Subsetting

Document # D3716R0
Date 2025-05-19
Targeted subgroups EWG, SG23
Ship vehicle C++29
Reply-to Peter Bindels <dascandy@gmail.com>

What does "-Wall" in "g++ -Wall test.cpp -o test" do? -- It's short for "warn all"; it turns on (almost) all the warnings that g++ can tell you about. Typically a good idea, especially if you're a beginner, because understanding and fixing those warnings can help you fix lots of different kinds of problems in your code.

1 Abstract

We propose to have a standard facility in C++ to define a subset of the language, and to enforce a subset of the language in a given environment.

2 Prior art

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1881r1.html

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3081r1.pdf

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3390r0.html

https://wg21.link/p2759

Reddit https://www.reddit.com/r/cpp/comments/ee3a48/subset_of_c/

StackOverflow https://stackoverflow.com/questions/3073642/official-c-language-subsets

3 Existing subsetting of C++

4 Design principles

5 Why is subsetting a thing we can and want to do?

The only thing it allows is removing a construct, function, type or keyword from use. The only change a user can see to their program is that it is now ill-formed, with a specific indication where the given subset is violated.

Most people have used an axe and a gun at some point in their life, but don't use axes or guns often. Similarly, most C++ code ends up relying on pointer arithmetic, but does not try do any pointer arithmetic by itself. Rules in subsets can be suppressed, allowing for a nearly-always rule to still be enabled.

Subsets define specific actions that are disallowed. The sum of two subsets is the sum of their disallowed actions. If any subset disallows suppressing a given rule, the sum subset disallows suppressing that rule.

Building with warnings-as-errors for a warning set, subsetting out the warning-causing constructs. Building in -std=c++17 mode, subsetting out all C++20+ constructs. Building with -fno-exceptions -fno-rtti, disabling exceptions and RTTI.

6 How to subset

Define a subset by doing one or more of the following

Each subset indicates whether the rules it disallows are suppressible. The set of subsets should be open-ended, so that other organizations (SEI, MISRA, AutoSAR, LLVM, Microsoft etc) can define subsets.

They are allowed to use the knowledge gained from a subset while compiling a TU, and can use the knowledge of the subset in linking if they can be certain that all TUs were compiled with that subset - all under the existing as-if rule.

7 Evolving subsets over time

A subset should have a semantic meaning, a user-understandable goal of the profile. The semantic meaning is what determines which rules should be included in a subset. Most subsets are defined once and do not naturally accumulate more rules over time, because they are restricting the existing language to remove particular features that are not being newly added. Some subsets however, particularly those oriented around new language feature restriction, naturally accumulate new rules over time as new features are added to the language. In a different way, companies tend to maintain their own subset of the language roughly corresponding to "all the warnings we've been able to fix around our software", where a company tries to expand the subset over time, making sure that the old subset is strictly not violated, while attempting to add new rules that can be fixed in software over time, preventing those same issues from showing up in future code.

The first of these will naturally accumulate more rules over time, but retains the same meaning. These subsets can be evolved in place.

The second of these will also accumulate more rules over time, as the company using it will increase the set of rules they are enforcing over time. In this case though, the logical meaning of the subset is different; the best description for the logical subsets is likely "The set of warnings we enforce in 2024 and on", and similarly "The set of warnings we enforce in 2026 and on". The latter naturally includes the former, and expands on it, and as such the subsets should themselves be written as composed subsets.

If a subset is found to contain a rule that should be omitted, it is possible to remove the rule, as it only relaxes the subset allowing more of the full language to be used.

8 Suppressing a rule or a subset?

Suppressing rules is more verbose; a statement can have multiple rules disallowing it.

Suppressing subsets requires a closed set of subset definitions, so that the suppression can target it. We see a major desire in many places to define subsets to correspond to restrictions that various groups want. At the moment we could think of compiler-designed subsets (removing anything from after C++17, removing all -Wall-triggering constructs), standard-body subsets (removing all constructs that violate MISRA rules), standard C++ subsets (removing all constructs that are considered obsolete in C++29), regulatory subsets (removing specific constructs considered to be unacceptable) and code-owner subsets (removing the constructs that the owner has removed and wants to make sure the code base is devoid of).

We propose to add suppressions on *rules*. This makes it so that the subset definitions can be varied by users, allowing (for example) MISRA and AutoSAR to update their definitions, and for things disallowed by multiple subsets to only need a single suppression. It also has the subtle effect of making code that breaks various subsetted properties to need multiple suppressions, making "worse" code "smell worse". In addition, it has the benefit that if a rule is disallowed by multiple subsets in use that the suppression works across all of them.

9 Example use

No exceptions subset: Disallow use of catch keyword.

No RTTI subset: Disallow use of catching non-final types, disallow use of the dynamic_cast keyword, disallow typeid run on a non-type argument.

Annex K subset: Disallow use of all functions mentioned in Annex K.

Type-safety subset: Disallow use of reinterpret_cast, const_cast etc. as described in paragraph 4 of p3081.

C++17 subset: Disallow use of the C++20 subset, plus all changes between C++17 and C++20 (not listed).

10 Wording

To be added if the paper is marked as desirable by EWG / SG23