Enumerating Core Undefined Behavior

Published Proposal,

This version:
Toggle Diffs:
ISO/IEC JTC1/SC22/WG21 14882: Programming Language — C++


Adding an undefined behavior annex to the Standard and creating an C++ undefined behavior TR

1. Introduction

Explaining undefined behavior is complicated. First you need to explain what undefined behavior is. Then all the unintuitive consequences that it entails. Including removal of safety checks, turning finite loops infinite, booleans that can both be false and true and how undefined behavior can time travel 🤯

Then comes the next logical question, how can I know what all the undefined behavior are so I can avoid them. This may be followed by an awkward silence, “That is complicated”, one might say. We might follow up and mention we have both explicit and implicit undefined behavior. A fair response might be, “Makes sense but surely you can tell me what all the explicit undefined behaviors are?”. This would be followed by more awkward silence. Followed by a perhaps sheepish, “Well you see that is also complicated”.

We would have to follow-up and point out that the C++ Standard does indeed list all the explicit undefined behavior but you would have to manually go through the 1700+ page Standard to find them. Merely finding all the mentions of “undefined” is only partially helpful. The Standard being a specification and not a tutorial does not explain each in plain language and honestly some defy explanation in plain language. Examples are not always provided, neither do we have rationales or explanations on how to avoid or catch violations of these rules (if possible).

The goal of this paper is two fold. One is to create an annex of undefined behavior. The purpose would be to have a list of all the explicit core undefined behavior along with at least one example demonstrating it. Having this list will enable the C++ community to better grasp the scope and depth of undefined behavior. It should benefit not just users but also those teaching C++ and those developing tools for writing better code. It will benefit implementors because it lets them know what’s undefined and how. It will help the committee track its undefined behavior and revisit it.

The second goal would be to develop of a core undefined behavior TR, which would expand upon the content of the annex with more examples including examples showing surprising consequences. It would also include tools if any that could aid in detecting or avoiding each undefined behavior. If possible we would also like to include a rationale for each undefined behavior. This will have all the benefits of that annex but with more details and rationale should aid in teaching. Additionally this should also be a help to researchers both in understanding, developing better tools and perhaps finding alternatives approaches to undefined behavior.

2. Goals of Undefined Behavior Annex

3. Benefits of Undefined Behavior Annex

4. Implementation of Undefined Behavior Annex

The Standards implementation defined behavior index is currently implemented using a macro \indeximpldef. We would envision implementing an undefined behavior annex is a similar fashion for example using a \undefbehavior macro. The advantages to this approach would be ease of maintance since the annex would be self maintaining. When new proposals introduce or take away undefined behavior it would be a matter of adding or removing markup.

The one additional feature I would add would be to include an example for each undefined behavior which would also be part of the annex. This could be implemented as part of the \undefbehavior macro or implemented via a second macro for example \undefbehaviorexample macro.

5. Goals of Undefined Behavior TR

6. Benefits of Undefined Behavior TR

7. How would the Undefined Behavior TR relate to the Core Guidelines

The C++ Core Guidelines are focused on "relatively high-level issues", which is appropriate for a document that seeks to "help people to use modern C++ effectively". Undefined behavior itself may be one high-level topic and does deserve specific mention in the Core Guidelines. The Undefined Behavior TR would be more focused, drilling into each specific core undefined behavior with details. The undefined behavior TR would therefore inform the Core Guidelines.

8. What about Standard Library Undefined Behavior?

Undefined behavior is a large topic, to make it a more tractable problem we believe tackling Core undefined behavior separately from Library undefined behavior makes sense. Core and Library already have separate processes and tackling them seperately will allow those with expertise in Core or Library to focus on those areas repsectively. This proposal specifically focuses on Core while acknowledging that documenting Library undefined behavior is important, we leave that to a future proposal.

9. How Might the TR look

There has been some effort to document core undefined behavior and below I will provide an example of one approach to an undefined behavior TR. This works covers about most of the explicit core undefined behavior with at least one example for each undefined behavior. To a lesser extent it covers rationales, backgrounds and tools:

9.1. [lex]

9.1.1. [lex.phases]

9.1.2. [lex.string]

9.2. [basic]

9.2.1. [basic.def.odr]

9.2.2. [basic.life]

9.2.3. [basic.indet]

9.2.4. [basic.start]

9.3. [expr]

9.3.1. [expr.pre]

9.3.2. [conv.double]

9.3.3. [conv.fpint]

9.3.4. [expr.call]

9.3.5. [expr.static.cast]

9.3.6. [expr.delete]

9.3.7. [expr.mptr.oper]

9.3.8. [expr.mul]

9.3.9. [expr.add]

9.3.10. [expr.shift]

9.3.11. [expr.ass]

9.4. [stmt.stmt]

9.4.1. [stmt.return]

9.4.2. [stmt.dcl]

9.5. [dcl.dcl]

9.5.1. [dcl.type.cv]

9.5.2. [dcl.attr.contract.syn]

9.5.3. [dcl.attr.contract.syn]

9.5.4. [dcl.attr.contract.check]

9.5.5. [dcl.attr.noreturn]

9.6. [class]

9.6.1. [class.mfct.non-static]

9.6.2. [class.dtor]

9.6.3. [class.union]

9.6.4. [class.abstract]

9.6.5. [class.base.init]

9.6.6. [class.cdtor]

10. Acknowledgement

Thanks to JF Bastien for his review.