Submitter: Freek Wiedijk and Robbert Krebbers (Radboud University Nijmegen, The Netherlands)
Submission Date: 2013-08-30
Source: WG 14
Reference Document: N1747
Date: April 2014
Subject: Instability of uninitialized automatic variables
The standard is unclear about the following questions:
unsigned char x; /* intentionally uninitialized */ printf("%d\n", x); printf("%d\n", x);
Does the standard allow an implementation to let this code print two different values? And if so, if we insert either of the following three statements
x = x; x += 0; x *= 0;
between the declaration and the
printf statements, is this
behavior still allowed?
Or alternatively, can these
printf statements exhibit undefined
behavior instead of having to print a reasonable number.
The standard is unclear about these questions.
On the one hand the committee response to Defect Report #260 strongly suggests that the committee decided that the standard implies the answer to question 1 to be "yes". (Although Defect Report #260 applies to the C99 standard and hence has been superseded by the C11 standard, no modification to the standard text was deemed necessary at the time, and all relevant text in the C11 standard is identical to that in the C99 standard.) The relevant quote from the committee response to Defect Report #260 is:
In the case of an indeterminate value [...] the actual bit-pattern may change without direct action of the program.
A subtlety is that Defect Report #260 talks about bit-patterns and not about values,
but for variables of type
unsigned char there is a one-to-one correspondence between
bit-patterns and values.
Another argument in favor of "instability" of indeterminate values is that values can "become indeterminate" (e.g. 184.108.40.206p5, 6.2.4p2, and 6.2.4p6). In these cases the value of an object may also change without an explicit store (and can keep changing?)
On the other hand, 6.7.9p10 states that the kind of uninitialized variables that we are discussing get an indeterminate value. From 3.19.2 it follows that if a type has no trap values, then indeterminate and unspecified values are the same. And in 3.19.3, it is stated explicitly that an unspecified value is chosen. Which implies that the value - after having been chosen - cannot change anymore.
Another argument against "instability" is that 6.8p3 states that "the values are stored in the objects (including storing an indeterminate value in objects without an initializer) each time the declaration is reached in the order of execution", and that 6.2.4p2 states that "An object [...] retains its last-stored value throughout its lifetime." The only way that one could read this in light of Defect Report #260 is if "retaining an indeterminate value" is read as meaning that the indeterminateness of the value is retained, without the value having a specific value.
It seems attractive to make a distinction between indeterminate values that are allowed to change without direct action of the program in the way that Defect Report #260 interpreted the standard, and unspecified values that do not have this property. However the current text of 3.19.2 does not allow for this interpretation. Also, probably some instances of "indeterminate" and "unspecified" would need to be changed for such an interpretation to make sense. (For example in 220.127.116.11p6 "the bytes of the object representation that correspond to any padding bytes take unspecified values." should probably become "... take indeterminate values.")
The reason for question 3 is that if the kind of "instability"
that questions 1 and 2 ask for is allowed to propagate maximally, then it
becomes impossible to implement
printf in C itself.
When converting an indeterminate value to a string of output characters, the
value can keep changing underneath, and the code cannot protect itself
On the other hand, if library functions exhibit undefined behavior on these kinds
of "unstable" uninitialized values, then an
fwrite of a struct with uninitialized
padding bytes would also give undefined behavior.
The fact that one wants to be able to copy uninitialized padding bytes in
memcpy without undefined behavior is the reason that using the
value of an uninitialized object is not undefined behavior.
This seems to suggest that an
fwrite of a struct with uninitialized padding
bytes should not exhibit undefined behavior.
We see three reasonable sets of answers to these questions:
Easy to repair the unclarity in the standard. Just add text that explicitly states that indeterminate values cannot change without direct action from the program. This will prevent people from invoking the response to Defect Report #260 from then on.
Restricts the kind of optimizations compilers are allowed to perform.
Specifically, "unstable" values will also propagate through function calls. Also, after
x *= 0;
the value of
x still will be "unstable" and hence
still can be any byte, and will not necessary be
Gives compilers more freedom to perform optimizations.
Is Defect Report #260-compliant (i.e., "the committee did not change its mind").
Needs more modifications to the text of the standard. It will then be necessary to make an explicit distinction between "indeterminate non-trap value" and "unspecified value".
fwriteshould be immune from this)
Restricts program behaviors least, giving compilers even more freedom.
Needs even more modification to the text of the standard.
Needs a decision on what library functions will not have undefined behavior when working on indeterminate values.
This is certainly not compatible with the current version of the standard, as no undefined behavior of this kind related to library functions is described there.
In 6.2.4p2, change "An object exists, has a constant address, and retains its last-stored value throughout its lifetime." to "An object exists and has a constant address throughout its lifetime. The value of an object is retained, including the object representation, until some other value is stored into it, or until the moment when the value becomes indeterminate (at which moment it is replaced with an indeterminate value, and after which that value is retained again)."
In 3.19.2 change "either an unspecified value or a trap representation" to "either an unspecified value or a trap representation, which can change arbitrarily without direct action from the program".
In 6.2.4p2, change "An object exists, has a constant address, and retains its last-stored value throughout its lifetime." to "An object exists, has a constant address, and retains its last-stored value (provided this value is not indeterminate), throughout its lifetime."
At the end of 6.5p1 add "If at least one of the operands of an operator is indeterminate, the result of the operator is also indeterminate."
Some instances of "indeterminate" and "unspecified" (to be determined) should be replaced by respectively "unspecified" and "indeterminate". See for example the instance in 18.104.22.168p6 mentioned earlier.
The changes for resolution (b), and also:
In 7.1.4p1 add: "If a function is called with an indeterminate value, the behavior is undefined."
In a selection (to be determined) of functions from the library, add text that counters this general statement added to 7.1.4p1.
Apr 2014 meeting
The author provided N1793 and an accompanying presentation N1818 in which his position changed to believing that "wobbly" values are not actually defined by the standard, and after discussion agreed that the following committee response would be an acceptable resolution.
Proposed Committee Response