ISO/IEC JTC1/SC22/WG14 N732 The meaning of "implementation-defined" Clive D.W. Feather Abstract ======== Discussion at the last meeting showed that there were divergent views on the meaning of the term "implementation-defined". This paper examines this matter and proposes changes to resolve the issue. Discussion ========== The Standard defines three significant terms (presented here out of order): 3.11 Implementation-defined behavior Behavior, for a correct program construct and correct data, that depends on the characteristics of the implementation and that each implementation shall document. 3.19 Unspecified behavior Behavior, for a correct program construct and correct data, for which this International Standard explicitly imposes no requirements. 3.18 Undefined behavior Behavior, upon use of a nonportable or erroneous program construct, of erroneous data, or of indeterminately valued objects, for which this International Standard imposes no requirements. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, [...] First consider the term "unspecified behavior". Most commentators on the Standard are of the opinion that this has the following properties: (1) There are a number of possible courses of actions, or the behavior is one that generates a result and then has a number of possible results. (2) The implementation can make any of the available choices, and can make different choices at different places or times. (3) The implementation need not document its choices. (4) No matter what choice the implementation makes, it cannot affect anything outside the range of that choice. If a value has to be chosen, it must be a valid value for that type. Property number 4 is the interesting one: it is usually taken to mean that the implementation cannot generate a spurious signal, branch to a random place in the code, or choose a trap representation. All of these, of course, are valid "undefined behavior". This interpretation, and particularly property number 4, is usually assumed to be what is meant by: for a correct program construct and correct data At this point, it should be noted that these words are not perhaps the best ever written. It has been claimed, in other contexts, that they mean that the construct must be correct, and if it is not the implementation is not constrained. Since that interpretation would make unspecified behavior indistinguishable from undefined behaviour, I reject it. But the wording should be improved. Now consider the definition of "implementation-defined behavior". Clearly, this is similar to unspecified behavior, but carries the words: that depends on the characteristics of the implementation and that each implementation shall document The most obvious reading is that property 3 above does not apply, and should be replaced by: (3A) The implementation must document its choices. Given the similarities in wording otherwise, I conclude that it was intended that properties 1, 2, and 4 still apply. At this point it should be noted that there are really two separate situations where implementation-defined behavior occurs. In the first (for example, whether plain char is signed or unsigned), property 4 remains desirable; no matter what choice is made, a program should be able to safely use the construct. In the second (for example, the result of left-shifting a negative value), there are implementations that wish to generate a visible exception or invoke behavior outside the range of that property. It is the wish to do the latter that has led to the belief that "implementation-defined behavior can be undefined behavior provided it's in the manual as such". Since both these types of "implementation-defined behavior" are of use within the Standard, we should explicitly have both. This proposal introduces the term "implementation-limited behavior" for the second type. Note that programs that contain implementation- defined (or unspecified) behavior are conforming to all conforming implementations, though their output will vary, while programs that contain implementation-limited (or undefined) behavior can only be run on some subset of conforming implementations. Proposal ======== Replace 3.19 ("unspecified behavior") with: 3.19 Unspecified behavior Behavior where this International Standard provides two or more possibilities and imposes no requirements on which is chosen in any instance. An otherwise correct program containing unspecified behaviour shall execute correctly on all implementations. Replace 3.11 ("implementation-defined behavior") with: 3.11 Implementation-defined behavior Unspecified behavior where each implementation shall document how the choice is made. Add a new definition: 3.X Implementation-limited behavior Behavior that depends on the characteristics of the implementation and that each implementation shall document. A program that contains implementation-limited behaviour need not execute correctly on a given implementation. In clause 4 paragraph 1, change: It shall not produce output dependent on any unspecified, undefined, or implementation-defined behaviour ... to: It shall not produce output dependent on any unspecified, undefined, implementation-defined, or implementation-limited behaviour ... and in paragraph 4 change: An implementation shall be accompanied by a document that defines all implementation-defined characteristics and all extensions. to: An implementation shall be accompanied by a document that defines all implementation-defined and implementation-limited characteristics and all extensions. In 5.1.1.3 paragraph 1, change: ... even if the behavior is also explicitly specified as undefined or implementation-defined. to: ... even if the behavior is also explicitly specified as undefined, implementation-defined, or implementation-limited. Application =========== Change the following uses of "implementation-defined" to "implementation-limited": - 6.1.3.4 paragraph 10 (character constants) - 6.1.3.4 paragraph 11 (wide character constants) - 6.1.3.4 examples 3 and 4 (character constants) - 6.3.2.3 paragraph 5 (type punning in unions) - 6.3.4 paragraph 4 (certain pointer casts) - 6.8.6 paragraph 1 (#pragma) - 6.8.9 paragraph 1 (pragma operator) - 7.10 paragraph 4 (semantics of signals) - 7.12.4.4 paragraph 3 (tmpnam called too often) - 7.12.6.2 p specifier second use (fscanf()) - 7.12.10.4 paragraph 2 (perror()) - 7.13.4.5 paragraph 2 (system()) - 7.18.2.2 p specifier second use (fwscanf()) - N739 item 9a Change the following uses of "implementation-defined" to "implementation-defined or implementation-limited": - 6.3 paragraph 4 (certain operators in expressions) Change the following uses of "undefined" to "implementation-limited": - N723 - N729