SC22/WG14 N790 The meaning of "implementation-defined" Clive D.W. Feather clive@demon.net 1997-10-21 Abstract ======== N732 proposed new definitions for "undefined" and "implementation-defined", and added a new term "implementation-limited". It then proposed that various items that are currently implementation-defined should become implementation- limited. After discussion at the Menlo Park meeting, it was agreed that the new definitions of "undefined" and "implementation-defined" were useful, though some adjustments were required. However, the new "implementation-limited" concept was disliked; instead, case-by-case changes should be made to the body of the Standard. Discussion ========== The Standard defines two significant terms: 3.11 Implementation-defined behavior Behavior, for a correct program construct and correct data, that depends on the characteristics of the implementation and that each implementation shall document. 3.19 Unspecified behavior Behavior, for a correct program construct and correct data, for which this International Standard explicitly imposes no requirements. First consider the term "unspecified behavior". Most commentators on the Standard are of the opinion that this has the following properties: (1) There are a number of possible courses of actions, or the behavior is one that generates a result and then has a number of possible results. (2) The implementation can make any of the available choices, and can make different choices at different places or times. (3) The implementation need not document its choices. (4) No matter what choice the implementation makes, it cannot affect anything outside the range of that choice. If a value has to be chosen, it must be a valid value for that type. Property number 4 is the interesting one: it is usually taken to mean that the implementation cannot generate a spurious signal, branch to a random place in the code, or choose a trap representation. All of these, of course, are valid "undefined behavior". This interpretation, and particularly property number 4, is usually assumed to be what is meant by: for a correct program construct and correct data At this point, it should be noted that these words are not perhaps the best ever written. It has been claimed, in other contexts, that they mean that the construct must be correct, and if it is not the implementation is not constrained. Since that interpretation would make unspecified behavior indistinguishable from undefined behaviour, I reject it. But the wording should be improved. Now consider the definition of "implementation-defined behavior". Clearly, this is similar to unspecified behavior, but carries the words: that depends on the characteristics of the implementation and that each implementation shall document The most obvious reading is that property 3 above does not apply, and should be replaced by: (3A) The implementation must document its choices. Given the similarities in wording otherwise, I conclude that it was intended that properties 1, 2, and 4 still apply. There are a number of places that are "implementation-defined behavior" but where it is desirable to give the implementation more flexibility, including the ability to violate property 4. Rather than introduce a new concept, these places have been specifically identified and new wording suggested. Proposal ======== Part 1 ------ Replace 3.19 ("unspecified behavior") with: 3.19 Unspecified behavior Behavior where this International Standard provides two or more possibilities and imposes no requirements on which is chosen in any instance. An otherwise correct program, operating on correct data, containing unspecified behaviour shall nonetheless be a correct program and act in accordance with subclause 5.1.2.3. Replace 3.11 ("implementation-defined behavior") with: 3.11 Implementation-defined behavior Unspecified behavior where each implementation shall document how the choice is made. Part 2 ------ In 6.3.2.4 (Structure and union members) paragraph 5, change: With one exception, if the value of a member of a union object is used when the most recent store to the object was to a different member, the behaviour is implementation-defined. One special guarantee is made ... to | If the value of a member of a union object is used when the most | recent store to the object was to a member with a different type: | if the two types are compatible, or are differently qualified versions | of compatible types, or the value is one that is required to have the | same representation in the two types, then the value retrieved is the | same as the value that was stored. Otherwise, with one exception, the | value retrieved is determined by the object representation as | described in subclause 6.1.2.8, and might be a trap representation | (in which case the behavior is undefined). One special guarantee is made ... Part 3 ------ Change 6.8.6 (Pragma directive) paragraph 1 from: A preprocessing directive of the form # pragma pp-tokens/opt new-line where the preprocessing token STDC does not immediately follow the pragma on the directive causes the implementation to behave in an implementation-defined manner. Any such pragma that is not recognized by the implementation is ignored. to: A preprocessing directive of the form # pragma pp-tokens/opt new-line where the preprocessing token STDC does not immediately follow the pragma on the directive causes the implementation to behave in a manner which it shall document. The behavior might cause translation to fail or the resulting program to behave in a non-conforming manner. and add at the end of the subclause: Recommended practice: Any pragma, not beginning with STDC, that the implementation does not have a specific meaning for should be ignored. In subclause 6.8.9 (Pragma operator), change: ... The original four preprocessing tokens in the unary operator expression are replaced by the (possibly empty) implementation-defined sequence of preprocessing-tokens that result from that execution. to: ... The original four preprocessing tokens in the unary operator expression are removed. [6.8.6 doesn't allow for preprocessing-tokens being added.] Part 4 ------ In subclause 7.13.4.4 (The tmpnam function) paragraph 3, change: If it is called more than TMP_MAX times, the behavior is implementation-defined. to: If it is called more than TMP_MAX times, the behavior is undefined. Part 5 ------ In the various *scanf functions, p conversion specifier, delete the words The interpretation of the input item is implementation-defined. Part 6 ------ In subclause 7.13.11.4 (The perror function) paragraph 2, change: The contents of the error message strings are the same as those returned by the /strerror/ function with argument /errno/, which are implementation-defined. to: The contents of the error message strings are the same as those | returned by the /strerror/ function with argument /errno/. [They are locale-specific, not implementation-defined.] Part 7 ------ In subclause 7.14.4.5 (The system function), change paragraphs 2 and 3 from: Description The /system/ function passes the string pointed to by /string/ to the host environment to be executed by a /command processor/ in an implementation-defined manner. A null pointer may be used for /string/ to inquire whether a command processor exists. Returns If the argument is a null pointer, the /system/ function returns nonzero only if a command processor is available. If the argument is not a null pointer, the /system/ function returns an implementation- defined value. to: Description If /string/ is a null pointer, the /system/ function determines whether the host environment has a /command processor/. If /string/ is not a null pointer, the /system/ function passes the string pointed to by /string/ to that command processor to be executed in a manner which the implementation shall document; this might then cause the program calling /system/ to behave in a non-conforming manner or to terminate. Returns If the argument is a null pointer, the /system/ function returns nonzero only if a command processor is available. If the argument is not a null pointer and the /system/ function does return, it returns an implementation-defined value. Part 8 ------ In subclause 6.3 (Expressions) paragraph 4, change: These operators return values that depend on the internal representations of integers, and thus have implementation-defined aspects for signed types. to: These operators return values that depend on the internal representations of integers, and have implementation-defined and undefined aspects for signed types.