SC22/WG14 N861 1998-11-03 Issues with CD2 (FCD1) Clive Feather clive@demon.net The following 13 items represent further issues with CD2. They are in order of location within the CD. ======== [Item 01] Category: Inconsistency Committee Draft subsection: 6.2.5, 6.5.3.4, 6.7 Title: Issues with prototypes and completeness. Detailed description: 6.2.5p23 says "An array type of unknown size is an incomplete type". Is the type "int [*]" (which can only occur within a prototype) complete or incomplete ? If it is complete, then what is its size ? This can occur in the construct int f (int a [sizeof (int [*][*])]); It it is incomplete, then the type "int [*][*]" is not permitted, which is clearly wrong. Now consider the prototype: int g (int a []); The parameter clearly has an incomplete type, but since a parameter is an object (see 3.16) this is forbidden by 6.7p7: If an identifier for an object is declared with no linkage, the type for the object shall be complete by the end of its declarator, or by the end of its init-declarator if it has an initializer. This is also clearly not what was intended. One way to fix the first item would be to change 6.5.3.4p1 to read: [#1] The sizeof operator shall not be applied to an expression that has function type or an incomplete type, to || an array type with unspecified size 72a), to the parenthesized name of such a type, or to an lvalue that designates a bit-field object. || 72a) An array type with unspecified size occurs in function || prototypes when the notation [*] is used, as in "int [*][5][*]". One way to fix the second item would be to change 6.7p7 to read: If an identifier for an object is declared with no linkage, the type for the object shall be complete by the end of its declarator, or by the end of its init-declarator if it has an initializer; || in the case of function arguments (including in prototypes) this shall || be after making the adjustments of 6.7.5.3 (from array and function || types to pointer types). ======== [Item 02] Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.2.5, 6.7 Title: Problems with flexible array members Detailed description: Sometime after CD1 the following wording was added to 6.2.5p23: A structure type containing a flexible array member is an incomplete type that cannot be completed. Presumably this was done to eliminate some conceptual problems with structures that contain such members. However, this change makes almost all use of such structures forbidden, because it is no longer possible to take their size, and it is unclear what other operations are valid. This was also not the intent behind the original proposal. On the other hand, if such a structure is a complete type, there are a number of issues to be defined, such as what happens when the structure is copied or initialized. These need to be addressed. The wording defining flexible array members is in 6.7.2.1p15: [#15] As a special case, the last element of a structure with more than one named member may have an incomplete array type. This is called a flexible array member, and the size of the structure shall be equal to the offset of the last element of an otherwise identical structure that replaces the flexible array member with an array of unspecified length.95) When an lvalue whose type is a structure with a flexible array member is used to access an object, it behaves as if that member were replaced with the longest array, with the same element type, that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, then it behaves as if it had one element, but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it. A solution to the problem is to leave the structure as complete but have the flexible member ignored in most contexts. To do this, delete the last sentence of 6.2.5p23, and change 6.7.2.1p15 as follows: [#15] As a special case, the last element of a structure with more than one named member may have an incomplete array || type. This is called a flexible array member. With two || exceptions the flexible array member is ignored. Firstly, the || size of the structure shall be equal to the offset of the last element of an otherwise identical structure that replaces the flexible array member with an array of unspecified length.95) Secondly, when the . or -> operator has a left || operand which is, or is a pointer to, a structure with a flexible || array member and the right operand names that member, it behaves as if that member were replaced with the longest array, with the same element type, that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, then it behaves as if it had one element, but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it. Finally, add further example text after 6.7.2.1p18: The assignment: *s1 = *s2; only copies the member n, and not any of the array elements. Similarly: struct s t1 = { 0 }; // valid struct s t2 = { 2 }; // valid struct ss tt = { 1, { 4.2 }}; // valid struct s t3 = { 1, { 4.2 }}; // error; there is nothing // for the 42 to initialize t1.n = 4; // valid t1.d [0] = 4.2; // undefined behavior ======== [Item 03] Category: Inconsistency Committee Draft subsection: 6.2.5, 6.7.2.2 Title: Circular definition of enumerated types Detailed description: 6.7.2.2 para 4 says: Each enumerated type shall be compatible with an integer type. However, 6.2.5 para 17 says: The type char, the signed and unsigned integer types, and the enumerated types are collectively called integer types. Thus we have a circular definition. To fix this, change the former to one of: Each enumerated type shall be compatible with a signed or unsigned integer type. or: Each enumerated type shall be compatible with a standard integer type or an extended integer type. ======== [Item 04] Category: Clarification Committee Draft subsection: 6.2.6.2 Title: Clarify aspects of negative zeros and related situations Detailed description: Subclause 6.2.6.1p2 makes it clear that there are only three permitted representations for signed integers - two's complement, one's complement, and sign-and-magnitude. It is reported, however, that certain implementations have problems with the "minus zero" representation; furthermore, because signed and unsigned integer types have the same representation over the common value range, it is useful to know when a minus zero can appear. The suggested change is to alter the last part of this paragraph: If the sign bit is one, then the value shall be modified in one of the following ways: -- the corresponding value with sign bit 0 is negated; -- the sign bit has the value -2N; -- the sign bit has the value 1-2N. to: If the sign bit is one, then the value shall be modified in one of the following ways: -- the corresponding value with sign bit 0 is negated (/sign and magnitude/); -- the sign bit has the value -2N (/two's complement/); -- the sign bit has the value 1-2N (/one's complement/). The implementation shall document which shall apply, and whether the value with sign bit 1 and all value bits 0 (for the first two), or with sign bit and all value bits 1 (for one's complement) is a trap representation or a normal value. In the case of sign and magnitude and one's complement, if this representation is a normal value it is called a /negative zero/. If the implementation supports negative zeros, then they shall only be generated by: - the & | ^ ~ << and >> operators with appropriate arguments; - the + - * / and % operators where one argument is a negative zero and the result is zero; - compound assignment operators based on the above cases. It is unspecified if these cases actually generate negative zero or normal zero, and whether a negative zero becomes a normal zero or remains a negative zero when stored in an object. If the implementation does not support negative zeros, the behavior of an & | ^ ~ << or >> operator with appropriate arguments is undefined. ======== [Item 05] Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.3.2.3 Title: Null pointer constants should be castable to pointer types Detailed description: 6.3.2.3p3 says that: If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. Such a pointer, called a null pointer, However, this doesn't cover cases such as: (char *) 0 which is neither an assignment or a comparison. Therefore this is not a null pointer constant, but rather an implementation-defined conversion from an integer to a pointer. This is clearly an oversight and should be fixed. Either change: If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. Such a pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function. to: If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. When a null pointer constant is converted to a pointer, the result (called a /null pointer/) is guaranteed to compare unequal to a pointer to any object or function. or change: If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. Such a pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function. [#4] Conversion of a null pointer to another pointer type yields a null pointer of that type. Any two null pointers shall compare equal. to: If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. A /null pointer/ is a special value of any given pointer type that is guaranteed to compare unequal to a pointer to any object or function. Conversion of a null pointer constant to a pointer type, or of a null pointer to another pointer type, yields a null pointer of that type. Any two null pointers shall compare equal. ======== [Item 06] Category: Inconsistency Committee Draft subsection: 6.4 Title: UCNs as preprocessing-tokens Detailed description: In 6.4 the syntax for "preprocessing-token" includes: identifier each universal-character-name that cannot be one of the above In 6.4.2.1 the syntax for "identifier" includes: identifier: identifier-nondigit identifier identifier-nondigit identifier digit identifier-nondigit: nondigit universal-character-name other implementation-defined characters Therefore a universal-character-name is always a valid identifier preprocessing token, and so the second alternative can never apply. It is true that 6.4.2.3p3 makes certain constructs undefined, but this does not alter the tokenisation. There are two ways to fix this situation. The first is to delete the second alternative for preprocessing-token. The second would be to add text to 6.4p3, or as a footnote, along the following lines: The alternative "each universal-character-name" that cannot be one of the above can never occur in the initial tokenisation of a program in translation phase 3. However, if an identifier includes a universal- character name that is not listed in Annex I, the implementation may choose to retokenise using this alternative. ======== [Item 07] Category: Normative change where the intent is unclear Committee Draft subsection: 6.7.5.2 Title: Side effects in VLAs Detailed description: There has been a long discussion on both the reflector and comp.std.c about the issue of side effects in VLA types. I do not intend to repeat all the arguments for and against. However, after considering all the issues it seems to me that the problems are to do with function prototypes more than they are to do with VLAs or side effects. In particular, code such as: int n; /* ... */ int vla [n++]; is both meaningful and easy to compile; the problems come with constructs like: void f (int n, int a [n++]); Therefore the changes proposed below implement the following principles: * If a declarator or an abstract-declarator is not part of a parameter- declaration, any expressions occuring in variably modified types are evaluated in the normal way, including all function calls and side effects. * If the declarator or abstract-declarator *is* part of a parameter- declaration, any expressions within array declarators are *not* evaluated, and thus the behaviour is as if all such expressions were replaced with "*" (though the latter would be forbidden in a function definition). An optional addition to the proposal enforces the latter rule with a constraint that forbids side effects and function calls in such expressions. Recommended changes: In 6.7.5.2, add a new heading and paragraph before the constraints: Definitions The direct-declarator or direct-abstract-declarator which forms the array declarator can occur in one of three contexts, each of which is given a name: (1) It is derived from the parameter-type-list of the declarator of a function definition. This is a /parameter-array-declarator/. (2) It is derived from some other parameter-type-list. This is a /prototype-array-declarator/. (3) It is not derived from a parameter-type-list. This is an /actual-array-declarator/. In the existing paragraph 3, replace the two sentences: If the size expression is not a constant expression, and it is evaluated at program execution time, it shall evaluate to a value greater than zero. It is unspecified whether side effects are produced when the size expression is evaluated. with: If the size expression is part of a parameter-array-declarator or of an actual-array declarator, and is not a constant expression, it shall be evaluated at program execution time (in the former case on each entry to the function) and shall evaluate to a value greater than zero. Otherwise it shall not be evaluated. Optional addition: add a further constraint: A parameter-array-declarator or prototype-array-declarator shall not contain a function call operator, nor shall it contain any operator which can modify an object. ======== [Item 08] Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.8.5 Title: Error in new for syntax Detailed description: C9X adds a new form of syntax for for statements: for ( declaration ; expr-opt ; expr-opt ) statement However, 6.7 states that /declaration/ *includes* the trailing semicolon. The simplest solution is to remove the corresponding semicolon in 6.8.5 and not worry about the informal use of the term in 6.8.5.3p1. Alternatively the syntax needs to be completely reviewed to allow the term to exclude the trailing semicolon. ======== [Item 09] Category: Inconsistency Committee Draft subsection: 6.9 Title: References to sizeof not allowing for VLAs Detailed description: 6.9p3 and p5 use sizeof without allowing for VLAs. In each case, change the parenthetical remark: (other than as a part of the operand of a sizeof operator) to: (other than as a part of the operand of a sizeof operator which is not evaluated) ======== [Item 10, based on PC-UK0027] Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.10.3 Title: Problems with extended characters in object-like macros Detailed description: When an object-like macro is #defined, there is no requirement for a delimiter between the macro identifier and the replacement list. This can be a problem when extended characters are involved - for example, some implementations view $ as valid in a macro identifier while others do not. Thus the line: #define THIS$AND$THAT(x) ((x)+42) can be parsed in either of two ways: Identifier Arguments Replacement list THIS - $AND$THAT(x) ((x)+42) THIS$AND$THAT x ((x)+42) TC1 addressed this by requiring the use of a space in certain circumstances so as to eliminate the ambiguity. However, this requirement has been removed in C9X for good reasons. Regrettably this reintroduces the original ambiguity. The simplest solution seems to be to require a space - or possibly one of the basic graphic characters - between the identifier and the replacement list. This change is unlikely to affect much code, and is not a Quiet Change - it will require a diagnostic for any affected code. It has the big advantage of eliminating the ambiguity. Insert a new Constraint in 6.10.3, either: In the definition of an object-like macro there shall be white space between the identifier and the replacement list. or In the definition of an object-like macro there shall be white space between the identifier and the replacement list unless the replacement list begins with one of the 26 graphic characters in the required character set other than ( _ or \. ======== [Item 11] Category: Feature that should be included Committee Draft subsection: 7.8 Title: Missing functions for intmax_t values Detailed description: Several utility functions have versions for types int and long int, and when long long was added corresponding versions were added. Then when intmax_t was added to C9X, further versions were provided for some of these functions. However, three cases were missed. For intmax_t to be useful to the same audience as other features of the Standard, these three functions should be added. Obviously they should be added to . Add a new subclause 7.8.3: 7.8.3 Miscellaneous functions 7.8.3.1 The atoimax function Synopsis #include intmax_t atoimax(const char *nptr); Description The atoimax function converts the initial portion of the string pointed to by nptr to intmax_t representation. Except for the behaviour on error, it is equivalent to strtoimax(nptr, (char **)NULL, 10) The function atoimax need not affect the value of the integer expression errno on an error. If the value of the result cannot be represented, the behavior is undefined. Returns The atoimax function returns the converted value. 7.8.3.2 The imaxabs function Synopsis #include intmax_t abs(intmax_t j); Description The imaxabs function computes the absolute value of an integer j. If the result cannot be represented, the behavior is undefined. Returns The imaxabs function returns the absolute value. 7.8.3.3 The imaxdiv function Synopsis #include imaxdiv_t div(intmax_t numer, intmax_t denom); Description The imaxdiv function computes numer/denom and numer%denom in a single operation. Returns The imaxdiv function returns a structure of type imaxdiv_t, comprising both the quotient and the remainder. The structure shall contain (in either order) the members quot (the quotient) and rem (the remainder), each of which have the type intmax_t. If either part of the result cannot be represented, the behavior is undefined. 7.8 paragraph 2 will need consequential changes. ======== [Item 12] Category: Clarification Committee Draft subsection: 7.19.5.1 Title: Clarify meaning of a failed fclose Detailed description: If a call to fclose() fails it is not clear whether: - it is still possible to access the stream; - whether fflush(NULL) will attempt to flush the stream. It is probably best to take the view that fclose always "closes" the stream as far as the program is concerned, whether or not the process was fully successful. The existing wording - read strictly - also requires the full list of actions to be carried out successfully whether or not the call fails. This is clearly an oversight. Change 7.19.5.1p2 to read: A successful call to the fclose function causes the stream pointed to by stream to be flushed and the associated file to be closed. Any unwritten buffered data for the stream are delivered to the host environment to be written to the file; any unread buffered data are discarded. Whether or not the call succeeds, the stream is disassociated from the file, and if the associated buffer was automatically allocated, it is deallocated. ======== [Item 13] Category: Correction restoring original intent Committee Draft subsection: 7.23.3.7 Title: Wrong time system notation used Detailed description: In 7.23.3.7p2, the expression "UTC-UT1" appears. This should read "TAI-UTC". -- Clive D.W. Feather | Regulation Officer, LINX | Work: Tel: +44 1733 705000 | (on secondment from | Home: Fax: +44 1733 353929 | Demon Internet) | Written on my laptop; please observe the Reply-To address