SC22/WG14 N847 1998-09-04 Issues with CD2 Clive Feather clive@demon.net The following 43 items represent issues with CD2. Regrettably, many of them were also issues with CD1 and do not seem to have been addressed; where possible the item has been rewritten to explain the problem better. Items 1 to 42 are in order of location within CD2. [Item 01, based on PC-UK0021] Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 4 Title: Further requirements on the conformance documentation Detailed description: The Standard requires an implementation to be accompanied by documentation of various items. However, there is a subtle difference between the terms "implementation-defined" and "described by the implementation" which has been missed by this wording (this is partly due to the tightening up of the uses of this term between C89 and C9X - see for example subclause 6.10.6). As a result, the wording does not actually require the latter items to be documented. Change the paragraph to: An implementation shall be accompanied by a document that describes all features that this International Standard requires to be described by the implementation, including all implementation-defined characteristics and all extensions. ======== [Item 02, based on PC-UK0001] Category: Editorial change/non-normative contribution Committee Draft subsection: 5.1.1.2 Title: Error in applying working paper N673 Detailed description: When N673 was applied to the draft, a new footnote was erroneously left out. The following footnote should be included, with a reference at the end of translation phase 2: [*] Thus the physical source lines (delimited by | characters): |\\\| || |n| generate the logical source lines: |\\| |n| and a source file may end with a backslash followed by two physical newlines, which will generate a last logical source line ending in a backslash. ======== [Item 03] Category: Feature that should be included Committee Draft subsection: 5.1.1.2, 5.2.2, 6.4.4.4 Title: provide a \s character Detailed description: Translation phase 5 states that if the execution character set cannot represent a character in the source set, it is converted to "an implementation-defined member" (of the execution character ser). It would be useful to have access to this character in an consistent manner, and the escape sequence \s ("substitute") is proposed for this purpose. Change translation phase 5 (5.1.1.2p1) to end: if there is no corresponding member, it is converted to the character represented by \s. In 5.2.2p2, add a entry to the list: \s (substitute) Produces a visible indication that a source character was used that does not correspond to a member of the execution character set. The active position is advanced as for a graphic character. In 6.4.4.4, add \s to "simple-escape-sequence" in p1 and to the list in p8. ======== [Item 04, based on PC-UK0027] Category: Inconsistency Committee Draft subsection: 5.2.1 plus scattered other changes Title: inconsistencies in use of "basic" and "extended" character sets Detailed description: [Please note: this is *not* a UCN issue.] The Standard uses the terms "basic character set" and "extended character set" at various places. However, the exact meaning of these two is not clear, and this leads to confusion. Consider the UTF-8 encoding (codes from 0 to 127 are single byte, codes from 128 to 255 form part of multibyte characters with length from 2 to 5 bytes). At execution time here are five "interesting" character sets based on this encoding: [1] The 95 characters required by 5.2.1p3, plus the null character. [2] The 128 single byte characters. [3] The 2**31 multibyte characters. [4] Set [3] minus set [1]. [5] Set [3] minus set [2]. (and of course the corresponding source sets). It is unclear whether the "basic character set" means [1] or [2] or something else, and as a result the Standard has to use circumlocutions such as "the required characters". Looking at the various places where the term is used has led me to believe that it is most useful to have terms for [1] and for [4], while there is little or no need to refer to any of the others. Therefore it would be logical for "basic character set" to represent [1] (the set of characters required in all implementations) and "extended character set" to represent [4] (any other characters provided by the specific implementation). This requires the following changes: Replace 5.2.1p1, second sentence, by: Each set is further divided into a /basic character set/, whose contents are given by this subclause, and an /extended character set/, consisting of zero or more locale-specific members (which are not members of the basic character set). [Note that this defines the two terms.] In 5.2.1p3, delete "at least" in the first sentence, and in the fourth sentence change "In the execution character set" to "In the basic execution character set". Replace 5.2.1.2p1, first bullet, by: - The basic character set shall be present and shall be encoded using single-byte characters. In 6.2.5p3, replace "required source character set enumerated in 5.1.2" with "basic execution character set". (Note that the execution set is more sensible in this context than the source set.) In 6.4.2.1p3, change "that are not part of the required source character set" to "that are in the extended source character set". In 6.4.3p2 and p3, change "required" to "basic". In 6.4.4.4p8 change "required" to "basic". Change 7.1.1p2 to: A /letter/ is one of the 52 lowercase and uppercase letters in the basic extension character set. All letters are printing characters. In Annex Ip2, delete "required". In Annex K.2p1, third bullet, change "required" to "basic". In Annex K.2p1, fifth bullet, delete "required". In Annex K.3.4p1, fourth bullet, change "required" to "basic". Change K.4p1, first bullet, to: - Any members of the extended execution character set (5.2.1). Change K.4p1, second bullet, to: - The presence, meaning, and representation of any multibyte characters in the extended execution character set (5.2.1.2). In Annex K.5.2p1, change "required" to "basic". ======== [Item 05, based on PC-UK0015] Category: Feature that should be included Committee Draft subsection: 5.2.4.2.1 Title: ensure int can hold all characters Detailed description: A number of functions in the standard library (particularly in and assume that an int is capable of holding every possible unsigned char value. If this is not the case then these functions will, to say the least, behave in a peculiar manner. While it is arguable to say that this implies that int must be able to hold every unsigned char value, it would be better to make this an explicit requirement on the implementation. To do so, append to 5.2.4.2.1p2: On a hosted implementation, INT_MAX shall be not less than UCHAR_MAX. Note that this does *not* forbid char and int from being the same type. ======== [Item 06, based on PC-UK0047] Category: Request for information/clarification Committee Draft subsection: 6.10 Title: Parsing ambiguity in preprocessing directives Detailed description: Consider parsing the following text during the preprocessing phase (translation phase 4): # if 0 xxxx # else yyyy # endif The third line fits the syntax for the first option of group-part, and thus generates two possible parsings. One of these will cause both text lines to be skipped, while the other only causes the second to be skipped. It is easy to fix this ambiguity. In the syntax in 6.10p1, change group-part to: group-part: non-directive new-line if-section control-line and add: non-directive: pp-tokens/opt Then add a new paragraph to the Constraints, after 6.10p3: The first preprocessing-token (if any) in a non-directive shall not be /#/. Finally, delete 6.10.3p8, because this can no longer occur. Note that this change has the added benefit of making it clear that unknown preprocessing directives require a diagnostic and do not affect conditional inclusion. ======== [Item 07, based on PC-UK0049] Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.10.1 Title: Handling of UCNs in character constants in #if directives Detailed description: Consider the line: #if '\u0024' < 100 where dollar is in the single-byte execution character set. It is not completely clear from 6.10.1p3 that the UCN is converted to a single character, since this normally happens in translation phase 5 and it is not specifically mentioned here. In 6.10.1p3 (near the end), change: ... which may involve converting escape sequences into execution character set members. to: ... which may involve converting escape sequences and universal character names into execution character set members in the manner of translation phase 5. ======== [Item 08, based on PC-UK0071] Category: Inconsistency Committee Draft subsection: 6.10.2 Title: Clarify included file process Detailed description: 6.10.2p3 ends: If this search is not supported, or if the search fails, the directive is reprocessed as if it read #include new-line with the identical contained sequence (including > characters, if any) from the original directive. The wording is technically incorrect, precisely because the original directive could contain angle brackets within the quotes whereas an h-char-sequence cannot. Better wording would be: If this search is not supported, or if the search fails, the directive is reprocessed as if it read #include new-line with the identical contained sequence from the original directive (if the q-char-sequence contains a > character, this is retained in the name searched for even though it could not appear in a true h-char-sequence). ======== [Item 09, based on PC-UK0052] Category: Feature that should be included Committee Draft subsection: 6.10.3 Title: Add a __VA_COUNT__ facility for varargs macros Detailed description: Unlike with function calls, it is trivial for an implementation to determine the number of arguments that match the ... in a varargs macro. There are a number of useful things that can be done with this (at the least, providing argument counts to varargs functions). Therefore this information should be made available to the macro expansion. In 6.10.3p5, change The identifier /__VA_ARGS__/ ... to: The identifiers /__VA_ARGS__/ and /__VA_COUNT__/ ... Append to 6.10.3.1p2: An identifier /__VA_COUNT__/ that occurs in the replacement list shall be replaced by a single token which is the number of trailing arguments (as a decimal constant) that were merged to form the variable arguments. ======== [Item 10, based on PC-UK0054] Category: Other: C++ conflict avoidance Committee Draft subsection: 6.10.8 Title: Require that __cplusplus not be defined Detailed description: Add to 6.10.8 a new paragraph 5: The implementation shall not predefine the macro /__cplusplus/, nor shall it define this macro in any header defined in clause 7. This change was agreed by the full committee at the Menlo Park meeting, but seems to have been lost. ======== [Item 11, based on PC-UK0169] Category: Feature that should be included Committee Draft subsection: 6.10.8 Title: provide a __STDC_HOSTED__ macro Detailed description: There is currently no way for a program to determine if the implementation is hosted or freestanding. A standard predefined macro should be provided. Add to the list in 6.10.8p1: __STDC_HOSTED__ The decimal constant 0 if the implementation is a freestanding one and the decimal constant 1 if it is a hosted one. ======== [Item 12, based on PC-UK0024] Category: Editorial change/non-normative contribution Committee Draft subsection: 6.2.5 Title: Replace footnote 25 Detailed description: Footnote 25 is unclear in the context in which it appears (implementation- defined types). The wording of footnote 29 explains what is meant much more clearly, and can be applied to both situations. Replace the text of footnote 25 with that of footnote 29, and change all references to the latter to be references to the former. ======== [Item 13, based on PC-UK0050] Category: Inconsistency Committee Draft subsection: 6.2.6.1, 6.5.2.3 Title: Effects on other members of assigning to a union member Detailed description: 6.5.2.3p5 has wording concerning the storing of values into a union member: With one exception, if the value of a member of a union object is used when the most recent store to the object was to a different member, the behavior is implementation-defined. When this wording was written, "implementation-defined" was interpreted more loosely and there was no other relevant wording concerning the representation of values. Neither of these is the case anymore. The requirement to be implementation-defined means that an implementation must ensure that all stored values are valid in the types of all the other members, and eliminates the possibility of them being trap representations. It also makes it practically impossible to have trap representations at all. This is not the intention of other parts of the Standard. It turns out that the wording of 6.2.6.1 is sufficient to explain the behavior in these circumstances, and the cited wording in 6.5.2.3 merely muddles the issue. It should be removed; the rest of the paragraph can stand alone. ======== [Item 14] Category: Improved terminology (technically normative) Committee Draft subsection: 6.3.4, plus scattered other changes Title: better terminology for object lifetimes Detailed description: The term "lifetime" is used at a few places in the Standard but never defined. Meanwhile a number of places uses circumlocutions such as "while storage is guaranteed to be reserved". These would be much easier to read if the term "lifetime" was defined and used. Make the following changes to subclause 6.3.4. Delete paragraph 5 and insert a new paragraph between 1 and 2: The /lifetime/ of an object is the portion of program execution during which storage is guaranteed to be reserved for that object. An object exists and retains its last-stored value throughout its lifetime. Objects with static or automatic storage duration have a constant address throughout their lifetime.23 If an object is referred to outside its lifetime, the behavior is undefined. The value of a pointer is indeterminate after the end of the lifetime of the object it points to. Change paragraphs 2 to 4 (which will become 3 to 5) to: [#2] An object whose identifier is declared with external or internal linkage, or with the storage-class specifier static, has static storage duration. The lifetime of the object is the entire execution of the program. Its stored value is initialized only once. [#3] An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration. For objects that do not have a variable length array type, the lifetime extends from entry into the block with which it is associated until execution of the block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively a new object is created each time. The initial value of the object is indeterminate; if an initialization is specified for the object, it is performed each time the declaration is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached. [#4] For objects that do have a variable length array type, the lifetime extends from the declaration of the object until execution of the program leaves the scope of that declaration24. If the scope is entered recursively a new object is created each time. The initial value of the object is indeterminate. Other changes: In 5.1.2p1 change "in static storage" to "with static storage duration". Change footnote 9 to: 9) In accordance with 6.2.4, a call to exit will remain within the lifetime of objects with automatic storage duration declared in main but a return from main will end their lifetime. Delete 5.1.2.3p5 as it just duplicates material in 6.2.4p3-4. Change the last portion of 6.5.2.5p17 to: of the loop only, and on entry next time around p would be pointing to an object outside of its lifetime, which would result in undefined behavior. Change the last portion of footnote 72 to: and the address of an automatic storage duration object after the end of its lifetime. Change the first sentence of 6.7.3.1p5 to: Here an execution of B means the lifetime of a notional object with type /char/ and automatic storage duration associated with B. Add to 7.20.3 a second paragraph: The lifetime of an object allocated by the calloc, malloc, or realloc functions extends from the function call until the object is freed by the free or realloc functions. The object has a constant address throughout its lifetime except when moved by a call to the realloc function. The last sentence of 7.20.3p1 is redundant and could be deleted. Relevant bullet points in annex K should also be changed. ======== [Item 15] Category: Editorial change/non-normative contribution Committee Draft subsection: 6.4.3 Title: reword the list of forbidden UCNs Detailed description: Change 6.4.3p2 to read: A universal-character-name shall not specify (in either form) a character short identifier less than 000000A0 other than: 00000024 00000040 00000060 or in the range 0000D800 to 0000DFFF inclusive. This wording makes it easier to understand the restriction, because it is not necessary to cross-reference the list in 5.2.1 and then determine the UCNs of those characters. ======== [Item 16, based on PC-UK0026] Category: Editorial change/non-normative contribution Committee Draft subsection: 6.4.5 Title: improve the example of character string literals Detailed description: Append to 6.4.5p7, the example: When this is used to initialize a static array, the array has three members that are initialized to /18/, the value of /'3'/, and /0/ respectively. ======== [Item 17, based on PC-UK0036] Category: Normative change where the intent is unclear Committee Draft subsection: 6.5.16 Title: Define the result of the assignment operator Detailed description: 6.5.16p3 states: An assignment expression has the value of the left operand after the assignment, but is not an lvalue. Two interpretations have been put on this wording: * the value of the assignment expression is the value that will also be stored in the left operand ("same-value" semantics); * the value of the assignment expression is the result of reading the left operand after storing the value in it ("write-then-read" semantics). These two have different results when the left operand is a volatile object that can be changed by external causes (such as a clock or a memory-mapped device register). This ambiguity needs to be resolved. Consider the code: int x; extern volatile int system_timer; // precision of 1 microsecond extern volatile int serial_port; // writing sends a word, reading // returns the next word received // ... x = system_timer = 42; // statement 1 serial_port = 66; // statement 2 With same-value semantics, statement 1 will set x to 42 and will send the value 66 to the serial port. With write-then-read semantics, statement 1 will set x to some other value (the change in the timer between writing to it and reading it back). More important, though, is the effects of statement 2 in write-then-read semantics. Because a statement expression is evaluated for its side effects, it is reasonable to require the value of the assignment statement to be determined before being thrown away (in particular, there is *no* statement in the Standard as to when the value of the assignment expression is or is not evaluated). This means that statement 2 always has the side effect of reading a word from the serial port, and there is no way to write without reading. Assuming that same-value semantics are intended, replace the cited words by: The value of the assignment expression is the value stored into the left operand, but is not an lvalue. If write-then-read semantics are intended, replace the cited words by something along the lines of: The value of the assignment expression is the result of reading the left operand after the value has been stored in it [*], but is not an lvalue. If the value of the assignment expression is not used as an operand in another expression,it is unspecified whether or not the left operand is actually read. [*] Thus if the left operand has volatile-qualified type and can be changed by external means, the value of the expression might not be the same as the value stored. [I do *not* claim that these later words actually have the desired effect.] ======== [Item 18, based on PC-UK0033] Category: Editorial change/non-normative contribution Committee Draft subsection: 6.5.2.2 Title: Fix wording relating to "number of arguments" Detailed description: 6.5.2.2p2 states "the number of arguments shall agree with the number of parameters". This does not clearly take account of varargs functions. Similarly the second sentence does not allow for the trailing arguments of varargs functions. Change the paragraph to: If the expression that denotes the called function has a type that includes a prototype, the number of arguments shall agree with the number of parameters (that is, if the prototype contains an ellipsis there shall be at least as many arguments as parameters, otherwise there shall be the same number of arguments as parameters). Each argument that corresponds to a declared parameter shall have a type such that its value may be assigned to an object with the unqualified version of the type of its corresponding parameter. ======== [Item 19, based on PC-UK0003] Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.5.2.2, 7.15.1.1 Title: Adjustment to permitted incompatible argument types Detailed description: At the Menlo Park meeting we agreed to amend 6.5.2.2p6 to permit incompatible parameter and argument types in certain cases where the representation is required to be the same. The cases permitted form the two bullet points at the end of the paragraph. However, the second case was intended to be slightly wider than the wording appearing in the draft; it should have been: - both types are pointers to qualified or unqualified versions of /void/ or of character types. (in other words, it should be possible to pass unsigned char * values to parameters of type char * as well as of void *). The same change needs to be made in 7.15.1.1p2. ======== [Item 20] Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.5.3.4 Title: Forbid sizeof bit-fields when not lvalues Detailed description: Consider the expression: sizeof func().bit_field This is currently not forbidden by 6.5.3.4p1 because it is not an lvalue. This is clearly an oversight. Change 6.5.3.4p1 to: The sizeof operator shall not be applied to an expression that has function type, an incomplete type, a bit-field type, or to the parenthesized name of any such type. ======== [Item 21, based on PC-UK0017] Category: Editorial Committee Draft subsection: 6.5.9 Title: tidy up changes to pointer comparison Detailed description: Though the wording is mostly correct, 6.5.9p6 does not complete cover every case and does not make it clear that it is exhaustive. Append to 6.5.9p6: Otherwise they shall compare unequal. Append to footnote 80: Two different subobjects of an object are not "the same object" and pointers to them compare unequal. ======== [Item 22, based on PC-UK0040] Category: Normative change to intent of existing feature Committee Draft subsection: 6.7.2.1 Title: Bitfields of unsupported types should require a diagnostic. Detailed description: If a bitfield is declared with a type other than /_Bool/ or plain, signed, or unsigned int, the behavior is undefined. Since this can easily be determined at compile time, a diagnostic should be required. It is reasonable to exempt other integer types that the implementation knows how to handle. Add to the end of 6.7.2.1p3: A bit-field shall have a type that is a qualified or unqualified version of /_Bool/, /signed int/ or /unsigned int/, or of some other implementation-defined integer type. Delete the first sentence of 6.7.2.1p8. Note that this wording allows additional implementation-defined bitfield types so long as they are integers. If they are not, the behaviour would not be defined by the Standard and so a diagnostic should still be required. An implementer can also allow non-integer bitfield types, but a diagnostic is still required. ======== [Item 23, based on PC-UK0007] Category: Other: outstanding problem Committee Draft subsection: 6.7.3.1 Title: Problem with restrict and string literals Detailed description: Consider any function which takes two char * parameters where one of them is restrict-qualified and a call where the corresponding arguments are both string literals. For example: char *s = "test string\n"; printf ("This - %s - is the test string\n", s); Because of the restrict qualification, it is not permitted for the two strings to share storage. However, an implementation is entitled to let the literals do so, quite possibly without the programmer realizing that the situation happened (for example, the first parameter might be a macro defined in a makefile). A similar situation occurs when compound literals share storage; in this case the parameters might have almost any restrict-qualified type. One solution would be to exempt unmodifiable objects from the requirements of restrict. Another would be to adopt alternative semantics for restrict as proposed by the UK (that the object pointed to by a restrict-qualified pointer is either not altered or is only accessed via that pointer). ======== [Item 24, based on PC-UK0042] Category: Editorial change/non-normative contribution Committee Draft subsection: 6.7.4 Title: Clarify some aspects of inline Detailed description: A good inlining implementation can inline calls to the comparison function of qsort and other indirect calls. It should be clearer that this is permitted. In 6.7.4p6, add a footnote referenced at the end of the last but one sentence ("An inline definition provides ... the same translation unit"): [*] The call need not be due to the direct appearance of the name of the function at the point of calling; it may be through some kind of indirection. ======== [Item 25, based on PC-UK0042] Category: Editorial change/non-normative contribution Committee Draft subsection: 6.7.4 Title: Clarify some aspects of inline Detailed description: The exact relationship between the inline and extern keywords is not obvious, particularly when the extern declaration of an inline function occurs after its definition. For this reason it should be made clearer in the examples. In 6.7.4p8, after: because /fahr/ is also declared with /extern/ add: (even though that declaration is not visible at the definition of /fahr/) ======== [Item 26, based on PC-UK0167] Category: Normative change to intent of existing feature Committee Draft subsection: 6.7.5.2 Title: require side effects in VLA declarations to work normally Detailed description: 6.7.5.2p3 states in part: It is unspecified whether side effects are produced when the size expression is evaluated. This rule will be extremely confusing to the normal programmer. It places a unreasonable burden on anyone who needs to write code with side-effects (particularly if the size is determined via a function call), and it does not offer any significant benefit to the implementation; to see this, consider that, however the implementation handles: int vla [n++][func()]; it must correctly handle the equivalent code: int vla_size [2] = { n++, func () }; int vla [vla_size [0]][vla_size [1]]; Other issues, such as the order of side effects, can be ignored here and handled in the same way as elsewhere in the Standard. See the WG14 archives for a fuller discussion of the topic. Change required: delete this sentence. ======== [Item 27, based on PC-UK0046] Category: Editorial change/non-normative contribution Committee Draft subsection: 6.7.7 Title: Correct ranges of bitfields in an example Detailed description: In 6.7.7p6, example 3, describes the ranges of various bit-fields in terms of "at least the range". This is because C89 was not clear on what the permitted ranges of integer types was. These ranges are now tightly specified by 6.2.6.2, and so the wording of this example should be altered accordingly: - change "at least the range [-15, +15]" to "either the range [-15, +15] or the range [-16, 15]" - change "values in the range [0, 31] or values in at least the range [-15, +15]" to "values in one of the ranges [0, 31], [-15, +15], or [-16, +15]" ======== [Item 28, based on PC-UK0014] Category: Inconsistency Committee Draft subsection: 6.7.8 Title: problems with initializing unsigned char arrays. Detailed description: Consider the following declaration: unsigned char s [] = "\x80\xff"; The first element of the string literal has the value: (char) 128 and the second element has the value: (char) 255 If the type char is signed and CHAR_MAX is less than 128, these two expressions are implementation-defined. In particular, on a ones- complement implementation likely values are -127 and -0 respectively. When these are converted back to unsigned char during the initialization, then (if UCHAR_MAX is 255) they will be converted to 129 and 0 respectively. This is *not* intuitive. Furthermore, while I do not have access to an implementation with ones-complement arithmetic, I suspect that they apply "copy bytes" semantics for initialization, rather than the "double cast" semantics that the strict wording requires. The following changes provide the more intuitive semantics: Append to 6.7.8p14: The value of each element is determined by converting the corresponding numerical representation of the mapped character, or the octal or hexadecimal escape sequence, directly to the array element type, not via the type char. Append to example 8 in 6.7.8p32: The declaration: unsigned char c [] = "\xFF"; is identical to: unsigned char c [2] = { 0xFF, 0 }; and not to: unsigned char c [2] = { (unsigned char)(char) 0xFF, 0 }; (the latter could be different if /CHAR_MAX/ is less than 255 and the implementation-defined value of the expression /(char) 0xFF/ is not equal to /254-UCHAR_MAX/). ======== [Item 29, based on PC-UK0072] Category: Feature that should be included Committee Draft subsection: 7.14.1.1, 7.20.4 Title: _exit function Detailed description: As part of a working paper (N789), I suggested that C provide an _exit() function like that in POSIX, and signal handlers should be allowed to call this function. The Menlo Park meeting agreed to add this function unless an unresolvable technical issue was found that would make it not conformant to POSIX - no such issue has been raised. Since the meeting I have made some minor improvements to the wording. In 7.14.1.1p5, change: or the signal handler calls any function in the standard library other than the /abort/ function or the /signal/ function to: or the signal handler calls any function in the standard library other than the /abort/ function, the /_exit/ function, or the /signal/ function Add a new subclause within 7.20.4: 7.20.4.X The _exit function Synopsis #include void _exit (int status); Description The /_exit/ function causes normal program termination to occur, and control to be returned to the host environment. No functions registered by the /atexit/ function or signal handlers registered by the /signal/ function are called. The /_exit/ function never returns to the caller. The status returned to the implementation is determined in the same manner as for the /exit/ function. It is implementation- defined whether open output streams are flushed, open streams closed, or temporary files removed. ======== [Item 30, based on PC-UK0056] Category: Feature that should be included Committee Draft subsection: 7.17 Title: Add a symbol giving the maximum alignment Detailed description: When writing functions that use the results of malloc et.al. in a general way (such as malloc wrappers) it is necessary to know what the worst possible alignment is. This value is known to the implementation in order to provide malloc in the first place, but cannot be derived by an application program. Thus it is eminently suitable for standardisation. Typical use might be: struct mallocinfo { char *file; unsigned line; time_t time }; #define HDRSIZE (((sizeof (struct mallocinfo) - 1) / _ALIGNMENT_ALL \ + 1) * _ALIGNMENT_ALL) void *my_malloc (size_t n, char *file, unsigned line) { unsigned char *p = malloc (n + HDRSIZE); if (p == NULL) return p; struct mallocinfo *h = (struct mallocinfo *) p; h->file = file; h->line = line; h->time = localtime(); return p + HDRSIZE; } void my_free (void *p) { if (p != NULL) free ((unsigned char *) p - HDRSIZE); } [I eventually decided is the right place for this.] Add a new macro to : _ALIGNMENT_ALL which expands to an integer constant expression that has type /size_t/, the value of which is the least common multiple of the alignments of all object types.[*] [*] If /p/ has pointer to character type and is suitably aligned for some type /t/, then /(p + _ALIGNMENT_ALL)/ is also suitably aligned for the same type /t/, no matter what /t/ is. ======== [Item 31, based on PC-UK0169] Category: Feature that should be included Committee Draft subsection: 7.17 Title: relax restrictions on the offsetof macro Detailed description: The offsetof macro currently requires its first argument to be a structure type, and is unclear what the second argument is. There is no particular reason to forbid unions for the first argument, nor to forbid complex constructs for the second argument, provided only that the address constant requirement continues to hold. In 7.17p3, change "structure" to "structure or union" in two places, and change: The /member-designator/ shall be such that given to: The /member-designator/ may be any construct, provided that given and add a footnote to the end of the paragraph: [*] Thus the member-designator may be a construct like /m [2]/ or /a.b.c/. The offset of any member of a union is 0. ======== [Item 32, based on PC-UK0057] Category: Normative change to intent of existing feature Committee Draft subsection: 7.19.2, 7.24.3.5, 7.24.6 Title: Better locale handling for wide oriented streams Detailed description: 7.19.2p6 associates an /mbstate_t/ object with each stream, and 7.19.3p11-13 state that this is used with the various I/O functions. On the other hand, 7.24.6p3 places very strict restrictions on the use of such objects, restrictions that cannot be met through the functions provided in the Standard while allowing convenient use of wide formatted I/O. Furthermore, an /mbstate_t/ object is tied to a single locale based on the first time it is used. This means that a wide oriented stream is tied to the locale in use the first time it is read or written. This will be surprising to many users of the Standard. Therefore, at the very least these objects should be exempt from the restrictions of 7.24.6; the restrictions of 7.19 (for example, 7.19.2p5 bullet 2) are sufficient to prevent unreasonable behaviour. In addition, the locale of the object should be tied and not affected by the current locale. The most sensible way to do this is to use the locale in effect when the file is opened, but allow /fwide/ to override this. In 7.19.2p6, add after the first sentence: This object is not subject to the restrictions on direction of use and of locale that are given in subclause 7.24.6. All conversions using this object shall take place as if the /LC_CTYPE/ category setting of the current locale is the setting that was in effect when the orientation of the stream was set with the /fwide/ function or, if this has not been used, when the stream was opened with the /fopen/ or /freopen/ function. In 7.24.3.5, add a new paragraph after paragraph 2: If the stream is successfully made wide oriented, the /LC_CTYPE/ category that is used with the /mbstate_t/ object associated with the stream shall be set to that of the current locale. In 7.24.6p3, append: These restrictions do not apply to the /mbstate_t/ objects associated with streams. ======== [Item 33, based on PC-UK0058] Category: Request for information/clarification Committee Draft subsection: 7.19.4.3 Title: Unclear how many times tmpfile() can be called. Detailed description: Nowhere does the Standard state how many times tmpfile() can be called, nor does it state that several successful calls will actually access different files ! Append to 7.19.4.3p2: The file will be different from any other existing file, including any opened by a previous successful call to the /tmpfile/ function. Add a new part to 7.19.4.3: Recommended practice It should be possible to open at least /TMP_MAX/ temporary files during the lifetime of the program, and no limit on the number simultaneously open other than this limit and any limit on the number of open streams (FOPEN_MAX). The limit of /TMP_MAX/ could be shared with calls to /tmpnam/. ======== [Item 34, based on PC-UK0064] Category: Request for information/clarification Committee Draft subsection: 7.19.8.1, 7.19.8.2 Title: Clarify the actions of fread and fwrite Detailed description: The exact behaviour of fread and fwrite are not well specified, particularly on text streams but in actuality even on binary streams. These changes apply the obvious semantics. In 7.19.8.1p2, add after the first sentence: For each object, /size/ calls are made to the /fgetc/ function and the results stored, in the order read, in an array of /unsigned char/ exactly overlaying the object. In 7.19.8.2p2, add after the first sentence: For each object, /size/ calls are made to the /fputc/ function, taking the values (in order) from an array of /unsigned char/ exactly overlaying the object. ======== [Item 35, based on PC-UK0063] Category: Feature that should be included Committee Draft subsection: 7.19.9 Title: Provide a way to compare fpos_t values. Detailed description: There is no way to determine whether two fpos_t values represent the same position in a file. Therefore, it is not possible to do operations such as the following: - open a file - move through it, looking for some mark - note the position using fgetpos() - rewind - move through it again to the same position, using calls to fgetpos() to determine where you are, rather than relying on having made exactly the same sequence of reads and seeks Add a new function to 7.19.9: 7.19.9.6 The fcmppos function Synopsis #include struct fcmppos fcmppos (fpos_t* pos1, fpos_t* pos2, FILE *stream) Description The /fcmppos/ function compares the values pointed to by /pos1/ and /pos2/, which must both refer to the stream /stream/. If either of the first two arguments is a null pointer, the result of a call to the /fgetpos/ function on the stream is used instead. If the stream has been written to at any point before the later of the two positions, the behaviour is undefined. Returns The value returned is a structured type containing at least the following fields: int before; // Less than, equal to, or greater than zero according // to whether /*pos1/ is before, at the same location // as, or after /*pos2/ in the file. int mbstate; // Zero if and only if the two positions have the same // multibyte parsing status. It will also be necessary to add /struct fcmppos/ to the start of 7.19. ======== [Item 36, based on PC-UK0061] Category: Normative change to intent of existing feature Committee Draft subsection: 7.2.1.1 Title: Explicitly allow assert on non-Boolean arguments Detailed description: DR 107 asked questions about the assert macro (it was written when the parameter type was given as int). Part c asked: Must a conforming implementation convert the value yielded by the expression given in an invocation of the assert macro to type int before checking to see if it compares equal to zero? and the answer given was "no". Other parts of the response stated: Passing a non-int argument in such a context will render the translation unit not strictly conforming. and: a violation of this requirement results in undefined behavior It is clear from these, as well as from a reasoned consideration of 6.10, that the argument to the assert macro is *not* converted to the required type, but must already have that type. This means that expressions such as: assert (n > 0) // new problem in CD2 - int is not _Boolean assert (p != NULL) // new problem in CD2 - int is not _Boolean assert (1U) // problem in C89 - unsigned int is not int assert (2.5) // problem in C89 - double is not int all produce undefined behavior. The change made between CD1 and CD2 has only exacerbated this, requiring explicit casts of comparisons: assert ((_Bool) (n != 0)) The wording changes required to fix this are simple and do not affect the spirit of assert. The implementation is also trivial - the definition of the assert macro might need to have "expression" changed to either "!!(expression)" or "(expression) != 0" where it is tested, though it is possible that the existing definition might already be valid. In 7.2.1.1p1, change "_Bool expression" to "scalar expression", where the word "scalar" is in italics. Add to paragraph 2, either after the first sentence or at the end: The argument of the /assert/ macro may be any expression with scalar type. ======== [Item 37, based on PC-UK0067] Category: Other: tidy up (technically normative) Committee Draft subsection: 7.20 Title: tidy up definitions of macros Detailed description: In 7.20p3, change: EXIT_SUCCESS which expand to integer expressions which ... to: EXIT_SUCCESS which expand to integer constant expressions which ... and change: MB_CUR_MAX which expands to a positive integer expression whose value ... never greater than /MB_LEN_MAX/. to: MB_CUR_MAX which expands to a positive integer expression whose type is /size_t/ and whose value ... never greater than /MB_LEN_MAX/. This is not a constant expression: it may change whenever the locale changes. ======== [Item 38, based on PC-UK0070] Category: Feature that should be included Committee Draft subsection: 7.22 Title: Type-generic macros should be generally useful Detailed description: 7.9 introduces the concept of type-generic macros, but these are only available for a small range of mathematical functions. This facility should be made generally available so that they can be used for general programming. ======== [Item 39] Category: Various (some normative) Committee Draft subsection: 7.23 Title: various changes to Detailed description: The following items constitute a number of changes to 7.23 . Some are editorial and some are normative. They are all included in one place for convenience, though each item stands alone. Sub-item 1 (editorial): In 7.23.1p5, "tm_extlen object" should read "tm_extlen member". Sub-item 2 (editorial): In 7.23.2.6p2 the list should be closed up and indented, in the same style as the lists in 7.23.1p4 and p5. Sub-item 3 (normative): There is an error in the algorithmin 7.23.2.6p3. The first line of the expression for D should read: D = Y*365 + DIV(Z,400)*97 + MOD(Z,400)/4 - MOD(Z,400)/100 + Sub-item 4 (editorial): In footnote 252 "401 B.C.E." should read "401 B.C.". Sub-item 5 (editorial): In 7.23.3.5p3, item %C, delete the "(00-99)" that seems to have appeared since CD1 (the year is not limited to 0 to 9999). Sub-item 6 (normative): It would be convenient to provide a way to produce the output generated in the "C" locale even when in another locale (for example to produce mixed format output). This could reasonably be done by a "C" modifier (there is no case where it makes sense to have this combined with the "E" or "O" modifiers). In 7.23.3.5p2 add "C" to the list of modifiers, and add a new paragraph before p4: The C modifier indicates that the replacement text shall be that produced in the "C" locale, irrespective of the current locale. Sub-item 7 (normative): The "E" and "O" modifiers could be made more general while, at the same time, making their meanings clearer. The following wording replaces 7.23.3.5p4 completely, though in principle one changed modifier could be adopted without the other. If this change is not made, it should be noted that "%OW" has been incorrectly written as "%Ou". [E modifier] The E modifier indicates that a locale-specific alternate calendar shall be used. All specifiers whose replacement depends on the date shall use the alternate calendar, and the replacement text shall depend on all the members tm_year, tm_mon, and tm_mday as well as any listed for the specifier. If there is no such alternate system, the modifier is ignored. If the alternate system makes use of base years (also known as eras) and offsets from the base, then the following specifiers have different meanings: %EC is replaced by the name of the base year or era. %Ey is replaced by the offset from %EC (year only). %EY is replaced by the locale's full alternative year representation including both era and offset. [O modifier] The O modifier indicates that a locale-specific set of alternative numeric symbols are to be used instead of decimal digits in the text replacing the conversion specifier. If there are no alternative numeric symbols, the modifier is ignored. ======== [Item 40, based on PC-UK0031] Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 7.4.1.8 Title: make ispunct() true for basic punctuation characters Detailed description: In C89, including after the addition of NA1, the definition of ispunct() was: The ispunct function tests for any printing character that is neither space (' ') nor a character for which isalnum is true. At sometime during the revision process this definition has been changed; this is a Quiet Change with no obvious rationale. It also makes it impossible to predict what will happen in the "C" locale. Preferably this wording should be restored. Alternatively, wording should be adopted that at least returns true for the required 29 punctuation characters in the "C" locale. The following change uses wording analogous to that in other functions and has the benefit that it clearly defines the results in the "C" locale without leaving them up to the implementation. Replace 7.4.1.8p2 by: The /ispunct/ function tests for any character that is one of the 29 graphic characters in the basic execution character set or is one of a locale-specific set of printing characters for which neither /isspace/ nor /isalnum/ is true. In the "C" locale it returns true only for the characters in the basic execution character set. ======== [Item 41] Category: Editorial Committee Draft subsection: D.1 Title: Minor edit to clarify interpretation Detailed description: In D.1p2, change "to follow" to "to follow exactly". The point is not that the annex is normative, but that it is to be applied "as is" rather than to the exact letter. ======== [Item 42] Category: Inconsistency Committee Draft subsection: D.5 Title: Minor correction to an example in Annex D Detailed description: Replace D.5p9 with: Clearly there is no undefined behavior. [The existing text is clearly wrong.] ======== [Item 43, based on PC-UK0066] Category: Inconsistency Committee Draft subsection: various Title: The term "access" is not well defined. Detailed description: The term "access" is not well defined. From context, it is most often used to mean "read or write the value" but sometimes to mean "read the value". This ambiguity sometimes makes it hard to understand what is actually meant. It appears that some work on this has been done since CD1, and 6.7.3.1p5 makes it clear that "read or write" should be the meaning. However, this does not obviously apply to the whole document; it ought to be made clear and the remaining "read the value" uses changed. Add a new subclause to clause 3: 3.X access (in the context of execution-time actions) to read or modify the value of an object; expressions that are not evaluated do not access objects. NOTE 1 Where only one of these two actions is meant, the term "read" or "modify" is used. NOTE 2 The term "modify" includes the case where the new value of the object is the same as the previous value. Delete the following words from 6.7.3.1p5: An access to a value means either fetching it or modifying it; expressions that are not evaluated do not access values. The following uses of "access" or its inflections need to be changed: 6.2.6.1p4 ("accessed" -> "read") 6.5p2 ("accessed" -> "read") Footnote 68 ("accessing" -> "reading") 6.5.16.1p3 ("accessed" -> "read") and the corresponding bullets in annex K.