ISO/IEC JTC 1/SC22 Programming languages, their environments and system software interfaces Secretariat: U.S.A. (ANSI) ISO/IEC JTC 1/SC22 N2872 TITLE: Summary of Voting on Final CD Ballot for FCD 9899 - Information technology - Programming languages - Programming Language C (Revision of ISO/IEC 9899:1990 DATE ASSIGNED: 1999-01-12 BACKWARD POINTER: N/A DOCUMENT TYPE: Summary of Voting PROJECT NUMBER: JTC 1.22.20.01 STATUS: WG14 is requested to prepare a Disposition of Comments Report and make a recommendation on the further processing of the FCD. ACTION IDENTIFIER: FYI DUE DATE: N/A DISTRIBUTION: Text CROSS REFERENCE: SC22 N2794 DISTRIBUTION FORM: Def Address reply to: ISO/IEC JTC 1/SC22 Secretariat William C. Rinehuls 8457 Rushing Creek Court Springfield, VA 22153 USA Telephone: +1 (703) 912-9680 Fax: +1 (703) 912-2973 email: rinehuls@access.digex.net _________ end of title page; beginning of voting summary ___________ SUMMARY OF VOTING ON Letter Ballot Reference No: SC22 N2794 Circulated by: JTC 1/SC22 Circulation Date: 1998-08-24 Closing Date: 1999-01-08 SUBJECT: Final CD Ballot for FCD 9899 - Information technology - Programming languages - Programming Language C (Revision of ISO/IEC 9899:1990 ---------------------------------------------------------------------- The following responses have been received on the subject of approval: "P" Members supporting approval without comments 8 "P" Members supporting approval with comments 4 "P" Members supporting approval with comments not yet received 1 "P" Members not supporting approval 3 "P" Members abstaining 2 "P" Members not voting 4 "O" Members supporting approval without comments 1 -------------------------------------------------------------------- Secretariat Action: WG14 is requested to prepare a Disposition of Comments Report and make a recommendation on the further processing of the FCD. The comment accompanying the abstention vote from Austria was: "Lack of expert resources." The comments accompanying the affirmative votes from Canada, France, Norway and the United States of America are attached, along with the comments accompanying the negative votes from Denmark, Japan and the United Kingdom. Germany has advised that the comments accompanying their affirmative vote "will follow within the next ten days". Upon receipt, those comments will be distributed as a separate SC22 document. _____ end of voting summary; beginning of detailed summary __________ ISO/IEC JTC1/SC22 LETTER BALLOT SUMMARY PROJECT NO: JTC 1.22.20.01 SUBJECT: Final CD Ballot for FCD 9899 - Information technology - Programming languages - Programming Language C (Revision of ISO/IEC 9899:1990) Reference Document No: N2794 Ballot Document No: N2794 Circulation Date: 1998-08-24 Closing Date: 1999-01-08 Circulated To: SC22 P, O, L Circulated By: Secretariat SUMMARY OF VOTING AND COMMENTS RECEIVED Approve Disapprove Abstain Comments Not Voting 'P' Members Australia (X) ( ) ( ) ( ) ( ) Austria ( ) ( ) (X) (X) ( ) Belgium ( ) ( ) ( ) ( ) (X) Brazil ( ) ( ) (X) ( ) ( ) Canada (X) ( ) ( ) (X) ( ) China (X) ( ) ( ) ( ) ( ) Czech Republic (X) ( ) ( ) ( ) ( ) Denmark ( ) (X) ( ) (X) ( ) Egypt ( ) ( ) ( ) ( ) (X) Finland (X) ( ) ( ) ( ) ( ) France (X) ( ) ( ) (X) ( ) Germany (X) ( ) ( ) (*) ( ) Ireland (X) ( ) ( ) ( ) ( ) Japan ( ) (X) ( ) (X) ( ) Netherlands (X) ( ) ( ) ( ) ( ) Norway (X) ( ) ( ) (X) ( ) Romania ( ) ( ) ( ) ( ) (X) Russian Federation (X) ( ) ( ) ( ) ( ) Slovenia ( ) ( ) ( ) ( ) (X) Ukraine (X) ( ) ( ) ( ) ( ) UK ( ) (X) ( ) (X) ( ) USA (X) ( ) ( ) (X) ( ) 'O' Memberz Voting Korea Republic (X) ( ) ( ) ( ) ( ) ( ) * The Germany Member Body has advised that "comments will follow within the next ten days". Upon receipt, these comments will be distributed as a separate SC22 document. --------- end of detailed summary; beginning of Canada Comments _____ From: "Doug Langlotz" (dlanglots@scc.ca) Document number FCD 9899 (JTC 1/SC22/N2794) Canada APPROVES WITH COMMENTS. Canada supports approval with the following comments. Comments: Comment #1 Category: Normative Committee Draft Subsection: 6.8.4 and 6.8.5 Title: Inconsistent scoping rules for compound literals and control statements Description: In 6.8.5.3, the for statement was modified when incorporating mixed declarations and code to limit the scope of the (possible) declaration in clause-1. However, this makes the behaviou inconsistent for compound literals which now have different scope rules in a for statement then in the other control statements. From example 8 in 6.5.2.5: struct s { int i; } int f (void) { struct s *p = 0, *q; int j = 0; while (j < 2) q = p, p = &((struct s){j++}); return p == q && q->i == 1; } Note that if a for loop were used instead of a while loop, the lifetime of the unnamed object would be the body of the loop only, and on entry next time around p would be pointing to an object which is no longer guaranteed to exist, which would result in undefined behaviour. The behaviour of compound literals should be made consistent by making all of the control statements have the same scoping rules as for loops. Comment #2 Category: Normative Committee Draft Subsection: 6.5.2.5 Title: Compound literals constraint #2 Description: Constraint #2 in 6.5.2.5 seems to have an undesirable interpretation. The constraint is, "No initializer shall attempt to provide a value for an object not contained within the entire unnamed object specified by the compound literal." This seems to disallow the following (assume the Fred type has 3 members and the George type has 2 so in neither case are we going past the end of the object): (Fred){1, 7, &((George){5, 6})} when it was really meant to disallow: (int[2]){1, 2, 3} Perhaps the rule could be broken down into more explicit cases: no subscript (implicit or explicit) should be beyond the bounds of the array object that it is applied to; no more fields in a struct object should be initialized than there are fields in that struct; only names of members of the struct or union object being initialized may be used for designated initializers. Similar wording is also used in 6.7.8. Comment #3 Category: Normative Committee Draft Subsection: 6.3.1.3 Title: Converting to signed integral type (based on previous Canadian comment) Description: Original Comment: Section 6.3.1.3 paragraph 3 describes the result of converting a value with integral type to a signed type which cannot represent the value. It says that the result is implementation defined, however, we believe that the result should be undefined, analogous to the case where an operation yields a value which cannot be represented by the result type. The purpose of this comment was to ensure that if converting a value with integral type to a signed type which cannot represent the value the implementation is allowed to terminate or allowed to fail to translate. Details from a note sent to the reflector: I would claim that the use of "implementation defined" isn't appropriate in 6.3.1.3 paragraph 3 for several reasons: 1. (possibly pedantic) The draft standard does not provide two or more choices as required by 3.19. What is an implementation allowed to do? Is termination legal? Failure to translate ruled out. 2. I interpret section 4 paragraph 3 to forbid an implementation from failing to translate because of an overflow during a conversion to a signed integral type. Yet this would seem to be quite appropriate. For example, an implementation should be *allowed* to treat the following external definition as an error ("fail to translate"), assuming that INT_MAX is not representable in short: short big = INT_MAX; Contrast that with the following, for which the implementation *is* allow allowed to fail to translate: short big = INT_MAX + 1; 3. An argument similar to 2 can be made that a run-time conversion "problem" should be allowed to be treated as an error. It isn't clear to me whether the definition of "implementation-defined" allows for such an interpretation. There are strong hints in the draft standard that lead me to think that it is not the committee's intent to allow such an interpretation. I don't actually see that this is spelled out in the draft standard. 4. In certain cases 6.3.1.3 paragraph 3 and 6.5 paragraph 5 actually seem to conflict. For example, I would suggest that in the following example, both apply. short big = (short) INT_MAX; Clearly 6.3.1.3 paragraph 3 applies, since a conversion to a signed integral type is being performed, and the value cannot be represented in it. So the result must be implementation-defined. Clearly 6.5 paragraph 5 applies since the value of the expression is not in the range of representable values for its type. So the behavior is undefined. This is seems to me to be a contradiction. A simple fix is to change 6.3.1.3 paragraph 3 to call for "undefined behavior". 5. Although the wording in the draft is very similar to that in the previous standard, there is a difference. I interpreted the old standard's looser definition of "implementation defined" to allow "failure to translate". This latitude is no longer available in the draft. 6. I feel that it is an error to try to represent an out of range value in an integral type. Yet "implementation defined" implies that this is *not* an error (see section 4 paragraph 3). It is in the interest of users that implementations be allowed to treat it as an error. I admit that requiring that it be "caught" at runtime would have a serious performance impact for many implementations. I'm only asking that the standard should allow this error to be caught. The standard should require that constant expressions with signed integer overflow should be constraint violations. This has no runtime cost. If this isn't acceptable, at least allow this situation to be treated as an error ("undefined behavior" accomplishes this). Excerpts from 3. Terms and definitions: 3.11 1 implementation-defined behavior unspecified behavior where each implementation documents how the choice is made 2 EXAMPLE An example of implementation-defined behavior is the propagation of the high-order bit when a signed integer is shifted right. 3.19 1 unspecified behavior behavior where this International Standard provides two or more possibilities and imposes no requirements on which is chosen in any instance 2 EXAMPLE An example of unspecified behavior is the order in which the arguments to a function are evaluated. Excerpt from 4. Conformance: 3 A program that is correct in all other aspects, operating on correct data, containing unspecified behavior shall be correct program and act in accordance with 5.1.2.3. [5.1.2.3 is "Program execution"] All of 6.3.1.3 Signed and unsigned integers: 1 When a value with integer type is converted to another integer type other than _Bool,if the value can be represented by the new type, it is unchanged. 2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. 3 Otherwise, the new type is signed and the value cannot be represented in it; the result is implementation-defined. Excerpt from 6.5 Expressions: 5 If an exception occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined. Comment #4 Category: Normative Committee Draft Subsection: 6.2.5 Title: Restrictions on long long Description: Proposal for a change to the Draft C standard (WG14/N843) This proposal suggests a collection of small changes to the Draft C Standard (WG14/N843) dated August 3, 1998. The changes are intended to isolate long long int and implementation-defined extended integer types from the common integer types. In particular, we wish to ensure that size_t and ptrdiff_t must be one of the common integer types, rather than long long or an implementation-defined extended integer type. Also, we wish to ensure that no values are converted to long long or an implementation-defined extended integer type, except when the conversion is explicit. For example, on a system where integers have 32 bits, a constant like 0xFFFFFFFF should be converted to unsigned long rather than long long. In order to implement this principle, we suggest the following wording changes to various sections in the draft document. 6.2.5 Types 4. There are four standard signed integer types, designated as signed char, short int, int, and long int. (These and other types may be designated in several additional ways, as described in 6.7.2.) There is one standard extended signed integer type designated as long long int. There may be additional implementation-defined extended signed integer types. The standard extended signed integer type and the implementation-defined extended signed integer types are collected called the extended signed integer types. The standard and extended signed integer types are collectivel called signed integer types. 7.17 Common definitions <>None of the above three types shall be defined with an extended integer type, whether standard or implementation-defined. ________end of Canada Comments; beginning of Denmark Comments_________ From: Charlotte Laursen Dato: 1999-01-06 Danish vote on FCD 9899 Ballot comments on FCD 9899, C, SC22 N2794 The DS vote is negative The vote can be changed into an affirmative one if the following problems are resolved satisfactorily. 1. Two functions isblank(), iswblank() are added with the description contained in CD 9899, SC22 N2620. 2. The external linkage limit is set to 32 *characters* and that a note that a character may be needed to be represented by more than one byte will be added. 3. The character terminology and its use be tidied up, and brought in consistence with SC2 terminology. Eg 3.5, 3.14 . The ISO/IEC 9945 (POSIX) series of standards have done a related exercise. _____ end of Denmark Comments; beginning of France Comments _________ From: ARNAUD.A.R.D.DIQUELOU@email.afnor.fr TITLE: Ballot comments on SC22N2794 - FCD 9899 - C STATUS: Approved AFNOR comments France votes YES to document SC22 N2794, with the following comments. A. First, it should be noted that with one (important) exception, the points raised in the precedent vote (N2690, answered by N2792) was satisfactorily resolved. The overall impression is that the document have been vastly improved in the ongoing process of revision. B. Then, there is a technical point related to that we miss at the first time, and about which we propose to drop all the new material in this area, waiting for experts (that have already set up a working group) to design a better solution. This point is detailled below. C. Finally, there is long long. Our position on this subject do not change: we feel this feature is not necessary to be included in C9X (see the fully detailled analysis in AFNOR previous ballot comments in SC22 N2690). The Committee answered PR at the preceding vote (i.e. "has reaffirmed this decision on more than one occasion.") On the other hand, we see this problem as being very minor when compared to the goal of delivering a new revision of the C Standard, with the added precisions and new features this draft is proposing. So we would not require anymore this feature to be removed from the draft (but we would be very happy if it happens). AFNOR detailled comments related to Comment 1. Category: Feature that should be included Committee Draft subsection: 7.23, 7.25, 7.26 Title: Useless library functions made deprecated Detailed description: mktime (7.23.2.3) is entirely subsumed by mkxtime (7.23.2.4). Similar cases occur with gmtime/localtime (7.23.3.3/7.23.3.4) vs zonetime (7.23.3.7, strftime (7.23.3.5) vs strfxtime (7.23.3.6), and wcsftime (7.24.5.1) vs wcsfxtime (7.24.5.2). The former functions do not add significant value over the latter ones (in particular, execution time are similar). So if the latter are to be kept (that is, if the below comment is dropped), the former should be put in the deprecated state, to avoid spurious specifications to be kept over years. Comment 2. Category: Feature that should be removed Committee Draft subsection: 7.23, 7.24.5.2 Title: Removal of struct tmx material in the library subclause Detailed description: a) The mechanism of tm_ext and tm_extlen is entirely new to the C Standard, so attention should be paid to the use that can be done of it. Unfortunately, the current text is very elliptic about this use, particularly about the storage of the further members referred by 7.23.1p5. In particular, it is impossible from the current wording to know how to correctly free a struct tmx object whose tm_ext member is not NULL, as in the following snippet: // This first function is OK (providing correct understanding of my behalf). struct tmx *tmx_alloc(void) // alloc a new struct tmx object { struct tmx *p = malloc(sizeof(struct tmx)); if( p == NULL ) handle_failed_malloc("tmx_alloc"); memchr(p, 0, sizeof(struct tmx)); // initialize to 0 all members p->tm_isdst = -1; p->tm_version = 1; p->tm_ext = NULL; return p; } // This second function have a big drawback void tmx_free(struct tmx *p) // free a previously allocated object { if( p == NULL ) return; // nothing to do if( p->tm_ext ) { // some additional members have been added by the implementation // or by users' programs using a future version of the Standard // since we do not know what to do, do nothing. ; // If the members were allocated, they are now impossible to // access, so might clobber the memory pool... } free(p); return; } Various fixes might be thought of. Among these, I see: - always require that allocation of the additional members be in control of the implementation; this way, programs should never "free() tm_ext"; effectively, this makes these additional members of the same status as are currently the additional members that may be (and are) part of struct tm or struct tmx - always require these additional objects to be separately dynamically allocated. This requires that copies between two struct tmx objects should dynamically allocate some memory to keep these objects. In effect, this will require additional example highlight this (perhaps showing what a tmxcopy(struct tmx*, const struct tmx*) function might be). Both solutions have pros and cons. But it is clear that the current state, that encompasses both, is not clear enough. Other examples of potential pitfalls are highlighted below. b) This extension mechanism might be difficult to use with implementations that currently have additional members to struct tm (_tm_zone, containing a pointer a string meaning the name of the time zone, and _tm_gmtoff, whose meaning is almost the same as tm_zone, except that it is 60 times bigger). This latter is particularly interesting, since it might need tricky kludges to assure the internal consistency of the struct tmx object (any change to either member should ideally be applied to the other, yielding potential problems of rounding). Having additional members, accessed though tm_ext, for example one whose effect is to duplicate _tm_zone behaviour, probably is awful while seen this way. c) 7.23.1p5 states that positive value for tm_zone means that the represented brocken-down time is ahead of UTC. In the case when the relationship between the brocken-down time and UTC is not known (thus tm_zone should be equal to _LOCALTIME), it is therefore forbidden to be positive. This might deserve a more explicit requirement in 7.23.1p2. d) POSIX compatibility, as well as proper support of historical time zones, will require tm_zone to be a count of seconds instead of a count of minutes; this will in turn require tm_zone to be enlarged to long (or to int_least32_t), to handle properly the minimum requirements. e) POSIX compatibility might be defeated with the restriction set upon Daylight Saving Times algorithms to actually *advance* the clocks. This is a minor point, since there is no historical need, nor any perceived real need, for such a "feature". f) On implementations that support leap seconds, 7.23.2.2 (difftime) do not specify whether the result should include (thus considering calendar time to be effectively UTC) or disregard (thus considering calendar time to be effectively TAI) leap seconds. This is unfortunate. g) The requirements set up by 7.23.2.3p4 (a second call to mktime should yield the same value and should not modify the brocken-down time) is too restrictive for mktime, because mktime does not allow complete determination of the calendar time associated with a given brocken-down time. Examples include the so-called "double daylight saving time" that were in force in the past, or when the time zone associated with the time changes relative to UTC. For example, in Sri Lanka, the clocks move back from 0:30 to 0:00 on 1996-10-26, permanently. So the timestamp 1996-10-26T00:15:00, tm_isdst=0 is ambiguous when given to mktime(); and widely deployed implementations exist that use caches, thus might deliver either the former of the later result on a random basis; this specification will effectively disallow caching inside mktime, with a big performance hit for users. This requirement (the entire paragraph) should be withdrawn. Anyway, mktime is intended to be superseded by mkxtime, so there is not much gain trying to improve a function that is to be declared deprecated. h) The case where mktime or mkxtime is called with tm_zone set to _LOCALTIME and tm_isdst being negative (unknown), and when the input moment of time is inside the "Fall back", that is between 1:00 am and 2:00 am on the last Sunday in October (in the United States), leads to a well known ambiguity. Contrary to what might have been waited for, this ambiguity is not solved by the additions of the revision of the Standard (either results might be returned): all boiled down to the sentence in 7.23.2.6, in the algorithm, saying // X2 is the appropriate offset from local time to UTC, // determined by the implementation, or [...] Since there are two possible offsets in this case... i) Assuming the implementation handle leap seconds, if brocken- down times standing in the future are passed (where leap seconds cannot de facto be determined), 7.23.2.4p4 (effect of _NO_LEAP_ SECONDS on mkxtime), and in particular the sentence between parenthesis, seems to require that the count of leap seconds should be assumed to be 0. This would be ill advised; I would prefer it to be implementation-defined, with the recommended practice (or requirement) of being 0 for implementations that do not handle leap seconds. j) Assuming the implementation handle leap seconds, the effect of 7.23.2.4p4 is that the "default" behaviour on successive calls to mkxtime yields a new, strange scale of time that is neither UTC nor TAI. For example (remember that there will be introduced a positive leap second at 1998-12-31T23:59:60Z in renewed ISO 8601 notation): struct tmx tmx = { .tm_year=99, .tm_mon=0, .tm_mday=1, .tm_hour=0, .tm_min=0, .tm_sec=0, .tm_version=1, .tm_zone=_LOCALTIME, .tm_ext=NULL, .tm_leapsecs=_NO_LEAP_SECONDS }, tmx0; time_t t1, t0; double delta, days, secs; char s[SIZEOF_BUFFER]; t1 = mkxtime(&tmx); puts(ctime(&t1)); if( tmx.tm_leapsecs == _NO_LEAP_SECONDS ) printf("Unable to determine number of leap seconds applied.\n"); else printf("tmx.tm_leapsecs = %d\n", tmx.tm_leapsecs); tmx0 = tmx; // !!! may share the object pointed to by tmx.tm_ext... ++tmx.tm_year; t2 = mkxtime(&tmx); puts(ctime(&t2)); delta = difftime(t2, t1); days = modf(delta, &secs); printf("With ++tm_year: %.7e s == %f days and %f s\n", delta, days, secs); if( tmx.tm_leapsecs == _NO_LEAP_SECONDS ) printf("Unable to determine number of leap seconds applied.\n"); else printf("tmx.tm_leapsecs = %d\n", tmx.tm_leapsecs); tmx = tmx0; // !!! may yield problems if the content pointed to by // tm_ext have been modified by the previous call... tmx.tm_hour += 24*365; tmx.tm_leapsecs = _NO_LEAP_SECONDS; t2 = mkxtime(&tmx); puts(ctime(&t2)); delta = difftime(t2, t1); days = modf(delta, &secs); printf("With tm_hour+=24*365: %.7e s == %f days and %f s\n", delta, days, secs); if( tmx.tm_leapsecs == _NO_LEAP_SECONDS ) printf("Unable to determine number of leap seconds applied.\n"); else printf("tmx.tm_leapsecs = %d\n", tmx.tm_leapsecs); Without leap seconds support, results should be consistent and straight- forward; like (for me in Metropolitan France): Thu Jan 1 01:00:00 1998 Unable to determine number of leap seconds applied. Fri Jan 1 01:00:00 1999 With ++tm_year: 3.1536000e+07 s == 365 days and 0 s Unable to determine number of leap seconds applied. Fri Jan 1 01:00:00 1999 With tm_hour+=24*365: 3.1536000e+07 s == 365 days and 0 s Unable to determine number of leap seconds applied. Things may change with leap seconds support; assuming we are in a time zone behind UTC (e.g. in the United States), the results might be: Wed Dec 31 21:00:00 1997 tmx.tm_leapsecs = 31 Thu Dec 31 21:00:00 1998 With ++tm_year: 3.1536000e+07 s == 365 days and 0 s tmx.tm_leapsecs = 31 Thu Dec 31 21:00:00 1998 With tm_hour+=24*365: 3.1536000e+07 s == 365 days and 0 s tmx.tm_leapsecs = 31 But with a time zone ahead of UTC, results might be Thu Jan 1 01:00:00 1998 tmx.tm_leapsecs = 31 Thu Dec 31 00:59:59 1998 With ++tm_year: 3.1536000e+07 s == 365 days and 0 s tmx.tm_leapsecs = 31 Fri Jan 1 01:00:00 1999 With tm_hour+=24*365: 3.1536001e+07 s == 365 days and 1 s tmx.tm_leapsecs = 32 And if the time zone is set to UTC, results might be Thu Jan 1 00:00:00 1998 tmx.tm_leapsecs = 31 Thu Dec 31 23:59:60 1998 With ++tm_year: 3.1536000e+07 s == 365 days and 0 s tmx.tm_leapsecs = 31 Fri Jan 1 00:00:00 1999 With tm_hour+=24*365: 3.1536001e+07 s == 365 days and 1 s tmx.tm_leapsecs = 32 or, for the three later lines Thu Dec 31 23:59:60 1998 With tm_hour+=24*365: 3.1536000e+07 s == 365 days and 0 s tmx.tm_leapsecs = 31 The last result is questionable, since both choices are allowed by the current text (the result is right into the leap second itself). Moreover, implementations with caches might return either on a random basis... Bottom line: the behaviour is surprising at least. k) 7.23.2.6p2 (maximum ranges on input to mkxtime) use LONG_MAX sub multiples to constraint the members' values. Outside the fact that the limitations given may easily be made greater in general cases, it have some defects: - the constraint disallow the common use on POSIX box of tm_sec as an unique input member set to a POSIX style time_t value; - the constraints are very awkward to implementations where long ints are bigger than "normal" ints: on such platforms, all members should first be converted to long before any operation take place; - since there are eight main input fields, plus a ninth (tm_zone) which is further constrained to be between -1439 and +1439, the result might nevertheless overflows, so special provision to handle should take place in any event. l) There is an obvious (and already known) typo in the description of D, regarding the handling of year multiple of 100. Also this definition should use QUOT and REM instead of / and %. m) Footnote 252 introduces the use of these library functions with the so-called proleptic Gregorian calendar, that is the rules for the Gregorian calendar applied to any year, even before Gregory's reform. This seems to contradict 7.23.1p1, which says that calendar time's dates are relative to the Gregorian calendar, thus tm_year should be in any case greater to -318. If it is the intent, another footnote in 7.23.1p1 might be worthwhile. Another way is to rewrite 7.23.1p1 something like "(according to the rules of the Gregorian calendar)". See also point l) below. n) The static status of the returned result of localtime and gmtime (which is annoying, but that is another story) is clearly set up by 7.23.3p1. However, this does not scale well to zonetime, given that this function might in fact return two objects: a struct tmx, and an optional object containing additional members, pointed to by tmx->tm_ext. If the later is to be static, this might yield problems with mkxtime as well, since 7.23.2.4p5 states that the brocken-down time "produced" by mkxtime is required to be identical to the result of zonetime (This will effectively require that tm_ext member should always point to static object held by the implementation; if it is the original intent, please state it clearly). o) There is a direct contradiction between the algorithm given for asctime in 7.23.3.1, which will overflow if tm_year is not in the range -11899 to +8099, and the statements in 7.23.2.6 that intent to encompass a broader range. All of these points militate to a bigger lifting of this part of the library. Such a job have been initiated recently, as Technical Committee J11 is aware of. In the meantime, I suggest dropping all these new features from the current revision of the Standard. It means in effect: i) removing subclauses 7.23.2.4 (mkxtime), 7.23.2.6 (normalization), 7.23.3.6 (strfxtime), 7.23.3.7 (zonetime), 7.24.5.2 (wcsfxtime). ii) removing paragraphs 7.23.2.3p3 and 7.23.2.3p4 (references to 7.23.2.6), iii) the macros _NO_LEAP_SECONDS and _LOCALTIME in 7.23.1p2 should be removed, as they are becoming useless. Same holds for struct tmx in 7.23.1p3. iv) 7.23.1p5 (definition of struct tmx) should also be removed, as it is becoming useless too. __________ end of France Comments; beginning of Japan Comments ______ VOTE ON FCD 9899 JTC 1/SC22 N2794 Information technology - Programming languages - Programming Language C (Revision of ISO/IEC 9899:1990) ---------------------------------------------------------- The National Body of Japan disapproves ISO/IEC JTC1/SC22 N 2794, VOTE ON FCD 9899: Information technology - Programming languages - Programming Language C (revision of ISO/IEC 9899:1990.) If the following comments are satisfactorily resolved, Japan's vote will be changed to "Yes". 1. 64-bit data type should be optional Japan has been claiming a 64-bit data type to be optional for freestanding environment because it tends to increase the sizes of program code and run-time library. This situation forces users and vendors, who do not need to handle any 64-bit data, to implement and use unnecessary fat conforming implementation. For example, for 16-bit microprocessors, its compiler vendors and users do not need 64-bit data type in almost cases. Of course, Japan knows well that this issue has been discussed among WG14 committee for a long time, and recognizes that a majority of WG14 committee agrees to introduce 64-bit data type as a mandatory specification to ISO C standard. So, Japan decided to agree to make 64-bit data type as a mandatory specification, provided that the following requirements are accepted by the committee: a) In the Rationale document, explicitly specify the logical reason why the 64-bit data type need to be introduced as a MANDATORY specification to the ISO C standard, in other words, why the 64-bit data type can NOT be OPTIONAL. Please understand that this is NOT requiring the reason why 64-bit data type is necessary. Japan already understands the necessity of 64-bit data type very well. Japan needs the logical reason why the committee has denied the proposal of "OPTIONAL". b) In the Rationale document, explicitly specify an effective example of conforming implementation which is supporting mandatory 64-bit data type and can reduce a code size of the program to similar (never lager) size in existing C90 conforming implementation if the program does not use 64-bit data at all. c) By some appropriate manner, publish the Rationale document that includes the above two descriptions. 2. "K.4 Locale-specific behavior" and the wcscoll function @ Annex K.4 The wcscoll function, defined in sub-clauses 7.24.4.4.2, has a locale-specific behavior, therefore, add the reference of the wcscoll function to Annex K.4. 3. Modify description for the conversion specifiers (e, f, g, a) @ 7.19.6.1 fprintf function @ 7.24.2.1 fwprintf function The description of the conversion specifiers f, F, e, E, g, G, a and A has the term "a (finite) floating-point number." This kind of expression using the parenthesizes, "(finite)", is not appropriate as a specification of the programming language standard. It should be changed to a more strict description by using well-defined terms. Japan's concrete proposal was already presented in the comment attached to the vote of CD Approval. Please refer to SC22 N 2790(SoV of CD vote), and N 2792 (Disposition of CD vote comments). [More explanation of this issue] A floating-point number is defined in sub-clause "5.2.4.2.2 characteristics of floating types ". In "5.2.4.2.2", the floating-point number seems to be categorized as follows: floating-point number + | +---- normalized floating-point number | +---- not normalized floating-point number + | +---- subnormal floating-point number | +---- infinities | +---- NaNs | +---- ... That is, infinities and NaNs can be interpreted as one category of "floating-point number". (In the standard, there is no explicit description describing that infinities and NaNs are NOT floating-point numbers.) On the other hand, the footnote 15 says that "although they are stored in floating types, infinities and NaNs are not floating-point numbers", however, the footnote is NOT the normative part of the standard. In order to make the standard clearer, this description should be moved to the normative part of the standard. That way, removing the word (finite)" is enough change to the current description for the conversion specifiers (e, f, g, a). However, if the committee leaves the footnote 15 as is, the description for e, f, g, and a should be changed as already Japan proposed. ----- Cf. SC22 N 2792 Disposition of Comments Report on CD Approval of CD 9899 - Information technology - Programming languages - Programming Language C (Revision of ISO/IEC 9899:1990) > 3.8 A double argument for the conversion specifier > > > 2. Editorial Comments > > 14) A double argument for the conversion specifier > > > > Sub clause 7.12.6.1 (page 232 - 233 in draft 9) and > > sub clause 7.18.2.1 (page 308 - 309 in draft 9): > > > > In the description about the conversion specifiers f, F, e, > > E and G of the function f[w]printf, > > "a double argument representing a floating-point number > > is..." > > should be changed to > > "a double argument representing a normalized > > floating-point number is..." ^^^^^^^^^^ > > in order to clarify the range and the definition of the > > double argument. > > > > WG14: The Committee discussed this comment, and came to the > > consensus that this is not an editorial issue, some > > floating point arithmetic support denormal numbers and > > infinities. > > There will need to be a detailed proposal to support > > this change. > > The original intention of Japanese comment is to point out > that the current description: > > "A double argument representing a floating number is > converted to ...[-]ddd.ddd... > A double argument representing an infinity is converted > to ...[-]inf or [-]infinity > A double argument representing a NaN is converted to ... > [-]nan or [-]nan(n-char-sequence)..." > > is not appropriate as a strict language standard > specification because "a floating-point number" (defined in > "5.2.4.2.2 Characteristics of floating types "), > as WG14 mentions above, may include an infinity and a Nan > so that the current description can be read as an infinity > can be converted to [-]ddd.ddd or [-]inf and also NaN can be > converted to [-]ddd.ddd or [-]Nan. > > Therefore, Japan re-proposes to change the above description > to: > > "A double argument representing an infinity is converted > to ...[-]inf or [-]infinity... > A double argument representing a NaN is converted to ... > [-]nan or [-]nan(n-char-sequence)..." > A double argument representing a floating number except > an infinity and a NaN is converted to ...[-]ddd.ddd..." > > This change should be applied to the description about the > conversion specifier f, F, e, E and G of the function > f[w]printf(). > > WG14: Response Code: AL ----- 4. Add UCN to the categories of preprocessing token @ 6.4 Lexical elements, 1st paragraph in the semantics Add UCN to the categories of the preprocessing token as described in the syntax. 5. Replace zero code with null character in mbstowcs @ 7.20.8.1 The mbstowcs function, Returns "Terminating zero code" should be changed to a well-defined term: "Terminating null character (defined in 5.2.1)" Cf. "5.2.1" says "a byte with all bits set to 0, called the null character, ..." 6. Necessary rationale for the changes of some environmental limits @ "7.13.6.1 The fprintf function" Environmental limit The minimum value for the maximum number of characters produced by any single conversion is changed from 509 (in the current ISO/IEC 9899:1990) to 4095. And also, some of other environmental limits are changed from ISO/IEC 9899:1990. Please describe a clear rationale for these changes. The above change request had already presented as a part of Japan's comments attached to CD approval vote. The disposition of this comment by the committee was as follows: "This was accepted as an editorial change to the Rationale." (Please refer SC22 N 2792.) However, the latest drafted Rationale SC22/WG14 N 850 does not have any description about the change of environmental limit pointed by Japan's comment. Therefore, Japan decided to present the same comment as already submitted at CD approval ballot. 7. Mathematical notations @ 7.12 Mathematics Many kinds of mathematical notations are used in sub clause 7.12, including unfamiliar one. Are all of these notations defined in some ISO standard, or any other standard? If so, please add the document name and number to "2. Normative references." If not, please add explanations and definitions of each mathematical notation to the annex. 8. The nearbyint function: Add reference to Annex F @ 7.12.9.3 the nearbyint functions, Description Add the reference "(see F.9.6.3 and F.9.6.4)" to the description of the nearbyint function as rint functions. 9. Inappropriate sentences @ 7.3.1 Introduction, 5th paragraph [#5] Notwithstanding the provisions of 7.1.3, a program is permitted to undefine and perhaps then redefine the macros complex, imaginary, and I. @ 7.16 Boolean type and values , 4th paragraph [#4] Notwithstanding the provisions of 7.1.3, a program is permitted to undefine and perhaps then redefine the macros bool, true, and false. The sentence "Notwithstanding... a program is permitted to undefine and perhaps then..." is not appropriate to describe the programming language standard. Please modify the description by using well defined verbs or auxiliary verbs, e.g. "shall", "may" and so on. For example, "a program may undefine and then redefine..." seems to be appropriate. 10. Change the definition of active position into the original words of C90 @ 5.2.2 Character display semantics, 1st paragraph France changed the description of the active position along the comment as follows: [#1] The active position is that location on a display device where the next character output by the fputc or fputwc functions would appear. The intent of writing a printable character (as defined by the isprint or iswprint function) to a display device is to display a graphic representation of that character at the active position and then advance the active position to the next position on the current line. This change lacks careful considerations about a treatment of a character, a wide character, a byte-oriented stream and a wide-oriented stream. It is necessary to distinguish carefully a character from a wide character, and a byte-oriented stream from a wide-oriented stream. It is also necessary to consider a mixed-up stream of byte-oriented and wide-oriented. As a result of the easy change, the above description is including incorrect descriptions. One of the examples of incorrect descriptions is "a printable character (as defined by ... iswprint function.)" The iswprintf defines the printable *wide* character, not the printable character. Therefore, the sentence should be changed back into the original words by removing "fputwc" and "iswprint" as follows: [#1] The active position is that location on a display device where the next character output by the fputc function would appear. The intent of writing a printable character (as defined by the isprint function) to a display device is to display a graphic representation of that character at the active position and then advance the active position to the next position on the current line. If the committee wants to change the definition of the active position from C90, it is necessary more deep discussions about the character issues, and need the agreement with all of the national member bodies. ____ end of Japan Comments; beginning of Norway Comments ___________ -------------------------------------------------------------------- ISO/IEC JTC 1/SC22 Title: Programming languages, their environments and system software interfaces Secretariat: U.S.A. (ANSI) --------------------------------------------------------------------- Please send this form, duly completed, to the secretariat indicated above. --------------------------------------------------------------------- FCD 9899 Title: Information technology - Programming languages - Programming Language C (Revision of ISO/IEC 9899:1990) ----------------------------------------------------------------------- Please put an "x" on the appropriate line(s) Approval of draft _____ as presented __X__ with comments as given below (use separate page as annex, if necessary _____ general __X__ technical _____ editorial _____ Disapproval of the draft for reasons below (use separate page as annex, if necessary _____ Acceptance of these reasons and appropriate changes in the text will change our vote to approval _____ Abstention (for reasons below -------------------------------------------------------------------- P-member voting: NORWAY Date: 1999-01-08 Signature (if mailed): ULF LEIRSTEIN --------------------------------------------------------------------- Comment 1. Category: Correction restoring original intent Committee Draft subsection: 5.1.1.2 Title: Translation phases 2 and 4 should be allowed to produce \\u.... In phase 2 and 4, behavior is explicitly undefined if new-line removal or token concatenation produces a sequence like \\u0123, e.g. printf("\\u\ 0123"); I assume this code is intended to be legal, since that \u0123 is not a universal character name. The same can happen with token concatenation in phase 4, though I'm not sure if otherwise well-defined code can produce such sequences. E.g. #define CAT(a,b) a##b CAT(a\\u, 0123) Suggested change (though I hope a more elegant wording is found): In phase 2 (and 4?), after character sequence that matches the syntax of a universal character name append and is not immediately preceded by an uneven number of backslashes On the other hand, is it supposed to be OK for backslash-newline removal to turn a UCN-look-alike into a non-UCN? printf("\\ \u0123"); I don't know if it can cause trouble for a C compiler, but it can confuse tools that process C program lines in a "state-less" way. ------------------------------------------------------------------------ Comment 2. Category: Normative change to intent of existing feature Committee Draft subsection: 6.4.9 Title: Remove `//*' quiet change. The `//*' quiet change in the Rationale, also illustrated by the last example of paragraph 6.4.9, is unnecessary: m = n//**/o + p; // equivalent to m = n + p; (This meant m = n / o + p; in C89.) Suggested change to 6.4.9: New constraint: The first (multibyte) character following the `//' in a `//'-style comment shall not be `*'. Paragraph 2: Append .."and to see if the initial character is `*'". Paragraph 3, last example: Change the note to " // syntax error". If this is rejected, a milder change could be a `Recommended Practice' section which recommends to warn about `//*'. ------------------------------------------------------------------------ Comment 3. Category: Feature that should be included Committee Draft subsection: 6.10, 6.10.3 Title: Allow `...' anywhere in a macro parameter list. Parameters after the `...' in a macro could be useful. The preprocessor does not need to forbid it, since the number of arguments is known when the macro is expanded. Example: extern void foo(char *arg, ...); /* Argument list ends with NULL */ #ifdef ENCLOSE # define foo(..., null) (foo)("{", __VA_ARGS__, "}", null) #endif Suggested changes follow. In 6.10 and A.2.3, add macro-parameter-list: identifier-list identifier-list , ... identifier-list , ... , identifier-list ... ... , identifier-list and replace # define identifier lparen identifier-list-opt ) replacement-list new-line # define identifier lparen ... ) replacement-list new-line # define identifier lparen identifier-list , ... ) replacement-list new-line with # define identifier lparen macro-parameter-list-opt ) replacement-list new-line In 6.10.3 paragraph 4, replace If the identifier-list in the macro definition does not end with an ellipsis, with If the macro-parameter-list does not contain an ellipsis, and replace `identifier-list' with `macro-parameter-list'. In paragraph 10, replace The parameters are specified by the optional list of identifiers, whose scope extends from their declaration in the identifier list with The parameters are specified by the optional macro-parameter-list. Their scope extends from their declarations in the parameter list In paragraph 12, replace If there is a ... in the identifier-list in the macro definition, then the trailing arguments, with If there is a ... in the macro-parameter-list, then arguments from the position matching the ... , In the rest of 6.10.3, replace `identifier list' with `parameter list'. ------------------------------------------------------------------------ Comment 4. Category: Feature that should be included Committee Draft subsection: 6.10.3 Title: Support empty __VA_ARGS__ by adding __VA_COMMA__ Empty __VA_ARGS__ are not currently allowed, though there are things they could express that the current definition cannot handle. Example: #define Msg(type, ...) printf(format[type], __VA_ARGS__) Suggested change: 1. Allow `...' to receive no arguments. 2. Define an implicit macro parameter __VA_COMMA__ or maybe __VA_SEP__ which expands to `,' if there were arguments and otherwise. This allows a possibly-empty __VA_ARGS__ to be used both in front and at the end of a comma-separated list: #define SEND(...) Send( __VA_ARGS__ __VA_COMMA__ (void*)0 ) #define Msg(type, ...) printf(format[type] __VA_COMMA__ __VA_ARGS__) One negative effect is that in macros which need arguments to `...', the error check for whether there are arguments is lost. The programmer must supply an extra argument if he wants that check, and e.g. replace `#__VA_ARGS__' with `#extra_arg #__VA_COMMA__ #__VA_ARGS__'. That does not seem important, since there is no error check on the rest of the arguments in any case. Besides, the error will usually cause a syntax error in translation phase 7. Still, a workaround could append a `?' to the `...' when the `...' may receive an empty argument list, or (uglier in my opinion) only allow an empty argument list if there is a __VA_COMMA__ in the replacement list. I do not know if `foo(EMPTY)' below should expand to or `,': #define foo(...) __VA_COMMA__ #define EMPTY This does not seem important, so it could probably be standardized as whatever is easiest to implement. I believe it should expand to `,'. (It should agree with the __VA_COUNT__ proposal, if that is included.) Changes to the standard: 6.10.3 paragraph 4: Replace Otherwise, there shall be more arguments in the invocation than with Otherwise, there shall be at least as many arguments in the invocation as 6.10.3 paragraph 5: Replace The identifier __VA_ARGS__ with The identifiers __VA_ARGS__ and __VA_COMMA__ 6.10.3.1 - new paragraph 3: An identifier __VA_COMMA__ that occurs in the replacement list shall be treated as if it were a parameter. If no arguments were used to form the variable arguments, __VA_COMMA__ shall receive an empty argument. Otherwise, it shall receive one `,' token. 6.10.3.5p9: Add an example #define run(cmd,...) execlp(cmd, cmd __VA_COMMA__ __VA_ARGS__, NULL) run("man", "cc"); results in execlp("man", "man" , "cc" , NULL); ------------------------------------------------------------------------ Comment 5. Category: Correction, Request for information/clarification Committee Draft subsection: various Title: Clean up character and encoding concepts The definitions of and relationships between characters, character encodings, character sets and C types are scattered throughout the Standard, and are difficult to figure out even after reading through it: Seven concepts for characters and their encodings (plus C types): character, extended character, multibyte character, generalized multibyte character, wide character, wide-character code/value, byte. Eight or nine "character sets": source/execution ch.sets, basic source/execution ch.sets, extended source/execution ch.sets, required source ch.set, encoding of physical source files, (encoding of generalized multibyte characters). Also, the word "character" is used for different concepts: *Encoded* in bytes (like UTF-8 characters), encoded as a single byte, and *enumerated* (as in iso-10646). I'm not sure if there are also "abstract" characters (conceptual entities with a typical graphical form, which I believe is the correct meaning of "character") in the standad. This may be part of the reason for the confusion in discussions about character sets and universal characters names. Note that I don't know if my definitions above are quite correct. As far as I can tell, the basic, extended and maybe required character sets are enumerated, and the rest are encoded. Though if the source character set is encoded, I don't know why translation phase 1 needs to map another character set to that instead of to an enumerated set. (The fact that most required characters can be encoded in one byte doesn't mean the required set can be both encoded and enumerated - an entity can't be both member of an encoded set and an enumerated set.) The different character concepts should be spelled out in _one_ section (5.2.1?), the unqualified word "character" ought to be used for (at most) one of the concepts above, and at least the character sets that use other concepts should be renamed - e.g. source character set -> "encoded source character set or multibyte source character set. This would often lead to cumbersome text, though. The text could be simplified if this notation is added: "Character" prepended with "basic", "extended", "source", "execution", and/or "required" means member of the corresponding character set. Thus, "basic source character" means "member of the basic source character set". If the "source/execution character sets" are renamed to "multibyte source/execution character sets", the same rule can apply to the word "multibyte" -- unless there exist multibyte characters that are not members of the source/execution character sets. Since the basic/extended character sets contain integer codes for characters and the source/execution character sets contain multibyte representations, I suggest that "characters" and "extended characters" are clearly described as integers or numbered entities, and "bytes" and "multibyte characters" as encodings. In any case, please describe all the character types and sets above in one section: * which character concept they use, * the relationship between them: - which types and sets map to each other, - which can be subsets or proper subsets of each other, and which may contain members that do not map to any member in others they relate to (e.g., I believe a byte may have a value which does not map to the source/execution character set), - maybe the character concepts' and character sets' relationship to char, unsigned char, wchar_t and wint_t. The character-related definitions in section 3 and their descriptions/definitions in 5.2.1 should point to each other. The reference to 5.2.1 in 6.2.5p3 must be updated or removed. 5.2.1 contains a jumbled list of encoded characters (in the source/execution sets) and enumerated characters (in the basic source/execution sets). I believe the current definition should describe the basic/extended character sets, and possibly that the null character is encoded as a null byte in the execution character set. (The current words are incorrect, the basic execution character set doesn't contain a null *byte*.) 7.19.6.1p17 and 7.19.6.2p24 incorrectly refer to "multibyte members of the extended character set", and 7.20p3(MB_CUR_MAX) to "bytes in a multibyte character for the extended character set". They should refer to multibyte characters *representing* members of the extended character set, or multibyte members of the (encoded) source/exec. character set. One other detail: considering the definition of "byte" (3.4), the "multibyte character" definition (3.14) means "sequence of one or more addressable units...". I think the correct definition of byte is "either , or bit representation which fits into same". Some suggested changes: 3.4 byte either an addressable unit of data storage large enough to hold any member of the basic character set of the execution environment, or bit representation which fits exactly in a byte. NOTE A byte has no value as such, but it can *encode* a value or part of a value - e.g. a character. When the "value" of a byte is mentioned without a related encoding, one usually means encoding as a binary integer with no sign bit. [ "no sign bit" (or unsigned, but that indicates C unsigned semantics) so 0 can't mean "all bits 1" on 1's complement machines. ] 3.5 character code value (a binary encoded integer [with no sign bit?]) that fits in a byte [ this is equivalent to 7.1.1p4 "wide character", except I wasn't sure if "... that corresponds to a member of the basic character set" could be included. May a byte have a value which is not a valid character? May a character have a value which isn't in the basic character set? ] or a multibyte character or wide character, if this is clear from the context. [ E.g. `the null character' and `space character' often means multibyte characters, in fact null character is defined that way in 5.2.1 - it is a member of an encoded character set. ] About source files: Translation phase 1 maps end-of-line in the physical source character set to newline in the source character set. Thus, the source character set (and the required source character set) in 5.2.1 must include newline. Though it's probably good to keep the reminder in 5.2.1 that physical source files may have some other representation of end-of-line. 5.2.1.2p1 says that for the source and execution character sets, -- A byte with all bits zero shall be interpreted as a null character independent of shift state. -- A byte with all bits zero shall not occur in the second or subsequent bytes of a multibyte character. I believe this means an 8-bit host must either map an UCS-4 physical source file to e.g. a UTF-8 source character set (the opposite of what a normal Unicode application would do), or define the compiler to be running on an emulated machine with 32-bit bytes. If so, why force the implementation through such games? However, I suspect the term "source file" in most places except 5.1.1.2 must be read as "the source after translation phase 1", or maybe "after mapping to the source character set" (possibly except the mapping of end-of-line to newline). Maybe the simplest fix is to replace most occurrences of "source file" with "source code", and define that as suggested above. And/or move 1st part of translation phase 1 to a new phase 0, and define the language in terms of phase 0 output. Usually one can't misunderstand without really trying to, but -- 5.2.1 defines the source character set as the encoding of characters in source files, this definition needs to be correct. 5.2.1.2p2 may contain an exception: For source files, the following shall hold: -- An identifier, comment, string literal, character constant, or header name shall begin and end in the initial shift state. I'm not familiar with shift states: Whether or not this is before or after phase 1, and whether or not that makes a difference. 5.2.1p3 ends with If any other characters are encountered in a source file (except (...)), the behavior is undefined. At least this text should say something like "after translation phase 1" or "mapped to the the source character set". Strings: 7.1.1p1 is slightly incorrect: this is one place where "null character" means "character or multibyte character", since "string" can mean either character string or multibyte string. In 7.1.1p1's sentence The term multibyte string is sometimes used instead to emphasize special processing given to multibyte characters contained in the string or to avoid confusion with a wide string. maybe "contained" should be replaced with "encoded". And to complement it, add: The term character string is sometimes used to emphasize that each byte in the string contains an integer character code (converted to char). Some other matters: Source characters and multibyte characters are defined in terms of bytes, which are defined in terms of the execution environment. Thus, the compilation environment can't have more bits per byte than the execution environment. This looks strange; if it is intended I think it must be emphasized. The ending sentence "If any other characters are encountered in a source file..." in 5.2.1p3 is placed so it actually means "other than in the required *execution* character set". 6.4.3p2/p3 refers to the "required character set"; that should be "required source character set". The Index needs entries for "required source character set" and "generalized multibyte character". 6.2.5p5-p6, D.2.1p1, K.1p1 refer to the "values" of bytes. That can be replaced this with the "contents" of bytes, depending on what you do with the suggested NOTE above about values of bytes. To make clear that e.g. a yen sign may be used for backslash in C, add to this wording of 5.2.1p3: Both the basic source and basic execution character sets shall have at least the following members: the text (though which glyphs these members correspond to on an output device is unspecified) ------------------------------------------------------------------------ Comment 6. Category: Change to existing feature, request for clarification Committee Draft subsection: 5.1.1.2, 6.2.5, 6.4.4, 6.4.5 Title: Failure on source->execution character conversion In translation phase 5, if a source character has no corresponding member in the execution character set, it is converted to an implementation-defined member. 1. Are _all_ source characters lacking a corresponding member in the execution character set converted to the _same_ character? 2. It should be allowed to let any use of the resulting character cause a compile-time error. (It should be legal to let compilation fail if the program may be translated differently than intended, or to let users request this.) Other places where error must then be allowed: 6.2.5p3 If any other character [than a required source character] is stored in a char object, the resulting value is implementation-defined but shall be within the range of values that can be represented in that type. 6.4.5p5 The value of a string literal containing a multibyte character or escape sequence not represented in the execution character set is implementation-defined. 6.4.4.4p2 An integer character constant is a sequence of one or more multibyte characters (...) mapped in an implementation-defined manner to members of the execution character set. [Incidentally, this says only what the encoding is, not what the integer value is. This should be mentioned explicitly, in particular since 6.4.4.4p5-p6 (octal/hex character) does describe the value and not the encoding.] 6.4.4.4p10 The value of an integer character constant (...) containing a character or escape sequence not represented in the basic execution character set, is implementation-defined. 6.4.4.4p11 The value of a wide character constant (...) containing a multibyte character or escape sequence not represented in the extended execution character set, is implementation-defined. ------------------------------------------------------------------------ Comment 7. Category: Request for clarification, or change to existing feature Committee Draft subsection: 6.10.3.2 Title: Consistent stringification of UCNs 6.10.3.2p2 (The # operator) says: it is unspecified whether a \ character is inserted before the \ character beginning a universal character name. Please clarify: Is the implementation allowed to insert `\' in some circumstances but not others? May it do so in a single translation unit? A single preprocessing file? (Also update this point in K.1 Unspecified behavior). If yes: It would be useful if implementations are required to behave consistently in this regard, preferably so that a Configure program can test *once* e.g. how the compiler treats identifiers and how it treats strings, and #define a macro accordingly which will be used when a package is compiled. Unfortunately I'm not sure what kind of "intelligence" one might want to give compilers here, such as to create \\u... inside stringified string/character constants and \u... outside them. ------------------------------------------------------------------------ Comment 8. [Withdrawn] ------------------------------------------------------------------------ Comment 9. Category: Editorial change/non-normative contribution Committee Draft subsection: 6.10.3.5 Title: Preprocessor examples The examples in 6.10.3.5 (macro scopes) would be better placed in a new section 6.10.3.6, since only some examples are related to macro scopes. A few examples with national characters and with universal character names in non-string tokens would be useful. ------------------------------------------------------------------------ Comment 10. Category: Editorial change/non-normative contribution Committee Draft subsection: 6.4.3 Title: Describe the specified Unicode ranges _Describe_ the named characters (or print them if you invert part of the section so that below 000000A0 it says which UCNs are _allowed_), for the sake of people who don't know Unicode well. (I found that 0000D800 through 0000DFFF are "surrogates", but not what that is, or whether it's something a compiler accepting national characters in source files should consider.) ------------------------------------------------------------------------ Comment 11. Category: Editorial change/non-normative contribution Committee Draft subsection: 6.4.4.4 Title: Add note that 'ab' is not a multibyte character To avoid confusion in `Description' at page 63, add a note that the <'ab'> is not one multibyte character. ------------------------------------------------------------------------ Comment 12. Category: Editorial change/non-normative contribution Committee Draft subsection: 7.20.2.2 Title: Use inttypes in example Suggestion: Replace `unsigned long int' with uint_fast32_t in the rand() example. (Or better, remove the example. It's a poor rand() implementation.) ------------------------------------------------------------------------ Comment 13. Category: Request for information/clarification Committee Draft subsection: 5.2.4.1, 6.4.3 Title: Meaning of "character short" identifier What does the term "character short" identifier mean? It is used in 5.2.4.1p1 and 6.4.3p2/p4. If it is an ISO-10646 term, you could copy the definition from there. ------------------------------------------------------------------------ Comment 14. Category: Editorial change/non-normative contribution Committee Draft subsection: 6.4.6, Index Title: digraphs are not defined. The index has a reference "digraphs, 6.4.6", but the word "digraph" does not occur anywhere else. It's a known word and people use it, so: In 6.4.6p3 after "these six tokens", add "(called _digraphs_)" or "(known as _digraphs_)", or say so in a note. ----------------------------------------------------------------------- Comment 15. Category: Request for clarification Committee Draft subsection: 6.10.8 Title: Are __TIME__ and __DATE__ constant? Can __TIME__ and __DATE__ change during the compilation of a single translation unit? If yes, I hope they may not change between seqence points. Otherwise printf("%s at %s", __DATE__, __TIME__) can be inconsistent at midnight. ----------------------------------------------------------------------- - Comment 16. Category: Normative change to intent of existing feature Committee Draft subsection: 7.19.4.2 Title: rename() of/to open files should be implementation-defined On a UNIX filesystem, `rename' removes (or makes unnamed) the _destination_ file if it existed. OTOH, a filesystem where the file's basic handle is the filename may see `rename' as removing the _source_ file. So I believe rename() can have the same problem as remove() if either of the files are open, and in addition I imagine a FILE* opened to the destination file may end up pointing into the new file. Suggested change: Add this to 7.19.4.2, similar to the text in 7.19.4.1: If either of the files are open, the behavior of the rename function is implementation-defined. ------------------------------------------------------------------------ Comment 17. Category: Editorial change/non-normative contribution Committee Draft subsection: 6.7.3 Title: Removing restrict from prototypes 6.7.3p7 says deleting all instances of the [restrict] qualifier from a conforming program does not change its meaning Correction: Replace "program" with "translation unit", or add "and the headers it includes" after "program". (Otherwise one could remove `restrict' below but keep `restrict' in string.h's declaration of strcpy.) #include char *(*fn) (char * restrict, const char * restrict) = strcpy; ------------------------------------------------------------------------ Comment 18. Category: Feature that should be included Committee Draft subsection: 6.3.2.3 Title: C++-style `T** -> T const *const *' conversion Allow `T**' to be converted to `T const *const *' as in C++. This text is based on section conv.qual in the C++ standard: - Text about pointers to members and multi-level mixed pointers has been removed, as well as the definition of _qv-qualification signature_ which isn't used elsewhere. - restrict has been added to the list of qualifiers. - Paragraph 4 about converting functions' prototypes and return types has been added. Possibly this paragraph should be put elsewhere. Editorial note: x{y} below is used as a typographical notation for x subscribted by y. E.g. water would be H{2}O. Suggested change: Replace 6.3.2.3p1-p2 with the following text: Qualification conversions In the following text, each qual{i} or qual{i,j} is zero or more of the qualifiers const, volatile, and restrict. [#1] The value of an expression of type "T qual{1} *" can be converted to type "T qual{2} *" if each qualifier in qual{1} is in qual{2}. [#2] NOTE Function types are never qualified; see [8.3.5 in C++.] [#3] A conversion can add qualifiers at levels other than the first in multi-level pointers, subject to the following rules:(->footnote) Two pointer types T{1} and T{2} are _similar_ if there exists a type T and integer n > 0 such that: T{1} is T qual{1,n} *qual{1,n-1} ... *qual{1,1} *qual{1,0} and T{2} is T qual{2,n} *qual{2,n-1} ... *qual{2,1} *qual{2,0} An expression of type T{1} can be converted to type T{2} if and only if the following conditions are satisfied: -- the pointer types are similar. -- for every j > 0, each qualifier in qual{1,j} is in qual{2,j}. -- if qual{1,j} and qual{2,j} are different, then const is in every qual{2,k} for 0 < k < j. [Note: if a program could assign a pointer of type T** to a pointer of type const T** (that is, if line //1 below was allowed), a program could inadvertently modify a const object (as it is done on line //2). For example, int main() { const char c = 'c'; char* pc; const char** pcc = &pc; //1: not allowed *pcc = &c; *pc = 'C'; //2: modifies a const object } --end note] [#4] A function pointer of type FT{1} can be converted to a function pointer of type FT{2} if - their return types are compatible types, or similar pointer types where the return type of FT{1} can be converted to the return type of FT{2}, - and their respective parameters are compatible types, or similar pointer types where the parameter of FT{2} can be converted to the type of the equivalent parameter to FT{1}. __________________ (footnote) These rules ensure that const-safety is preserved by the conversion. ------------------------------------------------------------------------ Comment 19. Category: Clarification, Request for clarification Committee Draft subsection: various Title: Terms like "unspecified _result_" are used but not defined The standard uses undefined terms like `implementation-defined value', `unspecified result', and `undefined conversion state'. Clause 3 only defines unspecified, implementation-defined and undefined *behavior*. The term `indeterminate' is important, but the closest to a definition of it is a mention in passing in 3.18 (undefined behavior). Suggested changes: Change 3.19p1 _unspecified behavior_ behavior where ... and 3.11p1 _implementation-defined behavior_ unspecified behavior where ... to 3.19p1 _unspecified_ aspect of the language where ... and 3.11p1 _implementation-defined_ unspecified aspect where ... However, keep `behavior' in the definition of `undefined behavior', since that always *is* behavior. Add a section 3.* _indeterminate_, and add `indeterminate' to the index. Mention that an indeterminate value can be a trap representation (and that a trap representation is considered a value, as in 6.2.6.1p5). 6.2.6.1p4 seems to contradicts the above: "Certain object representations need not represent a value of the object type (...) such a representation is called a trap representation." Replace "value" above with "valid value". In 7.8.1p6, change `may be left undefined' to `need not be #defined' or change `undefined' to `un#defined'. In the following places, remove "behavior" or change it to "aspect(s)" (except when applied to `undefined' if you wish to be that pedantic). 3.11p2 EXAMPLE An example of implementation-defined behavior 3.19p2 EXAMPLE An example of unspecified behavior 4p3 A program (...) containing unspecified behavior 4p5 output dependent on any unspecified, undefined, or implementation-defined behavior D.1p1 involves unspecified or undefined behavior D.2p1 involves undefined or unspecified behavior K.1 Unspecified behavior K.3 Implementation-defined behavior Index implementation-defined behavior, 3.11, K.3 unspecified behavior, 3.19, K.1 Contents K.1 Unspecified behavior ......................... 566 K.3 Implementation-defined behavior .............. 585 In the following places, change "undefined" to "indeterminate" or change the text to say "undefined behavior": Note 58 to 6.5p2 This paragraph renders undefined statement expressions such as Note 80 to 6.5.9p6 the effect of subsequent comparisons is also undefined. 7.12.3p1 The result is undefined if an argument is not of real floating type. 7.24.6.3.2p4, 7.24.6.3.3p4, 7.24.6.4.1p4 and 7.24.6.4.2p4: the conversion state is undefined. [Maybe this should be `unspecified'.] D.1p2 expressions that are not determined to be undefined Note 288 to D.4.3p3 If it were undefined to write twice to the flags D.5p27 and D.5p31 and so the expression is undefined. ------------------------------------------------------------------------ Comment 20. Category: Request for clarification Committee Draft subsection: various Title: Some unspecified cases are unclear In some unspecified and implementation-defined cases, it is unclear which options the implementation has to choose from. Suggested changes: In 3.19, add notes - with reference to 4p3 and 5.1.2.3 that an unspecified aspect will not cause failure (i.e. produce a trap representation or undefined behavior). Even knowledgeable people missed that when I asked. - that the implementation must make sensible (or maybe "unsurprising" choices for unspecified aspects. What are the possible behaviors in the following places? 6.5.7p5 The result of E1 >> E2 is E1 right-shifted E2 bit positions (...) If E1 has a signed type and a negative value, the resulting value is implementation-defined. What are the choices here? 7.19.4.4p3 If it [the tmpnam function] is called more than TMP_MAX times, the behavior is implementation-defined. Are there any particular choices? 7.19.7.11p5 the value of its file position indicator after a successful call to the ungetc function is unspecified Are there any particular choices, except it must be an fpos_t which is not a trap representation? 5.2.2p1, 5.2.2p2(\b,\t,\v) the behavior is unspecified Is only the output unspecified, or may this also affect program execution? E.g. may ferror() be set? Other problems: 6.3.2.3p5 An integer may be converted to any pointer type. The result is implementation-defined, might not be properly aligned, and might not point to an entity of the referenced type. This is an unspecified aspect which may cause undefined behaviour, which I'm told is not supposed to happen. I think it should be something like this: If the integer resulted from converting a pointer [to void?] to intptr_t, uintptr_t [or a wider integer type?], and the pointed-to object is still live, the result is a pointer to that object (see 7.18.1.4). Otherwise the result is indeterminate, but the implementation shall document it (->footnote XXX). [footnote XXX] An integer converted to a pointer might not be properly aligned, and might not point to an entity of the referenced type. ------------------------------------------------------------------------ Comment 21. Category: Defect/Request for clarification Committee Draft subsection: various Title: The standard implies that pointer = address = integer This can be deduced as follows: * C addresses are integers: 3.1p1 mentions "addresses that are particular multiples of a byte address". Except to mathematicians, only integers are "multiples". * C addresses are pointers (most of the Standard seems to think so): 6.5.3.2p3 says that "address-of" operator returns a pointer. This operator's name is the clearest indication I can find of what the relationship is between pointers and addresses. 6.5.2.5p10 says about "int *p; (...) p = (int [2]){*p}" that "p is assigned the address of the first element of an array (...)". * The implementation's hardware address concept is the same as C's address concept: This is implied by omission, in particular in 3.1p1. Readers should not be expected to know that the hardware address concept can be very different from the common integer-like address concept, so they may not imagine any reason to not trust hardware addresses to be like C addresses. This must be corrected. C's `address' and `pointer' concepts need explicit definitions and Index entries. The relationship between them, and between C addresses and hardware addresses, must be stated. 3.4p2 says "It is possible to express the address of each individual byte of an object uniquely." Maybe this is the right wording: Pointers _express_ addresses? Or maybe this wording too is wrong - I prefer to think of pointers as handles to objects and nothing else, and that anything about addresses just say something about their implementation. ------------------------------------------------------------------------ Comment 22. Category: Clarification/Correction restoring original intent Committee Draft subsection: 5.1.1.2 Title: Pragmas are lost after translation phase 4 Pragmas are executed and then deleted in phase 4, so phases 4-6 haven't really got an approved way to pass on pragmas that take effect in later phases. The same applies to e.g. letting phase 1 notice and pass on the character set name of the source file. Suggested change: Add this paragraph to 5.1.1.2: The output from a translation phase may contain extra information which is transparent to the rules [or "to the grammar"?] described in this standard. ------------------------------------------------------------------------ Comment 23. Category: Request for clarification/correction Committee Draft subsection: 7.19.7.3, 7.19.7.11 Title: "character" arguments to fputc and ungetc The range of valid arguments to fputc() and ungetc() is unclear. They are only defined for _character_ arguments, so non-character arguments produce undefined behaviour (7.19.7.3p2, 7.19.7.11p2/p4). However, a character is a typeless bit representation according to 3.5. The most natural interpretation of 3.5 is that the argument must be an integer which fits in an unsigned char. The behavior of fputc() should be defined at least for - all `unsigned char' values, so one can fputc the result of fgetc. - all `char' values, to avoid breaking a _lot_ of programs that do e.g. putchar(*string) instead of putchar((unsigned char)*string). fputc should convert char arguments to unsigned char as it does now. I do not know if it should also be defined for all signed char values, or all `int' values. This does not work for ungetc, since a char value equal to EOF has a special meaning. Yet ungetc too converts its argument to unsigned char, so apparently ungetc(ch) where ch is a char is intended to work. Please state the valid range of the character argument to ungetc. Add a note that even though ungetc converts its argument to unsigned char, the application should still convert any char arguments to unsigned char "by hand", in case the char value == EOF. ------------------------------------------------------------------------ Comment 24. Category: Request for clarification/Correction Committee Draft subsection: 7.19.*, 5.2.4.2.1 Title: Problem with non-two's complement char If INT_MAX - INT_MIN < UCHAR_MAX (e.g., one's complement with sizeof(int) == 1), there is one byte value (UCHAR_MAX?) which cannot be read and written to the stream: - The function call `fputc(u_ch, f)' converts u_ch to int, and then fputc converts it back to unsigned char and writes that to the stream. This will typically convert the bit pattern for negative zero t0 zero, I'm not sure if it can do it with some other value. - Similarly, fgetc(f) converts the unsigned char on the stream to int and returns that, and I'm not even sure it is at liberty to return "negative zero" so the application could do some hack to notice it. - 7.19.2p3 may be taken to forbid this (Data read in from a binary stream shall compare equal to the data that were earlier written out to that stream), if it is read as 'fwrite() the data, fread() it back, then memcmp() shall return 0'. However, the 'data read' and 'data written' can also be taken to mean the output from fgetc() and the input to fputc(), which have already been through the conversion to int. The intent here must be clarified. If the intent is that INT_MAX - INT_MIN >= UCHAR_MAX, 5.2.4.2.1 would be a better place to state it. It seems strange to have to search the library section for such basic restrictions on the language. ____ end of Norway Comments; beginning of United Kingdom comments ____ The UK votes NO on CD2/FCD1. To change the UK vote to YES: The following major issues must be addressed: side effects in VLAs write-once-read-many changes to restrict various issues with floating point For each UK comment listed in column 1 of the table below, the described change or an equivalent must be made. For each UK comment listed in column 2 of the table below, the described change or an equivalent must be made or the UK National Body must be satisfied that there is good reason to not make the change. The UK comments listed in column 3 of the table below must be properly considered and a reasonable response made. The changes in these comments are not mandatory provided that such reponses are made. The procedural matters described below must be addressed. Column 1 Changes that must be applied PC-UK0201 [1] PC-UK0244 PC-UK0246 PC-UK0248 PC-UK0272 PC-UK0273 PC-UK0277 PC-UK0278 PC-UK0286 PC-UK0287 Column 2 Comments that must be addressed PC-UK0209 [2] PC-UK0214 [3] PC-UK0222 [4] PC-UK0232 PC-UK0245 PC-UK0254 PC-UK0274 PC-UK0275 PC-UK0279 PC-UK0281 PC-UK0282 PC-UK0283 PC-UK0284 PC-UK0285 PC-UK0261 PC-UK0262 PC-UK0269 PC-UK0270 Column 3 Other comments PC-UK0227 [5] PC-UK0249 PC-UK0251 PC-UK0252 PC-UK0256 PC-UK0257 PC-UK0276 PC-UK0263 PC-UK0264 PC-UK0265 Notes 1 We do not believe that WG14 addressed the issue actually raised. 2 The UK does not consider this to be a a new feature, but a minor though useful enhancement to an existing one. 3 Previous objection to this item was based on problems with realloc. Now that the latter has been redefined to create a new object, thi item should be reconsidered. 4 The UK considers that code affected by this item is almost certain to be erroneous, and feels that it is important that it is addressed. 5 This item would clarify the meaning of bit-fields, and in particular that they cannot be wider than specified. This response also assumes that the following items of SC22/WG14/N847 have been accepted as is or with editorial changes: 4, 8, 10, 19, 20, 21, 33, 36, 43. Otherwise these items should be added to column 1 of the above table. Procedural issues WG14 failed to provide comprehensible responses to a number of matters raised in the UK comments to CD1. To the extent that those comments are not subsumed by other parts of this response, they are required to be addressed in a way that allows the UK to determine whether the WG14 responses are acceptable, and therefore form a part of this submission. The UK comments in question are: PC-UK0079 PC-UK0082 PC-UK0083 PC-UK0085 PC-UK0086 PC-UK0088 PC-UK0089 PC-UK0090 PC-UK0091 PC-UK0092 PC-UK0093 PC-UK0094 PC-UK0095 PC-UK0096 PC-UK0097 PC-UK0098 PC-UK0102 PC-UK0106 PC-UK0108 PC-UK0112 PC-UK0114 PC-UK0117 PC-UK0118 PC-UK0120 PC-UK0122 PC-UK0126 PC-UK0129 PC-UK0130 PC-UK0133 PC-UK0134 PC-UK0135 PC-UK0137 PC-UK0138 PC-UK0141 PC-UK0142 PC-UK0143 PC-UK0144 PC-UK0147 PC-UK0150 PC-UK0151 PC-UK0152 PC-UK0153 PC-UK0154 PC-UK0155 PC-UK0156 PC-UK0158 PC-UK0159 PC-UK0161 PC-UK0162 PC-UK0163 PC-UK0164 PC-UK0165 PC-UK0171 Side effects in VLAs The UK requires the issue of side effects in variably-modified type declarations and type names to be addressed. A number of proposals have previously been produced to this end, such as those in PC-UK0226 and PC-UK0250. The minimum requirement is that, for any given piece of code, the code either violates a constraint, or else all implementations produce the same result (in the absence of any unspecified behaviour in the code not related to the use of variably-modified types). In particular, it is not acceptable for side effects to be optional. The UK preference is to have side effects work normally in variably-modified types. It would be acceptable for a constraint to forbid certain operators (such as ++ and function call) within array sizes within a sizeof expression. Changes to restrict There are four basic issues to be addressed: 1.The current specification of restrict disallows aliasing of unmodified objects, which renders some natural and useful programs undefined without promoting optimization. This is contrary to the prior art in Fortran. 2.If a restricted pointer points to an object with const-qualified type, the current specification allows casting away the const qualifier, and modifying the object. Disallowing such modifications promotes optimization as illustrated in example F below. 3.The current specification does not address the effect of accessing objects through pointers of various types, all based on a restricted pointer. In particular, these objects are supposed to determine an array, but the element type is not specified. 4.The specification of realloc now states that the old object is freed, and a new object is allocated. The old and new objects cannot, in general, be viewed as being members of an array of such objects. With the current specification, this appears to prohibit the use of the restrict qualifier for a pointer that points to an object that is realloc'd. There are also related issues for dynamically allocated linked lists. The following changes would address these, though it is accepted that further discussion may be able to improve them. In these changes, new text is in bold and removed text in italics. In 6.7.3.1, amend paragraph 1 to read: Let D be a declaration of an ordinary identifier that provides a means of designating an object P as a restrict-qualified pointer to objects of type T. Change paragraph 4 to: During each execution of B, let A be the array object that is determined dynamically by all references through pointer expressions based on P. Then all references to values of A shall be through pointer expressions based on P. Let L(X,B) denote the set of all lvalues that are used to access object X during a particular execution of B. If T is const-qualified, and the address of one lvalue in L(X,B) is based on P, then X shall not be modified during the execution of B. If T is not const-qualified, the address of one lvalue in L(X,B) is based on P, and X is modified during the execution of B, then the addresses of all lvalues in L(X,B) shall be based on P. The requirement in the previous sentence applies recursively, with P in place of X, with each access of X through an lvalue in L(X,B) treated as if it modified the value of P, and with other restricted pointers, associated with B or with other blocks, in place of P. Furthermore, if P is assigned the value of a pointer expression E that is based on another restricted pointer object P2, associated with block B2, then either the execution of B2 shall begin before the execution of B, or the execution of B2 shall end prior to the assignment. If these requirements are not met, then the behavior is undefined. Alternative wording for the last new sentence ("The requirement ...") is: If X is modified, the requirement in the previous sentence applies recursively: P is treated as if it were itself modified and replaces X in L(X,B), then the same condition shall apply to other restricted pointers, associated with B or with other blocks, in place of P in the previous sentence. Fina ly, WG14 may wish to consider the following additional change (rationale is available separately). In 6.7.5.3 paragraph 6, change: A declaration of a parameter as "array of type" shall be adjusted to "pointer to type", to: A declaration of a parameter as "array of type" shall be adjusted to "restrict-qualified pointer to type", and add a new paragraph after paragraph 6: As far as the constraints of restrict-qualification are concerned (6.7.3.1), a parameter that is a complete array type shall be regarded as a pointer to an object of the complete array size; for all other purposes, its type shall be as described above. Issues with floating point Floating-point Unimplementabilities and Ambiguities The UK comments on CD1 included a large number of comments on CD1 that have not been addressed in the FCD. Discusssions on the reflector indicate that many of the new features in the language are intended to make sense only if Annex F or Annex H are in force. This is not reasonable, not least because it makes the main body of the standard meaningful only in the context of an informative annex. It is not reasonable to claim that such problems do not matter because they cannot be shown in strictly conforming programs. The same applies to the new features in their entirety, because they are defined only in certain implementation defined cases. And the same applies to almost all error and exception handling, even in C89. The list of architectures which will have major trouble with the new proposals includes the IBM 360/370/390 (including the Hitachi S-3600 and others), the NEC SX-4, the DEC VAX, the Cray C90 and T90, the Hitachi SR2201, the DEC Alpha (to a certain extent) and almost certainly many others. Implementors on these will interpret the standard in many, varied and incompatible ways, because they CANNOT implement the current wording in any way that makes sense. For similar reasons, these new features are impossible to use in a portable program, because it is not possible to determine what they mean, unless __STD_IEC_559__ is set. This is not reasonable for features defined in the main body of the standard. The standard must be improved so that all such arithmetic-dependent features are shielded in some suitable way: by a predefined preprocessor macro, moved to an optional annex, defined so that all reasonable implementations can support them, or defined to permit an implementation to indicate that they are not supported. It does not really matter which approach is adopted. The following suggestions should remove the worst of the problems, mostly using the last approach. In most cases, they are trivial extensions that merely permit the implementor to return an error indication if the feature cannot be provided, or wording to change ill-defined behaviour into implementation-defined behaviour. 7.6 Floating-point environment Item A The C standard should not place constraints on a program that are not determinable by the programmer, and the second and third items of paragraph 2 do. Many systems use floating-point for some integer operations or handle some integer exceptions as floating-point - e.g. dividing by zero may raise SIGFPE, and integer multiplication or division may actually be done by converting to floating-point and back again. Either the clause "or unless the function is known not to use floating point" should be dropped in both cases, or a determinable condition should be imposed, such as by adding the extra item: - any function call defined in the headers or defined elsewhere to access a floating-point object is assumed to have the potentia l for raising floating-point exceptions, unless the documentation states otherwise. This requires most of the functions in to handle exceptions themselves, if they use floating-point, but that is assumed by much existing code. It has the merit of at least being determinable, which the existing wording isn't. Item B There is another serious problem, even on systems with IEEE arithmetic, in that the interaction of the flag settings with setjmp/longjmp is not well-defined. Should they have the value when setjmp was invoked, when longjmp was called, or what? Worse still, the current wording does not forbid setjmp to be invoked with non-standard modes and longjmp called with default ones, which won't work in general. Another item of paragraph 2 should be added: - if the setjmp macro is invoked with non-default modes, the behaviour is undefined. The values of the exception flags on return from a setjmp macro invocation are unspecified. Item C A related one concerns the case where a function with FENV_ACCESS set on calls one with FENV_ACCESS set off - the current wording implies that the latter must set the flags precisely for the benefit of the former, which is a major restriction on implementors and makes a complete mockery of footnote 163. The second last sentence of paragraph 2 should be changed to: If part of a program sets or tests flags or runs under non-default mode settings, .... 7.6.2 Exceptions These specifications do not allow the implementation any way to indicate failure. This is (just) adequate for strict IEEE arithmetic, but is a hosta ge to fortune and prevents their use for several IEEE-like arithmetics. All such implementations can do is to not define the macros, thus implying that they cannot support the functions, whereas they may be able to support all reasonable use of the functions and merely fail in some perverse circumstances. All of these functions (excluding fetestexcept) should be defined with a return type of int, and to return zero if and only if the call succeeded. 7.6.3.1 The fegetround function What happens if the rounding mode is none of the ones defined above, or is not determinable (as can occur)? The following should be added to the end of paragraph 3: If the rounding mode does not match a rounding direction macro or is not determinable, a negative value is returned. 7.6.3.2 The fesetround function Many existing C implementations and even more numerical libraries have the property that they rely on the rounding mode they are called with being the one they were built for. To use a different rounding mode, the user must link with a separate library. The standard should permit an implementation to reject a change if the change is impossible, as distinct from when it does not support that mode at all. Paragraph 3 be simplified to: The fsetround function returns a nonzero value if and only if the requested rounding direction has been established. Note that this enables the example to make sense, which it doesn't at present. 7.6.2 Environment Exactly the same points apply as for 7.6.2 Exceptions above for all the functions (excluding feholdexcept), and exactly the same solution should be adopted. 7.12 Mathematics Item A A major flaw in paragraphs 4 and 5 is that there is no way of specifying an infinity or a NaN for double or long double, unless float supports them. While this is the common case, C9X does not require it and it is not reasonable to do so. In particular, the NEC SX-4 supports IEEE, Cray and IBM arithmetics, and there are a lot of IEEE systems around which have non-IEEE long double, and this case cannot be fully supported, either. 'float' should be changed to to 'double' in paragraph 4 and the following should be added to it: The macros INFINITY_F INFINITY_L are respectively float and long double analogs of INFINITY. 'float' should be changed to to 'double' in paragraph 5 and the following should be added to it: The macros NAN_F NAN_L are respectively float and long double analogs of NAN. Item B The classification macros are inadequate to classify all numbers on many architectures - for example, the IBM 370 has unnormalised numbers and the DEC VAX has invalid ones (i.e. not NaNs.) 5.2.4.2.2 and 7.6 permit other implementation-defined values, but this section does not. The following should be appended to paragraph 6: Additional floating-point classifications, with macro definitions beginning with FP_ and an uppercase letter, may also be specified by the implementation. 7.12.1 Treatment of error conditions I have no idea what "without extraordinary roundoff error" means, and I have been involved in the implementation and validation of mathematical functions over 3 decades. My dictionary contains 5 definitions of "extraordinary", most of which might be applicable, and I know at least 3 separate meanings of the term "roundoff error" in the context of mathematical functions. The following paragraph should be added: If a function produces a range error to avoid extraordinary roundoff error, the implementation shall define the conditions when this may occur. 7.12.3.1 The fpclassify macro As mentioned above, the current wording forbids an implementation from correctly classifying certain IEEE, IBM 370 and DEC VAX numbers. The first sentence of paragraph 2 should have appended: ..., or into another implementation-defined category. 7.12.3.2 The signbit macro The wording of this is seriously flawed. It says that it returns the sign of a number, but is clearly intended to test the sign bit, and these are NOT equivalent. IEE 754 states explicitly that it does not interpret the sign of NaNs (see section 6.3), and the VAX distinguishes zeroes from reserved operands (not NaNs, but something much more illegal) by the sign bit. And there is nowhere else in C9X that requires the sign of a floating-point number to be held as a bit - surely people have not yet forgotten ones' and twos' complement floating point? Paragraphs 1 and 2 should be replaced by: For valid non-zero values (including infinities but not NaNs), the signbit macro returns nonzero if and only if the sign of its argument is negative. For zeroes and NaNs when __STD_IEC_559__ is defined, the signbit macro returns nonzero if and only the sign bit of the value is set. For zeroes and NaNs when __STD_IEC_559__ is not defined, the signbit macro returns nonzero for an implementation-defined set of values and zero otherwise. 7.12.11.1 The copysign functions What does "treat negative zero consistently" mean? Does IBM 370 arithmetic do it? Does VAX? Does Intel? Does Cray? Does IEEE? The sentence "On implementations ... the sign of zero as positive." should be replaced by one or the other of the following: Unless __STD_IEC_559__ is defined (see Annex F), it is implementation-defined whether any representations of zero are regarded as negative by this function and, if so, which. or: The copysign functions shall interpret the sign of zeroes in the same way that the signbit macro (7.12.3.2) does. Floating-point Incompatibilities with Other Standards The UK comments on CD1 included a large number of comments on CD1 that have not been addressed in the FCD with regard to compatibility with IEC 60559 (IEEE 754) and ISO/IEC 10967-1 (LIA-1). It is not reasonable to claim that such problems do not matter because they cannot be shown in strictly conforming programs. The same applies to almost all of the trickier aspects of C89 and C9X floating-point support. The responses stated that the intention is compatibility only with a subset of those standards, but those standards do not always allow the subsetting requires by C9X. Furthermore, the statement is not true in all cases, and it is impossible for an implementation to conform to both standards simultaneously. The standard must be improved so that an implementation can reasonably satisfy both standards simultaneously, in all aspects where C9X claims that it is compatible with the other standards. Where this is not possible, C9X should admit the fact in so many words or provide a mechanism for alternate implementation. There is also the major point that C9X can and should specify syntax for such support, in case where this would avoid implementations providing it incompatibly. This will then reduce problems if C wishes to support the feature properly at a later revision. A precedence for this is the signal handling, which effectively defines syntax but leaves the semantics almost completely undefined. Furthermore, there are many places where C9X makes it unnecessarily difficult to satisfy the other standards, and where minor changes would have major benefits. These should be improved, and the forthcoming ISO/IEC 10967-2 (LIA-2) should also be considered in this respect. The following suggestions should remove the worst of the problems, but are by no means a complete solution. 5.2.4.2.2 Characteristics of floating types Paragraph 5 doesn't define precisely what the rounding mode terms mean, and there are many possible interpretations (especially of nearest rounding for numbers that fall precisely between two values.) Note that this is specified by IEC 60559 but explicitly not by ISO/IEC 10967-1. However, the latter requires the rounding method to be documented in full (see section 8, paragraph f.) The following should be added to the end of the last sentence: Unless __STD_IEC_559__ is defined (see Annex F), the precise definition of these rounding modes is implementation-defined. 7.3.2 Conventions This comment is not strictly an incompatibility, but is about wording likely to cause such problems. The current description of errno handling is so confusing that it could be interpreted that errno is unpredictable on return from a complex function. This cannot be the intention. The second sentence should be replaced by: An implementation may define domain and range errors, when it will set errno to EDOM and ERANGE and the result to an implementation defined value, but is not required to. 7.6 Floating-point environment 7.12.1 Treatment of error conditions Annex F IEC 60559 floating-point arithmetic Annex H Language Independent Arithmetic One of the assumptions in the IEC 60559 model is that exception flags will eventually be either cleared or diagnosed, and this is required by ISO/IEC 10967-1. Fortran does not specify what may be written to 'standard error', but C does, and many vendors regard the standard as forbidding them from issuing diagnostics in this case. H.3.1.1 states that C permits an implementation to do this, but provides no hint as to how. Furthermore, there is no implication in the standard that floating-point exception flags must have any particularvalues at any time. The following should be added to 7.6: If any of the exception flags are set on normal termination after all calls to functions registered by the atexit function have been made (see 7.20.4.3), and stderr is still open, the implementation may write some diagnostics indicating the fact to stderr. If this is not done, then Annex H must be corrected, or clarified to explain how such a diagnostic can be produced by a conforming implementation. 7.12 Mathematics F.2.1 Infinities, signed zeroes and NaNs F.3 Operators and functions Section 7.12 paragraphs 5 and 6 and F.3 are seriously incompatible with the spirit of IEC 60559, and are in breach of IEEE 754 section 6.2, by not providing any way to define a signalling NaN or test for whether a NaN is signalling or quiet. In particular, an implementation cannot extend the fpclassify function to conform to both standards, because C9X requires it to classify both signalling and quiet NaNs as FP_NAN, and IEC 60559 requires it to distinguish them. Furthermore, the current C9X situation does not allow a programmer to initialise his data to signalling NaNs (as recommended by IEEE 754). It is perfectly reasonable not to define the behaviour of signalling NaNs in general, but it is not reasonable to be unnecessarily hostile to IEC 60559. At the very least, there should be a macro NANSIG for creating one, which can be used in initialisers, and a macro FP_NANSIG for flagging one. There are also implementation difficulties with the wording of fpclassify as it stands, especially since it may need to copy its argument, and this is not always possible for signalling NaNs. 7.12 should have the extra paragraph: The macro NANSIG is defined if and only if the implementation supports signalling NaNs for the double type. It expands to a constant expression of type double representing an implementation-defined signalling NaN. If defined, NANSIG may be used as an initializer (6.7.8) for an object of semantic type double; no other semantics for NANSIG values are defined by this standard. The macros NANSIG_F NANSIG_L are respectively float and long double analogs of NANSIG. Note that it is not possible to have solely a float value of NANSIG, because of the constraints on copying signalling NaN values. 7.12 paragraph 6 should define the extra symbol: FP_NANSIG and add the extra sentence: This standard does not specify whether the argument of fpclassify is copied or not, in the sense used by IEC 60559. F.2.1 paragraph 1 needs replacing by:: The NAN, NAN_F, NAN_L, NANSIG, NANSIG_F, NANSIG_L, INFINITY, INFINITY_F an INFINITY_L macros in provide designations for IEC 60559 NaNs and infinities. F.3 last paragraph (starting "The signbit macro") should be simplified by the omission of the exclusion in brackets - i.e. "(except that fpclassify does not distinguish signalling from quiet NaNs)". 7.12.1 Treatment of error conditions Paragraph 2 is in clear conflict with the stated intention of IEC 60559 and ISO/IEC 10967-1, and actually prevents an implementation from conforming to both C9X and the whole of ISO/IEC 10967-1 simultaneously. Despite this, H.3.1.2 Paragraph 1 claims that the C standard allows "hard to ignore" trapping and diagnostics as an alternative form of notification (as required by ISO/IEC 10967-1), but it specifically FORBIDS this in many routines of the library that provide the ISO/IEC 10967-1 functionality (as described in H.2.3.2). This is unacceptable. While there are many possible solutions, this problem is extremely pervasive, and most of them would involve extensive changes to C9X. However, SOMETHING needs to be done, and the following are possibilities: 1.To remove the erroneous claims of ISO/IEC 10967-1 support from Annex H. 2 To define a pragma to select between the mode where errno is returned and modes where ISO/IEC 10967-1 "hard to ignore" trapping and diagnostics are used. Unfortunately, the changes would be extensive. 3.To add the following to 7.12.1: An implementation shall provide a mechanism for programs to be executed as described above. It may also provide a mechanism by which programs are executed in a mode in which some or all domain and range errors raise signals in an implementation-defined fashion. Recommended practice If domain errors raise a signal, the signal should be SIGILL. If range errors raise a signal, the signal should be SIGFPE. It should be possible for the program to run in a mode where domain errors and range errors that correspond to overflow raise signals, but range errors that correspond to underflow do not. Alternatively, people might prefer to use SIGFPE for both classes of error; there are arguments both ways, and either choice is reasonable. F.9 Mathematics Paragraph 4 is seriously incompatible with the spirit of IEC 60559 and the wording of ISO/IEC 10967-1. Note that 7.12.1 paragraphs 2 and 3 permits an implementation to define additional domain and range error conditions, but this section does not. Paragraph 4 should be changed to: The invalid exception will be raised whenever errno is set to EDOM. Subsequent subclauses of this annex specify other circumstances when the invalid or divide-by-zero exceptions are raised. There is also a possible ambiguity in paragraphs 5 and 6, and a problem caused by cases where the implementation may define extra range errors as permitted by 7.12.1. It should be closed by adding the following: Whenever errno is set to ERANGE, at least one of the divide-by-zero, overflow or underflow exceptions shall be raised. H.3.1 Notification alternatives H.3.1.2 Traps ISO/IEC 10967-1 6.1.1 point (c) requires the ability to permit the programmer to specify code to compensate for exceptions if trap-and-resume exception handling is used. C does not permit such code, but H.3.1 paragraph 4 claims that it does. In particular, there is no way t o return a corrected value after a numeric (SIGFPE) signal. Paragraphs 4 of H.3.1 and H.3.1.2 must be corrected, so that they do not claim that C9X supports ISO/IEC 10967-1 trap-and-resume exception handling. H.3.1.2 paragraph 4 claims that the C standard allows trap-and-terminate as well as trap-and-resume. This is not true, either, as C9X stands. In particular, it does not permit it with exponentF and scaleF implemented using logb and scalbn etc. Either such termination must be permitted, or paragraphs 4 of H.3.1 and H.3.1.2 must be corrected; a suggestion is made for the former elsewhere in this proposal.. Details of PC-UK02xx issues Category: 1 PC-UK0201 Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 4 Title: Further requirements on the conformance documentation Detailed description: The Standard requires an implementation to be accompanied by documentation of various items. However, there is a subtle difference between the terms "implementation-defined" and "described by the implementation" which has been missed by this wording (this is partly due to the tightening up of the uses of this term between C89 and C9X - see for example subclause 6.10.6). As a result, the wording does not actually require the latter items to be documented. Change the paragraph to: An implementation shall be accompanied by a document that describes all features that this International Standard requires to be described by the implementation, including all implementation-defined characteristics and all extensions. PC-UK0244 Category: Inconsistency Committee Draft subsection: 6.2.5, 6.5.3.4, 6.7 Title: Issues with prototypes and completeness. Detailed description: 6.2.5p23 says "An array type of unknown size is an incomplete type". Is the type "int [*]" (which can only occur within a prototype) complete or incomplete ? If it is complete, then what is its size ? This can occur in the construct int f (int a [sizeof (int [*][*])]); It it is incomplete, then the type "int [*][*]" is not permitted, which is clearly wrong. Now consider the prototype: int g (int a []); The parameter clearly has an incomplete type, but since a parameter is an object (see 3.16) this is forbidden by 6.7p7: If an identifier for an object is declared with no linkage, the type for the object shall be complete by the end of its declarator, or by the end of its init-declarator if it has an initializer. This is also clearly not what was intended. One way to fix the first item would be to change 6.5.3.4p1 to read: [#1] The sizeof operator shall not be applied to an expression that has function type or an incomplete type, to || an array type with unspecified size 72a), to the parenthesized name of such a type, or to an lvalue that designates a bit-field object. || 72a) An array type with unspecified size occurs in function || prototypes when the notation [*] is used, as in "int [*][5][*]". One way to fix the second item would be to change 6.7p7 to read: If an identifier for an object is declared with no linkage, the type for the object shall be complete by the end of its declarator, or by the end of its init-declarator if it has an initializer; || in the case of function arguments (including in prototypes) this shall || be after making the adjustments of 6.7.5.3 (from array and function || types to pointer types). PC-UK0246 Committee Draft subsection: 6.2.5, 6.7.2.2 Title: Circular definition of enumerated types Detailed description: 6.7.2.2 para 4 says: Each enumerated type shall be compatible with an integer type. However, 6.2.5 para 17 says: The type char, the signed and unsigned integer types, and the enumerated types a re collectively called integer types. Thus we have a circular definition. To fix this, change the former to one of: Each enumerated type shall be compatible with a signed or unsigned integer type. or: Each enumerated type shall be compatible with a standard integer type or an extended integer type. PC-UK0248 Committee Draft subsection: 6.3.2.3 Title: Null pointer constants should be castable to pointer types Detailed description: 6.3.2.3p3 says that: If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. Such a pointer, called a null pointer, However, this doesn't cover cases such as: (char *) 0 which is neither an assignment or a comparison. Therefore this is not a null pointer constant, but rather an implementation-defined conversion from an integer to a pointer. This is clearly an oversight and should be fixed. Either change: If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. Such a pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function. to: If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. When a null pointer constant is converted to a pointer, the result (called a /null pointer/) is guaranteed to compare unequal to a pointer to any object or function. or change: If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. Such a pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function. [#4] Conversion of a null pointer to another pointer type yields a null pointer of that type. Any two null pointers shall compare equal. to: If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. A /null pointer/ is a special value of any given pointer type that is guaranteed to compare unequal to a pointer to any object or function. Conversion of a null pointer constant to a pointer type, or of a null pointer to another pointer type, yields a null pointer of that type. Any two null pointers shall compare equal. PC-UK0272 Committee Draft subsection: 6.5.9 Title: Tidy up of pointer comparison Detailed description: Clause 6.5.8, para 5-6: The original wording (6.3.8) contained a paragraph between these two. The first sentence of this paragraph has been moved to paragraph 5. The second sentence ("If two pointers to object or incomplete types compare equal, both point to the same object, or both point one past the last element of the same array object."). does not appear to occur in the FCD. The sentence needs to be in the FCD so that all cases are covered. Append to 6.5.9 para 6: Otherwise the pointers compare unequal. PC-UK0273 Committee Draft subsection: 6.7.5.3 Title: Forbid incomplete types in prototypes Detailed description: Clause 6.7.5.3, para 8: This allows constructs such as: void f1(void, char); struct t; void f2(struct t); Allowing incomplete types in prototypes is only necessary to support [*] (another UK proposal would make this a complete type). If certain incomplete types are allowed in prototypes they need to be explicitly called out. Otherwise the behaviour should be made a constraint violation. Remove the words "may have incomplete type and" from the cited paragraph. Words should be added somewhere to make it clear that [*] arrays are complete types; for example, in 6.7.5.2 para 3 change: ... in declarations with function prototype scope.111) ... to ... in declarations with function prototype scope 111); such arrays are nonetheless complete types ... See also PCUK-0244. PC-UK0277 Committee Draft subsection: 6.2.6.1, 6.5.2.3 Title: Effects on other members of assigning to a union member Detailed description: 6.5.2.3p5 has wording concerning the storing of values into a union member: With one exception, if the value of a member of a union object is used when the most recent store to the object was to a different member, the behavior is implementation-defined. When this wording was written, "implementation-defined" was interpreted more loosely and there was no other relevant wording concerning the representation of values. Neither of these is the case anymore. The requirement to be implementation-defined means that an implementation must ensure that all stored values are valid in the types of all the other members, and eliminates the possibility of them being trap representations. It also makes it practically impossible to have trap representations at all. This is not the intention of other parts of the Standard. It turns out that the wording of 6.2.6.1 is sufficient to explain the behavior in these circumstances. Type punning by unions causes the same sequence of bytes to be interpreted according to different types. In general, 6.2.6.1 makes it clear that bit patterns can be trap values, and so the programmer can never be sure that the value is safe to use in a different type. One special case that should be considered is the following: union { unsigned short us; signed int si; unsigned int ui; } x; If x.si is set to a positive value, the requirements of 6.2.5 and 6.2.6.1 will mean that x.ui holds the same value with the same representation. This appears to be a reasonable assumption. A similar thing happens if x.ui is set to a value between 0 and INT_MAX. If x.si is set to a negative value, or x.ui to a value greater than INT_MAX, the other might be set to a trap representation if there are any padding bits; if there are none, then it must be the case that the extra bit in x.ui overlaps the sign bit of x.si. Finally, if x.ui is set to some value and x.us does not have any padding bits and does not overlap any padding bits of x.ui, then x.us will have some value determinable from the arrangement of bits in the two types (this might be the low bits of x.ui, the high bits, or some other combination). None of these cases should be particularly surprising. The cited wording in 6.5.2.3 merely muddles the issue by implying that all possible members take sensible (non-trap) values. It should be removed; the rest of the paragraph can stand alone. Committee Draft subsection: 7.19.8.1, 7.19.8.2 PC-UK0278 Title: Clarify the actions of fread and fwrite Detailed description: The exact behaviour of fread and fwrite are not well specified, particularly on text streams but in actuality even on binary streams. For example, the wording does not require the bytes of the object to be output in the order they appear in memory, but would allow (for example) byte-swapping. It is reported that at least one implementation converts to a text form such as uuencoding on output to a text stream, converting back on input. And, finally, there is not even a requirement that data written out by fwrite and read back by an equivalent call to fread even has the same value. These changes apply the obvious semantics. In 7.19.8.1p2, add after the first sentence: For each object, /size/ calls are made to the /fgetc/ function and the results stored, in the order read, in an array of /unsigned char/ exactly overlaying the object. In 7.19.8.2p2, add after the first sentence: For each object, /size/ calls are made to the /fputc/ function, taking the values (in order) from an array of /unsigned char/ exactly overlaying the object. PC-UK0286 Committee Draft subsection: 7.6, 7.6.3 Title: Inconsistencies in fesetround Detailed description: It is not clear from the text whether an implementation is required to allow the floating-point rounding direction to be changed at runtime. The wording of 7.6p7 implies that those modes defined must be settable at runtime ("... supports getting and setting ..."), but if this is the case then 7.6.3.2p4 (example 1) would have no need to call assert on the results of the fesetround call, since that call could not fail (if FE_UPWARD is not defined the code is in error). On the other hand, 7.6.3.2p2 implies that setting succeeds whenever the argument is one of the defined FE_ macros, and in any case 7.6.3.2p3 is ambiguous. Even if the mode cannot be set at runtime, it may be the case that the code is compiled under more than one mode, and it is therefore convenient to be able to retrieve the current mode. Option A -------- If the intention is that there may be rounding modes that can be in effect but cannot be set by the program, then make the following changes: Change 7.6p7 to: Each of the macros FE_DOWNWARD FE_TONEAREST FE_TOWARDZERO FE_UPWARD is defined if the implementation supports this rounding direction and it might be returned by the fegetround function; it might be possible to set this direction with the fesetround function. Additional rounding directions ... [unchanged] Change 7.6.3.2p2 to: The fesetround function attempts to establish the rounding direction represented by its argument round. If the new rounding direction cannot be established, it is left unchanged. Change 7.6.3.2p3 to: The fesetround function returns a zero value if and only if the new rounding direction has been established. Example 1 is valid in this option, though it may be desirable to replace the assert by some other action. Option B -------- If the intention is that an implementation must allow the mode to be set successfully to any FE_* macro that is defined, then make the following changes: Change 7.6.3.2p3 to: The fesetround function returns a zero value if and only if the argument is equal to a rounding direction macro defined in the header (that is, if and only if the requested rounding direction is one supported by the implementation). In 7.6.3.2p4, change the lines: setround_ok = fesetround (FE_UPWARD); assert(setround_ok); to: assert(defined (FE_UPWARD)); fesetround(FE_UPWARD); // Can't fail and delete the declaration of setround_ok. UK-PC0287 Committee Draft subsection: 7.13, 7.13.2.1 Title: Clarify what the setjmp "environment" is Detailed description: Much of the state of the abstract machine is not stored in "objects" within the meaning of the Standard. It needs to be clear that such state is not saved and restored by the setjmp/longjmp combination. Append to 7.13p2: The environment of a call to the setjmp macro consists of information sufficient for a call to the longjmp function to return execution to the correct block and invocation of that block (if it were called recusively). It does not include the state of the floating-point flags, of open files, or of any other component of the abstract machine. In 7.13.2.1p3, change: All accessible objects have values as of the time ... to All accessible objects have values, and all other components of the abstract machine [*] have the same state as of the time ... [*] This includes but is not limited to the floating-point flags and the state of open files. It also needs to be clear that optimisers need to take care. Consider the following code: jmp_buf env; int v [N * 2]; for (int i = 0; i < N; i++) { v [2*i] = 0; if (setjmp (env)) // ... v [2*i+1] = f (); // might call longjmp; note i hasn't changed } This might be optimised as if written as: jmp_buf env; int v [N * 2]; for (ii = 0; ii < 2 * N; ) { v [ii] = 0; if (setjmp (env)) // ... ii++; v [ii] = f (); // might call longjmp ii++; } Such code would be allowed to make ii indeterminate after a longjmp, but the original code would not. It should be made clear, perhaps through an example, that such an optimisation must allow for the possibility of a longjmp happening. Category 2 PC-UK0209 Committee Draft subsection: 6.10.3 Title: Add a __VA_COUNT__ facility for varargs macros Detailed description: Unlike with function calls, it is trivial for an implementation to determine the number of arguments that match the ... in a varargs macro. There are a number of useful things that can be done with this (at the least, providing argument counts to varargs functions). Therefore this information should be made available to the macro expansion. In 6.10.3p5, change The identifier /__VA_ARGS__/ ... to: The identifiers /__VA_ARGS__/ and /__VA_COUNT__/ ... Append to 6.10.3.1p2: An identifier /__VA_COUNT__/ that occurs in the replacement list shall be replaced by a single token which is the number of trailing arguments (as a decimal constant) that were merged to form the variable arguments. PC-UK0214 Committee Draft subsection: 6.3.4, plus scattered other changes Title: better terminology for object lifetimes Detailed description: The term "lifetime" is used at a few places in the Standard but never defined. Meanwhile a number of places uses circumlocutions such as "while storage is guaranteed to be reserved". These would be much easier to read if the term "lifetime" was defined and used. Make the following changes to subclause 6.3.4. Delete paragraph 5 and insert a new paragraph between 1 and 2: The /lifetime/ of an object is the portion of program execution during which storage is guaranteed to be reserved for that object. An object exists and retains its last-stored value throughout its lifetime. Objects with static or automatic storage duration have a constant address throughout their lifetime.23 If an object is referred to outside its lifetime, the behavior is undefined. The value of a pointer is indeterminate after the end of the lifetime of the object it points to. Change paragraphs 2 to 4 (which will become 3 to 5) to: [#2] An object whose identifier is declared with external or internal linkage, or with the storage-class specifier static, has static storage duration. The lifetime of the object is the entire execution of the program. Its stored value is initialized only once. [#3] An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration. For objects that do not have a variable length array type, the lifetime extends from entry into the block with which it is associated until execution of the block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively a new object is created each time. The initial value of the object is indeterminate; if an initialization is specified for the object, it is performed each time the declaration is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached. [#4] For objects that do have a variable length array type, the lifetime extends from the declaration of the object until execution of the program leaves the scope of that declaration24. If the scope is entered recursively a new object is created each time. The initial value of the object is indeterminate. Other changes: In 5.1.2p1 change "in static storage" to "with static storage duration". Change footnote 9 to: 9) In accordance with 6.2.4, a call to exit will remain within the lifetime of objects with automatic storage duration declared in main but a return from main will end their lifetime. Delete 5.1.2.3p5 as it just duplicates material in 6.2.4p3-4. Change the last portion of 6.5.2.5p17 to: of the loop only, and on entry next time around p would be pointing to an object outside of its lifetime, which would result in undefined behavior. Change the last portion of footnote 72 to: and the address of an automatic storage duration object after the end of its lifetime. Change the first sentence of 6.7.3.1p5 to: Here an execution of B means the lifetime of a notional object with type /char/ and automatic storage duration associated with B. Add to 7.20.3 a second paragraph: The lifetime of an object allocated by the calloc, malloc, or realloc functions extends from the function call until the object is freed by the free or realloc functions. The object has a constant address throughout its lifetime except when moved by a call to the realloc function. The last sentence of 7.20.3p1 is redundant and could be deleted. Relevant bullet points in annex K should also be changed. PC-UK0222 Committee Draft subsection: 6.7.2.1 Title: Bitfields of unsupported types should require a diagnostic. Detailed description: If a bitfield is declared with a type other than /_Bool/ or plain, signed, or unsigned int, the behavior is undefined. Since this can easily be determined at compile time, a diagnostic should be required. It is reasonable to exempt other integer types that the implementation knows how to handle. Add to the end of 6.7.2.1p3: A bit-field shall have a type that is a qualified or unqualified version of /_Bool/, /signed int/ or /unsigned int/, or of some other implementation-defined integer type. Delete the first sentence of 6.7.2.1p8. Note that this wording allows additional implementation-defined bitfield types so long as they are integers. If they are not, the behaviour would not be defined by the Standard and so a diagnostic should still be required. An implementer can also allow non-integer bitfield types, but a diagnostic is still required. PC-UK0232 Committee Draft subsection: 7.19.2, 7.24.3.5, 7.24.6 Title: Better locale handling for wide oriented streams Detailed description: 7.19.2p6 associates an /mbstate_t/ object with each stream, and 7.19.3p11-13 state that this is used with the various I/O functions. On the other hand, 7.24.6p3 places very strict restrictions on the use of such objects, restrictions that cannot be met through the functions provided in the Standard while allowing convenient use of wide formatted I/O. Furthermore, an /mbstate_t/ object is tied to a single locale based on the first time it is used. This means that a wide oriented stream is tied to the locale in use the first time it is read or written. This will be surprising to many users of the Standard. Therefore, at the very least these objects should be exempt from the restrictions of 7.24.6; the restrictions of 7.19 (for example, 7.19.2p5 bullet 2) are sufficient to prevent unreasonable behaviour. In addition, the locale of the object should be tied and not affected by the current locale. The most sensible way to do this is to use the locale in effect when the file is opened, but allow /fwide/ to override this. In 7.19.2p6, add after the first sentence: This object is not subject to the restrictions on direction of use and of locale that are given in subclause 7.24.6. All conversions using this object shall take place as if the /LC_CTYPE/ category setting of the current locale is the setting that was in effect when the orientation of the stream was set with the /fwide/ function or, if this has not been used, when the stream was opened with the /fopen/ or /freopen/ function. In 7.24.3.5, add a new paragraph after paragraph 2: If the stream is successfully made wide oriented, the /LC_CTYPE/ category that is used with the /mbstate_t/ object associated with the stream shall be set to that of the current locale. In 7.24.6p3, append: These restrictions do not apply to the /mbstate_t/ objects associated with streams. PC-UK0245 Committee Draft subsection: 6.2.5, 6.7 Title: Problems with flexible array members Detailed description: Sometime after CD1 the following wording was added to 6.2.5p23: A structure type containing a flexible array member is an incomplete type that cannot be completed. Presumably this was done to eliminate some conceptual problems with structures that contain such members. However, this change makes almost all use of such structures forbidden, because it is no longer possible to take their size, and it is unclear what other operations are valid. This was also not the intent behind the original proposal. On the other hand, if such a structure is a complete type, there are a number of issues to be defined, such as what happens when the structure is copied or initialized. These need to be addressed. The wording defining flexible array members is in 6.7.2.1p15: [#15] As a special case, the last element of a structure with more than one named member may have an incomplete array type. This is called a flexible array member, and the size of the structure shall be equal to the offset of the last element of an otherwise identical structure that replaces the flexible array member with an array of unspecified length.95) When an lvalue whose type is a structure with a flexible array member is used to access an object, it behaves as if that member were replaced with the longest array, with the same element type, that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, then it behaves as if it had one element, but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it. A solution to the problem is to leave the structure as complete but have the flexible member ignored in most contexts. To do this, delete the last sentence of 6.2.5p23, and change 6.7.2.1p15 as follows: [#15] As a special case, the last element of a structure with more than one named member may have an incomplete array || type. This is called a flexible array member. With two || exceptions the flexible array member is ignored. Firstly, the || size of the structure shall be equal to the offset of the last element of an otherwise identical structure that replaces the flexible array member with an array of unspecified length.95) Secondly, when the . or -> operator has a left || operand which is, or is a pointer to, a structure with a flexible || array member and the right operand names that member, it behaves as if that member were replaced with the longest array, with the same element type, that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, then it behaves as if it had one element, but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it. Finally, add further example text after 6.7.2.1p18: The assignment: *s1 = *s2; only copies the member n, and not any of the array elements. Similarly: struct s t1 = { 0 }; // valid struct s t2 = { 2 }; // valid struct ss tt = { 1, { 4.2 }}; // valid struct s t3 = { 1, { 4.2 }}; // error; there is nothing // for the 42 to initialize t1.n = 4; // valid t1.d [0] = 4.2; // undefined behavior PC-UK0254 Committee Draft subsection: 7.8 Title: Missing functions for intmax_t values Detailed description: Several utility functions have versions for types int and long int, and when long long was added corresponding versions were added. Then when intmax_t was added to C9X, further versions were provided for some of these functions. However, three cases were missed. For intmax_t to be useful to the same audience as other features of the Standard, these three functions should be added. Obviously they should be added to . Add a new subclause 7.8.3: 7.8.3 Miscellaneous functions 7.8.3.1 The atoimax function Synopsis #include intmax_t atoimax(const char *nptr); Description The atoimax function converts the initial portion of the string pointed to by nptr to intmax_t representation. Except for the behaviour on error, it is equivalent to strtoimax(nptr, (char **)NULL, 10) The function atoimax need not affect the value of the integer expression errno on an error. If the value of the result cannot be represented, the behavior is undefined. Returns The atoimax function returns the converted value. 7.8.3.2 The imaxabs function Synopsis #include intmax_t abs(intmax_t j); Description The imaxabs function computes the absolute value of an integer j. If the result cannot be represented, the behavior is undefined. Returns The imaxabs function returns the absolute value. 7.8.3.3 The imaxdiv function Synopsis #include imaxdiv_t div(intmax_t numer, intmax_t denom); Description The imaxdiv function computes numer/denom and numer%denom in a single operation. Returns The imaxdiv function returns a structure of type imaxdiv_t, comprising both the quotient and the remainder. The structure shall contain (in either order) the members quot (the quotient) and rem (the remainder), each of which have the type intmax_t. If either part of the result cannot be represented, the behavior is undefined. 7.8 paragraph 2 will need consequential changes. PC-UK0274 Committee Draft subsection: 6.3.1.3 Title: Clarify the semantics of integer conversions Detailed description: Clause 6.3.1.3, para 2. The bit about converting the type, in C90 (6.2.1.2), has been deleted. What happens when an object of type int having a value of -1 is assigned to an unsigned long? If the adding and subtracting described in para 2 is performed on the int object the resulting value will be different than if the original value was first promoted to long (assuming int and long have a different width). Add footnote (see footnote 28 in C90) stating that the rules describe arithmetic on the mathematical value, not on the value of a given type of expression. PC-UK0275 Committee Draft subsection: 6.6 Title: lacuna in sizeof/VLA interactions in constant expressions Detailed description: Clause 6.6, para 8: This paragraph does not contain the extra wording on sizeof "... whose results are integer constants, ..." found in para 6. This needs to be added. PC-UK0279 Committee Draft subsection: 6.2.6.2 Title: Remove or clarify one's complement and sign-and-magnitude Detailed description: Subclause 6.2.6.2p2 makes it clear that there are only three permitted representations for signed integers - two's complement, one's complement, and sign-and-magnitude. It is reported, however, that certain historical hardware using the latter two have problems with the "minus zero" representation. Software not written with minus zero in mind can also run into problems; for example, the expressions: 0 & 1 or (-2 + 2) & 1 might turn out to be true because a minus zero has happened (bit operators are defined to act on the bit patterns, so this is an issue). It is inconvenient to have to code defensively around this problem, and most programmers are probably not even aware of it. However, enquiries have failed to identify any existing platform that does not use two's complement, and so the time may have come to require it as part of C. This approach is addressed in option A below. If WG14 is not willing to do this, the changes in option B deal with the issues of minus zero, by forbidding it from appearing unexpectedly. Option A -------- Change the last part of 6.2.6.2p2 from: If the sign bit is zero, it shall not affect the resulting value. If the sign bit is one, then the value shall be modified in one of the following ways: -- the corresponding value with sign bit 0 is negated; -- the sign bit has the value -2N; -- the sign bit has the value 1-2N. to: The sign bit shall have the value -2N.[*] and add the footnote: [*] This is often known as 2's complement. Consequential changes will be required in 5.2.4.2.1, 7.18.2, and 6.7.7, and possibly elsewhere. Option B -------- Change the last part of 6.2.6.2p2 from: If the sign bit is one, then the value shall be modified in one of the following ways: -- the corresponding value with sign bit 0 is negated; -- the sign bit has the value -2N; -- the sign bit has the value 1-2N. to: If the sign bit is one, then the value shall be modified in one of the following ways: -- the corresponding value with sign bit 0 is negated (/sign and magnitude/); -- the sign bit has the value -2N (/two's complement/); -- the sign bit has the value 1-2N (/one's complement/). The implementation shall document which shall apply, and whether the value with sign bit 1 and all value bits 0 (for the first two), or with sign bit and all value bits 1 (for one's complement) is a trap representation or a normal value. In the case of sign and magnitude and one's complement, if this representation is a normal value it is called a /negative zero/. and insert two new paragraphs immediately afterwards: If the implementation supports negative zeros, then they shall only be generated by: - the & | ^ ~ << and >> operators with appropriate arguments; - the + - * / and % operators where one argument is a negative zero and the result is zero; - compound assignment operators based on the above cases. It is unspecified if these cases actually generate negative zero or normal zero, and whether a negative zero becomes a normal zero or remains a negative zero when stored in an object. If the implementation does not support negative zeros, the behavior of an & | ^ ~ << or >> operator with appropriate arguments is undefined. PC-UK0281 Committee Draft subsection: 6.10 Title: Parsing ambiguity in preprocessing directives Detailed description: Consider parsing the following text during the preprocessing phase (translation phase 4): # if 0 xxxx # else yyyy # endif The third line fits the syntax for the first option of group-part, and thus generates two possible parsings. One of these will cause both text lines to be skipped, while the other only causes the second to be skipped. To fix this ambiguity. In the syntax in 6.10p1, change group-part to: group-part: non-directive new-line if-section control-line and add: non-directive: pp-tokens/opt Then add a new paragraph to the Constraints, after 6.10p3: If the first preprocessing-token (if any) in a non-directive is /#/, the second shall be an identifier other than one of those that appears in the syntax in this subclause. Such a non-directive shall only appear in a group that is skipped. This change has two (deliberate) side-effects: unknown preprocessing require a diagnostic if not skipped, and in any case cannot affect the state of conditional inclusion. If such a directive is recognised by the implementation, it can still interpret it in any desired way after outputting the diagnostic. PK-UK0282 Committee Draft subsection: 6.10.8 Title: provide a __STDC_HOSTED__ macro Detailed description: There is currently no way for a program to determine if the implementation is hosted or freestanding. A standard predefined macro should be provided. Add to the list in 6.10.8p1: __STDC_HOSTED__ The decimal constant 0 if the implementation is a freestanding one and the decimal constant 1 if it is a hosted one. Note: it has been suggested that this is difficult to provide when there is an independent preprocessor because it will not know what language the compiler is handling or what library is available, but the same points apply to the standard headers, to __STC_VERSION__, to __STDC_IEC_559__, and so on; if these can be handled correcly by such an implementation, so can __STDC_HOSTED__. PC-UK0283 Committee Draft subsection: 7.14.1.1, 7.20.4 Title: _Exit function Detailed description: As part of a working paper (N789), I suggested that C provide an _exit() function like that in POSIX, and signal handlers should be allowed to call this function. The Menlo Park meeting agreed to add this function unless an unresolvable technical issue was found that would make it not conformant to POSIX. The Santa Cruz meeting decided not to include this function because they felt that there was a possibility of conflict with POSIX. The functionality is still needed, as without it there is no safe way to leave a signal handler, and so it is being resubmitted with a new name in the implementer's namespace. In 7.14.1.1p5, change: or the signal handler calls any function in the standard library other than the /abort/ function or the /signal/ function to: or the signal handler calls any function in the standard library other than the /abort/ function, the /_Exit/ function, or the /signal/ function Add a new subclause within 7.20.4: 7.20.4.X The _Exit function Synopsis #include void _Exit (int status); Description The /_Exit/ function causes normal program termination to occur, and control to be returned to the host environment. No functions registered by the /atexit/ function or signal handlers registered by the /signal/ function are called. The /_Exit/ function never returns to the caller. The status returned to the implementation is determined in the same manner as for the /exit/ function. It is implementation- defined whether open output streams are flushed, open streams closed, or temporary files removed. PC-UK0284 Committee Draft subsection: 6.10.3 Title: Problems with extended characters in object-like macros Detailed description: When an object-like macro is #defined, there is no requirement for a delimiter between the macro identifier and the replacement list. This can be a problem when extended characters are involved - for example, some implementations view $ as valid in a macro identifier while others do not. Thus the line: #define THIS$AND$THAT(x) ((x)+42) can be parsed in either of two ways: Identifier Arguments Replacement list THIS - $AND$THAT(x) ((x)+42) THIS$AND$THAT x ((x)+42) TC1 addressed this by requiring the use of a space in certain circumstances so as to eliminate the ambiguity. However, this requirement has been removed in C9X for good reasons. Regrettably this reintroduces the original ambiguity. The proposed solution is to require that the macro identifier in the definition of object-like macros be followed by white space or by one of the basic graphic characters that is not ambiguous. Most code already uses white space and such code will not be affected. Code such as: #define x+y z which actually means #define x +y z will also not be affected by the first option. The only cases that are affected will require a diagnostic, thus eliminating the ambiguity. Insert a new Constraint in 6.10.3, preferably: In the definition of an object-like macro there shall be white space between the identifier and the replacement list unless the latter begins with one of the 26 graphic characters in the basic character set other than ( _ or \. (or equivalent wording) or alternatively: In the definition of an object-like macro there shall be white space between the identifier and the replacement list. PC-UK0285 Committee Draft subsection: 7.19.5.1 Title: Clarify meaning of a failed fclose Detailed description: If a call to fclose() fails it is not clear whether: - it is still possible to access the stream; - whether fflush(NULL) will attempt to flush the stream; - whether it is safe to destroy a buffer allocated to the stream by setvbuf(). There are two possibilities: a failed fclose can leave the stream open as far as the program is concerned, or it can leave it closed (irrespective of the state of the underlying entity). The first case is a problem if the close fails part way through, as it might not be possible to reinstate the status of the stream. Therefore the second is better, because the implementation can always carry out those parts of the cleanup that are visible to the program (such as the second and third items above). The existing wording - read strictly - also requires the full list of actions to be carried out successfully whether or not the call fails. This is clearly an oversight. Change 7.19.5.1p2 to read: A successful call to the fclose function causes the stream pointed to by stream to be flushed and the associated file to be closed. Any unwritten buffered data for the stream are delivered to the host environment to be written to the file; any unread buffered data are discarded. Whether or not the call succeeds, the stream is disassociated from the file and any buffer set by the setbuf or setvbuf function is disassociated from the stream, and if the later was automatically allocated, it is deallocated. Note that this does not require anything outside the control of the implementation to take place if the call fails, while still leaving the program in a safe state. PC-UK0261 Committee Draft subsection: 6.10.8 Title: Distinguishing C89 from C9X Detailed description: Because of the widespread and important changes between C89 and C9X, it is very important for an application to be able to determine which language the implementation is supporting. __STDC_VERSION__ may have been intended for this purpose, but is not entirely reliable and has the wrong properties. Amongst other faults, it is nowhere stated that it will be increased, or even continue to have integer values, so it is not possible to test in a program designed for long-term portability. There are two possible solutions to this. The first is to use the value of __STDC__ as an indicator of the C language variant, by adding wording like: __STDC__ shall be set to 2, to indicate the C language described in this document, rather than 1, which indicated the language described in ISO/IEC 9899:1990 and ISO/IEC 9899:AMD1:1995. That is an entirely reliable indicator of whether an implementation conforms to C89 or C9X, and follows existing practice (many vendors use a value of 0 to indicate K&R/standard intermediates.) A second approach is to define the meaning of __STDC_VERSION__ more precisely, by adding wording like: It is the intention of this standard that the value of __STDC_VERSION__ may be used to determine which revision of the standard an implementation is conforming to, and that it will remain a constant of type long that is increased at each revision. That is not entirely reliable, but would probably do. PC-UK0262 Committee Draft subsection: 6.3.1.3 and 6.10.6 Title: Detecting C89/C9X incompatibilities Detailed description: Because of the change in the status of the long and unsigned long types, it is very important to be able to detect when an application was conforming in C89 and is undefined in C9X, or has a different effect in C89 and C9X. The following should be added to 6.3.1.3: 6.3.1.3.1 The C89_MIGRATION pragma The C89_MIGRATION pragma can be used to constrain (if the state is on) or permit (if the state is off) integer conversions from higher ranks than long or unsigned long to types that are explicitly declared as either long or unsigned long. Each pragma can occur either outside external declarations or preceding all explicit declarations and statements inside a compound statement. When outside external declarations, the pragma takes effect from its occurrence until another C89_MIGRATION pragma is encountered, or until the end of the translation unit. When inside a compound statement, the pragma takes effect from its occurrence until another C89_MIGRATION pragma is encountered (within a nested compound statement), or until the end of the compound statement; at the end of a compound statement the state for the pragma is restored to its condition just before the compound statement. If this pragma is used in any other context, the behavior is undefined. The default state (on or off) for the pragma is implementation-defined. For the purposes of the C89_MIGRATION pragma, a type is explicitly declared as long or unsigned long if its type category (6.2.5) is either long or unsigned long and either of the following conditions is true: The type specifier in the declaration or type name that defines the type is long or unsigned long, in any of the equivalent forms described in 6.7.2. The type specifier in the declaration or type name that defines the type is a typedef name which is not defined in a standard header and whose type satisfies the previous condition. This rule shall be applied recursively. Constraints If the state of the C89_MIGRATION pragma is on, no value with a signed integer type of higher integer conversion rank than long or with an unsigned integer type of higher integer conversion rank than unsigned long or shall be converted to a type that is explicitly declared as either type long or unsigned long. If the state of the C89_MIGRATION pragma is on, no function with a type that does not include a prototype shall be called with an argument that has a signed integer type of higher integer conversion rank than long or an unsigned integer type of higher integer conversion rank than unsigned long. Recommended practice A similar constraint should also be applied to programs that use conversion specifiers associated with long or unsigned long (e.g. %ld or %lu) for integer values or variables of a higher rank, where this can be checked during compilation. The following should be added to 6.10.6 #pragma STDC C89_MIGRATION on-off-switch PC-UK0269 Committee Draft subsection: 5.1.2.3 Title: Ambiguity in what is meant by "storing" Detailed description: The standard assumes the concept of "storing" a value in many places (e.g. when floating-point values must be converted to their target type) but nowhere defines it. It is not obvious that argument passing is a storage operation. Some wording like the following should be added after paragraph 3 in 5.1.2.3: The data model used in the abstract machine is that all objects are sequences of bytes in memory, and that assignment (including to register objects, argument passing etc.) consists of storing data in those bytes. PC-UK0270 Committee Draft subsection: 7.20.4.2, 7.20.4.3 Title: Ambiguity in when exit calls atexit functions Detailed description: It is unclear whether exit calls functions registered by atexit as if it were a normal function, or whether it may unwind the stack to the entry to main before doing so. This affects whether it is legal to call longjmp to leave an atexit function to return to a location set up by a call of setjmp before the call of exit. This should be clarified, which could include making it explicitly undefined. Category 3 PC-UK0227 Committee Draft subsection: 6.7.7 Title: Correct ranges of bitfields in an example Detailed description: In 6.7.7p6, example 3, describes the ranges of various bit-fields in terms of "at least the range". This is because C89 was not clear on what the permitted ranges of integer types was. These ranges are now tightly specified by 6.2.6.2, and so the wording of this example should be altered accordingly: - change "at least the range [-15, +15]" to "either the range [-15, +15] or the range [-16, 15]" - change "values in the range [0, 31] or values in at least the range [-15, +15]" to "values in one of the ranges [0, 31], [-15, +15], or [-16, +15]" PC-UK0249 Committee Draft subsection: 6.4 Title: UCNs as preprocessing-tokens Detailed description: In 6.4 the syntax for "preprocessing-token" includes: identifier each universal-character-name that cannot be one of the above In 6.4.2.1 the syntax for "identifier" includes: identifier: identifier-nondigit identifier identifier-nondigit identifier digit identifier-nondigit: nondigit universal-character-name other implementation-defined characters Therefore a universal-character-name is always a valid identifier preprocessing token, and so the second alternative can never apply. It is true that 6.4.2.3p3 makes certain constructs undefined, but this does not alter the tokenisation. There are two ways to fix this situation. The first is to delete the second alternative for preprocessing-token. The second would be to add text to 6.4p3, or as a footnote, along the following lines: The alternative "each universal-character-name" that cannot be one of the above can never occur in the initial tokenisation of a program in translation phase 3. However, if an identifier includes a universal- character name that is not listed in Annex I, the implementation may choose to retokenise using this alternative. PC-UK0251 Committee Draft subsection: 6.8.5 Title: Error in new for syntax Detailed description: C9X adds a new form of syntax for for statements: for ( declaration ; expr-opt ; expr-opt ) statement However, 6.7 states that /declaration/ *includes* the trailing semicolon. The simplest solution is to remove the corresponding semicolon in 6.8.5 and not worry about the informal use of the term in 6.8.5.3p1. Alternatively the syntax needs to be completely reviewed to allow the term to exclude the trailing semicolon. PC-UK0252 Committee Draft subsection: 6.9 Title: References to sizeof not allowing for VLAs Detailed description: 6.9p3 and p5 use sizeof without allowing for VLAs. In each case, change the parenthetical remark: (other than as a part of the operand of a sizeof operator) to: (other than as a part of the operand of a sizeof operator which is not evaluated) PC-UK0256 Committee Draft subsection: 7.23.3.7 Title: Wrong time system notation used Detailed description: In 7.23.3.7p2, the expression "UTC-UT1" appears. This should read "TAI-UTC". PC-UK0257 Committee Draft subsection: 7.25.2.1, 7.25.3. Title: ISO10646 to/from wchar_t conversion functions. Often programs that manipulate C source code are themselves written in C. The purpose of these changes is to make it easier for such programs to handle universal character names, specified in input files not source files, portably. They can also be used for interpreting data files and suchlike, although the preferred way to do this is to use the appropriate locale; thus, there is no functionality for converting several wide characters at a time. The mapping functions below could be implemented by writing wchar_t u2wc[] = { L'\U00000000', L'\U00000001', L'\U00000002', ... } wint_t toiso10646wc(long iso10646) { return u2wc[iso10646]; } and the reverse for toiso10646wc, except that implementation limits will usually prohibit such a large array. The functions can be trivially defined to return -1 or WEOF always, although this is not recommended. This can happen, for instance, if the wide character set in use does not have any characters which have known equivalents in ISO10646. It may happen that even if a wide character does have an equivalent in ISO10646, that it is unreasonable for the runtime library to know about it, and in such cases the functions may return -1 or WEOF (this is a quality-of-implementation issue). The names of the functions are chosen to not tread on anybody's namespace. `long' is chosen because int_fast32_t need not be defined by wctype.h. I would have used (long)WEOF instead of -1 as the error return for towciso10646, but (long)WEOF might be a valid result: for instance, wchar_t is 64 bits, WEOF is 0xFFFFFFFF00000000ll, long is 32 bits. These changes apply to the committee draft of August 3, 1988. Add after section 7.25.2.1.11 "The iswxdigit function": 7.25.2.1.12 The iswiso10646 function Synopsis #include int iswiso10646(wint_t wc); Description The /iswiso10646/ function tests for those characters for which /towciso10646/ would not return -1. Add to section 7.25.2.2.1 "The iswctype function": iswctype(wc, wctype("iso10646")) // iswiso10646(wc) Add after section 7.25.3.2.2 "The wctrans function": 7.25.3.3 Wide-character ISO10646 mapping functions The function /towciso10646/ and the function /toiso10646wc/ convert wide characters to and from ISO10646 code points. 7.25.3.3.1 The towciso10646 function Synopsis #include long towciso10646(wint_t wc); Description The /towciso10646/ function returns the ISO10646:1993 code point corresponding to /wc/, or -1. If /towciso10646/ does not return -1, then /toiso10646wc(towciso10646(wc))/ returns /wc/. Recommended Practise /towciso10646(L'\Unnnnnnnn')/ returns /0xnnnnnnnnl/ when /\Unnnnnnnn/ is a universal character name that corresponds to a wide character. /towciso10646/ does not return -1 for wide characters corresponding to those required in the basic execution character set. 7.25.3.3.2 The toiso10646wc function Synopsis #include wint_t toiso10646wc(long iso10646); Description The toiso10646wc function returns the wide character corresponding to the ISO10646:1993 code point /iso10646/, or /WEOF/. If /toiso10646wc/ does not return /WEOF/, then /towciso10646(toiso10646wc(iso10646))/ returns /iso10646/. PC-UK0276 Committee Draft subsection: various Title: Assorted editorial changes Detailed description: Each of these changes stands alone. [A] Clause 3.11: Change to "unspecified behaviour where each implementation shall document the behaviour for that implementation." [B] Clause 3.13: documents -> shall document [C] Clause 6.3.2.2: When is an expression "evaluated as a void expression". The original (6.2.2.2) wording is much clearer and should continue to be used. [D] Clause 6.4.3, para 2: Change "... required character set." to "... required source character set."? But they may also apply to the execution character set. Which required character set is being required? [We believe that the term "required" is being replaced.] [E] Clause 6.1.1.2, para 5. Remove. The standard is not in the business of specifing quality of implementation diagnostics. [F] Clause 6.5, para 4: Change "... are required to have ..." back to "... shall have ...". [G] Clause 6.5.2.2, para 4: Change "An argument may be ..." to "An argument shall be ...". [H] Footnote 85: Delete. It appears to be a glorified forward reference. [I] Clause 6.8.4.2, para 3: Change "... expression and no two ..." back to "... expression. No two ...". PC-UK0263 Committee Draft subsection: 7.18.3 and 7.19.1 Title: Support for data management Detailed description: Because of the change in the status of the long and unsigned long types, there is a need for an efficient data type that can be used to perform calculations on mixed file sizes and data object sizes. The obvious candidate is to define an off_t that is compatible with POSIX, but which makes sense on non-POSIX systems. The following should be added to 7.19.1 after paragraph 2: off_t which is a signed integer type of integer conversion rank not less than that of any of long, size_t or ptrdiff_t, and capable of holding both the size of any object allocated by the malloc function (7.20.3.3) and the maximum number of bytes that a conforming application can write to a file opened with mode "wb+" and read back again in an arbitrary order. Recommended practice The off_t type should be capable of holding the total size of all accessible objects at any instant during the execution of a program. The reason for the above (apparently contorted) wording is to allow off_t to be a 32-bit type on a system where long, size_t, ptrdiff_t and the size of files are all 32-bit quantities (e.g. traditional Unix), but to require it to be longer if any of those are larger or if any single object larger than 2 GB can actually be allocated. Note that Unix pipes and similar objects have never had a definite limit on their size. The following should be added to 7.18.3: OFF_MIN -2147483647 OFF_MAX +2147483647 Note that POSIX defines off_t to be only a signed arithmetic type, and not an integer one, but traditional practice (and consistency) requires that it be integral. To the best of my knowledge, no implementation that conforms with POSIX has ever defined off_t to be a floating type. The above specification is therefore what existing POSIX practice is, though framed in words that are not specific to POSIX. This type should also have a flag character defined for use in conversion specifiers in 7.19.6.1, 7.19.6.2, 7.24.2.1, 7.24.2.2. As the obvious letters were all already taken in C89, it does not matter much what it is. PC-UK0264 Committee Draft subsection: 7.8.2 Title: functions for intmax_t and uintmax_t Detailed description: Because of the change in the status of the long type, it is necessary to change many or all uses of that type in many important C89 programs to other types (often intmax_t). It is highly desirable that it should be straightforward to do this by automatic textual processing, but should still produce an efficient result; one obstacle to this is the lack of equivalents to the labs and ldiv functions for the maximum length types. The following should be added to 7.8, after paragraph 1: It declares the type maxdiv_t which is a structure type that is the type of the value returned by the maxdiv function. And the following should be added to 7.8.2: 7.8.2.3 The maxabs function Synopsis [#1] #include intmax_t maxabs(intmax_t j); Description [#2] The maxabs function computes the absolute value of an integer j. If the result cannot be represented, the behavior is undefined. [#3] The maxabs function returns the absolute value. 7.8.2.4 The maxdiv function Synopsis [#1] #include maxdiv_t maxdiv(intmax_t numer, intmax_t denom); Description [#2] The maxdiv function computes numer / denom and numer % denom in a single operation. Returns [#3] The maxdiv function returns a structure of type maxdiv_t, comprising both the quotient and the remainder. The structure shall contain (in either order) the members quot (the quotient) and rem (the remainder), each of which have the same type as the arguments numer and denom. If either part of the result cannot be represented, the behavior is undefined. PC-UK0265 Committee Draft subsection: 7.19.6.1, 7.19.6.2, 7.24.2.1, 7.24.2.2 Title: Use a better flag character for intmax_t and uintmax_t Detailed description: In C89 Future Library Directions, it was said that "Lower-case letters may be added to the conversion specifiers in fprintf and fscanf. Other characters may be used in extensions." Some implementors have ignored this, which is only to be expected. However, C9X has bent over backwards to support such gratuitously perverse implementations. In addition to establishing a very bad precedent, this is a long term drain on resources. Some of us have to teach C and debug assist with the debugging of other people's code; every important but non-mnemonic facility takes extra time and increases errors. Because of the importance of conversions using the intmax_t and uintmax_t types, their flag character should be memorable. If the committee really feels that 'm' is unacceptable, then it should be 'z' and the specifier for size_t should be 'j'. 'z' is at least a common convention for the final item of a sequence. There should also be a flag character defined for the off_t type, because of its importance in data manipulation code and as a migration path. As the obvious letters were all already taken in C89, it does not matter much what it is. ____ end of United Kingdom Comments; beginning of USA Comments ____ From: Matthew Deane The US National Body votes to Approve with comments ISO/IEC FCD 9899, Information Technology - Programming languages - Programming Language C ( Revision of ISO/IEC 9899:1990). See below comments. =========================================== Author: Douglas Walls Comment 1. Category: Inconsistency Committee Draft subsection: 7.20.4.2 The atexit function 7.20.4.3 The exit function Title: atexit call after exit should be undefined Detailed description: The description of the atexit function 7.20.4.3p2 states the atexit function registers the function pointed to by the argument to atexit, to be called without arguments at normal program termination. The description of the exit function 7.20.4.3p3 states first all functions registered by the atexit function are called, in the reverse order of their registration. Neither the descriptions of the atexit nor exit functions adequeatly define how the following program should function: int fl(){} int f2(){ atexit(f1); } int main(){ atexit(f2); exit(0); } Suggested fix: Add a sentence to 7.20.4.3p3, the exit function stating: If a call to atexit occurs after a call to exit, the behavior is undefined. John Hauser Comment 1. Category: Normative change to existing feature retaining the original intent ? Committee Draft subsection: F.9.1.4 Title: atan2 of infinite and zero magnitude vectors Detailed description: (All numbers in this comment are floating-point.) Problem: Section F.9.1.4 defines atan2(+0,+0) -> +0 atan2(+0,-0) -> +pi atan2(-0,+0) -> -0 atan2(-0,-0) -> -pi atan2(+infinity,+infinity) -> +pi/4 atan2(+infinity,-infinity) -> +3*pi/4 atan2(-infinity,+infinity) -> -pi/4 atan2(-infinity,-infinity) -> -3*pi/4 Unfortunately, all of these results, while plausible, are in no way determinate. Note, for example, that any value from +0 to +pi/2 is an equally plausible result for atan2(+infinity,+infinity). Defining atan2 as above is tantamount to the decision made for APL that 0/0 would be 1. The hope presumably is that these values will be innocuous for most uses. In contrast, the IEEE Standard makes it a rule that indeterminate cases signal an invalid exception and return a NaN. The APL decision has since been regretted, and this one may be too if it prevails. Fix: Define the above cases as domain errors. ---------------------------------------------------------------- Comment 2. Category: Normative change to existing feature retaining the original intent ? Committee Draft subsection: F.9.4.4, 7.12.8.4, and F.9.5.4 Title: Infinite results from pow and tgamma Detailed description: (All numbers in this comment are floating-point.) Problem: Section F.9.4.4 defines all the following to return +infinity: pow(x,+infinity) x < -1 pow(x,-infinity) -1 < x < 0 pow(-0,y) y < 0 and y not an integer pow(-infinity,y) y > 0 and y not an integer Consider, for example, pow(-3,+infinity). We can infer that this value has infinite magnitude, but unless we can assume the +infinity exponent is an even or odd integer, we can't say anything conclusive about the sign of the result. All the cases above share the property that the correct result is at best ambiguous between +infinity and -infinity. Currently, CD2 makes the dubious choice of forcing the sign of the result positive, rather than taking the safer route of returning a NaN. It's not clear that these cases are important enough to warrant bending the rules. (Does pow(-3,+infinity) really come up so often that efficiency is more important than correctness?) Interestingly, the opposite choice was made for the tgamma function. For all nonpositive integers x, tgamma(x) could be taken as either +infinity or -infinity, with no way to choose between them, just as for pow above. However, Section 7.12.8.4 makes this case an explict domain error, and Section F.9.5.4 confirms this decision. Possible fixes: Make the pow cases listed above domain errors, consistent with tgamma's treatment of infinity results. Doing so would also make pow(x,0.5) almost identical to sqrt(x). (The remaining difference would be the IEEE mistake that has sqrt(-0) -> -0.) If that isn't possible, change tgamma to be consistent with pow, returning +infinity for cases now listed as domain errors. ------------------------------------------------ Comment 3. Category: Normative change to existing feature retaining the original intent ? Committee Draft subsection: F.9.9.1 and 7.12.12.1 Title: fdim of two infinities Detailed description: (All numbers in this comment are floating-point.) Problem: As defined by Section 7.12.12.1, fdim(x,y) returns x-y if that's positive, and otherwise returns +0. Unfortunately, as written the definition gives fdim(+infinity,+infinity) -> +0 fdim(-infinity,-infinity) -> +0 which is at odds with the fact that x-x is indeterminate if x is infinite. As with atan2 (in another comment), it appears that a plausible but essentially arbitrary result has been selected for convenience, in violation of the IEEE Standard precept that indeterminate cases return NaN and signal the invalid exception. Fix: Define the above cases as domain errors. ------------------------------------------------------------------ Comment 4. Category: Normative change to intent of existing feature Committee Draft subsection: F.9.9.2, F.9.9.3, 7.12.12.2, and 7.12.12.3 Title: fmax and fmin of NaN arguments Detailed description: Problem: Sections F.9.9.2 and F.9.9.3 state that if exactly one argument to fmax or fmin is a NaN, the NaN argument is to be ignored and the other argument returned. A footnote in Section 7.12.12.2 further clarifies that NaN arguments are treated as missing data. This treatment of NaNs contradicts the semantics given by the IEEE Standard that a NaN may represent an indeterminate value, the calculation of which has failed due to an invalid exception. In such cases, the NaN cannot necessarily be dismissed as ``missing data.'' For reasons of correctness, all IEEE operations---and for that matter all other CD2 functions---require that any NaN argument be propagated as the result of the function or operation if possible. (It will not be possible if the function result type is not floating-point. An exception is made for bit-manipulating functions such as copysign, and for cases when the function result is fully determined by the other arguments.) It is both risky and unnecessarily complicating for fmax and fmin to deviate from this convention. Fix: Define fmax and fmin to propagate NaNs in the usual way. The only cases for which NaNs don't propagate are fmax(+infinity,NaN) -> +infinity fmin(-infinity,NaN) -> -infinity and the same with arguments reversed. David R Tribble Comment 1. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 7.18 Title: [u]intN_t names Detailed description: Section 7.18 describes the exact-, minimum-, and fastest-width integer types. While I agree that these types are very useful, I feel that their names are misleading. It can be argued that programmers are more likely to use the 'intN_t' names than the other names, if for no other reason than because they are short. This has the potential of creating problems for programmers who are compiling their programs on machines that do not provide efficient N-bit representations or do not provide them at all. It can also be argued that programmers that use the 'intN_t' more than likely really meant to use the 'int_leastN_t' types. Such programmers probably want a type with a guaranteed number of bits rather than an exact number of bits. It can be argued further that use of the 'intN_t' names is not portable because they may not exist in all implementations. (An implementation is not required to provide these types.) The main point behind all these arguments is that a short name such as 'int8_t' should represent the most common and the most useful integer "width" type, and that the "exact width" meaning is inappropriate for it. For these reasons, the names of the exact-width and least-width type names should be changed. Instead of 'int8_t', we should have 'int_exact8_t', and instead of 'int_least8_t', we should have 'int8_t' (and so forth for the other type names). The standard integer type names then become: Exact-width Minimum-Width Fastest minimum-width -------------- ------------- --------------------- int_exact8_t int8_t int_fast8_t int_exact16_t int16_t int_fast16_t int_exact32_t int32_t int_fast32_t int_exact32_t int64_t int_fast64_t uint_exact8_t uint8_t uint_fast8_t uint_exact16_t uint16_t uint_fast16_t uint_exact32_t uint32_t uint_fast32_t uint_exact32_t uint64_t uint_fast64_t The benefits of these names over the current names are: 1. The 'intN_t' types always exist in conforming implementations. 2. The 'intN_t' types have a more intuitive meaning; use of them indicates the need for integers of well-known minimum widths. 3. Use of the 'intN_t' types covers the most common use of such types, and thus the existence of a short, convenient name is reasonable. 4. Use of the 'int_exactN_t' types indicates a real need for integers with exactly known widths; this is probably a rare need and thus the existence of a bulkier type name is acceptable. For consistency, the macros in sections 7.18.2.1 and 7.18.2.2 should also be renamed accordingly: 7.18.2.1 Limits of exact-width integer types INT_EXACT8_MIN INT_EXACT8_MAX UINT_EXACT8_MAX INT_EXACT16_MIN INT_EXACT16_MAX UINT_EXACT16_MAX INT_EXACT32_MIN INT_EXACT32_MAX UINT_EXACT32_MAX INT_EXACT64_MIN INT_EXACT64_MAX UINT_EXACT64_MAX 7.18.2.2 Limits of minimum-width integer types INT8_MIN INT8_MAX UINT8_MAX INT16_MIN INT16_MAX UINT16_MAX INT32_MIN INT32_MAX UINT32_MAX INT64_MIN INT64_MAX UINT64_MAX ------------------------------------------------------------------ Eric Rudd Comment 1. Category: Editorial change/non-normative contribution Committee Draft subsection: 7.12.4.4 Title: atan2 function is unclearly specified Detailed description: > [#2] The atan2 functions compute the principal value of the > arc tangent of y/x, using the signs of both arguments to > determine the quadrant of the return value. The principal value of the arc tangent has a range from -pi/2 to +pi/2, but atan2() returns a value that has a range from -pi to +pi. The definition in N2794 has further complications in the case where x=0 and the implementation does not support infinities. It is not specified in the existing definition how to use the signs of the arguments to determine the quadrant of the return value. Of course, the underlying assumption is that a vector angle is really being computed. Thus, I would propose replacing the above sentence with the following: "The atan2 functions compute the angle of the vector (x,y) with respect to the +x axis, reckoning the counterclockwise direction as positive." Comment 2. Category: Normative change to intent of existing feature Committee Draft subsection: 7.12.4.4, 7.12.1 Title: atan2 indeterminate cases Detailed description: > A domain error may occur if both arguments are zero. atan2(0., 0.) and atan2(INF, INF) are both mathematically indeterminate, so, by 7.12.1, a domain error is *required*. Thus, I would propose the following wording: "A domain error occurs if both arguments are zero, or if both arguments are infinite." Comment 3. Category: Request for information/clarification Committee Draft subsection: 7.12.4.4 Title: atan2 range Detailed description: There is a minor problem with the statement of range of the atan2 function if signed zeros do not exist in an implementation. If |y|=0., and x<0., should the value of atan2(y, x) be -pi or +pi? The range is stated in 7.12.4.4 to be [-pi,+pi], but I don't see how the range can be inclusive at both ends unless signed zeros exist in the implementation. Comment 4. Category: Normative change to intent of existing feature Committee Draft subsection: 7.12.1, Annex F.9.1.4, F.9.4.4 Title: Annex F considered harmful Detailed description: Annex F attempts to define numerical return values for the math functions in cases where the underlying mathematical functions are indeterminate. I regard this as a disastrous mistake, especially in light of 7.12.1: [#2] For all functions, a domain error occurs if an input argument is outside the domain over which the mathematical function is defined. Thus, F.9.1.4 should read -- atan2(+/-0, x) returns +/-0, for x>0. -- atan2(+/-0, +/-0) returns NaN. -- atan2(+/-0, x) returns +/-pi, for x<0. -- atan2(y, +/-0) returns pi/2 for y>0. -- atan2(y, +/-0) returns -pi/2 for y<0. -- atan2(+/-y, INF) returns +/-0, for finite y>0. -- atan2(+/-INF, x) returns +/-pi/2, for finite x. -- atan2(+/-y, -INF) returns +/-pi, for finite y>0. -- atan2(+/-INF, +/-INF) returns NaN. -- atan2(y, NaN) returns NaN for any y. -- atan2(NaN, x) returns NaN for any x. and F.9.4.4 should read [#1] -- pow(NaN, x) returns NaN for any x, even 0 -- pow(x, NaN) returns NaN for any x -- pow(+/-0, +/-0) returns NaN -- pow(x, +INF) returns +INF for x>1. -- pow(1., +/-INF) return NaN -- pow(x, +INF) returns +0 for 01. -- pow(x, -INF) returns +INF for 00. -- pow(+INF, +/-0) returns NaN. -- pow(+INF, y) returns +0 for y<0. -- pow(-INF, y) returns NaN for any y. I know that the committee has rebuffed earlier suggestions of this sort, but I hope that the committee will reconsider, since the laws of mathematics *must* take precedence over mere software standards. There is also the common-sense rule that when you don't know the answer, the only responsible reply is "I don't know" (that is, NaN) rather than making up an answer, however plausible. Comment 5. Category: Normative change to intent of existing feature Committee Draft subsection: 7.12.7.4, 7.12.1 Title: pow domain errors Detailed description: > [#2] The pow functions compute x raised to the power y. A > domain error occurs if x is negative and y is finite and not > an integer value. A domain error occurs if the result > cannot be represented when x is zero and y is less than or > equal to zero. A range error may occur. When x and y are both zero, the result is mathematically undefined, so, by 7.12.1, a domain error is *required*. Thus, I would recommend changing 7.12.7.4 paragraph 2 to: [#2] The pow functions compute x raised to the power y. A domain error occurs if x is negative and y is finite and not an integer value. A domain error occurs if x and y are both zero, or if the result cannot be represented when x is zero and y is less than zero. A range error may occur. Comment 6. Category: Request for information/clarification Committee Draft subsection: 5.2.4.2.2, Annex F Title: Rounding behavior in case of ties Detailed description: In paragraph 6, rounding is discussed. However, the behavior of round-to-nearest in case of ties has not been specified. Should it be stated that this is implementation-specified? Annex F refers to IEC 60559, but does not quote from it, so one is left wondering what is to be done in case of ties. I think that the C standard should be reasonably self-contained, and a brief mention of rounding in case of ties would do the trick. Comment 7. Category: Request for information/clarification Committee Draft subsection: 5.2.4.2.2, 6.5, 7.6 Title: "Exception" inadequately defined Detailed description: In subsection 5.2.4.2.2 paragraph #3 the term "exception" is used for the first time, without definition or forward reference. Is this a "floating-point exception"? Subsection 6.5, paragraph #5 only confuses things further: [#5] If an exception occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined. since in 7.6 some attempt is made to define the behavior for floating-point exceptions. A clarification of these terms is needed. Comment 8. Category: Request for information/clarification Committee Draft subsection: 7.12.1 Title: Backward compatibility issue with floating-point exceptions Detailed description: Programs written for ISO 9899-1990 do not have access to the floating-point exception mechanism. It was safe to pass any argument to the math functions and test the results later, since, according to 7.5.1 of that document, "Each function shall execute as if it were a single operation, without generating any externally visible exceptions." However, that sentence has disappeared from FCD 9899, which raises compatibility issues. It's OK to have an exception mechanism, but there needs to be a guarantee that programs written for ISO 9899-1990 will not crash because of an exception which is now externally visible (and unhandled). Comment 9. Category: Request for information/clarification Committee Draft subsection: 7.12.1 Title: Domain and range errors Detailed description: In paragraphs #2 and #3, the requirement was dropped that errno be set to EDOM or ERANGE to reflect a domain or range error. I am curious as to what use there is in defining domain and range errors, since it appears that there is no longer any specified means for a program to determine whether such an error has occurred. This omission creates compatibility problems for programs conforming to ISO 9899-1990, since they may rely on errno to detect problems occurring during evaluation of math functions. Lawrence J. Jones Comment 1. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 7.19.4.4 Title: tmpnam Detailed description: tmpnam is required to return at least TMP_MAX different names that are not the same as the names of any exiting files, but there is no way to ensure that every possible name isn't already in use by existing files. Since there is no license for tmpnam to fail until it has been called more than TMP_MAX times (and, since the behavior in that case is implementation-defined, it may not even be valid then), it seems that tmpnam must not return in this case. This is not very useful behavior. I suggest the following changes: 1) TMP_MAX should be described as the number of possible names that tmpnam can generate (note that any or all of them might match existing file names). 2) tmpnam should be allowed to return a null pointer if no acceptable name can be generated. 3) Calling tmpnam more than TMP_MAX times should not produce implementation-defined behavior. Rather, it should be implementation-defined (or unspecified) whether tmpnam simply fails unconditionally after exhausing all possible names or resets in an attempt to re-use previously generated names. ------------------------------------------------------------------------ Comment 2. Category: Request for information/clarification Committee Draft subsection: 3.18, 6.2.4, 6.2.6, 6.7.8, others Title: Unspecified and indeterminate values Detailed description: The draft refers to both "unspecified" and "indeterminate" values. It is not clear to me whether these are intended to be synonyms, or if there is some subtle difference between them. If there is a difference, my guess is that indeterminate values are allowed to be trap representations but unspecified values are not. In either case, the draft should be clarified. ------------------------------------------------------------------------ Comment 3. Category: Editorial change/non-normative contribution Committee Draft subsection: 6.2.5, 6.3.1.1, 6.7.2.2 Title: Enumerated type compatibility Detailed description: According to 6.7.2.2, "each enumerated type shall be compatible with an integer type", but this is vacuously true since 6.2.5p17 says that enumerated types *are* integer types. I believe that 6.7.2.2 meant to say that each enumerated type shall be compatible with a signed integer type or an unsigned integer type (or, perhaps, with char). If a change is made to 6.7.2.2, a similar change needs to be made to 6.3.1.1. ------------------------------------------------------------------------ Comment 4. Category: Editorial change/non-normative contribution Committee Draft subsection: 6.3.2.3 Title: Conversion of null pointer constants Detailed description: 6.3.2.3p3 states that assignment and comparison cause null pointer constants to be converted to null pointers, implying that conversions in other contexts do not and thus result in implementation-defined results as per p5. I believe the intent was that any conversion of a null pointer constant to a pointer type should result in a null pointer, with assignment and comparison being but examples. ------------------------------------------------------------------------ Comment 5. Category: Editorial change/non-normative contribution Committee Draft subsection: 6.4 Title: UCNs as preprocessing tokens Detailed description: Since the syntax and constraints allow any UCN to appear in an identifier, I believe the syntax rule allowing a UCN to be a preprocessing token is vacuous and should be removed. ==================== David H. Thornley Comment 1. Category: Feature that should be removed Committee Draft subsection: 7.19.7.7 Title: Declare gets() obsolescent Detailed description: The gets() function can very rarely be used safely, and is the source of numerous problems, including the Great Internet Worm of 1987. It should be declared obsolescent. Unfortunately, it is widely used, and cannot be simply removed. ------------------------------------------------------------------ Comment 2. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 7.18 Title: Renaming of types in Detailed description: Traditionally, C integer types were assigned similarly to the *fast* designations: int has been a fast type of at least 16 bits, while long has been a fast type of at least 32 bits. It is desirable to keep this principle, and so it is desirable to ensure that the shortest, easiest-to-use integer type names are fast types. Further, the shortest names are going to be most used, except among very careful programmers, and are going to be troublesome if assigned to exact-width types, which may or may not exist. The fix: Rename the types as follows: int_*t becomes int_exact*t int_least*t stays the same int_fast*t becomes int_*t ------------------------------------------------------------------ Comment 3. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 7.18.1.1 Title: The meaning of "exact" in Detailed description: The meaning of "exact" in 7.18.1.1 is unclear. A computer with 36-bit words could define a 36-bit integer type as int_32t, since the only way to tell the difference would be to cause overflow, which is undefined. Since it is impossible to tell the difference without causing undefined behavior, it is difficult to see what portable use the int_32t would be.> The main apparent use for exact types is communicating with I/O devices and programs, and this would be compromised if the types were inexact. The fix: Exact types may not have padding bits. (Alternately, exact signed types may not have padding bits. The same difficulty does not exist with exact unsigned types, since the overflow behavior of unsigned types is well defined.) ---------------------------------------------------------------- Comment 4. Category: Feature that should be removed Committee Draft subsection: 6.2.4#4 Title: Remove the long long integer types Detailed description: The increased size of computer systems has created a problem in that many object sizes cannot be expressed with only 32-bit integers. On the other hand, many programs existing use "long" as meaning, specifically, 32-bit integer. There are also an unknown number of programs that rely on "long" as the longest integer type. There is no good solution for this, so we must find the least bad one. In the previous C standard, there were four integer types, with clear meanings. "char" was a type capable of holding a single character. "short" was a space-conserving type capable of holding an integer of frequently useful size. "int" was a general-purpose integer type, being some sort of natural size (but at least 16 bits), and "long" was the longest integer type. It is now proposed to remove the meaning of "long" by adding "long long". This leaves the meaning of "long" as something like int_fast32t, and "long long" as int_fast64t. The addition of "long long" therefore destroys meaning in favor of introducing redundancy. The primary reason for not allowing "long long" as a standard integer type is that it introduces a serious incompatibility between the previous C standard and the proposed on. In the previous C standard, it was guaranteed that any type based on a standard integer (such as size_t) could be converted into a string by casting it to long or unsigned long and using sprintf() or similar function. DR 067 essentially said that this was standard and portable. With the proposed standard, it is no longer portable, since size_t and other such types can be longer than long. (POSIX defines some such types as being standard C integer or arithmetic types, and these are likely more important than size_t and ptrdiff_t.) The difference between this change and all others proposed is that it produces a fundamental discrepancy between the previous C standard and the proposed one. Certainly, the conversion of size_t or similar types to character strings is a reasonable thing to do. There is a way to do it in C89, and an entirely different way to do it in the proposed standard. There is no way to write a program doing such a conversion that will run in both versions of C, except by using the preprocessor for conditional compilation. I do not know how many programs will be affected, or in what way. In a recent posting to comp.std.c, message ID <6vfcd6$7ut$1@pegasus.csx.cam.ac.uk>, Nick Maclaren described an experiment he had performed: "I took a copy of gcc and hacked it around enough to produce diagnostics for some of the problem cases, where C9X introduces a quiet change over C89 in the area of 'long' and 'long long'. However, this hack has the following properties: 1) It flags only some traps. 2) It produces a large number of false positives. 3) It requires header hacks, and produces broken code. "I then ran it on a range of widely-used and important public-domain codes, taken from the Debian 1.3.1 CD-ROM set. Many of these are effectively the same codes that are shipped with commercial systems, and others are relied on heavily by many sites. "Most of the codes used "long" to hold object and file positions, or as a way of printing an unknown integer type. The ones that I have marked as "Yes" will almost certainly invoke undefined behaviour if faced with a C9X compiler where ptrdiff_t is longer than "long", and probably will if off_t is. The ones that I have marked "Maybe" could well have checks to prevent this, or were too spaghettified to investigate. "Only 4 had any reference to "long long" whatsoever, and it was in a single non-default #if'd out section in 3 of them; one of those defined a symbol that was never referred to, another was solely for Irix 6 file positions, and the last could trivially have been replaced by double. The ONLY program that either had any reference to "long long" by default, or used it seriously, was gcc itself." Loss of data printf fails Uses long long ------------ ------------ -------------- apache Yes Yes No bison No No No bash Maybe Yes No cpio Yes No Effectively not csh Yes No No diff Maybe No No elm Build process failed No exim Yes No No fileutils Yes No Effectively not findutils Yes Yes No flex No No No gawk Yes Yes No gcc Build process failed Yes gnuplot Maybe No No gzip Yes No No icon Yes No No inn Build process failed No nvi Maybe Yes No pari Maybe No No perl Build process failed Effectively not sendmail Yes Yes For Irix 6 trn Maybe No No wu-ftpd No Yes No zip Yes Yes No The problems will show up only when dealing with sufficiently long data objects, but I see no reason why any of those programs should not eventually be applied to a file of more than four gigabytes. If so, the program wil fail in odd ways, likely corrupting data. Assuming that the programs selected are representative, somewhere between one-third and one-half of all large, heavily-used C programs are likely to mishandle large files or memory objects. Since many of these programs are not commercially supported, the task of changing them to be valid C will fall to volunteers. Even with commercial software, the difficulties involved in sorting out all possible problems mean that these programs will be untrustworthy. It is also desirable to keep fseek() and ftell() usable as is, rather than creating more functions, changing the return types of the current ones, or going to the less useful fgetpos() and fsetpos(). Also, pointers were guaranteed to fit in some integral type in the previous standard, and many programs may have taken advantage of that. Since the proposed "long long" type creates an unbridgeable discrepancy between C89 and C9X, and since it renders a large and unknown number of programs untrustworthy in unknown but probably dangerous ways, I think it a very bad idea. It is not necessary to require that "long" become a 64-bit type, but merely to require that all appropriate "_t" types be representable as long or unsigned long. I see no obvious need for a 64-bit type unless required by these types, but there is certainly no reason not to have one. It is possible to require a conforming implementation to have "int_fast64t", "int_least64t", and the corresponding unsigned types. It is possible to list "long long" as a common extension, in which case it does no harm but the aesthetic. Let us consider how programs will be affected by the changes. First, there is no need for long to exceed 32 bits unless object size may exceed that. Many current systems will have no problem with it. Second, new programs written can use int_least32t in place of long, and int_fast64t in place of "long long". This will provide compatibility with existing ABIs using "long" and "long long" types. Third, compiler manufacturers will undoubtedly provide various ways to bridge the gap. For a long time to come, compilers will doubtless provide options to restrict "long" to 32 bits, for use in compiling older programs. There is a class of programs that will be seriously affected: those that need "long" to be 32 bits, and need "size_t" or "off_t" or some such to be more than 32 bits. Many of these programs are therefore irretrievably broken. Some of them have been written with "long long", and such types as size_t being "unsigned long long". The authors of these programs have already demonstrated a willingness to discard portability, in that the programs will only work on certain nonstandard implementations. I hope that they have been careful to record what uses of "long" should not exceed 32 bits and what uses should be the longest possible type. I believe this class of programs is much smaller than the class that would be adversely affected by "long long"; most of the programs surveyed above do not in fact use "long long", while a frightening number will suffer quiet breakage if "long long" is a standard C integer type. Fix: Remove "long long" and "unsigned long long" from the list of standard C integer types. It may be listed as a common extension, and may require int_fast64t and int_least64t if desired. ------------------------------------------------------------------ Comment 5. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 7.19.6.1, 7.19.6.2, 7.24.2.1, 7.24.2.2 Title: New type declarators for printf() and scanf() Detailed description: The conversion specifier characters in the format string listed in are clumsy, awkward, and difficult to remember. I propose a somewhat different syntax for them, substituting involving length modifiers rather than conversion specifier characters. We need to add specifiers for int_*t, int_fast*t, int_least*t, and intmax_t, as well as their unsigned counterparts. We can use a notation with, for example, :32f as a length modifier for int_fast32t types. This has the further advantage that it limits the need for using more letters. Note that this is not an actual parameterized implementation, although it could certainly be made so in a later version of the standard. The fix: Add to the end of 7.19.6.1 [#7] : Specifies an extended integer type. It is followed by a number and letter, or by a string of letters. If followed by a number and letter, the number is the number of bits in the type, and the letter is x for an exact type, f for a fast type, and l for a least type. The number and letter shall match an existing extended integer type in stdint.h. Alternately, the colon shall be followed by "max", "size", or "ptr", specifying a length of intmax_t, size_t, or ptrdiff_t respectively. Add to the end of 7.19.6.2 [#11] : Specifies a pointer to an extended integer type. It is followed by a number and letter, or by a string of letters. If followed by a and letter, the number is the number of bits in the type, and the letter is x for an exact type, f for a fast type, and l for a least type. The number and letter shall match an existing extended integer type in stdint.h. Alternately, the colon shall be followed by "max", "size", or "ptr", specifying a length of intmax_t, size_t, or ptrdiff_t respectively. The same should be appended to 7.24.2.1 and 7.24.2.2 respectively. ------------------------------------------ Robert Corbett Comment 1. Category: Request for information/clarification Committee Draft subsections: F.9, G.5 Title: the sign convention should be explained Detailed description: The sign convention used in Sections F.9 and G.5 should be described explicitly. For example, I assume that asin(±0) returns ±0 means asin(-0) returns -0 and asin(+0) returns +0, but I could imagine someone thinking that either -0 or +0 could be returned for asin(-0) or asin(+0). ============================================================= Comment 2. Category: Normative change to intent of existing feature Committee Draft subsection: F.9.4.4 Title: pow(+1, ±inf) should return +1 Detailed description: The function call pow(1, ±inf) should return 1 and should not raise any exceptions. The mathematical basis for changing this special case is stronger than the basis for defining pow(x, ±0) to be 1. This special case is less important than the pow(x, ±0) case, but it is useful for essentially the same reasons. ============================================================= Comment 3. Category: Normative change to intent of existing feature Committee Draft subsection: G.5, paragraph 7 Title: cpow(0, z) should not always raise exceptions Detailed description: The FCD defines cpow(z, c) to be equivalent to cexp(c clog(z)). No special cases are given in Section G.5.4. Since clog(0+i0) raises the divide-by-zero exception, cpow(0+i0, z) must raise the divide-by-zero exception regardless of the value of z. In the case where z = x+i0, cpow is required to raise both the divide-by-zero exception and the invalid exception. There is no reasonable basis for raising these exceptions. Recommendation: The function cpow(0+i0, z) should not raise the divide-by-zero exception unless the result is an infinity. It should not raise the invalid exception unless the result is a NaN. ============================================================= Comment 4. Category: Normative change to intent of existing feature Committee Draft subsection: 6.5.3.4, paragraph 2 Title: the operand of sizeof should not be a VLA Detailed description: One of the properties of C that makes it suitable for writing systems software is that all operations are explicit. The sizeof operation as defined for operands that are VLAs introduces implicit operations for the first time. The implicit operations are particularly problematic if the bounds expression has side effects. It is not clear from the current draft if the bounds expression will be re-evaluated when the sizeof operator is evaluated. Recommendation: Add a constraint prohibiting the operand of sizeof from being a VLA or a VLA type. ============================================================= Comment 5. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.2.2 Title: there should explicitly be no linkage between different identifiers Detailed description: Many programmers believe that #include const int i = 0; const int j = 0; int void main(void) { printf("%d\n", &i == &j); return (0); } is allowed to print 1 (followed by a newline). The standard does not explicitly state that no linkage exists between the objects designated by different identifiers. While the intent of the committee is clear to me, I find it hard to convince others based on the current text of the standard. If this issue is not addressed, I suspect a defect report will be needed to resolve it. Recommendation: The standard should explicitly state that no linkage exists between different identifiers. ============================================================= Comment 6. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.4.4.2 Title: conversion of floating-point constants Detailed description: The first edition of the C standard does not require translation-time conversion of floating-point constants. The current draft does not quite manage to state that floating-point constants must be converted at translation time, but I believe that the committe intended it to so stipulate. I know of one C compiler that provides an option to have the values of floating-point constants depend on the rounding mode in effect at the time the containing expression is evaluated. I know of another compiler that raises overflow or underflow exceptions at execution time if the value of a floating-point constant overflowed or underflowed when it was converted at translation time. These compilers are products of major systems vendors (not Sun in either case). Recommendation: The standard should explicitly state that floating-point constants must be converted at translation time. It should state that decimal floating-point constants of the same type and mathematical value must produce the same value in all contexts. Similarly, it should state that hexadecimal floating-point constants of the same type and mathematical value must produce the same value in all contexts. The normative Section F.7.2 should state that evaluation of floating-point constants is not allowed to raise execution-time exceptions. John Hauser Comment 1. Category: Inconsistency Committee Draft subsection: 7.12.6.4 and F.9.3.4 Title: frexp infinity result not allowed Detailed description: Problem: Section 7.12.6.4 specifies that frexp returns a value with ``magnitude in the interval [1/2, 1) or zero'', but Section F.9.3.4 insists that frexp(x,p) returns x if x is an infinity. Fix: In Section 7.12.6.4, allow frexp to return infinity if its first argument is infinite. --------------------------------------------- Comment 2. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: F.9.3.10 Title: log2(1) should equal log(1) and log10(1) Detailed description: (All numbers in this comment are floating-point.) Problem: Sections F.9.3.7 and F.9.3.8 mandate that log(1) and log10(1) each return +0, while Section F.9.3.10 omits this requirement for log2(1). There is no reason for base-2 logarithms to differ from base-10 and natural logarithms in this detail. Fix: Add to Section F.9.3.10 the rule: -- log2(1) returns +0. ------------------------------------------------------------------ Comment 3. Category: Inconsistency Committee Draft subsection: 7.12.7.4 and F.9.4.4 Title: Inconsistent pow domain errors Detailed description: (All numbers in this comment are floating-point.) Problem: Section 7.12.7.4 states for pow(x,y): A domain error occurs if x is negative and y is finite and not an integer value. Under the IEEE Standard (IEC 60559), a domain error is signaled by raising the invalid exception flag, with a NaN returned from the function if the result type is floating-point. But Section F.9.4.4 defines pow(-infinity,y) -> +0 y < 0 and y not an integer pow(-infinity,y) -> +infinity y > 0 and y not an integer and does not permit these cases to raise the invalid exception flag, even though 7.12.7.4 calls them domain errors. In a slightly different vein, Section 7.12.7.4 also says that A domain error occurs if the result cannot be represented when x is zero and y is less than or equal to zero. whereas F.9.4.4 defines pow(+0,+0) = pow(+0,-0) = pow(-0,+0) = pow(-0,-0) -> 1 The definition in Section F.9.4.4 strongly implies that pow(0,0) should return 1 in _every_ implementation, because nothing ever prevents it. Since a result of 1 can certainly always ``be represented'', the wording in Section 7.12.7.4 is incongruous with Annex F for the case pow(0,0). Suggested fix: Adjust the function description in Section 7.12.7.4 to say: A domain error occurs if x is finite and negative and y is finite and not an integer value. A domain error occurs if the result cannot be represented when x is zero and y is less than zero. ------------------------------------------------------------------ Comment 4. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: F.9.5.3 Title: lgamma(1) and lgamma(2) should return +0 Detailed description: (All numbers in this comment are floating-point.) Problem: When a floating-point function returns a zero result and the sign of this zero result is indeterminate (neither +0 nor -0 is a more legitimate result than the other), Annex F usually mandates that the function return +0 and not -0. For example, Annex F specifies that acos(1), acosh(1), log(1), and log10(1) each return +0. This policy is consistent with the IEEE Standard's (IEC 60559's) requirement that x-x always return +0 and not -0 (assuming finite x and the usual round-to-nearest rounding mode), even though the sign on the zero result in this case is wholy indeterminate for any x. Section F.9.5.3 neglects to require that lgamma(1) and lgamma(2) each return +0, consistent with the other such cases. Fix: Add to Section F.9.5.3 the rules: -- lgamma(1) returns +0. -- lgamma(2) returns +0. ------------------------------------------------------ Thomas MacDonald Comment 1. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.8.5.3 The for statement Title: definition of the for statement Detailed description: The current description of the for statement is terms of a syntactic rewrite into a while loop. This causes problems as demonstrated by the following example: enum {a,b}; for(;;i+=sizeof(enum{b,a})) j+=b; is not equal to enum {a,b}; while(){j+=b; i+=sizeof(enum{b,a});} because a different b is added to j. Change the description of the for statement to: 6.8.5.3 The for statement The following statement for ( clause-1; expression-2; expression-3 ) statement behaves as follows. The expression expression-2 is the controlling expression. If clause-1 is an expression it is evaluated as a void expression before the first evaluation of the controlling expression. The evaluation of the controlling expression takes place before each execution of the loop body. The evaluation of expression-3 takes place after each execution of the loop body as a void expression. Both clause-1 and expression-3 can be omitted. An omitted expression-2 is replaced by a nonzero constant. Also, delete the "forward reference" to the continue statement. ---------------------------------------------------------------- Comment 2. Category: Normative change to intent of existing feature Committee Draft subsection: 5.2.4.1, 6.2.1, 6.8, 6.8.2, 6.8.4, 6.8.5 Title: define selection and iteration statements to be blocks. Detailed description: A common coding convention is to allow { } for all iteration and selection statements. When compound literals where introduced, this convention was inadvertently compromised. Consider the following example: struct tag { int m1, m2; } *p; while (flag) flag = func( p = & (struct tag) { 1, 2 } ); p->m1++; If { } are introduced as follows: while (flag) { flag = func( p = & (struct tag) { 1, 2 } ); } p->m1++; // Error - compound literal went out of scope then the example has undefined behavior because the compound literal goes out of scope after the while loop finishes execution. The recommended change is: 5.2.4.1 (page 17) Translation limits: 127 nesting levels of blocks 6.2.1 (page 25) scopes of identifiers: ... the identifier has "block scope," which terminates at the end of the associated block. 6.5.2.5 (page 67-68) change example 8 6.8 (page 119) statements: A "block" allows a set ... (move 6.8.2 semantics to 6.8) 6.8.2 (page 120) Compound statement: Semantics: change semantics to: A "compound statement" is a block. 6.8.4 (page 121) Selection statements: (add to semantics) A selection statement is a block whose scope is a strict subset of its enclosing block. All associated substatements are blocks whose scopes are strict subsets of the associated selection statement block. 6.8.5 (page 123) iteration statements: (add to semantics) An iteration statement is a block whose scope is a strict subset of its enclosing block. The associated loop body is a block whose scope is a strict subset of the associated iteration statement block. Also, forward references need to be checked. ----------------------------------------------------------------------- Comment 3. Category: Normative change to intent of existing feature Committee Draft subsection: 6.7.5.2 Title: VLA side-effects Detailed description: Currently 6.7.5.2 states that it is unspecified whether side effects are produced when the size expression of a VLA declarator is evaluated. Consider the following complicated sizeof expression: void *p; sizeof( ** (int (* (*) [++m]) [++n]) p ) In this example, the sizeof operand is an expression. The pointer "p" is cast to a complicated type: pointer to array of pointers to array of int which can be shown in a flattened out view (read R to L) int [++n] * [++m] * The sizeof operand is an expression containing two deref operators. Once these deref operators are applied, the final type of the expression is: int [++n] * The translator does not need to evaluate the "++n" size expression to determine the pointer size. However, it's not difficult for the translator to do so anyway. It's a common implementation technique to ignore unused portions of the expression tree when sizeof is involved. One way to view the sizeof expression is the following parse tree: sizeof (TYPE: size_t) | * (TYPE: int [++n] *) below not needed --> | so discard? * (TYPE: int [++n] * [++m]) | (cast) (TYPE: int [++n] * [++m] *) | p (TYPE: void *) When "sizeof" is encountered, the translator need not look any further than its immediate operand to determine the size. When looking at the immediate operand of "sizeof" the only side effect noticed is "++n" and this means the translator must look all the way down into the type "int [++n] *" to find the side-effect (which is straightforward). If C9X requires implementations to evaluate "++m" also, then it gets harder because it's not always obvious where to find the "++m" expression. More elaborate machinery is needed to find the unevaluated VLA expressions. If a translator is designed from scratch, this can be built in. However, most of us live with existing technology. The compromise reached by WG14 here, is that we do not require side effects to be evaluated inside a VLA size expression. Some have objected to that compromise. The presence of mixed code and decls eliminates more of the distinctions between declarations and statements. They can appear in the same places now. The critics of the compromise have more ammo here. The "no required side effects" clause is more noticeable with mixed code & decls. There are other places where an evaluation is not strictly needed: void f(int n; int a[++n]) { // "a" is converted to a pointer anyway } int (*p)[n++]; // don't need array size unless doing bounds checking I'm sure there are others. In general, if a VLA declarator can be written as an incomplete type, as in: void f(int n; int a[]) { } int (*p)[]; the side effect might not be evaluated by the translator (because the size isn't needed). For an example such as: int (*p)[++n] = malloc(size); p++; The intent is that the side effect be evaluated since the value is needed to increment the pointer p. However, if all references to p are of the form: (*p)[i] then the value produced by the side effect is never needed. The wording is tricky, because the side effects are only evaluated when the size is needed. ----------------------------------------------------------------------- Comment 4. Category: Normative change to intent of existing feature Committee Draft subsection: 6.7.3 Title: property M for restricted pointers Detailed description: Summary ------- A 'pure extension' to the current specification of the 'restrict' qualifier is proposed. It would make some additional, natural usages of the qualifier conforming without imposing an additional burden on compilers. This proposal was developed in response to PC-UK0118, and, possibly after further refinement, is intended to become part of a public comment. The problem with the current specification ------------------------------------------ The issue addressed here was raised in part 2) of PC-UK0118 and concerns the use of two different restricted pointers to reference the same object when the value of that object is not modified (by> any means). The same issue can arise for a library function with one or more restrict-qualified parameters when it is called with string literal arguments. Example A: typedef struct { int n; double * restrict v; } vector; void addA(vector a, vector b, vector c) { int i; for(i=0; in; i++) x->v[i] += y->v[i-1]; } There is a problem for a call of the form addC(&z,&z), which the current specification gives undefined behavior. The PC-UK0118 proposal appears to give this call defined behavior, and so renders the restrict qualifiers on the parameters ineffective in promoting optimization of the loop. In particular, because z.v is not modified, it can be referenced as both x->v and y->v within addC. It follows that although z.v[i] is modified, it is referenced through only one restricted pointer object, z.v (designated first as x->v and then as y->v within addC). Thus there is no undefined behavior, and so optimization of the loop in addC is inhibited by the possibility that x->v[i] and y->v[i] refer to the same object. Consider also an example motivated by the semantics of Fortran dummy arguments: Example D: void addD(int n, int * restrict x, int * restrict y) { int i; for(i=0; iv[i] is modified, and &(x->v[i]), or (x->v)+i, is based on the restricted pointer x->v, which is a subobject of *x, it follows that *x has property M. Therefore all references to *x must be through a pointer based on x. In particular, y is not based on x, and so *y must not be the same object as *x, as it would be for the call addC(&v, &v). For Example D, the call of the form addD(100,z,z+100) has defined behavior, as desired, because although the arrays referenced through the two restricted pointer parameters overlap, the elements in the overlap are not modified. Conclusion ---------- Compared to either the current specification or the PC-UK0118 proposal, the changes proposed here do a better job of specifying undefined behavior only where there is a real opportunity for better optimization. They admittedly result in a more complicated specification, but they do not impose an additional burden on an implementation. ----------------------------------------------------------------------- Comment 5. Category: Normative change to intent of existing feature Committee Draft subsection: 6.7.3 Title: allow restricted pointers to point to multiple objects Detailed description: Currently, restricted pointers can only point to a single object (dynamically determined) during their lifetime. Recently the committee changed C9X such that "realloc" always allocates a new object. This means that using "realloc" with restricted pointers is no longer permitted. int * restrict p = malloc(init_size); ... p = realloc(p, new_size); // error - new object Change the specification of "restrict" to allow restricted pointers to point to multiple objects during their lifetime. ----------------------------------------------------------------------- Comment 6. Category: Normative change to intent of existing feature Committee Draft subsection: 6.7.3 Title: allow type qualifiers inside [] for params. Detailed description: One impediment to effective usage of restricted pointers is that they are not permitted in array parameter declarations such as: void func(int n, int a[restrict n][n]); Instead, the user must write the declaration as: void func(int n, int (* restrict a)[n]); this form of declaration is far more cryptic and, therefore, harder to understand. The recommendation is that all type qualifiers be allowed in the ``top type'' of a parameter declared with an array type. --------- Peter Seebach Comment 1. Category: Feature that should be included Committee Draft subsection: Title: strsep() function Detailed description: ... The strsep() function was proposed for the standard. It was not voted in because there was a lack of general interest at the meeting in question. However, it's a trivial feature, and a very useful one, and I'd like to see it added. Complete thread-safe implementation follows: /* * Get next token from string *stringp, where tokens are possibly-empty * strings separated by characters from delim. * * Writes NULs into the string at *stringp to end tokens. * delim need not remain constant from call to call. * On return, *stringp points past the last NUL written (if there might * be further tokens), or is NULL (if there are definitely no more tokens). * * If *stringp is NULL, strsep returns NULL. */ char * strsep(stringp, delim) register char **stringp; register const char *delim; { register char *s; register const char *spanp; register int c, sc; char *tok; if ((s = *stringp) == NULL) return (NULL); for (tok = s;;) { c = *s++; spanp = delim; do { if ((sc = *spanp++) == c) { if (c == 0) s = NULL; else s[-1] = 0; *stringp = s; return (tok); } } while (sc != 0); } /* NOTREACHED */ } Comment 2. Category: Inconsistency Committee Draft subsection: several; primarily 3.x Title: Indeterminate values and trap representations. Detailed description: It is unclear whether or not our intent is that an access to an indeterminately valued object of character type invokes undefined behavior. In C89, the only argument which allows an implementation to abort upon access to an indeterminately valued object of any type is the argument that, since the definition of undefined behavior includes "access to indeterminately valued objects", that all such access invokes undefined behavior. There is no other way to reach that conclusion, but clearly it is the one we want - thus, we meant to assert that all access to indeterminately valued objects yields undefined behavior. However, it has become clear that many people want a guarantee that access through lvalues of character type cannot be undefined behavior. This is for two reasons: struct padding, and memcmp. A discussion at the Santa Cruz meeting came to the conclusion that we would like for this to be permitted. The material on "trap values" appears to provide adequate guidance to when access to an object *may* yield undefined behavior. It may be that simply removing the reference to indeterminately valued objects from the definition of undefined behavior will fix this. However, it is too complicated to suggest that any change would be merely editorial. Additionally, whatever our intent may have been, it seems fairly clear (to me, anyway) that the current wording renders access to indeterminately valued objects through lvalues of character type is undefined behavior, and if we wish to change this, it is normative, even if it's what we thought we meant all along. Comment 3. Category: Clarification Committee Draft subsection: Section 7 Title: Modification of non-const qualified arguments by library functions Detailed description: In several places, library functions take non-const-qualified pointers as arguments, to indicate that the argument may be modified. It is unclear whether or not this gives the implementor license to modify the argument in all cases, or only as described in the semantics for a function. A particular example is strtok("a", "b"); in this example, a careful reading of the description of strtok makes it clear that there is no point at which the first argument is modified. However, there are implementations which abort execution when they reach this, because they *do* modify the first argument and it's a string literal, so this introduces undefined behavior. A discussion at Santa Cruz came to the conclusion that it's probably intended that all library functions may "modify" any object they have a non-const-qualified pointer to, as long as there are no visible modifications not described in the text. Wording for this ought to be put into the description of library functions in general. ------------------ Andrew Josey Comment 1. Category: _ Feature that should be included Committee Draft subsection: __7.19.8.2 Title: fwrite when size==0 Detailed description: Comment on 7.19.8.2 The fwrite function #3 The C standard is inconsistent regarding fread() and fwrite() when size==0. In fread(), there's an explicit sentence which specifies that if either size or nmemb is 0 that the return value is zero. I believe there should be an equivalent sentence in fwrite(). All UNIX implementations we know of (and the Single UNIX Specification) return 0 for both functions when either size or nmemb is zero because they essentially turn into calls to read() or write() of size*nmemb bytes. The C standard was supposed to codify existing practice, and the text does not match this here. ----- Lawrence J. Jones Comment 1. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 7.1.4 Title: Library macro requirements Detailed description: Library functions are allowed to be implemented as function-like macros, but it is not entirely clear how closely such macros must match the behavior of the actual function. In particular, it is clear that a macro must be usable in the same contexts as a function call and accept similar kinds of expressions as arguments, but it need not have the same sequence points. What is not clear is whether it must accept exactly the same argument types as the function; that is, whether it must perform the same argument type checking and conversion that a call to a prototyped function would. I believe that reliable use of such macros requires a guarantee that this argument type checking and conversion occur, and I suggest changing the draft appropriately. Note that this has some impact on implementations, but not a lot. The conversion is easily handled by a judicious use of casts. Handling the type checking is not as obvious, but is equally simple: at most, all that is required is to add a call to the real function in a context where it will not be evaluated such as the argument of sizeof (the value of which can then be discarded). The committee should also carefully consider adding these requirements for library macros where a complete prototype is given. ------------------------------------------------------------------------ Comment 2. Category: Inconsistency Committee Draft subsection: 7.19.6.2, 7.24.2.2 Title: Input failures Detailed description: Paragraph 15 of the description of fscanf talks about what happens when end-of-file is encountered; the last part says that if valid input for the current directive is immediately followed by end-of-file, "execution of the following directive (other than %n, if any) is terminated with an input failure." This is better than in C90 where the existence of %n was ignored, but it is still incomplete: like %n, a white-space directive also does not require any input and thus should not suffer an input failure in this case. As far as I can tell, the handling of end-of-file is specified (correctly) in paragraphs 5 - 9, so I suggest deleting paragraph 15. (A parallel situation exists for fwscanf.) ---------- Randy Meyers Comment 1. Category: Editorial change Committee Draft subsection: 6.11 Title: Conforming implementations should be allowed to provide additional floating point types Detailed description: One of the contentious issues in C89 and C9x is whether an implementation or even a new standard may provide additional integer types not explicitly mentioned in C89. C9x has now definitively stated that such types may be supplied. However, C9x fails to make any statements about implementation defined floating point types or even warn programmers that a future standard may include such types. The Future Language Directions Subclause (6.11) should contain a new Subclause stating: Future standardization may include additional floating-point types, including those with greater range, precision, or both, than long double. In addition, the committee should grant to conforming C9x implementations a license to add such types. This might be done by adding a statement to 6.2.5 that "While long double has at least the range and precision of double and float, other floating point types may provide greater range or greater precision than long double." (This wording needs work.) ------------------------------------------------------------------ Comment 2. Category: Editorial change Committee Draft subsection: 7.20.3.4 Title: Rewrite the description of realloc to state it returns a new object with the same value. Detailed description: Objects in C lack expected properties because realloc goes out of its way to state that it returns a pointer to the same object, even if the object has been moved to a new address. There are places in the FCD that discuss objects that incorrectly imply that an object has a fixed address. The easiest way to correct this problem is for the description of realloc in 7.20.3.4 to state it returns a pointer to a new object. Here is the suggested new wording (note the wording does not change the behavior of realloc): The *realloc* function deallocates the old object pointed to by *ptr* and returns a pointer to a new object that has the size specified by *size*. The contents of the new object shall be the same as the old object before deallocation up to the lesser of the size of the old object and *size*. Any bytes in the new object beyond the size of the old object have indeterminate values. If *ptr* is a null pointer, the *realloc* function behaves like the *malloc* function for the specified size. Otherwise, if *ptr* does not match a pointer earlier returned by the *calloc*, *malloc*, or *realloc* function, or if the space has been deallocated by a call to the *free* or *realloc* function, the behavior is undefined. If memory for the new object cannot be allocated, the old object is not deallocated and its value is unchanged. Returns The *realloc* function returns a pointer to the new object, which may have the same value as *ptr*, or a null pointer if the new object could not be allocated. ------------------------------------------------------------------ Antoine Leca Comment 1. Category: Feature that should be included Committee Draft subsection: 7.23, 7.25, 7.26 Title: Useless library functions made deprecated Detailed description: mktime (7.23.2.3) is entirely subsumed by mkxtime (7.23.2.4). Similar cases occur with gmtime/localtime (7.23.3.3/7.23.3.4) vs zonetime (7.23.3.7, strftime (7.23.3.5) vs strfxtime (7.23.3.6), and wcsftime (7.24.5.1) vs wcsfxtime (7.24.5.2). The former functions do not add significant value over the latter ones (in particular, execution time are similar). So if the latter are to be kept (that is, if the below comment is dropped), the former should be put in the deprecated state, to avoid spurious specifications to be kept over years. Comment 2. Category: Feature that should be removed Committee Draft subsection: 7.23, 7.24.5.2 Title: Removal of struct tmx material in the library subclause Detailed description: a) The mechanism of tm_ext and tm_extlen is entirely new to the C Standard, so attention should be paid to the use that can be done of it. Unfortunately, the current text is very elliptic about this use, particularly about the storage of the further members referred by 7.23.1p5. In particular, it is impossible from the current wording to know how to correctly free a struct tmx object whose tm_ext member is not NULL, as in the following snippet: // This first function is OK (providing correct understanding of my behalf). struct tmx *tmx_alloc(void) // alloc a new struct tmx object { struct tmx *p = malloc(sizeof(struct tmx)); if( p == NULL ) handle_failed_malloc("tmx_alloc"); memchr(p, 0, sizeof(struct tmx)); // initialize to 0 all members p->tm_isdst = -1; p->tm_version = 1; p->tm_ext = NULL; return p; } // This second function have a big drawback void tmx_free(struct tmx *p) // free a previously allocated object { if( p == NULL ) return; // nothing to do if( p->tm_ext ) { // some additional members have been added by the implementation // or by users' programs using a future version of the Standard // since we do not know what to do, do nothing. ; // If the members were allocated, they are now impossible to // access, so might clobber the memory pool... } free(p); return; } Various fixes might be thought of. Among these, I see: - always require that allocation of the additional members be in control of the implementation; this way, programs should never "free() tm_ext"; effectively, this makes these additional members of the same status as are currently the additional members that may be (and are) part of struct tm or struct tmx - always require these additional objects to be separately dynamically allocated. This requires that copies between two struct tmx objects should dynamically allocate some memory to keep these objects. In effect, this will require additional example highlight this (perhaps showing what a tmxcopy(struct tmx*, const struct tmx*) function might be). Both solutions have pros and cons. But it is clear that the current state, that encompasses both, is not clear enough. Other examples of potential pitfalls are highlighted below. b) This extension mechanism might be difficult to use with implementations that currently have additional members to struct tm (_tm_zone, containing a pointer a string meaning the name of the time zone, and _tm_gmtoff, whose meaning is almost the same as tm_zone, except that it is 60 times bigger). This latter is particularly interesting, since it might need tricky kludges to assure the internal consistency of the struct tmx object (any change to either member should ideally be applied to the other, yielding potential problems of rounding). Having additional members, accessed though tm_ext, for example one whose effect is to duplicate _tm_zone behaviour, probably is awful while seen this way. c) 7.23.1p5 states that positive value for tm_zone means that the represented brocken-down time is ahead of UTC. In the case when the relationship between the brocken-down time and UTC is not known (thus tm_zone should be equal to _LOCALTIME), it is therefore forbidden to be positive. This might deserve a more explicit requirement in 7.23.1p2. d) POSIX compatibility, as well as proper support of historical time zones, will require tm_zone to be a count of seconds instead of a count of minutes; this will in turn require tm_zone to be enlarged to long (or to int_least32_t), to handle properly the minimum requirements. e) POSIX compatibility might be defeated with the restriction set upon Daylight Saving Times algorithms to actually *advance* the clocks. This is a minor point, since there is no historical need, nor any perceived real need, for such a "feature". f) On implementations that support leap seconds, 7.23.2.2 (difftime) do not specify whether the result should include (thus considering calendar time to be effectively UTC) or disregard (thus considering calendar time to be effectively TAI) leap seconds. This is unfortunate. g) The requirements set up by 7.23.2.3p4 (a second call to mktime should yield the same value and should not modify the brocken-down time) is too restrictive for mktime, because mktime does not allow complete determination of the calendar time associated with a given brocken-down time. Examples include the so-called "double daylight saving time" that were in force in the past, or when the time zone associated with the time changes relative to UTC. For example, in Sri Lanka, the clocks move back from 0:30 to 0:00 on 1996-10-26, permanently. So the timestamp 1996-10-26T00:15:00, tm_isdst=0 is ambiguous when given to mktime(); and widely deployed implementations exist that use caches, thus might deliver either the former of the later result on a random basis; this specification will effectively disallow caching inside mktime, with a big performance hit for users. This requirement (the entire paragraph) should be withdrawn. Anyway, mktime is intended to be superseded by mkxtime, so there is not much gain trying to improve a function that is to be declared deprecated. h) The case where mktime or mkxtime is called with tm_zone set to _LOCALTIME and tm_isdst being negative (unknown), and when the input moment of time is inside the "Fall back", that is between 1:00 am and 2:00 am on the last Sunday in October (in the United States), leads to a well known ambiguity. Contrary to what might have been waited for, this ambiguity is not solved by the additions of the revision of the Standard (either results might be returned): all boiled down to the sentence in 7.23.2.6, in the algorithm, saying // X2 is the appropriate offset from local time to UTC, // determined by the implementation, or [...] Since there are two possible offsets in this case... i) Assuming the implementation handle leap seconds, if brocken- down times standing in the future are passed (where leap seconds cannot de facto be determined), 7.23.2.4p4 (effect of _NO_LEAP_ SECONDS on mkxtime), and in particular the sentence between parenthesis, seems to require that the count of leap seconds should be assumed to be 0. This would be ill advised; I would prefer it to be implementation-defined, with the recommended practice (or requirement) of being 0 for implementations that do not handle leap seconds. j) Assuming the implementation handle leap seconds, the effect of 7.23.2.4p4 is that the "default" behaviour on successive calls to mkxtime yields a new, strange scale of time that is neither UTC nor TAI. For example (remember that there will be introduced a positive leap second at 1998-12-31T23:59:60Z in renewed ISO 8601 notation): struct tmx tmx = { .tm_year=99, .tm_mon=0, .tm_mday=1, .tm_hour=0, .tm_min=0, .tm_sec=0, .tm_version=1, .tm_zone=_LOCALTIME, .tm_ext=NULL, .tm_leapsecs=_NO_LEAP_SECONDS }, tmx0; time_t t1, t0; double delta, days, secs; char s[SIZEOF_BUFFER]; t1 = mkxtime(&tmx); puts(ctime(&t1)); if( tmx.tm_leapsecs == _NO_LEAP_SECONDS ) printf("Unable to determine number of leap seconds applied.\n"); else printf("tmx.tm_leapsecs = %d\n", tmx.tm_leapsecs); tmx0 = tmx; // !!! may share the object pointed to by tmx.tm_ext... ++tmx.tm_year; t2 = mkxtime(&tmx); puts(ctime(&t2)); delta = difftime(t2, t1); days = modf(delta, &secs); printf("With ++tm_year: %.7e s == %f days and %f s\n", delta, days, secs); if( tmx.tm_leapsecs == _NO_LEAP_SECONDS ) printf("Unable to determine number of leap seconds applied.\n"); else printf("tmx.tm_leapsecs = %d\n", tmx.tm_leapsecs); tmx = tmx0; // !!! may yield problems if the content pointed to by // tm_ext have been modified by the previous call... tmx.tm_hour += 24*365; tmx.tm_leapsecs = _NO_LEAP_SECONDS; t2 = mkxtime(&tmx); puts(ctime(&t2)); delta = difftime(t2, t1); days = modf(delta, &secs); printf("With tm_hour+=24*365: %.7e s == %f days and %f s\n", delta, days, secs); if( tmx.tm_leapsecs == _NO_LEAP_SECONDS ) printf("Unable to determine number of leap seconds applied.\n"); else printf("tmx.tm_leapsecs = %d\n", tmx.tm_leapsecs); Without leap seconds support, results should be consistent and straight- forward; like (for me in Metropolitan France): Thu Jan 1 01:00:00 1998 Unable to determine number of leap seconds applied. Fri Jan 1 01:00:00 1999 With ++tm_year: 3.1536000e+07 s == 365 days and 0 s Unable to determine number of leap seconds applied. Fri Jan 1 01:00:00 1999 With tm_hour+=24*365: 3.1536000e+07 s == 365 days and 0 s Unable to determine number of leap seconds applied. Things may change with leap seconds support; assuming we are in a time zone behind UTC (e.g. in the United States), the results might be: Wed Dec 31 21:00:00 1997 tmx.tm_leapsecs = 31 Thu Dec 31 21:00:00 1998 With ++tm_year: 3.1536000e+07 s == 365 days and 0 s tmx.tm_leapsecs = 31 Thu Dec 31 21:00:00 1998 With tm_hour+=24*365: 3.1536000e+07 s == 365 days and 0 s tmx.tm_leapsecs = 31 But with a time zone ahead of UTC, results might be Thu Jan 1 01:00:00 1998 tmx.tm_leapsecs = 31 Thu Dec 31 00:59:59 1998 With ++tm_year: 3.1536000e+07 s == 365 days and 0 s tmx.tm_leapsecs = 31 Fri Jan 1 01:00:00 1999 With tm_hour+=24*365: 3.1536001e+07 s == 365 days and 1 s tmx.tm_leapsecs = 32 And if the time zone is set to UTC, results might be Thu Jan 1 00:00:00 1998 tmx.tm_leapsecs = 31 Thu Dec 31 23:59:60 1998 With ++tm_year: 3.1536000e+07 s == 365 days and 0 s tmx.tm_leapsecs = 31 Fri Jan 1 00:00:00 1999 With tm_hour+=24*365: 3.1536001e+07 s == 365 days and 1 s tmx.tm_leapsecs = 32 or, for the three later lines Thu Dec 31 23:59:60 1998 With tm_hour+=24*365: 3.1536000e+07 s == 365 days and 0 s tmx.tm_leapsecs = 31 The last result is questionable, since both choices are allowed by the current text (the result is right into the leap second itself). Moreover, implementations with caches might return either on a random basis... Bottom line: the behaviour is surprising at least. k) 7.23.2.6p2 (maximum ranges on input to mkxtime) use LONG_MAX sub multiples to constraint the members' values. Outside the fact that the limitations given may easily be made greater in general cases, it have some defects: - the constraint disallow the common use on POSIX box of tm_sec as an unique input member set to a POSIX style time_t value; - the constraints are very awkward to implementations where long ints are bigger than "normal" ints: on such platforms, all members should first be converted to long before any operation take place; - since there are eight main input fields, plus a ninth (tm_zone) which is further constrained to be between -1439 and +1439, the result might nevertheless overflows, so special provision to handle should take place in any event. l) There is an obvious (and already known) typo in the description of D, regarding the handling of year multiple of 100. Also this definition should use QUOT and REM instead of / and %. m) Footnote 252 introduces the use of these library functions with the so-called proleptic Gregorian calendar, that is the rules for the Gregorian calendar applied to any year, even before Gregory's reform. This seems to contradict 7.23.1p1, which says that calendar time's dates are relative to the Gregorian calendar, thus tm_year should be in any case greater to -318. If it is the intent, another\ footnote in 7.23.1p1 might be worthwhile. Another way is to rewrite 7.23.1p1 something like "(according to the rules of the Gregorian calendar)". See also point l) below. n) The static status of the returned result of localtime and gmtime (which is annoying, but that is another story) is clearly set up by 7.23.3p1. However, this does not scale well to zonetime, given that this function might in fact return two objects: a struct tmx, and an optional object containing additional members, pointed to by tmx->tm_ext. If the later is to be static, this might yield problems with mkxtime as well, since 7.23.2.4p5 states that the brocken-down time "produced" by mkxtime is required to be identical to the result of zonetime (This will effectively require that tm_ext member should always point to static object held by the implementation; if it is the original intent, please state it clearly). o) There is a direct contradiction between the algorithm given for asctime in 7.23.3.1, which will overflow if tm_year is not in the range -11899 to +8099, and the statements in 7.23.2.6 that intent to encompass a broader range. All of these points militate to a bigger lifting of this part of the library. Such a job have been initiated recently, as Technical Committee J11 is aware of. In the meantime, I suggest dropping all these new features from the current revision of the Standard. It means in effect: i) removing subclauses 7.23.2.4 (mkxtime), 7.23.2.6 (normalization), 7.23.3.6 (strfxtime), 7.23.3.7 (zonetime), 7.24.5.2 (wcsfxtime). ii) removing paragraphs 7.23.2.3p3 and 7.23.2.3p4 (references to 7.23.2.6), iii) the macros _NO_LEAP_SECONDS and _LOCALTIME in 7.23.1p2 should be removed, as they are becoming useless. Same holds for struct tmx in 7.23.1p3. iv) 7.23.1p5 (definition of struct tmx) should also be removed, as it is becoming useless too. ---------- Douglas A. Gwyn (IST) Comment 1. Category: Editorial change/non-normative contribution Committee Draft subsection: 4 Title: Clarify ``shall accept any strictly conforming program'' Detailed description: There seems to be widespread confusion over the meaning of the phrase ``shall accept any strictly conforming program''. I suggest the following changes: 1) In paragraph 6, change the two occurrences of: accept any strictly conforming program to: successfully translate any strictly conforming program that does not exceed its translation limits (Editorially, two commas should also be added in the list of qualifications after the second instance of this replacement.) Note that this does not imply that the implementation's translation limits are exactly the set listed in subsection 5.2.4.1. 2) Add a forward reference: translation limits (5.2.4.1). ------------------------------------------------------------------------ Comment 2. Category: Feature that should be removed Committee Draft subsection: 4 Title: Conforming program is useless concept Detailed description: A ``conforming program'' is one that just happens to work under some implementation. This concept is of no value in a standards context. I suggest the following changes: 1) Delete paragraph 7: A *conforming program* is one that is acceptable to a conforming implementation. 2) Delete the second sentence of footnote 4: Conforming programs may depend upon nonportable features of a conforming implementation. 3) Move the remaining sentence of footnote 4 into footnote 2, before the existing text in footnote 2: 2) Strictly conforming programs are intended to be maximally portable among conforming implementations. A strictly conforming program can use conditional features ... ------------------------------------------------------------------------ Comment 3. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 5.1.2.2.1, 5.1.2.2.3 Title: Clean up interface to main() Detailed description: There is no need for the confusion caused by there being two distinct interfaces for the function ``main''. The ``one true interface'' should be the one that is specified. Implementations can always support alternate interfaces in addition to whatever is specified. Also, it would help in answering questions about the execution context for the function ``main'', for example in connection with atexit- registered functions, if main were specified as a normal C function invoked in a well-defined manner. I suggest the following changes: 1) In 5.1.2.2.1 paragraph 1, change the final sentence to: It shall be defined with a return type of int and two parameters (referred to here as argc and argv, though any names may be used): int main(int argc, char *argv[]) { /* ... */ } or equivalent.(8) 2) In 5.1.2.2.1 paragraph 2, delete: If they are declared, (Editorially, the new first word of that sentence should be capitalized: ``The parameters ...'') 3) Replace 5.1.2.2.3 paragraph 1 with: After its arguments are set up at program startup, the main function is called by the execution environment as follows: exit(main(argc, argv)); Thus the value returned by the main function is used as the termination status argument to the exit function. Note that main is now a normal function, so a missing return value is not automatically supplied as 0. This is good, as it discourages sloppiness (and reflects how many implementations have historically behaved in such situations). ------------------------------------------------------------------------ Comment 4. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.3.2.3 Title: Support writing explicit null pointer values Detailed description: A construct such as (int *)0 should represent a null pointer value of the obvious type. I suggest the following change: 1) In paragraph 3, change: assigned to or compared for equality to to: assigned to, converted to, or compared for equality to ------------------------------------------------------------------------ Comment 5. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.4.2.2 Title: Define __func__ outside a function Detailed description: There is no specification for __func__ when not lexically enclosed by any function. I suggest the following change: 1) Append to paragraph 1: If there is no lexically-enclosing function, an empty *function-name* is used. ------------------------------------------------------------------------ Comment 6. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.8.5.3 Title: Fix ``for'' statement specification Detailed description: There are several problems with the current specification of the ``for'' statement, as discussed in the recent Santa Cruz meeting. For example, the ``equivalent'' sequence of statements cannot be considered a simple textual substitution, and the introduction of additional levels of braces (compound statements) raises issues in connection with the translation limit minimum requirements in subsection 5.2.4.1. I believe other committee members have devised a suitable correction; this comment is merely to ensure that the issue is not overlooked. ------------------------------------------------------------------------ Comment 7. Category: Editorial change/non-normative contribution Committee Draft subsection: 6.10.3.3, 6.10.3.5 Title: Placemarker preprocessing tokens are temporary Detailed description: The reason that placemarker preprocessing tokens do not appear in the formal grammar is that they are temporary markers created by the implementation. This could be made clearer. Also, the introductory sentence to EXAMPLE 5 is misleading. I suggest the following changes: 1) At the end of 6.10.3.3 paragraph 2, attach a footnote: (134.5) Placemarker preprocessing tokens do not appear in the formal grammar, because they are temporary entities created by the implementation during translation phase 4 which vanish before translation phase 5. 2) In EXAMPLE 5 after subsection 6.10.3.5, replace: To illustrate the rules for placemarker ## placemarker the sequence with: to illustrate the rules for placemarkers, the sequence ------------------------------------------------------------------------ Comment 8. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 7.2.1.1 Title: Argument to assert is not necessarily _Bool Detailed description: There is a general problem with specifying library macros via Synopses that pretend they are functions within the type system, but it is worst for the assert macro, so that's what I most want to see fixed. I suggest the following changes: 1) In paragraph 1 (Synopsis), remove: __Bool 2) In the second sentence of paragraph 2, change: if expression is false (that is, compares equal to 0) to: if expression compares equal to 0 ------------------------------------------------------------------------ Comment 9. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 7.18.1.1 Title: Specify size of exact-width integer types Detailed description: The utility of the exact-width integer types in strictly conforming programs would be significantly enhanced, especially for the signed variety, if certain additional properties were specified. The following suggestion seems to match the actual uses to which these types are typically put. I suggest the following change: 1) Append to the first sentence of paragraph 1: and contains no padding bits ------------------------------------------------------------------------ Comment 10. Category: Normative change to intent of existing feature Committee Draft subsection: 7.18.1.1 Title: Specify twos-complement for exact-width integer types Detailed description: The utility of the exact-width integer types in strictly conforming programs would be significantly enhanced, especially for the signed variety, if certain additional properties were specified. The following suggestion seems to match the actual uses to which these types are typically put. I suggest the following change: 1) Insert the following after the first sentence of paragraph 1: Further, the sign bit of an exact-width signed integer type shall represent the value -2^n. NOTE: This says it must have a twos-complement representation, which is most convenient for implementing multiple-precision arithmetic and for matching most externally-imposed formats. ------------------------------------------------------------------------ Comment 11. Category: Normative change to intent of existing feature Committee Draft subsection: 7.18.1.1 Title: Require universal support for exact-width integer types Detailed description: The utility of the exact-width integer types in strictly conforming programs would be significantly enhanced if they were always available. I suggest the following change: 1) In paragraph 2, change: (These types need not exist in an implementation.) to: These types shall exist in every conforming implementation. NOTE: This will impose a burden on some implementations, for example one where all integer types larger than char are 64 bits wide. The justification for this is that any strictly conforming program that needs one of these types would otherwise have to allow for its nonexistence, requiring considerable code to simulate the missing type. But that's exactly the sort of thing that a compiler ought to take care of. ------------------------------------------------------------------------ Comment 12. Category: Editorial change/non-normative contribution Committee Draft subsection: 7.20.2.2 Title: Delete example implementation of rand function Detailed description: The portable implementation of the rand and srand functions has been criticised as being of inferior quality and as appearing to be a recommendation for the actual implementation. I suggest the following change: 1) Remove paragraph 5 (EXAMPLE). NOTE: The example should be moved into the Rationale document. ------------------------------------------------------------------------ Comment 13. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 7.20.4.2 Title: Forbid atexit or exit call from registered function Detailed description: Some implementations could have trouble supporting registration of new functions during processing of the registered-function list at normal program termination. Also, atexit-registered functions ought not to (recursively) invoke the exit function. I suggest the following change: 1) Append to paragraph 2: If a registered function executes a call to the atexit or exit functions, the behavior is undefined. ------------------------------------------------------------------------ Comment 14. Category: Normative change to intent of existing feature Committee Draft subsection: 7.23.1, 7.23.2.4, 7.23.2.6 Title: Remove struct tmx and recipe for normalizing broken-down times Detailed description: There are several problems with the current specification of broken- down times, as discussed in the recent Santa Cruz meeting. There are apparently errors in the algorithm given in subsection 7.23.2.6 paragraph 3, and groups working in this general area have complained that struct tmx is incomplete and not consistent with the approach they are developing. I believe the committee has in principle already agreed to revert to the existing standard in this area; this comment is merely to ensure that the issue is not overlooked. ---------- John Hauser Comment 1. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 7.12.8.4 and F.9.5.4 Title: tgamma(0) should equal 1/0 Detailed description: Problem: Section F.9.5.4 states: -- tgamma(x) returns a NaN and raises the invalid exception if x is a negative integer or zero. However, when x is zero, tgamma(x) should equal 1/x, which is not ambiguous under IEEE Standard (IEC 60559) arithmetic. Fix: Modify Section F.9.5.4 to specify: -- tgamma(+/-0) returns +/-infinity. -- tgamma(x) returns a NaN and raises the invalid exception if x is a negative integer. Correspondingly, adjust the description of tgamma in Section 7.12.8.4 to say: A domain error occurs if x is a negative integer or if the result cannot be represented when x is zero. ------------------------------------------------------------------ Comment 2. Category: Editorial change/non-normative contribution ? Committee Draft subsection: 7.3.4 Title: ``usual mathematical formulas'' vague Detailed description: Problem: In Section 7.3.4 it is explained that when the state of the CX_LIMITED_RANGE pragma is on, an implementation may use the ``usual mathematical formulas'' to implement complex multiply, divide, and absolute value. Unfortunately, the term ``usual mathematical formulas'' is unnecessarily vague. A footnote specifies the intended formulas, but footnotes are not normative. Fix: Fold Footnote 151 into the text by stating explicitly that when the state of the CX_LIMITED_RANGE macro is on, complex multiply, divide, and absolute value may be implemented by the specified formulas. ------------------------------------------------------------------ Comment 3. Category: Editorial change/non-normative contribution Committee Draft subsection: 7.3.5.4 Title: ``function'' should be ``functions'' Detailed description: In Section 7.3.5.4 (The ccos functions), change ``function computes'' to ``functions compute''. ------------------------------------------------------------------ Comment 4. Category: Editorial change/non-normative contribution Committee Draft subsection: G.5.2.2 Title: Missing plus sign Detailed description: In Section G.5.2.2 (The casinh functions), change ``casinh(infinity+i*infinity)'' to ``casinh(+infinity+i*infinity)''. ------------------------------------------------------------------ Comment 5. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: G.5.2.3 Title: catanh(1+i0) not specified Detailed description: Problem: Section G.5.2.3 neglects to specify the result for catanh(1+i0). Possible fix: Add to Section G.5.2.3 the rule: -- catanh(1+i0) returns +infinity+i*NaN and raises the invalid exception. ------------------------------------------------------------------ Comment 6. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: G.5.2.4 Title: ccosh(+inf+i*inf) inconsistent with other functions Detailed description: Problem: Section G.5.2.4 specifies that -- ccosh(+infinity+i*infinity) returns +infinity+i*NaN and raises the invalid exception. But ccosh(+infinity+i*infinity) is not necessarily in the positive real half-plane; any complex infinity is an equally plausible result. For a nearly identical case, the Section G.5.2.5 requires: -- csinh(+infinity+i*infinity) returns +/-infinity+i*NaN (where the sign of the real part of the result is unspecified) and raises the invalid exception. Other such cases are similarly stipulated throughout Section G.5. Fix: In Section G.5.2.4, specify the ccosh case to be like csinh (and others): -- ccosh(+infinity+i*infinity) returns +/-infinity+i*NaN (where the sign of the real part of the result is unspecified) and raises the invalid exception. ------------------------------------------------------------------ Comment 7. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: G.5.2.6 Title: Special cases of ctanh incorrect Detailed description: Problem: Section G.5.2.6 specifies: -- ctanh(+infinity+i*y) returns 1+i0, for all positive-signed numbers y. -- ctanh(x+i*infinity) returns NaN+i*NaN and raises the invalid exception, for finite x. However, working from the formula sinh(2x) sin(2y) tanh(x+iy) = ---------------- + i ---------------- cosh(2x)+cos(2y) cosh(2x)+cos(2y) it appears that the above cases have not been defined with the same care applied to other functions in Section G.5. In particular, ctanh(+infinity+i*y) should be 1+i0*sin(2y), and ctanh(+0+i*infinity) can be more accurately defined as +0+i*NaN. Fix: Modify Section G.5.2.6 to say: -- ctanh(+infinity+i*y) returns 1+i0*sin(2y), for positive-signed finite y. -- ctanh(+infinity+i*infinity) returns 1+/-i0 (where the sign of the imaginary part of the result is unspecified). -- ctanh(+0+i*infinity) returns +0+i*NaN and raises the invalid exception. -- ctanh(x+i*infinity) returns NaN+i*NaN and raises the invalid exception, for finite nonzero x. -- ctanh(+0+i*NaN) returns +0+i*NaN. -- ctanh(x+i*NaN) returns NaN+i*NaN and optionally raises the invalid exception, for finite nonzero x. ------------------------------------------------------------------ Comment 8. Category: Editorial change/non-normative contribution Committee Draft subsection: G.5.3.2 Title: Missing minus sign Detailed description: In Section G.5.3.2 (The clog functions), change ``clog(0+i0)'' to ``clog(-0+i0)''. ------------------------------------------------------------------ Comment 9. Category: Normative change to intent of existing feature Committee Draft subsection: G.5 Title: Directions of complex zeros and infinities Detailed description: Problem: Throughout Section G.5 it is assumed that the following complex zero and infinity values have these directions in the complex plane: -0-i0 -pi -0+i0 +pi +0-i0 -0 +0+i0 +0 -infinity-i*infinity -3pi/4 -infinity+i*infinity +3pi/4 +infinity-i*infinity -pi/4 +infinity+i*infinity +pi/4 Unfortunately, none of these directions is mathematically determinate, and the consequence is that calculations are made a little more susceptible to returning incorrect results without warning. The effect of this set of assumptions on the properties of the complex functions is similar to what would result from assuming, for example, that +infinity/+infinity = 1. (In fact, the assumption that the direction of +infinity+i*infinity is +pi/4 is virtually equivalent to the assumption that +infinity/+infinity = 1.) It is often claimed that the explanation for these choices can be found in W. Kahan's paper, ``Branch Cuts for Complex Elementary Functions'', published in the Proceedings of the Joint IMA/SIAM Conference on the State of the Art in Numerical Analysis, 1987. But after correctly observing that the direction of +infinity+i*infinity is not determinate (it's ``some fixed but unknown theta strictly between 0 and pi/2''), Kahan turns around and merely _asserts_ the directions listed above, without justification. The paper on which the choices above are probably based supplies _no_argument_ for them. The risks caused by these choices might be tolerable if they allowed information about angle to be usefully preserved through at least some calculations. Unfortunately, a somewhat tedious examination of the basic operations and functions (addition, multiplication, clog, etc.) shows that there is almost no benefit. The ``convenient values'' either quickly disappear into NaNs (harmless) or they soon become arbitrary (not so harmless). There is rarely a case where they preserve any useful information. Since the benefits do not appear to be worth the risks, this approach ought not to be adopted by the C Standard. Preferred fix: As has been independently proposed in other public comments, first fix the atan2 function so that a domain error occurs for atan2(x,y) when x and y are both zero or both an infinity. In Section G.5.1.1, substitute the rule: -- cacos(+/-infinity+i*infinity) returns NaN-i*infinity and raises the invalid exception. In Section G.5.2.1, substitute: -- cacosh(+/-infinity+i*infinity) returns +infinity+i*NaN and raises the invalid exception. In Section G.5.2.2, substitute: -- casinh(+infinity+i*infinity) returns +infinity+i*NaN and raises the invalid exception. And in Section G.5.3.2, substitute the rules: -- clog(+/-0+i0) returns -infinity+i*NaN and raises the invalid exception. -- clog(+/-infinity+i*infinity) returns +infinity+i*NaN and raises the invalid exception. Alternative fix: [This proposal may look complicated, but it actually involves fairly benign substitutions in the Standard.] If the preferred fix cannot be adopted, the safer form should at least be permitted. In Annex G, make it implementation-defined whether the carg function returns NaN or a finite value for zeros and infinities. If it returns a finite value then carg(-0-i0) -> -pi carg(-0+i0) -> +pi carg(+0-i0) -> -0 carg(+0+i0) -> +0 carg(-infinity-i*infinity) -> -3pi/4 carg(-infinity+i*infinity) -> +3pi/4 carg(+infinity-i*infinity) -> -pi/4 carg(+infinity+i*infinity) -> +pi/4 as currently required. Otherwise, carg returns NaN and raises the invalid exception for these cases. In Section G.5.1.1, replace the four rules: -- cacos(-infinity+i*infinity) returns 3pi/4-i*infinity. -- cacos(+infinity+i*infinity) returns pi/4-i*infinity. -- cacos(x+i*infinity) returns pi/2-i*infinity, for finite x. -- cacos(NaN+i*infinity) returns NaN-i*infinity. with the single rule: -- cacos(x+i*infinity) evaluates as carg(x+i*infinity)-i*infinity, for all x (including NaN). In Section G.5.2.1, replace: -- cacosh(-infinity+i*infinity) returns +infinity+i*3pi/4. -- cacosh(+infinity+i*infinity) returns +infinity+i*pi/4. -- cacosh(x+i*infinity) returns +infinity+i*pi/2, for finite x. -- cacosh(NaN+i*infinity) returns +infinity+i*NaN. with: -- cacosh(x+i*infinity) evaluates as +infinity+i*carg(x+i*infinity) for all x (including NaN). In Section G.5.2.2, replace: -- casinh(+infinity+i*infinity) returns +infinity+i*pi/4. -- casinh(x+i*infinity) returns +infinity+i*pi/2, for positive-signed finite x. with: -- casinh(x+i*infinity) evaluates as +infinity+i*carg(x+i*infinity) for all positive-signed numbers x. And in Section G.5.3.2, replace: -- clog(-0+i0) returns -infinity+i*pi and raises the divide-by-zero exception. -- clog(+0+i0) returns -infinity+i0 and raises the divide-by-zero exception. -- clog(-infinity+i*infinity) returns +infinity+i*3pi/4. -- clog(+infinity+i*infinity) returns +infinity+i*pi/4. -- clog(x+i*infinity) returns +infinity+i*pi/2, for finite x. with: -- clog(-0+i0) evaluates as -infinity+i*carg(-0+i0) and, if the imaginary part of the result is not NaN, also raises the divide-by-zero exception. -- clog(+0+i0) evaluates as -infinity+i*carg(+0+i0) and, if the imaginary part of the result is not NaN, also raises the divide-by-zero exception. -- clog(x+i*infinity) evaluates as +infinity+i*carg(x+i*infinity) for all x (including NaN). ---------- Paul Eggert Comment 1. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 6.10.3.2 Title: Stringizing a string containing a UCN should have specified behavior Detailed description: The semantics of UCNs have improved greatly in the latest draft, but I noticed one glitch. Section 6.10.3.2 says that ``it is unspecified whether a \ character is inserted before the \ character beginning a universal character name''. This lack of specification means that one cannot reliably convert from a program using an extended source character set to one using only UCNs (or vice versa), as this might change the program's behavior. For example, suppose @ represents MICRO SIGN (code 00B5). Then `assert (strcmp (unit, "@") != 0)' might change its behavior if the program is converted to UCN form `assert (strcmp (unit, "\u00B5") != 0)', since (if the assertion fails) the former's output will contain "@", but for the latter it is unspecified whether the output will contain "@" or the six characters "\u00B5". Problems like these might be manageable if the programmer has control over (or at least can inspect) all macro definitions. But they become unmanageable if code is intended to be portable to all implementations and/or libraries that supplies macros that might stringify their arguments, as there's no convenient way for the programmer to determine whether the macros might change their behavior if the program is converted to or from UCN form. For example, the Solaris 2.6 header files for tracing threads have macros that stringify their arguments, but most users of these macros don't know this fact. With 6.10.3.2's current wording, such users are at risk when they convert their program to or from UCN form. A consequence of the current wording is that careful programmers will have to avoid passing strings containing UCNs or extended source chars to any macro (or possible macro) not under the programmer's control. This is error-prone and restrictive. I realize that the current lack of specification is to allow implementations that keep UCNs (or convert all extended chars to UCNs) during processing, but for such implementations it's a very small overhead to avoid prepending \ to UCNs when stringizing strings. It's well worth doing this to standardize behavior and make it easier to port programs. Suggestion: In 6.10.3.2 paragraph 2, change ``it is unspecified whether a \ character is inserted before the \ character beginning a universal character name'' to ``a \ character is not inserted before the \ character beginning a universal character name''. ------------------------------------------------------------------ Comment 2. Category: Feature that should be removed Committee Draft subsection: 7.23 Title: Remove struct tmx and associated functions Detailed description: Comment 14 in US0011 (1998-03-04) discusses several problems in the struct-tmx-related changes by Committee Draft 1 (CD 1) to . Unfortunately, many of these problems remain in the final committee draft (FCD), and I've since learned of other problems. I summarize these remaining problems in Appendix 1 below. Also, Clive Feather (who I understand is responsible for most of the changes in CD 1 and FCD) has proposed that a new section be written to address these problems. I welcome this proposal, and would like to contribute. However, I believe that it's too late in the standardization process to introduce major improvements to , as there will be insufficient time to gain implementation experience with these changes, experience that is needed for proper review. Instead, I propose that 's problems be fixed by removing the struct-tmx-related changes to , reverting to the the current ISO C standard (C89); we can then come up with a better for the next standard (C0x). In other words, I propose the following: * Change to define only the types and functions that were defined in C89's , and to remove a new requirement on mktime. Appendix 2 gives the details. * Work with Clive Feather and other interested parties to write and test a revised suitable for inclusion in C0x. Please let me know of any way that I can further help implement this proposal. ------------------------------------------------------------ Appendix 1. Problems in the struct-tmx-related part of Here is a summary of technical problems in the struct-tmx-related part of FCD (1998-08-03), section 7.23. The problems fall into two basic areas: * struct tmx is not headed in the right direction. The struct-tmx-related changes do not address several well-known problems with C89 , and do not form a good basis for addressing these problems. These problems include the following. - Lack of precision. The standard does not require precise timekeeping; typically, time_t has only 1-second precision. - Inability to determine properties of time_t. There's no portable way to determine the precision or range of time_t. - Poor arithmetic support for the time_t type. difftime is not enough for many practical applications. - The new interface is not reentrant. A common extension to C89 is the support of reentrant versions of functions like localtime. This extension is part of POSIX.1. There's no good reason (other than historical practice) for time-related functions to rely on global state; any new extensions should be reentrant. - No control over time zones. There's no portable way for an application to inquire about the time in New York, for example, even if the implementation supports this. - Missing conversions. There's no way to convert between UTC and TAI, or between times in different time zones, or to determine which time zone is in use. - No reliable interval time scale. If the clock is adjusted to keep in sync with UTC, there's no reliable way for a program to ignore this change. - One cannot apply strftime to the output of gmtime, as the %Z and %z formats may be misinterpreted. (Credit: I've borrowed many of the above points from discussions by Clive Feather and Markus Kuhn.) * struct tmx has several technical problems of its own. Even on its own terms, struct tmx has several technical problems that would need to be fixed before being made part of a standard. These problems include the following. - In 7.23.1 paragraph 5, struct tmx's tm_zone member counts minutes. This disagrees with common practice, which is to extend struct tm by adding a new member tm_gmtoff that is UTC offset in seconds. The extra precision is needed to support historical time stamps -- UTC offsets that were not a multiple of one minute used to be quite common, and in at least one locale this practice did not die out until 1972. - The tm_leapsecs member defined by 7.23.1 paragraph 5 is an integer, but it is supposed to represent TAI - UTC, and this value is not normally an integer for time stamps before 1972. Also, it's not clear what this value should be for historical time stamps before the introduction of TAI in the 1950s. - The tm_ext and tm_extlen members defined by 7.23.1 paragraph 5 use a new method to allow for future extensions. This method has never before been tried in the C Standard, and is likely to lead to problems in practice. For example, the draft makes no requirement on the storage lifetime of storage addressed by tm_ext. This means that an application cannot reliably dereference the pointer returned by zonetime, because it has no way of knowing when the tm_ext member points to freed storage. - 7.23.2.3 paragraph 4 adds the following requirement for mktime not present in C89: If the call is successful, a second call to the mktime function with the resulting struct tm value shall always leave it unchanged and return the same value as the first call. This requirement was inspired by the struct-tmx-related changes to , but it requires changes to existing practice, and it cannot be implemented without hurting performance or breaking binary compatibility. For example, suppose I am in Sri Lanka, and invoke mktime on the equivalent of 1996-10-26 00:15:00 with tm_isdst==0. There are two distinct valid time_t values for this input, since Sri Lanka moved the clock back from 00:30 to 00:00 that day, permanently. There is no way to select the time_t by inspecting tm_isdst, since both times are standard time. On examples like these, C89 allows mktime to return different time_t values for the same input at different times during the execution of the program. This is common existing practice, but it is prohibited by this new requirement. It's possible to satisfy this new requirement by adding a new struct tm member, which specifies the UTC offset. However, this would break binary compatibility. It's also possible to satisfy this new requirement by always returning the earlier time_t value in ambiguous cases. However, this can greatly hurt performance, as it's not easy for some implementations to determine that the input is ambiguous; it would require scouting around each candidate returned value to see whether the value might be ambiguous, and this step would be expensive. - The limits on ranges for struct tmx members in 7.23.2.6 paragraph 2 are unreasonably tight. For example, they disallow the following program on a POSIX.1 host with a 32-bit `long', since `time (0)' currently returns values above 900000000 on POSIX.1 hosts, which is well above the limit LONG_MAX/8 == 268435455 imposed by 7.23.2.6. #include struct tmx tm; int main() { char buf[1000]; time_t t = 0; /* Add current time to POSIX.1 epoch, using mkxtime. */ tm.tm_version = 1; tm.tm_year = 1970 - 1900; tm.tm_mday = 1; tm.tm_sec = time (0); if (mkxtime (&tm) == (time_t) -1) return 1; strfxtime (buf, sizeof buf, "%Y-%m-%d %H:%M:%S", &tm); puts (buf); return 0; } The limits in 7.23.2.6 are not needed. A mktime implementation need not check for overflow on every internal arithmetic operation; instead, it can cheaply check for overflow by doing a relatively simple test at the end of its calculation. - 7.23.2.6 paragraph 3 contains several technical problems: . In some cases, it requires mkxtime to behave as if each day contains 86400 seconds, even if the implementation supports leap seconds. For example, if the host supports leap seconds and uses Japan time, then using mkxtime to add 1 day to 1999-01-01 00:00:00 must yield 1999-01-01 23:59:59, because there's a leap second at 08:59:60 that day in Japan. This is not what most programmers will want or expect. . The explanation starts off with ``Values S and D shall be determined as follows'', but the code that follows does not _determine_ S and D; it consults an oracle to find X1 and X2, which means that the code merely places _constraints_ on S and D. A non-oracular implementation cannot in general determine X1 and X2 until it knows S and D, so the code, if interpreted as a definition, is a circular one. . The code suffers from arithmetic overflow problems. For example, suppose tm_hour == INT_MAX && INT_MAX == 32767. Then tm_hour*3600 overflows, even though tm_hour satisfies the limits of paragraph 2. . The code does not declare the types of SS, M, Y, Z, D, or S, thus leading to confusion. Clearly these values cannot be of type `int', due to potential overflow problems like the one discussed above. It's not clear what type would suffice. . The definition for QUOT yields numerically incorrect results if either (b)-(a) or (b)-(a)-1 overflows. Similarly, REM yields incorrect results if (b)*QUOT(a,b) overflows. . The expression Y*365 + (Z/400)*97 + (Z%400)/4 doesn't match the Gregorian calendar, which has special rules for years that are multiples of 100. . The code is uncommented, so it's hard to understand and evaluate. For example, the epoch (D=0, S=0) is not described; it appears to be (-0001)-12-31 Gregorian, but this should be cleared up. - 7.23.3.7 says that the number of leap seconds is the ``UTC-UT1 offset''. It should say ``UTC - TAI''. ------------------------------------------------------------ Appendix 2. Details of proposed change to Here are the details about my proposed change to . This change reverts the part of the standard to define only the types, functions, and macros that were defined in C89's . It also removes the hard-to-implement requirement in 7.23.2.3 paragraph 4. * 7.23.1 paragraph 2. Remove the macros _NO_LEAP_SECONDS and _LOCALTIME. * 7.23.1 paragraph 3. Remove the type `struct tmx'. * 7.23.1 paragraph 5 (struct tmx). Remove this paragraph. * 7.23.2.3 paragraph 3 (mktime normalization). Remove this paragraph. * 7.23.2.3 paragraph 4. Remove the phrase ``and return the same value''. It's not feasible to return the same value in some cases; see the discussion of 7.23.2.3 paragraph 4 above. * 7.23.2.4 (mkxtime). Remove this section. * 7.23.2.6 (normalization of broken-down times). Remove this section; this means footnote 252 will be removed. * 7.23.3 paragraph 1. Remove the reference to strfxtime. * 7.23.3.6 (strfxtime). Remove this section. * 7.23.3.7 (zonetime). Remove this section. -- end of USA comments ____________________ end of SC22 N2872 ________________________________