JTC1/SC22/WG14
N868
ISO/IEC JTC 1/SC22
Programming languages, their environments and system software interfaces
Secretariat: U.S.A. (ANSI)
ISO/IEC JTC 1/SC22
N2872
TITLE:
Summary of Voting on Final CD Ballot for FCD 9899 - Information technology
- Programming languages - Programming Language C (Revision of ISO/IEC
9899:1990
DATE ASSIGNED:
1999-01-12
BACKWARD POINTER:
N/A
DOCUMENT TYPE:
Summary of Voting
PROJECT NUMBER:
JTC 1.22.20.01
STATUS:
WG14 is requested to prepare a Disposition of Comments Report and make a
recommendation on the further processing of the FCD.
ACTION IDENTIFIER:
FYI
DUE DATE:
N/A
DISTRIBUTION:
Text
CROSS REFERENCE:
SC22 N2794
DISTRIBUTION FORM:
Def
Address reply to:
ISO/IEC JTC 1/SC22 Secretariat
William C. Rinehuls
8457 Rushing Creek Court
Springfield, VA 22153 USA
Telephone: +1 (703) 912-9680
Fax: +1 (703) 912-2973
email: rinehuls@access.digex.net
_________ end of title page; beginning of voting summary ___________
SUMMARY OF VOTING ON
Letter Ballot Reference No: SC22 N2794
Circulated by: JTC 1/SC22
Circulation Date: 1998-08-24
Closing Date: 1999-01-08
SUBJECT: Final CD Ballot for FCD 9899 - Information technology -
Programming languages - Programming Language C (Revision of
ISO/IEC 9899:1990
----------------------------------------------------------------------
The following responses have been received on the subject of approval:
"P" Members supporting approval
without comments 8
"P" Members supporting approval
with comments 4
"P" Members supporting approval
with comments not yet received 1
"P" Members not supporting approval 3
"P" Members abstaining 2
"P" Members not voting 4
"O" Members supporting approval
without comments 1
--------------------------------------------------------------------
Secretariat Action:
WG14 is requested to prepare a Disposition of Comments Report and make a
recommendation on the further processing of the FCD.
The comment accompanying the abstention vote from Austria was: "Lack of
expert resources."
The comments accompanying the affirmative votes from Canada, France,
Norway and the United States of America are attached, along with the
comments accompanying the negative votes from Denmark, Japan and the
United Kingdom.
Germany has advised that the comments accompanying their affirmative vote
"will follow within the next ten days". Upon receipt, those comments will
be distributed as a separate SC22 document.
_____ end of voting summary; beginning of detailed summary __________
ISO/IEC JTC1/SC22 LETTER BALLOT SUMMARY
PROJECT NO: JTC 1.22.20.01
SUBJECT: Final CD Ballot for FCD 9899 - Information technology - Programming
languages - Programming Language C (Revision of ISO/IEC 9899:1990)
Reference Document No: N2794 Ballot Document No: N2794
Circulation Date: 1998-08-24 Closing Date: 1999-01-08
Circulated To: SC22 P, O, L Circulated By: Secretariat
SUMMARY OF VOTING AND COMMENTS RECEIVED
Approve Disapprove Abstain Comments Not Voting
'P' Members
Australia (X) ( ) ( ) ( ) ( )
Austria ( ) ( ) (X) (X) ( )
Belgium ( ) ( ) ( ) ( ) (X)
Brazil ( ) ( ) (X) ( ) ( )
Canada (X) ( ) ( ) (X) ( )
China (X) ( ) ( ) ( ) ( )
Czech Republic (X) ( ) ( ) ( ) ( )
Denmark ( ) (X) ( ) (X) ( )
Egypt ( ) ( ) ( ) ( ) (X)
Finland (X) ( ) ( ) ( ) ( )
France (X) ( ) ( ) (X) ( )
Germany (X) ( ) ( ) (*) ( )
Ireland (X) ( ) ( ) ( ) ( )
Japan ( ) (X) ( ) (X) ( )
Netherlands (X) ( ) ( ) ( ) ( )
Norway (X) ( ) ( ) (X) ( )
Romania ( ) ( ) ( ) ( ) (X)
Russian Federation (X) ( ) ( ) ( ) ( )
Slovenia ( ) ( ) ( ) ( ) (X)
Ukraine (X) ( ) ( ) ( ) ( )
UK ( ) (X) ( ) (X) ( )
USA (X) ( ) ( ) (X) ( )
'O' Memberz Voting
Korea Republic (X) ( ) ( ) ( ) ( ) ( )
* The Germany Member Body has advised that "comments will follow within
the next ten days". Upon receipt, these comments will be distributed
as a separate SC22 document.
--------- end of detailed summary; beginning of Canada Comments _____
From: "Doug Langlotz" (dlanglots@scc.ca)
Document number FCD 9899 (JTC 1/SC22/N2794)
Canada APPROVES WITH COMMENTS.
Canada supports approval with the following comments.
Comments:
Comment #1
Category: Normative
Committee Draft Subsection: 6.8.4 and 6.8.5
Title:
Inconsistent scoping rules for compound literals and control
statements
Description:
In 6.8.5.3, the for statement was modified when incorporating mixed
declarations and code to limit the scope of the (possible)
declaration in clause-1. However, this makes the behaviou
inconsistent for compound literals which now have different scope
rules in a for statement then in the other control statements.
From example 8 in 6.5.2.5:
struct s { int i; }
int f (void)
{
struct s *p = 0, *q;
int j = 0;
while (j < 2)
q = p, p = &((struct s){j++});
return p == q && q->i == 1;
}
Note that if a for loop were used instead of a while loop, the
lifetime of the unnamed object would be the body of the loop
only, and on entry next time around p would be pointing to an
object which is no longer guaranteed to exist, which would result
in undefined behaviour.
The behaviour of compound literals should be made consistent by
making all of the control statements have the same scoping rules
as for loops.
Comment #2
Category: Normative
Committee Draft Subsection: 6.5.2.5
Title:
Compound literals constraint #2
Description:
Constraint #2 in 6.5.2.5 seems to have an undesirable interpretation.
The constraint is, "No initializer shall attempt to provide a value
for an object not contained within the entire unnamed object specified
by the compound literal." This seems to disallow the following
(assume the Fred type has 3 members and the George type has 2 so in
neither case are we going past the end of the object):
(Fred){1, 7, &((George){5, 6})}
when it was really meant to disallow:
(int[2]){1, 2, 3}
Perhaps the rule could be broken down into more explicit cases: no
subscript (implicit or explicit) should be beyond the bounds of the
array object that it is applied to; no more fields in a struct object
should be initialized than there are fields in that struct; only
names of members of the struct or union object being initialized may
be used for designated initializers.
Similar wording is also used in 6.7.8.
Comment #3
Category: Normative
Committee Draft Subsection: 6.3.1.3
Title:
Converting to signed integral type (based on previous Canadian
comment)
Description:
Original Comment:
Section 6.3.1.3 paragraph 3 describes the result of converting a
value with integral type to a signed type which cannot represent the
value.
It says that the result is implementation defined, however, we
believe that the result should be undefined, analogous to the case
where an operation yields a value which cannot be represented
by the result type.
The purpose of this comment was to ensure that if converting a value
with integral type to a signed type which cannot represent the value
the implementation is allowed to terminate or allowed to fail to
translate.
Details from a note sent to the reflector:
I would claim that the use of "implementation defined" isn't
appropriate in 6.3.1.3 paragraph 3 for several reasons:
1. (possibly pedantic) The draft standard does not provide two or
more choices as required by 3.19. What is an implementation
allowed to do? Is termination legal? Failure to translate
ruled out.
2. I interpret section 4 paragraph 3 to forbid an implementation from
failing to translate because of an overflow during a conversion to
a signed integral type. Yet this would seem to be quite
appropriate. For example, an implementation should be *allowed*
to treat the following external definition as an error ("fail to
translate"), assuming that INT_MAX is not representable in short:
short big = INT_MAX;
Contrast that with the following, for which the implementation
*is* allow allowed to fail to translate:
short big = INT_MAX + 1;
3. An argument similar to 2 can be made that a run-time conversion
"problem" should be allowed to be treated as an error. It isn't
clear to me whether the definition of "implementation-defined"
allows for such an interpretation. There are strong hints in the
draft standard that lead me to think that it is not the
committee's intent to allow such an interpretation. I don't
actually see that this is spelled out in the draft standard.
4. In certain cases 6.3.1.3 paragraph 3 and 6.5 paragraph 5 actually
seem to conflict. For example, I would suggest that in the
following example, both apply.
short big = (short) INT_MAX;
Clearly 6.3.1.3 paragraph 3 applies, since a conversion to a
signed integral type is being performed, and the value cannot be
represented in it. So the result must be implementation-defined.
Clearly 6.5 paragraph 5 applies since the value of the expression
is not in the range of representable values for its type. So the
behavior is undefined.
This is seems to me to be a contradiction. A simple fix is to
change 6.3.1.3 paragraph 3 to call for "undefined behavior".
5. Although the wording in the draft is very similar to that in the
previous standard, there is a difference. I interpreted the old
standard's looser definition of "implementation defined" to allow
"failure to translate". This latitude is no longer available
in the draft.
6. I feel that it is an error to try to represent an out of range
value in an integral type. Yet "implementation defined" implies
that this is *not* an error (see section 4 paragraph 3). It is
in the interest of users that implementations be allowed to treat
it as an error. I admit that requiring that it be "caught" at
runtime would have a serious performance impact for many
implementations.
I'm only asking that the standard should allow this error to be
caught.
The standard should require that constant expressions with signed
integer overflow should be constraint violations. This has no
runtime cost. If this isn't acceptable, at least allow this
situation to be treated as an error ("undefined behavior"
accomplishes this).
Excerpts from 3. Terms and definitions:
3.11
1 implementation-defined behavior
unspecified behavior where each implementation documents how
the choice is made
2 EXAMPLE An example of implementation-defined behavior is the
propagation of the high-order bit when a signed integer is
shifted right.
3.19
1 unspecified behavior
behavior where this International Standard provides two or
more possibilities and imposes no requirements on which is
chosen in any instance
2 EXAMPLE An example of unspecified behavior is the order in
which the arguments to a function are evaluated.
Excerpt from 4. Conformance:
3 A program that is correct in all other aspects, operating on
correct data, containing unspecified behavior shall be correct
program and act in accordance with 5.1.2.3.
[5.1.2.3 is "Program execution"]
All of 6.3.1.3 Signed and unsigned integers:
1 When a value with integer type is converted to another integer
type other than _Bool,if the value can be represented by the
new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted
by repeatedly adding or subtracting one more than the maximum
value that can be represented in the new type until the value
is in the range of the new type.
3 Otherwise, the new type is signed and the value cannot be
represented in it; the result is implementation-defined.
Excerpt from 6.5 Expressions:
5 If an exception occurs during the evaluation of an expression
(that is, if the result is not mathematically defined or not in
the range of representable values for its type), the behavior
is undefined.
Comment #4
Category: Normative
Committee Draft Subsection: 6.2.5
Title:
Restrictions on long long
Description:
Proposal for a change to the Draft C standard (WG14/N843)
This proposal suggests a collection of small changes to the Draft C
Standard (WG14/N843) dated August 3, 1998. The changes are intended
to isolate long long int and implementation-defined extended integer
types from the common integer types. In particular, we wish to
ensure that size_t and ptrdiff_t must be one of the common integer
types, rather than long long or an implementation-defined extended
integer type. Also, we wish to ensure that no values are converted to
long long or an implementation-defined extended integer type,
except when the conversion is explicit. For example, on a system
where integers have 32 bits, a constant like 0xFFFFFFFF should be
converted to unsigned long rather than long long.
In order to implement this principle, we suggest the following
wording changes to various sections in the draft document.
6.2.5 Types
4. There are four standard signed integer types, designated as
signed char, short int, int, and long int. (These and other types
may be designated in several additional ways, as described in
6.7.2.) There is one standard extended signed integer
type designated as long long int. There may be additional
implementation-defined extended signed integer types. The standard
extended signed integer type and the implementation-defined
extended signed integer types are collected called the extended
signed integer types. The standard and extended signed integer
types are collectivel called signed integer types.
7.17 Common definitions <stddef.h>
<<Add to end of paragraph 2>>None of the above three types shall be
defined with an extended integer type, whether standard or
implementation-defined.
________end of Canada Comments; beginning of Denmark Comments_________
From: Charlotte Laursen <cl@ds.dk>
Dato: 1999-01-06
Danish vote on FCD 9899
Ballot comments on FCD 9899, C, SC22 N2794
The DS vote is negative
The vote can be changed into an affirmative one if the following
problems are resolved satisfactorily.
1. Two functions isblank(), iswblank() are added with the description
contained in CD 9899, SC22 N2620.
2. The external linkage limit is set to 32 *characters* and that
a note that a character may be needed to be represented by more
than one byte will be added.
3. The character terminology and its use be tidied up, and brought
in consistence with SC2 terminology. Eg 3.5, 3.14 .
The ISO/IEC 9945 (POSIX) series of standards have done a related exercise.
_____ end of Denmark Comments; beginning of France Comments _________
From: ARNAUD.A.R.D.DIQUELOU@email.afnor.fr
TITLE: Ballot comments on SC22N2794 - FCD 9899 - C
STATUS: Approved AFNOR comments
France votes YES to document SC22 N2794, with the following comments.
A. First, it should be noted that with one (important) exception,
the points raised in the precedent vote (N2690, answered by N2792)
was satisfactorily resolved. The overall impression is that the
document have been vastly improved in the ongoing process of revision.
B. Then, there is a technical point related to <time.h> that we
miss at the first time, and about which we propose to drop all
the new material in this area, waiting for experts (that have
already set up a working group) to design a better solution.
This point is detailled below.
C. Finally, there is long long.
Our position on this subject do not change: we feel this feature is
not necessary to be included in C9X (see the fully detailled analysis
in AFNOR previous ballot comments in SC22 N2690).
The Committee answered PR at the preceding vote (i.e. "has reaffirmed
this decision on more than one occasion.")
On the other hand, we see this problem as being very minor when
compared to the goal of delivering a new revision of the C Standard,
with the added precisions and new features this draft is proposing.
So we would not require anymore this feature to be removed from
the draft (but we would be very happy if it happens).
AFNOR detailled comments related to <time.h>
Comment 1.
Category: Feature that should be included
Committee Draft subsection: 7.23, 7.25, 7.26
Title: Useless library functions made deprecated
Detailed description:
mktime (7.23.2.3) is entirely subsumed by mkxtime (7.23.2.4).
Similar cases occur with gmtime/localtime (7.23.3.3/7.23.3.4)
vs zonetime (7.23.3.7, strftime (7.23.3.5) vs strfxtime
(7.23.3.6), and wcsftime (7.24.5.1) vs wcsfxtime (7.24.5.2).
The former functions do not add significant value over the latter
ones (in particular, execution time are similar). So if the
latter are to be kept (that is, if the below comment is dropped),
the former should be put in the deprecated state, to avoid
spurious specifications to be kept over years.
Comment 2.
Category: Feature that should be removed
Committee Draft subsection: 7.23, 7.24.5.2
Title: Removal of struct tmx material in the <time.h> library subclause
Detailed description:
a) The mechanism of tm_ext and tm_extlen is entirely new to the
C Standard, so attention should be paid to the use that can be
done of it. Unfortunately, the current text is very elliptic about
this use, particularly about the storage of the further members
referred by 7.23.1p5.
In particular, it is impossible from the current wording to know how
to correctly free a struct tmx object whose tm_ext member is
not NULL, as in the following snippet:
// This first function is OK (providing correct understanding of my behalf).
struct tmx *tmx_alloc(void) // alloc a new struct tmx object
{
struct tmx *p = malloc(sizeof(struct tmx));
if( p == NULL ) handle_failed_malloc("tmx_alloc");
memchr(p, 0, sizeof(struct tmx)); // initialize to 0 all members
p->tm_isdst = -1;
p->tm_version = 1;
p->tm_ext = NULL;
return p;
}
// This second function have a big drawback
void tmx_free(struct tmx *p) // free a previously allocated object
{
if( p == NULL ) return; // nothing to do
if( p->tm_ext ) {
// some additional members have been added by the implementation
// or by users' programs using a future version of the Standard
// since we do not know what to do, do nothing.
;
// If the members were allocated, they are now impossible to
// access, so might clobber the memory pool...
}
free(p);
return;
}
Various fixes might be thought of. Among these, I see:
- always require that allocation of the additional members be in
control of the implementation; this way, programs should never
"free() tm_ext"; effectively, this makes these additional members
of the same status as are currently the additional members that
may be (and are) part of struct tm or struct tmx
- always require these additional objects to be separately dynamically
allocated. This requires that copies between two struct tmx objects
should dynamically allocate some memory to keep these objects.
In effect, this will require additional example highlight this
(perhaps showing what a tmxcopy(struct tmx*, const struct tmx*)
function might be).
Both solutions have pros and cons. But it is clear that the current
state, that encompasses both, is not clear enough.
Other examples of potential pitfalls are highlighted below.
b) This extension mechanism might be difficult to use with
implementations that currently have additional members to struct tm
(_tm_zone, containing a pointer a string meaning the name of the time
zone, and _tm_gmtoff, whose meaning is almost the same as tm_zone,
except that it is 60 times bigger). This latter is particularly
interesting, since it might need tricky kludges to assure the internal
consistency of the struct tmx object (any change to either member
should ideally be applied to the other, yielding potential problems
of rounding). Having additional members, accessed though tm_ext,
for example one whose effect is to duplicate _tm_zone behaviour,
probably is awful while seen this way.
c) 7.23.1p5 states that positive value for tm_zone means that the
represented brocken-down time is ahead of UTC. In the case when
the relationship between the brocken-down time and UTC is not known
(thus tm_zone should be equal to _LOCALTIME), it is therefore
forbidden to be positive. This might deserve a more explicit
requirement in 7.23.1p2.
d) POSIX compatibility, as well as proper support of historical
time zones, will require tm_zone to be a count of seconds instead
of a count of minutes; this will in turn require tm_zone to be
enlarged to long (or to int_least32_t), to handle properly
the minimum requirements.
e) POSIX compatibility might be defeated with the restriction
set upon Daylight Saving Times algorithms to actually *advance*
the clocks. This is a minor point, since there is no historical
need, nor any perceived real need, for such a "feature".
f) On implementations that support leap seconds, 7.23.2.2
(difftime) do not specify whether the result should include
(thus considering calendar time to be effectively UTC) or
disregard (thus considering calendar time to be effectively
TAI) leap seconds. This is unfortunate.
g) The requirements set up by 7.23.2.3p4 (a second call to
mktime should yield the same value and should not modify the
brocken-down time) is too restrictive for mktime, because
mktime does not allow complete determination of the calendar
time associated with a given brocken-down time. Examples
include the so-called "double daylight saving time" that
were in force in the past, or when the time zone associated
with the time changes relative to UTC.
For example, in Sri Lanka, the clocks move back from 0:30
to 0:00 on 1996-10-26, permanently. So the timestamp
1996-10-26T00:15:00, tm_isdst=0 is ambiguous when given to
mktime(); and widely deployed implementations exist that use
caches, thus might deliver either the former of the later
result on a random basis; this specification will effectively
disallow caching inside mktime, with a big performance hit
for users.
This requirement (the entire paragraph) should be withdrawn.
Anyway, mktime is intended to be superseded by mkxtime, so
there is not much gain trying to improve a function that is
to be declared deprecated.
h) The case where mktime or mkxtime is called with tm_zone set
to _LOCALTIME and tm_isdst being negative (unknown), and when
the input moment of time is inside the "Fall back", that is
between 1:00 am and 2:00 am on the last Sunday in October
(in the United States), leads to a well known ambiguity.
Contrary to what might have been waited for, this ambiguity
is not solved by the additions of the revision of the Standard
(either results might be returned): all boiled down to the
sentence in 7.23.2.6, in the algorithm, saying
// X2 is the appropriate offset from local time to UTC,
// determined by the implementation, or [...]
Since there are two possible offsets in this case...
i) Assuming the implementation handle leap seconds, if brocken-
down times standing in the future are passed (where leap seconds
cannot de facto be determined), 7.23.2.4p4 (effect of _NO_LEAP_
SECONDS on mkxtime), and in particular the sentence between
parenthesis, seems to require that the count of leap seconds
should be assumed to be 0. This would be ill advised; I would
prefer it to be implementation-defined, with the recommended
practice (or requirement) of being 0 for implementations that
do not handle leap seconds.
j) Assuming the implementation handle leap seconds, the effect
of 7.23.2.4p4 is that the "default" behaviour on successive
calls to mkxtime yields a new, strange scale of time that is
neither UTC nor TAI. For example (remember that there will
be introduced a positive leap second at 1998-12-31T23:59:60Z
in renewed ISO 8601 notation):
struct tmx tmx = {
.tm_year=99, .tm_mon=0, .tm_mday=1, .tm_hour=0, .tm_min=0, .tm_sec=0,
.tm_version=1, .tm_zone=_LOCALTIME, .tm_ext=NULL,
.tm_leapsecs=_NO_LEAP_SECONDS }, tmx0;
time_t t1, t0;
double delta, days, secs;
char s[SIZEOF_BUFFER];
t1 = mkxtime(&tmx);
puts(ctime(&t1));
if( tmx.tm_leapsecs == _NO_LEAP_SECONDS )
printf("Unable to determine number of leap seconds applied.\n");
else
printf("tmx.tm_leapsecs = %d\n", tmx.tm_leapsecs);
tmx0 = tmx; // !!! may share the object pointed to by tmx.tm_ext...
++tmx.tm_year;
t2 = mkxtime(&tmx);
puts(ctime(&t2));
delta = difftime(t2, t1);
days = modf(delta, &secs);
printf("With ++tm_year: %.7e s == %f days and %f s\n", delta, days, secs);
if( tmx.tm_leapsecs == _NO_LEAP_SECONDS )
printf("Unable to determine number of leap seconds applied.\n");
else
printf("tmx.tm_leapsecs = %d\n", tmx.tm_leapsecs);
tmx = tmx0; // !!! may yield problems if the content pointed to by
// tm_ext have been modified by the previous call...
tmx.tm_hour += 24*365;
tmx.tm_leapsecs = _NO_LEAP_SECONDS;
t2 = mkxtime(&tmx);
puts(ctime(&t2));
delta = difftime(t2, t1);
days = modf(delta, &secs);
printf("With tm_hour+=24*365: %.7e s == %f days and %f s\n", delta, days, secs);
if( tmx.tm_leapsecs == _NO_LEAP_SECONDS )
printf("Unable to determine number of leap seconds applied.\n");
else
printf("tmx.tm_leapsecs = %d\n", tmx.tm_leapsecs);
Without leap seconds support, results should be consistent and straight-
forward; like (for me in Metropolitan France):
Thu Jan 1 01:00:00 1998
Unable to determine number of leap seconds applied.
Fri Jan 1 01:00:00 1999
With ++tm_year: 3.1536000e+07 s == 365 days and 0 s
Unable to determine number of leap seconds applied.
Fri Jan 1 01:00:00 1999
With tm_hour+=24*365: 3.1536000e+07 s == 365 days and 0 s
Unable to determine number of leap seconds applied.
Things may change with leap seconds support; assuming we are in a
time zone behind UTC (e.g. in the United States), the results might be:
Wed Dec 31 21:00:00 1997
tmx.tm_leapsecs = 31
Thu Dec 31 21:00:00 1998
With ++tm_year: 3.1536000e+07 s == 365 days and 0 s
tmx.tm_leapsecs = 31
Thu Dec 31 21:00:00 1998
With tm_hour+=24*365: 3.1536000e+07 s == 365 days and 0 s
tmx.tm_leapsecs = 31
But with a time zone ahead of UTC, results might be
Thu Jan 1 01:00:00 1998
tmx.tm_leapsecs = 31
Thu Dec 31 00:59:59 1998
With ++tm_year: 3.1536000e+07 s == 365 days and 0 s
tmx.tm_leapsecs = 31
Fri Jan 1 01:00:00 1999
With tm_hour+=24*365: 3.1536001e+07 s == 365 days and 1 s
tmx.tm_leapsecs = 32
And if the time zone is set to UTC, results might be
Thu Jan 1 00:00:00 1998
tmx.tm_leapsecs = 31
Thu Dec 31 23:59:60 1998
With ++tm_year: 3.1536000e+07 s == 365 days and 0 s
tmx.tm_leapsecs = 31
Fri Jan 1 00:00:00 1999
With tm_hour+=24*365: 3.1536001e+07 s == 365 days and 1 s
tmx.tm_leapsecs = 32
or, for the three later lines
Thu Dec 31 23:59:60 1998
With tm_hour+=24*365: 3.1536000e+07 s == 365 days and 0 s
tmx.tm_leapsecs = 31
The last result is questionable, since both choices are allowed by
the current text (the result is right into the leap second itself).
Moreover, implementations with caches might return either on a
random basis...
Bottom line: the behaviour is surprising at least.
k) 7.23.2.6p2 (maximum ranges on input to mkxtime) use LONG_MAX
sub multiples to constraint the members' values. Outside the fact
that the limitations given may easily be made greater in general
cases, it have some defects:
- the constraint disallow the common use on POSIX box of tm_sec
as an unique input member set to a POSIX style time_t value;
- the constraints are very awkward to implementations where
long ints are bigger than "normal" ints: on such platforms, all
members should first be converted to long before any operation
take place;
- since there are eight main input fields, plus a ninth (tm_zone)
which is further constrained to be between -1439 and +1439, the
result might nevertheless overflows, so special provision to
handle should take place in any event.
l) There is an obvious (and already known) typo in the description
of D, regarding the handling of year multiple of 100. Also
this definition should use QUOT and REM instead of / and %.
m) Footnote 252 introduces the use of these library functions with
the so-called proleptic Gregorian calendar, that is the rules for
the Gregorian calendar applied to any year, even before Gregory's
reform. This seems to contradict 7.23.1p1, which says that calendar
time's dates are relative to the Gregorian calendar, thus tm_year
should be in any case greater to -318. If it is the intent, another
footnote in 7.23.1p1 might be worthwhile. Another way is to rewrite
7.23.1p1 something like "(according to the rules of the Gregorian
calendar)". See also point l) below.
n) The static status of the returned result of localtime and gmtime
(which is annoying, but that is another story) is clearly set up
by 7.23.3p1. However, this does not scale well to zonetime, given
that this function might in fact return two objects: a struct tmx,
and an optional object containing additional members, pointed to
by tmx->tm_ext.
If the later is to be static, this might yield problems with mkxtime
as well, since 7.23.2.4p5 states that the brocken-down time "produced"
by mkxtime is required to be identical to the result of zonetime
(This will effectively require that tm_ext member should always
point to static object held by the implementation; if it is the
original intent, please state it clearly).
o) There is a direct contradiction between the algorithm given for
asctime in 7.23.3.1, which will overflow if tm_year is not in the
range -11899 to +8099, and the statements in 7.23.2.6 that intent
to encompass a broader range.
All of these points militate to a bigger lifting of this part
of the library. Such a job have been initiated recently, as
Technical Committee J11 is aware of. In the meantime, I suggest
dropping all these new features from the current revision of
the Standard.
It means in effect:
i) removing subclauses 7.23.2.4 (mkxtime), 7.23.2.6 (normalization),
7.23.3.6 (strfxtime), 7.23.3.7 (zonetime), 7.24.5.2 (wcsfxtime).
ii) removing paragraphs 7.23.2.3p3 and 7.23.2.3p4 (references to
7.23.2.6),
iii) the macros _NO_LEAP_SECONDS and _LOCALTIME in 7.23.1p2 should be
removed, as they are becoming useless. Same holds for struct tmx
in 7.23.1p3.
iv) 7.23.1p5 (definition of struct tmx) should also be removed, as it
is becoming useless too.
__________ end of France Comments; beginning of Japan Comments ______
VOTE ON FCD 9899
JTC 1/SC22 N2794
Information technology - Programming languages -
Programming Language C (Revision of ISO/IEC 9899:1990)
----------------------------------------------------------
The National Body of Japan disapproves ISO/IEC JTC1/SC22 N 2794, VOTE ON
FCD 9899: Information technology - Programming languages - Programming
Language C (revision of ISO/IEC 9899:1990.)
If the following comments are satisfactorily resolved, Japan's vote will
be changed to "Yes".
1. 64-bit data type should be optional
Japan has been claiming a 64-bit data type to be optional for
freestanding environment because it tends to increase the sizes of
program code and run-time library. This situation forces users and
vendors, who do not need to handle any 64-bit data, to implement and use
unnecessary fat conforming implementation. For example, for 16-bit
microprocessors, its compiler vendors and users do not need 64-bit data
type in almost cases.
Of course, Japan knows well that this issue has been discussed among WG14
committee for a long time, and recognizes that a majority of WG14
committee agrees to introduce 64-bit data type as a mandatory
specification to ISO C standard. So, Japan decided to agree to make 64-bit
data type as a mandatory specification, provided that the following
requirements are accepted by the committee:
a) In the Rationale document, explicitly specify the logical reason why
the 64-bit data type need to be introduced as a MANDATORY specification
to the ISO C standard, in other words, why the 64-bit data type can NOT
be OPTIONAL. Please understand that this is NOT requiring the reason why
64-bit data type is necessary. Japan already understands the necessity of
64-bit data type very well. Japan needs the logical reason why the
committee has denied the proposal of "OPTIONAL".
b) In the Rationale document, explicitly specify an effective example of
conforming implementation which is supporting mandatory 64-bit data type
and can reduce a code size of the program to similar (never lager) size
in existing C90 conforming implementation if the program does not use
64-bit data at all.
c) By some appropriate manner, publish the Rationale document that
includes the above two descriptions.
2. "K.4 Locale-specific behavior" and the wcscoll function
@ Annex K.4
The wcscoll function, defined in sub-clauses 7.24.4.4.2, has a
locale-specific behavior, therefore, add the reference of the wcscoll
function to Annex K.4.
3. Modify description for the conversion specifiers (e, f, g, a)
@ 7.19.6.1 fprintf function
@ 7.24.2.1 fwprintf function
The description of the conversion specifiers f, F, e, E, g, G, a and A
has the term "a (finite) floating-point number." This kind of expression
using the parenthesizes, "(finite)", is not appropriate as a
specification of the programming language standard. It should be changed
to a more strict description by using well-defined terms. Japan's
concrete proposal was already presented in the comment attached to the
vote of CD Approval. Please refer to SC22 N 2790(SoV of CD vote), and N
2792 (Disposition of CD vote comments).
[More explanation of this issue]
A floating-point number is defined in sub-clause "5.2.4.2.2
characteristics of floating types <float.h>". In "5.2.4.2.2", the
floating-point number seems to be categorized as follows:
floating-point number
+
|
+---- normalized floating-point number
|
+---- not normalized floating-point number
+
|
+---- subnormal floating-point number
|
+---- infinities
|
+---- NaNs
|
+---- ...
That is, infinities and NaNs can be interpreted as one category of
"floating-point number". (In the standard, there is no explicit
description describing that infinities and NaNs are NOT floating-point
numbers.)
On the other hand, the footnote 15 says that "although they are stored
in floating types, infinities and NaNs are not floating-point numbers",
however, the footnote is NOT the normative part of the standard. In order
to make the standard clearer, this description should be moved to the
normative part of the standard. That way, removing the word (finite)" is
enough change to the current description for the conversion specifiers
(e, f, g, a). However, if the committee leaves the footnote 15 as is, the
description for e, f, g, and a should be changed as already Japan
proposed.
-----
Cf. SC22 N 2792
Disposition of Comments Report on CD Approval of CD
9899 - Information technology - Programming languages
- Programming Language C (Revision of ISO/IEC
9899:1990)
> 3.8 A double argument for the conversion specifier
>
> > 2. Editorial Comments
> > 14) A double argument for the conversion specifier
> >
> > Sub clause 7.12.6.1 (page 232 - 233 in draft 9) and
> > sub clause 7.18.2.1 (page 308 - 309 in draft 9):
> >
> > In the description about the conversion specifiers f, F, e,
> > E and G of the function f[w]printf,
> > "a double argument representing a floating-point number
> > is..."
> > should be changed to
> > "a double argument representing a normalized
> > floating-point number is..." ^^^^^^^^^^
> > in order to clarify the range and the definition of the
> > double argument.
> >
> > WG14: The Committee discussed this comment, and came to the
> > consensus that this is not an editorial issue, some
> > floating point arithmetic support denormal numbers and
> > infinities.
> > There will need to be a detailed proposal to support
> > this change.
>
> The original intention of Japanese comment is to point out
> that the current description:
>
> "A double argument representing a floating number is
> converted to ...[-]ddd.ddd...
> A double argument representing an infinity is converted
> to ...[-]inf or [-]infinity
> A double argument representing a NaN is converted to ...
> [-]nan or [-]nan(n-char-sequence)..."
>
> is not appropriate as a strict language standard
> specification because "a floating-point number" (defined in
> "5.2.4.2.2 Characteristics of floating types <float.h>"),
> as WG14 mentions above, may include an infinity and a Nan
> so that the current description can be read as an infinity
> can be converted to [-]ddd.ddd or [-]inf and also NaN can be
> converted to [-]ddd.ddd or [-]Nan.
>
> Therefore, Japan re-proposes to change the above description
> to:
>
> "A double argument representing an infinity is converted
> to ...[-]inf or [-]infinity...
> A double argument representing a NaN is converted to ...
> [-]nan or [-]nan(n-char-sequence)..."
> A double argument representing a floating number except
> an infinity and a NaN is converted to ...[-]ddd.ddd..."
>
> This change should be applied to the description about the
> conversion specifier f, F, e, E and G of the function
> f[w]printf().
>
> WG14: Response Code: AL
-----
4. Add UCN to the categories of preprocessing token
@ 6.4 Lexical elements, 1st paragraph in the semantics
Add UCN to the categories of the preprocessing token as described in the
syntax.
5. Replace zero code with null character in mbstowcs
@ 7.20.8.1 The mbstowcs function, Returns
"Terminating zero code" should be changed to a well-defined term:
"Terminating null character (defined in 5.2.1)"
Cf. "5.2.1" says "a byte with all bits set to 0, called the null
character, ..."
6. Necessary rationale for the changes of some environmental limits
@ "7.13.6.1 The fprintf function" Environmental limit
The minimum value for the maximum number of characters produced by any
single conversion is changed from 509 (in the current ISO/IEC 9899:1990)
to 4095. And also, some of other environmental limits are changed from
ISO/IEC 9899:1990. Please describe a clear rationale for these changes.
The above change request had already presented as a part of Japan's
comments attached to CD approval vote. The disposition of this comment by
the committee was as follows:
"This was accepted as an editorial change to the Rationale." (Please
refer SC22 N 2792.) However, the latest drafted Rationale SC22/WG14 N
850 does not have any description about the change of environmental limit
pointed by Japan's comment. Therefore, Japan decided to present the same
comment as already submitted at CD approval ballot.
7. Mathematical notations
@ 7.12 Mathematics <math.h>
Many kinds of mathematical notations are used in sub clause 7.12,
including unfamiliar one. Are all of these notations defined in some ISO
standard, or any other standard? If so, please add the document name and
number to "2. Normative references." If not, please add explanations and
definitions of each mathematical notation to the annex.
8. The nearbyint function: Add reference to Annex F
@ 7.12.9.3 the nearbyint functions, Description
Add the reference "(see F.9.6.3 and F.9.6.4)" to the description of the
nearbyint function as rint functions.
9. Inappropriate sentences
@ 7.3.1 Introduction, 5th paragraph
[#5] Notwithstanding the provisions of 7.1.3, a program is permitted to
undefine and perhaps then redefine the macros complex, imaginary, and I.
@ 7.16 Boolean type and values <stdbool.h>, 4th paragraph
[#4] Notwithstanding the provisions of 7.1.3, a program is permitted to
undefine and perhaps then redefine the macros bool, true, and false.
The sentence "Notwithstanding... a program is permitted to undefine and
perhaps then..." is not appropriate to describe the programming language
standard. Please modify the description by using well defined verbs or
auxiliary verbs, e.g. "shall", "may" and so on. For example, "a program
may undefine and then redefine..." seems to be appropriate.
10. Change the definition of active position into the original words of
C90
@ 5.2.2 Character display semantics, 1st paragraph
France changed the description of the active position along the comment
as follows:
[#1] The active position is that location on a display device where the
next character output by the fputc or fputwc functions would appear.
The intent of writing a printable character (as defined by the isprint or
iswprint function) to a display device is to display a graphic
representation of that character at the active position and then advance
the active position to the next position on the current line.
This change lacks careful considerations about a treatment of a
character, a wide character, a byte-oriented stream and a wide-oriented
stream. It is necessary to distinguish carefully a character from a wide
character, and a byte-oriented stream from a wide-oriented stream. It is
also necessary to consider a mixed-up stream of byte-oriented and
wide-oriented. As a result of the easy change, the above description is
including incorrect descriptions. One of the examples of incorrect
descriptions is "a printable character (as defined by ... iswprint
function.)" The iswprintf defines the printable *wide* character, not the
printable character.
Therefore, the sentence should be changed back into the original words by
removing "fputwc" and "iswprint" as follows:
[#1] The active position is that location on a display device where the
next character output by the fputc function would appear. The intent of
writing a printable character (as defined by the isprint function) to a
display device is to display a graphic representation of that character
at the active position and then advance the active position to the next
position on the current line.
If the committee wants to change the definition of the active position
from C90, it is necessary more deep discussions about the character
issues, and need the agreement with all of the national member bodies.
____ end of Japan Comments; beginning of Norway Comments ___________
--------------------------------------------------------------------
ISO/IEC JTC 1/SC22
Title: Programming languages, their environments and system software
interfaces
Secretariat: U.S.A. (ANSI)
---------------------------------------------------------------------
Please send this form, duly completed, to the secretariat indicated above.
---------------------------------------------------------------------
FCD 9899
Title: Information technology - Programming languages - Programming
Language C (Revision of ISO/IEC 9899:1990)
-----------------------------------------------------------------------
Please put an "x" on the appropriate line(s)
Approval of draft
_____ as presented
__X__ with comments as given below (use separate page as annex, if
necessary
_____ general
__X__ technical
_____ editorial
_____ Disapproval of the draft for reasons below (use separate page as
annex, if necessary
_____ Acceptance of these reasons and appropriate changes in the text
will change our vote to approval
_____ Abstention (for reasons below
--------------------------------------------------------------------
P-member voting: NORWAY
Date: 1999-01-08
Signature (if mailed): ULF LEIRSTEIN
---------------------------------------------------------------------
Comment 1.
Category: Correction restoring original intent
Committee Draft subsection: 5.1.1.2
Title: Translation phases 2 and 4 should be allowed to produce \\u....
In phase 2 and 4, behavior is explicitly undefined if new-line removal
or token concatenation produces a sequence like \\u0123, e.g.
printf("\\u\
0123");
I assume this code is intended to be legal, since that \u0123 is not a
universal character name.
The same can happen with token concatenation in phase 4, though I'm not
sure if otherwise well-defined code can produce such sequences. E.g.
#define CAT(a,b) a##b
CAT(a\\u, 0123)
Suggested change (though I hope a more elegant wording is found):
In phase 2 (and 4?), after
character sequence that matches the syntax of a universal
character name
append
and is not immediately preceded by an uneven number of
backslashes
On the other hand, is it supposed to be OK for backslash-newline
removal to turn a UCN-look-alike into a non-UCN?
printf("\\
\u0123");
I don't know if it can cause trouble for a C compiler, but it can
confuse tools that process C program lines in a "state-less" way.
------------------------------------------------------------------------
Comment 2.
Category: Normative change to intent of existing feature
Committee Draft subsection: 6.4.9
Title: Remove `//*' quiet change.
The `//*' quiet change in the Rationale, also illustrated by the last
example of paragraph 6.4.9, is unnecessary:
m = n//**/o
+ p; // equivalent to m = n + p;
(This meant m = n / o + p; in C89.)
Suggested change to 6.4.9:
New constraint: The first (multibyte) character following the `//'
in a `//'-style comment shall not be `*'.
Paragraph 2: Append .."and to see if the initial character is `*'".
Paragraph 3, last example: Change the note to " // syntax error".
If this is rejected, a milder change could be a `Recommended Practice'
section which recommends to warn about `//*'.
------------------------------------------------------------------------
Comment 3.
Category: Feature that should be included
Committee Draft subsection: 6.10, 6.10.3
Title: Allow `...' anywhere in a macro parameter list.
Parameters after the `...' in a macro could be useful. The preprocessor
does not need to forbid it, since the number of arguments is known when
the macro is expanded. Example:
extern void foo(char *arg, ...); /* Argument list ends with NULL */
#ifdef ENCLOSE
# define foo(..., null) (foo)("{", __VA_ARGS__, "}", null)
#endif
Suggested changes follow.
In 6.10 and A.2.3, add
macro-parameter-list:
identifier-list
identifier-list , ...
identifier-list , ... , identifier-list
...
... , identifier-list
and replace
# define identifier lparen identifier-list-opt )
replacement-list new-line
# define identifier lparen ... ) replacement-list new-line
# define identifier lparen identifier-list , ... )
replacement-list new-line
with
# define identifier lparen macro-parameter-list-opt )
replacement-list new-line
In 6.10.3 paragraph 4, replace
If the identifier-list in the macro definition does not
end with an ellipsis,
with
If the macro-parameter-list does not contain an ellipsis,
and replace `identifier-list' with `macro-parameter-list'.
In paragraph 10, replace
The parameters are
specified by the optional list of identifiers, whose scope
extends from their declaration in the identifier list
with
The parameters are
specified by the optional macro-parameter-list. Their scope
extends from their declarations in the parameter list
In paragraph 12, replace
If there is a ... in the identifier-list in the macro
definition, then the trailing arguments,
with
If there is a ... in the macro-parameter-list, then
arguments from the position matching the ... ,
In the rest of 6.10.3, replace `identifier list' with `parameter list'.
------------------------------------------------------------------------
Comment 4.
Category: Feature that should be included
Committee Draft subsection: 6.10.3
Title: Support empty __VA_ARGS__ by adding __VA_COMMA__
Empty __VA_ARGS__ are not currently allowed, though there are things
they could express that the current definition cannot handle. Example:
#define Msg(type, ...) printf(format[type], __VA_ARGS__)
Suggested change:
1. Allow `...' to receive no arguments.
2. Define an implicit macro parameter __VA_COMMA__ or maybe __VA_SEP__
which expands to `,' if there were arguments and <empty> otherwise.
This allows a possibly-empty __VA_ARGS__ to be used both in front
and at the end of a comma-separated list:
#define SEND(...) Send( __VA_ARGS__ __VA_COMMA__ (void*)0 )
#define Msg(type, ...) printf(format[type] __VA_COMMA__ __VA_ARGS__)
One negative effect is that in macros which need arguments to `...', the
error check for whether there are arguments is lost. The programmer
must supply an extra argument if he wants that check, and e.g. replace
`#__VA_ARGS__' with `#extra_arg #__VA_COMMA__ #__VA_ARGS__'.
That does not seem important, since there is no error check on the rest
of the arguments in any case. Besides, the error will usually cause a
syntax error in translation phase 7.
Still, a workaround could append a `?' to the `...' when the `...' may
receive an empty argument list, or (uglier in my opinion) only allow an
empty argument list if there is a __VA_COMMA__ in the replacement list.
I do not know if `foo(EMPTY)' below should expand to <empty> or `,':
#define foo(...) __VA_COMMA__
#define EMPTY
This does not seem important, so it could probably be standardized as
whatever is easiest to implement. I believe it should expand to `,'.
(It should agree with the __VA_COUNT__ proposal, if that is included.)
Changes to the standard:
6.10.3 paragraph 4: Replace
Otherwise, there shall be more arguments in the invocation than
with
Otherwise, there shall be at least as many arguments in the
invocation as
6.10.3 paragraph 5: Replace
The identifier __VA_ARGS__
with
The identifiers __VA_ARGS__ and __VA_COMMA__
6.10.3.1 - new paragraph 3:
An identifier __VA_COMMA__ that occurs in the replacement list shall
be treated as if it were a parameter. If no arguments were used to
form the variable arguments, __VA_COMMA__ shall receive an empty
argument. Otherwise, it shall receive one `,' token.
6.10.3.5p9: Add an example
#define run(cmd,...) execlp(cmd, cmd __VA_COMMA__ __VA_ARGS__, NULL)
run("man", "cc");
results in
execlp("man", "man" , "cc" , NULL);
------------------------------------------------------------------------
Comment 5.
Category: Correction, Request for information/clarification
Committee Draft subsection: various
Title: Clean up character and encoding concepts
The definitions of and relationships between characters, character
encodings, character sets and C types are scattered throughout the
Standard, and are difficult to figure out even after reading through it:
Seven concepts for characters and their encodings (plus C types):
character, extended character, multibyte character, generalized
multibyte character, wide character, wide-character code/value, byte.
Eight or nine "character sets":
source/execution ch.sets, basic source/execution ch.sets, extended
source/execution ch.sets, required source ch.set, encoding of physical
source files, (encoding of generalized multibyte characters).
Also, the word "character" is used for different concepts: *Encoded*
in bytes (like UTF-8 characters), encoded as a single byte, and
*enumerated* (as in iso-10646). I'm not sure if there are also
"abstract" characters (conceptual entities with a typical graphical
form, which I believe is the correct meaning of "character") in the
standad. This may be part of the reason for the confusion in
discussions about character sets and universal characters names.
Note that I don't know if my definitions above are quite correct.
As far as I can tell, the basic, extended and maybe required character
sets are enumerated, and the rest are encoded. Though if the source
character set is encoded, I don't know why translation phase 1 needs to
map another character set to that instead of to an enumerated set.
(The fact that most required characters can be encoded in one byte
doesn't mean the required set can be both encoded and enumerated - an
entity can't be both member of an encoded set and an enumerated set.)
The different character concepts should be spelled out in _one_ section
(5.2.1?), the unqualified word "character" ought to be used for (at
most) one of the concepts above, and at least the character sets that
use other concepts should be renamed - e.g. source character set ->
"encoded source character set or multibyte source character set.
This would often lead to cumbersome text, though. The text could be
simplified if this notation is added:
"Character" prepended with "basic", "extended", "source", "execution",
and/or "required" means member of the corresponding character set.
Thus, "basic source character" means "member of the basic source
character set".
If the "source/execution character sets" are renamed to "multibyte
source/execution character sets", the same rule can apply to the word
"multibyte" -- unless there exist multibyte characters that are not
members of the source/execution character sets.
Since the basic/extended character sets contain integer codes for
characters and the source/execution character sets contain multibyte
representations, I suggest that "characters" and "extended characters"
are clearly described as integers or numbered entities, and "bytes" and
"multibyte characters" as encodings.
In any case, please describe all the character types and sets above in
one section:
* which character concept they use,
* the relationship between them:
- which types and sets map to each other,
- which can be subsets or proper subsets of each other, and
which may contain members that do not map to any member in
others they relate to (e.g., I believe a byte may have a value
which does not map to the source/execution character set),
- maybe the character concepts' and character sets' relationship to
char, unsigned char, wchar_t and wint_t.
The character-related definitions in section 3 and their
descriptions/definitions in 5.2.1 should point to each other.
The reference to 5.2.1 in 6.2.5p3 must be updated or removed.
5.2.1 contains a jumbled list of encoded characters (in the
source/execution sets) and enumerated characters (in the basic
source/execution sets). I believe the current definition should
describe the basic/extended character sets, and possibly that the null
character is encoded as a null byte in the execution character set.
(The current words are incorrect, the basic execution character set
doesn't contain a null *byte*.)
7.19.6.1p17 and 7.19.6.2p24 incorrectly refer to "multibyte members of
the extended character set", and 7.20p3(MB_CUR_MAX) to "bytes in a
multibyte character for the extended character set". They should refer
to multibyte characters *representing* members of the extended character
set, or multibyte members of the (encoded) source/exec. character set.
One other detail: considering the definition of "byte" (3.4), the
"multibyte character" definition (3.14) means "sequence of one or more
addressable units...". I think the correct definition of byte is
"either <the current definition>, or bit representation which fits into
same".
Some suggested changes:
3.4 byte
either an addressable unit of data storage large enough to hold any
member of the basic character set of the execution environment,
or bit representation which fits exactly in a byte.
NOTE A byte has no value as such, but it can *encode* a value or
part of a value - e.g. a character. When the "value" of a byte is
mentioned without a related encoding, one usually means encoding
as a binary integer with no sign bit.
[ "no sign bit" (or unsigned, but that indicates C unsigned semantics)
so 0 can't mean "all bits 1" on 1's complement machines. ]
3.5 character
code value (a binary encoded integer [with no sign bit?]) that fits
in a byte
[ this is equivalent to 7.1.1p4 "wide character", except I wasn't sure
if "... that corresponds to a member of the basic character set" could
be included. May a byte have a value which is not a valid character?
May a character have a value which isn't in the basic character set? ]
or a multibyte character or wide character, if this is clear from
the context.
[ E.g. `the null character' and `space character' often means multibyte
characters, in fact null character is defined that way in 5.2.1 -
it is a member of an encoded character set. ]
About source files:
Translation phase 1 maps end-of-line in the physical source character
set to newline in the source character set. Thus, the source character
set (and the required source character set) in 5.2.1 must include
newline. Though it's probably good to keep the reminder in 5.2.1 that
physical source files may have some other representation of end-of-line.
5.2.1.2p1 says that for the source and execution character sets,
-- A byte with all bits zero shall be interpreted as a
null character independent of shift state.
-- A byte with all bits zero shall not occur in the second
or subsequent bytes of a multibyte character.
I believe this means an 8-bit host must either map an UCS-4 physical
source file to e.g. a UTF-8 source character set (the opposite of what a
normal Unicode application would do), or define the compiler to be
running on an emulated machine with 32-bit bytes. If so, why force the
implementation through such games?
However, I suspect the term "source file" in most places except 5.1.1.2
must be read as "the source after translation phase 1", or maybe "after
mapping to the source character set" (possibly except the mapping of
end-of-line to newline).
Maybe the simplest fix is to replace most occurrences of "source file"
with "source code", and define that as suggested above.
And/or move 1st part of translation phase 1 to a new phase 0, and define
the language in terms of phase 0 output.
Usually one can't misunderstand without really trying to, but --
5.2.1 defines the source character set as the encoding of characters
in source files, this definition needs to be correct.
5.2.1.2p2 may contain an exception:
For source files, the following shall hold:
-- An identifier, comment, string literal, character
constant, or header name shall begin and end in the
initial shift state.
I'm not familiar with shift states: Whether or not this is before or
after phase 1, and whether or not that makes a difference.
5.2.1p3 ends with
If any other characters are encountered in a source file
(except (...)), the behavior is undefined.
At least this text should say something like "after translation phase
1" or "mapped to the the source character set".
Strings:
7.1.1p1 is slightly incorrect: this is one place where "null character"
means "character or multibyte character", since "string" can mean either
character string or multibyte string.
In 7.1.1p1's sentence
The term multibyte string is sometimes used instead to emphasize
special processing given to multibyte characters contained in the
string or to avoid confusion with a wide string.
maybe "contained" should be replaced with "encoded".
And to complement it, add:
The term character string is sometimes used to emphasize that
each byte in the string contains an integer character code
(converted to char).
Some other matters:
Source characters and multibyte characters are defined in terms of
bytes, which are defined in terms of the execution environment. Thus,
the compilation environment can't have more bits per byte than the
execution environment. This looks strange; if it is intended I think it
must be emphasized.
The ending sentence "If any other characters are encountered in a source
file..." in 5.2.1p3 is placed so it actually means "other than in the
required *execution* character set".
6.4.3p2/p3 refers to the "required character set"; that should be
"required source character set".
The Index needs entries for "required source character set" and
"generalized multibyte character".
6.2.5p5-p6, D.2.1p1, K.1p1 refer to the "values" of bytes. That can be
replaced this with the "contents" of bytes, depending on what you do
with the suggested NOTE above about values of bytes.
To make clear that e.g. a yen sign may be used for backslash in C, add
to this wording of 5.2.1p3:
Both the basic source and basic execution character sets shall
have at least the following members:
the text
(though which glyphs these members correspond to on an output
device is unspecified)
------------------------------------------------------------------------
Comment 6.
Category: Change to existing feature, request for clarification
Committee Draft subsection: 5.1.1.2, 6.2.5, 6.4.4, 6.4.5
Title: Failure on source->execution character conversion
In translation phase 5, if a source character has no corresponding
member in the execution character set, it is converted to an
implementation-defined member.
1. Are _all_ source characters lacking a corresponding member in the
execution character set converted to the _same_ character?
2. It should be allowed to let any use of the resulting character cause
a compile-time error. (It should be legal to let compilation fail if
the program may be translated differently than intended, or to let
users request this.)
Other places where error must then be allowed:
6.2.5p3
If any other character [than a required source character] is
stored in a char object, the resulting value is
implementation-defined but shall be within the range of values
that can be represented in that type.
6.4.5p5
The value of a string literal containing a multibyte character or
escape sequence not represented in the execution character set is
implementation-defined.
6.4.4.4p2
An integer character constant is a sequence of one or more
multibyte characters (...) mapped in an implementation-defined
manner to members of the execution character set.
[Incidentally, this says only what the encoding is, not what the
integer value is. This should be mentioned explicitly, in particular
since 6.4.4.4p5-p6 (octal/hex character) does describe the value and
not the encoding.]
6.4.4.4p10
The value of an integer character constant (...) containing a
character or escape sequence not represented in the basic
execution character set, is implementation-defined.
6.4.4.4p11
The value of a wide character constant (...) containing a
multibyte character or escape sequence not represented in the
extended execution character set, is implementation-defined.
------------------------------------------------------------------------
Comment 7.
Category: Request for clarification, or change to existing feature
Committee Draft subsection: 6.10.3.2
Title: Consistent stringification of UCNs
6.10.3.2p2 (The # operator) says:
it is unspecified whether a \ character is inserted before
the \ character beginning a universal character name.
Please clarify: Is the implementation allowed to insert `\' in some
circumstances but not others? May it do so in a single translation
unit? A single preprocessing file?
(Also update this point in K.1 Unspecified behavior).
If yes: It would be useful if implementations are required to behave
consistently in this regard, preferably so that a Configure program can
test *once* e.g. how the compiler treats identifiers and how it treats
strings, and #define a macro accordingly which will be used when a
package is compiled. Unfortunately I'm not sure what kind of
"intelligence" one might want to give compilers here, such as to create
\\u... inside stringified string/character constants and \u... outside
them.
------------------------------------------------------------------------
Comment 8. [Withdrawn]
------------------------------------------------------------------------
Comment 9.
Category: Editorial change/non-normative contribution
Committee Draft subsection: 6.10.3.5
Title: Preprocessor examples
The examples in 6.10.3.5 (macro scopes) would be better placed in a new
section 6.10.3.6, since only some examples are related to macro scopes.
A few examples with national characters and with universal character
names in non-string tokens would be useful.
------------------------------------------------------------------------
Comment 10.
Category: Editorial change/non-normative contribution
Committee Draft subsection: 6.4.3
Title: Describe the specified Unicode ranges
_Describe_ the named characters (or print them if you invert part of the
section so that below 000000A0 it says which UCNs are _allowed_), for
the sake of people who don't know Unicode well. (I found that 0000D800
through 0000DFFF are "surrogates", but not what that is, or whether it's
something a compiler accepting national characters in source files
should consider.)
------------------------------------------------------------------------
Comment 11.
Category: Editorial change/non-normative contribution
Committee Draft subsection: 6.4.4.4
Title: Add note that 'ab' is not a multibyte character
To avoid confusion in `Description' at page 63, add a note that the
<'ab'> is not one multibyte character.
------------------------------------------------------------------------
Comment 12.
Category: Editorial change/non-normative contribution
Committee Draft subsection: 7.20.2.2
Title: Use inttypes in example
Suggestion: Replace `unsigned long int' with uint_fast32_t in the
rand() example.
(Or better, remove the example. It's a poor rand() implementation.)
------------------------------------------------------------------------
Comment 13.
Category: Request for information/clarification
Committee Draft subsection: 5.2.4.1, 6.4.3
Title: Meaning of "character short" identifier
What does the term "character short" identifier mean? It is used
in 5.2.4.1p1 and 6.4.3p2/p4. If it is an ISO-10646 term, you could
copy the definition from there.
------------------------------------------------------------------------
Comment 14.
Category: Editorial change/non-normative contribution
Committee Draft subsection: 6.4.6, Index
Title: digraphs are not defined.
The index has a reference "digraphs, 6.4.6", but the word "digraph"
does not occur anywhere else. It's a known word and people use it, so:
In 6.4.6p3 after "these six tokens", add "(called _digraphs_)" or
"(known as _digraphs_)", or say so in a note.
-----------------------------------------------------------------------
Comment 15.
Category: Request for clarification
Committee Draft subsection: 6.10.8
Title: Are __TIME__ and __DATE__ constant?
Can __TIME__ and __DATE__ change during the compilation of a single
translation unit?
If yes, I hope they may not change between seqence points. Otherwise
printf("%s at %s", __DATE__, __TIME__) can be inconsistent at midnight.
----------------------------------------------------------------------- -
Comment 16.
Category: Normative change to intent of existing feature
Committee Draft subsection: 7.19.4.2
Title: rename() of/to open files should be implementation-defined
On a UNIX filesystem, `rename' removes (or makes unnamed) the
_destination_ file if it existed. OTOH, a filesystem where the file's
basic handle is the filename may see `rename' as removing the _source_
file. So I believe rename() can have the same problem as remove() if
either of the files are open, and in addition I imagine a FILE* opened
to the destination file may end up pointing into the new file.
Suggested change:
Add this to 7.19.4.2, similar to the text in 7.19.4.1:
If either of the files are open, the behavior of the rename
function is implementation-defined.
------------------------------------------------------------------------
Comment 17.
Category: Editorial change/non-normative contribution
Committee Draft subsection: 6.7.3
Title: Removing restrict from prototypes
6.7.3p7 says
deleting all instances of the [restrict] qualifier from a
conforming program does not change its meaning
Correction: Replace "program" with "translation unit", or add "and the
headers it includes" after "program".
(Otherwise one could remove `restrict' below but keep `restrict' in
string.h's declaration of strcpy.)
#include <string.h>
char *(*fn) (char * restrict, const char * restrict) = strcpy;
------------------------------------------------------------------------
Comment 18.
Category: Feature that should be included
Committee Draft subsection: 6.3.2.3
Title: C++-style `T** -> T const *const *' conversion
Allow `T**' to be converted to `T const *const *' as in C++.
This text is based on section conv.qual in the C++ standard:
- Text about pointers to members and multi-level mixed pointers has been
removed, as well as the definition of _qv-qualification signature_
which isn't used elsewhere.
- restrict has been added to the list of qualifiers.
- Paragraph 4 about converting functions' prototypes and return types
has been added. Possibly this paragraph should be put elsewhere.
Editorial note: x{y} below is used as a typographical notation for x
subscribted by y. E.g. water would be H{2}O.
Suggested change: Replace 6.3.2.3p1-p2 with the following text:
Qualification conversions
In the following text, each qual{i} or qual{i,j} is zero or more of
the qualifiers const, volatile, and restrict.
[#1] The value of an expression of type "T qual{1} *" can be converted
to type "T qual{2} *" if each qualifier in qual{1} is in qual{2}.
[#2] NOTE Function types are never qualified; see [8.3.5 in C++.]
[#3] A conversion can add qualifiers at levels other than the first
in multi-level pointers, subject to the following rules:(->footnote)
Two pointer types T{1} and T{2} are _similar_ if there exists a type T
and integer n > 0 such that:
T{1} is T qual{1,n} *qual{1,n-1} ... *qual{1,1} *qual{1,0}
and
T{2} is T qual{2,n} *qual{2,n-1} ... *qual{2,1} *qual{2,0}
An expression of type T{1} can be converted to type T{2} if and only if
the following conditions are satisfied:
-- the pointer types are similar.
-- for every j > 0, each qualifier in qual{1,j} is in qual{2,j}.
-- if qual{1,j} and qual{2,j} are different, then const is in every
qual{2,k} for 0 < k < j.
[Note: if a program could assign a pointer of type T** to a pointer of
type const T** (that is, if line //1 below was allowed), a program
could inadvertently modify a const object (as it is done on line //2).
For example,
int main() {
const char c = 'c';
char* pc;
const char** pcc = &pc; //1: not allowed
*pcc = &c;
*pc = 'C'; //2: modifies a const object
}
--end note]
[#4] A function pointer of type FT{1} can be converted to a function
pointer of type FT{2} if
- their return types are compatible types, or similar pointer types
where the return type of FT{1} can be converted to the return type
of FT{2},
- and their respective parameters are compatible types, or similar
pointer types where the parameter of FT{2} can be converted to the
type of the equivalent parameter to FT{1}.
__________________
(footnote) These rules ensure that const-safety is preserved by the
conversion.
------------------------------------------------------------------------
Comment 19.
Category: Clarification, Request for clarification
Committee Draft subsection: various
Title: Terms like "unspecified _result_" are used but not defined
The standard uses undefined terms like `implementation-defined value',
`unspecified result', and `undefined conversion state'. Clause 3 only
defines unspecified, implementation-defined and undefined *behavior*.
The term `indeterminate' is important, but the closest to a definition
of it is a mention in passing in 3.18 (undefined behavior).
Suggested changes:
Change
3.19p1 _unspecified behavior_
behavior where ...
and
3.11p1 _implementation-defined behavior_
unspecified behavior where ...
to
3.19p1 _unspecified_
aspect of the language where ...
and
3.11p1 _implementation-defined_
unspecified aspect where ...
However, keep `behavior' in the definition of `undefined behavior',
since that always *is* behavior.
Add a section 3.* _indeterminate_, and add `indeterminate' to the index.
Mention that an indeterminate value can be a trap representation (and
that a trap representation is considered a value, as in 6.2.6.1p5).
6.2.6.1p4 seems to contradicts the above:
"Certain object representations need not represent a value of
the object type (...) such a representation is called a trap
representation."
Replace "value" above with "valid value".
In 7.8.1p6, change `may be left undefined' to `need not be #defined'
or change `undefined' to `un#defined'.
In the following places, remove "behavior" or change it to "aspect(s)"
(except when applied to `undefined' if you wish to be that pedantic).
3.11p2
EXAMPLE An example of implementation-defined behavior
3.19p2
EXAMPLE An example of unspecified behavior
4p3
A program (...) containing unspecified behavior
4p5
output dependent on any unspecified, undefined, or
implementation-defined behavior
D.1p1
involves unspecified or undefined behavior
D.2p1
involves undefined or unspecified behavior
K.1 Unspecified behavior
K.3 Implementation-defined behavior
Index
implementation-defined behavior, 3.11, K.3
unspecified behavior, 3.19, K.1
Contents
K.1 Unspecified behavior ......................... 566
K.3 Implementation-defined behavior .............. 585
In the following places, change "undefined" to "indeterminate" or change
the text to say "undefined behavior":
Note 58 to 6.5p2
This paragraph renders undefined statement expressions such as
Note 80 to 6.5.9p6
the effect of subsequent comparisons is also undefined.
7.12.3p1
The result is undefined if an argument is not of real floating
type.
7.24.6.3.2p4, 7.24.6.3.3p4, 7.24.6.4.1p4 and 7.24.6.4.2p4:
the conversion state is undefined.
[Maybe this should be `unspecified'.]
D.1p2
expressions that are not determined to be undefined
Note 288 to D.4.3p3
If it were undefined to write twice to the flags
D.5p27 and D.5p31
and so the expression is undefined.
------------------------------------------------------------------------
Comment 20.
Category: Request for clarification
Committee Draft subsection: various
Title: Some unspecified cases are unclear
In some unspecified and implementation-defined cases, it is unclear
which options the implementation has to choose from.
Suggested changes:
In 3.19, add notes
- with reference to 4p3 and 5.1.2.3 that an unspecified aspect will
not cause failure (i.e. produce a trap representation or undefined
behavior). Even knowledgeable people missed that when I asked.
- that the implementation must make sensible (or maybe "unsurprising"
choices for unspecified aspects.
What are the possible behaviors in the following places?
6.5.7p5
The result of E1 >> E2 is E1 right-shifted E2 bit positions (...)
If E1 has a signed type and a negative value, the resulting value
is implementation-defined.
What are the choices here?
7.19.4.4p3
If it [the tmpnam function] is called more than TMP_MAX
times, the behavior is implementation-defined.
Are there any particular choices?
7.19.7.11p5
the value of its file position indicator after a successful call
to the ungetc function is unspecified
Are there any particular choices, except it must be an fpos_t which is
not a trap representation?
5.2.2p1, 5.2.2p2(\b,\t,\v)
the behavior is unspecified
Is only the output unspecified, or may this also affect program
execution? E.g. may ferror() be set?
Other problems:
6.3.2.3p5
An integer may be converted to any pointer type. The result is
implementation-defined, might not be properly aligned, and might
not point to an entity of the referenced type.
This is an unspecified aspect which may cause undefined behaviour,
which I'm told is not supposed to happen. I think it should be
something like this:
If the integer resulted from converting a pointer [to void?] to
intptr_t, uintptr_t [or a wider integer type?], and the
pointed-to object is still live, the result is a pointer to that
object (see 7.18.1.4). Otherwise the result is indeterminate,
but the implementation shall document it (->footnote XXX).
[footnote XXX] An integer converted to a pointer might not be
properly aligned, and might not point to an entity of the
referenced type.
------------------------------------------------------------------------
Comment 21.
Category: Defect/Request for clarification
Committee Draft subsection: various
Title: The standard implies that pointer = address = integer
This can be deduced as follows:
* C addresses are integers:
3.1p1 mentions "addresses that are particular multiples of a byte
address". Except to mathematicians, only integers are "multiples".
* C addresses are pointers (most of the Standard seems to think so):
6.5.3.2p3 says that "address-of" operator returns a pointer.
This operator's name is the clearest indication I can find of what the
relationship is between pointers and addresses.
6.5.2.5p10 says about "int *p; (...) p = (int [2]){*p}" that
"p is assigned the address of the first element of an array (...)".
* The implementation's hardware address concept is the same as C's
address concept:
This is implied by omission, in particular in 3.1p1. Readers should
not be expected to know that the hardware address concept can be very
different from the common integer-like address concept, so they may
not imagine any reason to not trust hardware addresses to be like C
addresses.
This must be corrected. C's `address' and `pointer' concepts need
explicit definitions and Index entries. The relationship between them,
and between C addresses and hardware addresses, must be stated.
3.4p2 says "It is possible to express the address of each individual
byte of an object uniquely." Maybe this is the right wording: Pointers
_express_ addresses? Or maybe this wording too is wrong - I prefer to
think of pointers as handles to objects and nothing else, and that
anything about addresses just say something about their implementation.
------------------------------------------------------------------------
Comment 22.
Category: Clarification/Correction restoring original intent
Committee Draft subsection: 5.1.1.2
Title: Pragmas are lost after translation phase 4
Pragmas are executed and then deleted in phase 4, so phases 4-6 haven't
really got an approved way to pass on pragmas that take effect in later
phases. The same applies to e.g. letting phase 1 notice and pass on the
character set name of the source file.
Suggested change:
Add this paragraph to 5.1.1.2:
The output from a translation phase may contain extra information
which is transparent to the rules [or "to the grammar"?]
described in this standard.
------------------------------------------------------------------------
Comment 23.
Category: Request for clarification/correction
Committee Draft subsection: 7.19.7.3, 7.19.7.11
Title: "character" arguments to fputc and ungetc
The range of valid arguments to fputc() and ungetc() is unclear. They
are only defined for _character_ arguments, so non-character arguments
produce undefined behaviour (7.19.7.3p2, 7.19.7.11p2/p4). However, a
character is a typeless bit representation according to 3.5. The most
natural interpretation of 3.5 is that the argument must be an integer
which fits in an unsigned char.
The behavior of fputc() should be defined at least for
- all `unsigned char' values, so one can fputc the result of fgetc.
- all `char' values, to avoid breaking a _lot_ of programs that do e.g.
putchar(*string) instead of putchar((unsigned char)*string).
fputc should convert char arguments to unsigned char as it does now.
I do not know if it should also be defined for all signed char values,
or all `int' values.
This does not work for ungetc, since a char value equal to EOF has a
special meaning. Yet ungetc too converts its argument to unsigned char,
so apparently ungetc(ch) where ch is a char is intended to work.
Please state the valid range of the character argument to ungetc. Add a
note that even though ungetc converts its argument to unsigned char, the
application should still convert any char arguments to unsigned char "by
hand", in case the char value == EOF.
------------------------------------------------------------------------
Comment 24.
Category: Request for clarification/Correction
Committee Draft subsection: 7.19.*, 5.2.4.2.1
Title: Problem with non-two's complement char
If INT_MAX - INT_MIN < UCHAR_MAX (e.g., one's complement with
sizeof(int) == 1), there is one byte value (UCHAR_MAX?) which cannot be
read and written to the stream:
- The function call `fputc(u_ch, f)' converts u_ch to int, and then fputc
converts it back to unsigned char and writes that to the stream. This
will typically convert the bit pattern for negative zero t0 zero, I'm
not sure if it can do it with some other value.
- Similarly, fgetc(f) converts the unsigned char on the stream to int and
returns that, and I'm not even sure it is at liberty to return
"negative zero" so the application could do some hack to notice it.
- 7.19.2p3 may be taken to forbid this (Data read in from a binary
stream shall compare equal to the data that were earlier written out
to that stream), if it is read as 'fwrite() the data, fread() it back,
then memcmp() shall return 0'. However, the 'data read' and 'data
written' can also be taken to mean the output from fgetc() and the
input to fputc(), which have already been through the conversion to
int.
The intent here must be clarified.
If the intent is that INT_MAX - INT_MIN >= UCHAR_MAX, 5.2.4.2.1 would
be a better place to state it. It seems strange to have to search the
library section for such basic restrictions on the language.
____ end of Norway Comments; beginning of United Kingdom comments ____
The UK votes NO on CD2/FCD1.
To change the UK vote to YES:
The following major issues must be addressed:
side effects in VLAs
write-once-read-many changes to restrict
various issues with floating point
For each UK comment listed in column 1 of the table below, the described
change or an equivalent must be made.
For each UK comment listed in column 2 of the table below, the described
change or an equivalent must be made or the
UK National Body must be satisfied that there is good reason to not make
the change.
The UK comments listed in column 3 of the table below must be properly
considered and a reasonable response made.
The changes in these comments are not mandatory provided that such
reponses are made.
The procedural matters described below must be addressed.
Column 1
Changes that must be applied
PC-UK0201 [1]
PC-UK0244
PC-UK0246
PC-UK0248
PC-UK0272
PC-UK0273
PC-UK0277
PC-UK0278
PC-UK0286
PC-UK0287
Column 2
Comments that must be addressed
PC-UK0209 [2]
PC-UK0214 [3]
PC-UK0222 [4]
PC-UK0232
PC-UK0245
PC-UK0254
PC-UK0274
PC-UK0275
PC-UK0279
PC-UK0281
PC-UK0282
PC-UK0283
PC-UK0284
PC-UK0285
PC-UK0261
PC-UK0262
PC-UK0269
PC-UK0270
Column 3
Other comments
PC-UK0227 [5]
PC-UK0249
PC-UK0251
PC-UK0252
PC-UK0256
PC-UK0257
PC-UK0276
PC-UK0263
PC-UK0264
PC-UK0265
Notes
1 We do not believe that WG14 addressed the issue actually raised.
2 The UK does not consider this to be a a new feature, but a minor
though useful enhancement to an existing one.
3 Previous objection to this item was based on problems with realloc.
Now that the latter has been redefined to create a new object, thi
item should be reconsidered.
4 The UK considers that code affected by this item is almost certain
to be erroneous, and feels that it is important that it is
addressed.
5 This item would clarify the meaning of bit-fields, and in
particular that they cannot be wider than specified.
This response also assumes that the following items of SC22/WG14/N847
have been accepted as is or with editorial changes:
4, 8, 10, 19, 20, 21, 33, 36, 43. Otherwise these items should be added
to column 1 of the above table.
Procedural issues
WG14 failed to provide comprehensible responses to a number of matters
raised in the UK comments to CD1. To the extent that those comments are
not subsumed by other parts of this response, they are required to be
addressed in a way that allows the UK to determine whether the WG14
responses are acceptable, and therefore form a part of this submission.
The UK comments in question are:
PC-UK0079 PC-UK0082 PC-UK0083 PC-UK0085 PC-UK0086 PC-UK0088 PC-UK0089
PC-UK0090 PC-UK0091 PC-UK0092 PC-UK0093 PC-UK0094 PC-UK0095 PC-UK0096
PC-UK0097 PC-UK0098 PC-UK0102 PC-UK0106 PC-UK0108 PC-UK0112 PC-UK0114
PC-UK0117 PC-UK0118 PC-UK0120 PC-UK0122 PC-UK0126 PC-UK0129 PC-UK0130
PC-UK0133 PC-UK0134 PC-UK0135 PC-UK0137 PC-UK0138 PC-UK0141 PC-UK0142
PC-UK0143 PC-UK0144 PC-UK0147 PC-UK0150 PC-UK0151 PC-UK0152 PC-UK0153
PC-UK0154 PC-UK0155 PC-UK0156 PC-UK0158 PC-UK0159 PC-UK0161 PC-UK0162
PC-UK0163 PC-UK0164 PC-UK0165 PC-UK0171
Side effects in VLAs
The UK requires the issue of side effects in variably-modified type
declarations and type names to be addressed. A number of proposals have
previously been produced to this end, such as those in PC-UK0226 and
PC-UK0250.
The minimum requirement is that, for any given piece of code, the code
either violates a constraint, or else all implementations produce the same
result (in the absence of any unspecified behaviour in the code not
related to the use of variably-modified types). In particular, it is not
acceptable for side effects to be optional.
The UK preference is to have side effects work normally in
variably-modified types. It would be acceptable for a constraint to
forbid certain operators (such as ++ and function call) within array
sizes within a sizeof expression.
Changes to restrict
There are four basic issues to be addressed:
1.The current specification of restrict disallows aliasing of unmodified
objects, which renders some natural and useful programs undefined
without promoting optimization. This is contrary to the prior art in
Fortran.
2.If a restricted pointer points to an object with const-qualified type,
the current specification allows casting away the const qualifier, and
modifying the object. Disallowing such modifications promotes
optimization as illustrated in example F below.
3.The current specification does not address the effect of accessing
objects through pointers of various types, all based on a restricted
pointer. In particular, these objects are supposed to determine an array,
but the element type is not specified.
4.The specification of realloc now states that the old object is freed,
and a new object is allocated. The old and new objects cannot, in
general, be viewed as being members of an array of such objects. With the
current specification, this appears to prohibit the use of the restrict
qualifier for a pointer that points to an object that is realloc'd. There
are also related issues for dynamically allocated linked lists.
The following changes would address these, though it is accepted that
further discussion may be able to improve them. In these changes, new text
is in bold and removed text in italics.
In 6.7.3.1, amend paragraph 1 to read:
Let D be a declaration of an ordinary identifier that provides a
means of designating an object P as a restrict-qualified pointer to
objects of type T.
Change paragraph 4 to:
During each execution of B, let A be the array object that is determined
dynamically by all references through pointer expressions based on P.
Then all references to values of A shall be through pointer expressions
based on P. Let L(X,B) denote the set of all lvalues that are used to
access object X during a particular execution of B. If T is
const-qualified, and the address of one lvalue in L(X,B) is based on P,
then X shall not be modified during the execution of B. If T is not
const-qualified, the address of one lvalue in L(X,B) is based on P, and X
is modified during the execution of B, then the addresses of all lvalues
in L(X,B) shall be based on P. The requirement in the previous sentence
applies recursively, with P in place of X, with each access of X through
an lvalue in L(X,B) treated as if it modified the value of P, and with
other restricted pointers, associated with B or with other blocks, in
place of P. Furthermore, if P is assigned the value of a pointer
expression E that is based on another restricted pointer object P2,
associated with block B2, then either the execution of B2 shall begin
before the execution of B, or the execution of B2 shall end prior to the
assignment. If these requirements are not met, then the behavior is
undefined.
Alternative wording for the last new sentence ("The requirement ...") is:
If X is modified, the requirement in the previous sentence applies
recursively: P is treated as if it were itself modified and replaces
X in L(X,B), then the same condition shall apply to other restricted
pointers, associated with B or with other blocks, in place of P in
the previous sentence.
Fina ly, WG14 may wish to consider the following additional change
(rationale is available separately). In 6.7.5.3 paragraph 6, change:
A declaration of a parameter as "array of type" shall be adjusted to
"pointer to type",
to:
A declaration of a parameter as "array of type" shall be adjusted to
"restrict-qualified pointer to type",
and add a new paragraph after paragraph 6:
As far as the constraints of restrict-qualification are concerned
(6.7.3.1), a parameter that is a complete array type shall be
regarded as a pointer to an object of the complete array size; for
all other purposes, its type shall be as described above.
Issues with floating point
Floating-point Unimplementabilities and Ambiguities
The UK comments on CD1 included a large number of comments on CD1 that
have not been addressed in the FCD. Discusssions on the reflector indicate
that many of the new features in the language are intended to make sense
only if Annex F or Annex H are in force. This is not reasonable, not least
because it makes the main body of the standard meaningful only in the
context of an informative annex.
It is not reasonable to claim that such problems do not matter because
they cannot be shown in strictly conforming programs.
The same applies to the new features in their entirety, because they are
defined only in certain implementation defined cases.
And the same applies to almost all error and exception handling, even in
C89.
The list of architectures which will have major trouble with the new
proposals includes the IBM 360/370/390 (including the Hitachi S-3600 and
others), the NEC SX-4, the DEC VAX, the Cray C90 and T90, the Hitachi
SR2201, the DEC Alpha (to a certain extent) and almost certainly many
others. Implementors on these will interpret the standard in many, varied
and incompatible ways, because they CANNOT implement the current wording
in any way that makes sense.
For similar reasons, these new features are impossible to use in a
portable program, because it is not possible to determine what they mean,
unless __STD_IEC_559__ is set. This is not reasonable for features
defined in the main body of the standard.
The standard must be improved so that all such arithmetic-dependent
features are shielded in some suitable way: by a predefined preprocessor
macro, moved to an optional annex, defined so that all reasonable
implementations can support them, or defined to permit an implementation
to indicate that they are not supported. It does not really matter which
approach is adopted.
The following suggestions should remove the worst of the problems, mostly
using the last approach. In most cases, they are trivial extensions that
merely permit the implementor to return an error indication if the
feature cannot be provided, or wording to change ill-defined behaviour
into implementation-defined behaviour.
7.6 Floating-point environment <fenv.h>
Item A
The C standard should not place constraints on a program that are not
determinable by the programmer, and the second and third items of
paragraph 2 do. Many systems use floating-point for some integer
operations or handle some integer exceptions as floating-point - e.g.
dividing by zero may raise SIGFPE, and integer multiplication or division
may actually be done by converting to floating-point and back again.
Either the clause "or unless the function is known not to use floating
point" should be dropped in both cases, or a determinable condition should
be imposed, such as by adding the extra item:
- any function call defined in the headers <complex.h or <math.h> or
defined elsewhere to access a floating-point object is assumed
to have the potentia l for raising floating-point exceptions,
unless the documentation states otherwise.
This requires most of the functions in <stdlib.h> to handle exceptions
themselves, if they use floating-point, but that is assumed by much
existing code. It has the merit of at least being determinable, which the
existing wording isn't.
Item B
There is another serious problem, even on systems with IEEE arithmetic,
in that the interaction of the flag settings with setjmp/longjmp is not
well-defined. Should they have the value when setjmp was invoked, when
longjmp was called, or what?
Worse still, the current wording does not forbid setjmp to be invoked
with non-standard modes and longjmp called with default ones, which won't
work in general.
Another item of paragraph 2 should be added:
- if the setjmp macro is invoked with non-default modes, the
behaviour is undefined. The values of the exception flags on return
from a setjmp macro invocation are unspecified.
Item C
A related one concerns the case where a function with FENV_ACCESS set on
calls one with FENV_ACCESS set off - the current wording implies that the
latter must set the flags precisely for the benefit of the former, which
is a major restriction on implementors and makes a complete mockery of
footnote 163.
The second last sentence of paragraph 2 should be changed to:
If part of a program sets or tests flags or runs under non-default
mode settings, ....
7.6.2 Exceptions
These specifications do not allow the implementation any way to indicate
failure. This is (just) adequate for strict IEEE arithmetic, but is a
hosta ge to fortune and prevents their use for several IEEE-like
arithmetics. All such implementations can do is to not define the macros,
thus implying that they cannot support the functions, whereas they may be
able to support all reasonable use of the functions and merely fail in
some perverse circumstances.
All of these functions (excluding fetestexcept) should be defined with a
return type of int, and to return zero if and only if the call succeeded.
7.6.3.1 The fegetround function
What happens if the rounding mode is none of the ones defined above, or
is not determinable (as can occur)? The following should be added to the
end of paragraph 3:
If the rounding mode does not match a rounding direction macro or is
not determinable, a negative value is returned.
7.6.3.2 The fesetround function
Many existing C <math.h> implementations and even more numerical
libraries have the property that they rely on the rounding mode they are
called with being the one they were built for. To use a different
rounding mode, the user must link with a separate library. The standard
should permit an implementation to reject a change if the change is
impossible, as distinct from when it does not support that mode at all.
Paragraph 3 be simplified to:
The fsetround function returns a nonzero value if and only if the
requested rounding direction has been established.
Note that this enables the example to make sense, which it doesn't at
present.
7.6.2 Environment
Exactly the same points apply as for 7.6.2 Exceptions above for all the
functions (excluding feholdexcept), and exactly the same solution should
be adopted.
7.12 Mathematics <math.h>
Item A
A major flaw in paragraphs 4 and 5 is that there is no way of specifying
an infinity or a NaN for double or long double, unless float supports
them. While this is the common case, C9X does not require it and it is
not reasonable to do so. In particular, the NEC SX-4 supports IEEE, Cray
and IBM arithmetics, and there are a lot of IEEE systems around which
have non-IEEE long double, and this case cannot be fully supported,
either.
'float' should be changed to to 'double' in paragraph 4 and the following
should be added to it:
The macros
INFINITY_F
INFINITY_L
are respectively float and long double analogs of INFINITY.
'float' should be changed to to 'double' in paragraph 5 and the following
should be added to it:
The macros
NAN_F
NAN_L
are respectively float and long double analogs of NAN.
Item B
The classification macros are inadequate to classify all numbers on many
architectures - for example, the IBM 370 has unnormalised numbers and the
DEC VAX has invalid ones (i.e. not NaNs.) 5.2.4.2.2 and 7.6 permit other
implementation-defined values, but this section does not.
The following should be appended to paragraph 6:
Additional floating-point classifications, with macro definitions
beginning with FP_ and an uppercase letter, may also be specified by
the implementation.
7.12.1 Treatment of error conditions
I have no idea what "without extraordinary roundoff error" means, and I
have been involved in the implementation and validation of mathematical
functions over 3 decades. My dictionary contains 5 definitions of
"extraordinary", most of which might be applicable, and I know at least 3
separate meanings of the term "roundoff error" in the context of
mathematical functions.
The following paragraph should be added:
If a function produces a range error to avoid extraordinary roundoff
error, the implementation shall define the conditions when this may
occur.
7.12.3.1 The fpclassify macro
As mentioned above, the current wording forbids an implementation from
correctly classifying certain IEEE, IBM 370 and DEC VAX numbers. The first
sentence of paragraph 2 should have appended:
..., or into another implementation-defined category.
7.12.3.2 The signbit macro
The wording of this is seriously flawed. It says that it returns the sign
of a number, but is clearly intended to test the sign bit, and these are
NOT equivalent. IEE 754 states explicitly that it does not interpret the
sign of NaNs (see section 6.3), and the VAX distinguishes zeroes from
reserved operands (not NaNs, but something much more illegal) by the sign
bit.
And there is nowhere else in C9X that requires the sign of a
floating-point number to be held as a bit - surely people have not
yet forgotten ones' and twos' complement floating point?
Paragraphs 1 and 2 should be replaced by:
For valid non-zero values (including infinities but not NaNs), the
signbit macro returns nonzero if and only if the sign of its argument
is negative.
For zeroes and NaNs when __STD_IEC_559__ is defined, the signbit
macro returns nonzero if and only the sign bit of the value is set.
For zeroes and NaNs when __STD_IEC_559__ is not defined, the signbit
macro returns nonzero for an implementation-defined set of values
and zero otherwise.
7.12.11.1 The copysign functions
What does "treat negative zero consistently" mean? Does IBM 370 arithmetic
do it? Does VAX? Does Intel? Does Cray? Does IEEE?
The sentence "On implementations ... the sign of zero as positive."
should be replaced by one or the other of the following:
Unless __STD_IEC_559__ is defined (see Annex F), it is
implementation-defined whether any representations of zero are
regarded as negative by this function and, if so, which.
or:
The copysign functions shall interpret the sign of zeroes in the
same way that the signbit macro (7.12.3.2) does.
Floating-point Incompatibilities with Other Standards
The UK comments on CD1 included a large number of comments on CD1 that
have not been addressed in the FCD with regard to compatibility with IEC
60559 (IEEE 754) and ISO/IEC 10967-1 (LIA-1). It is not reasonable to
claim that such problems do not matter because they cannot be shown in
strictly conforming programs. The same applies to almost all of the
trickier aspects of C89 and C9X floating-point support.
The responses stated that the intention is compatibility only with a
subset of those standards, but those standards do not always allow the
subsetting requires by C9X. Furthermore, the statement is not true in all
cases, and it is impossible for an implementation to conform to both
standards simultaneously.
The standard must be improved so that an implementation can reasonably
satisfy both standards simultaneously, in all aspects where C9X claims
that it is compatible with the other standards. Where this is not
possible, C9X should admit the fact in so many words or provide a
mechanism for alternate implementation.
There is also the major point that C9X can and should specify syntax for
such support, in case where this would avoid implementations providing it
incompatibly. This will then reduce problems if C wishes to support the
feature properly at a later revision. A precedence for this is the signal
handling, which effectively defines syntax but leaves the semantics
almost completely undefined.
Furthermore, there are many places where C9X makes it unnecessarily
difficult to satisfy the other standards, and where minor changes would
have major benefits. These should be improved, and the forthcoming
ISO/IEC 10967-2 (LIA-2) should also be considered in this respect.
The following suggestions should remove the worst of the problems, but
are by no means a complete solution.
5.2.4.2.2 Characteristics of floating types <float.h>
Paragraph 5 doesn't define precisely what the rounding mode terms mean,
and there are many possible interpretations (especially of nearest
rounding for numbers that fall precisely between two values.) Note that
this is specified by IEC 60559 but explicitly not by ISO/IEC 10967-1.
However, the latter requires the rounding method to be documented in full
(see section 8, paragraph f.)
The following should be added to the end of the last sentence:
Unless __STD_IEC_559__ is defined (see Annex F), the precise
definition of these rounding modes is implementation-defined.
7.3.2 Conventions
This comment is not strictly an incompatibility, but is about wording
likely to cause such problems. The current description of errno handling
is so confusing that it could be interpreted that errno is unpredictable
on return from a complex function. This cannot be the intention. The
second sentence should be replaced by:
An implementation may define domain and range errors, when it will
set errno to EDOM and ERANGE and the result to an implementation
defined value, but is not required to.
7.6 Floating-point environment <fenv.h>
7.12.1 Treatment of error conditions
Annex F IEC 60559 floating-point arithmetic
Annex H Language Independent Arithmetic
One of the assumptions in the IEC 60559 model is that exception flags
will eventually be either cleared or diagnosed, and this is required by
ISO/IEC 10967-1. Fortran does not specify what may be written to
'standard error', but C does, and many vendors regard the standard as
forbidding them from issuing diagnostics in this case. H.3.1.1 states
that C permits an implementation to do this, but provides no hint as to
how. Furthermore, there is no implication in the standard that
floating-point exception flags must have any particularvalues at any time.
The following should be added to 7.6:
If any of the exception flags are set on normal termination after
all calls to functions registered by the atexit function have
been made (see 7.20.4.3), and stderr is still open, the
implementation may write some diagnostics indicating the fact to
stderr.
If this is not done, then Annex H must be corrected, or clarified to
explain how such a diagnostic can be produced by a conforming
implementation.
7.12 Mathematics <math.h>
F.2.1 Infinities, signed zeroes and NaNs
F.3 Operators and functions
Section 7.12 paragraphs 5 and 6 and F.3 are seriously incompatible with
the spirit of IEC 60559, and are in breach of IEEE 754 section 6.2, by not
providing any way to define a signalling NaN or test for whether a NaN is
signalling or quiet. In particular, an implementation cannot extend the
fpclassify function to conform to both standards, because C9X requires it
to classify both signalling and quiet NaNs as FP_NAN, and IEC 60559
requires it to distinguish them.
Furthermore, the current C9X situation does not allow a programmer to
initialise his data to signalling NaNs (as recommended by IEEE 754). It is
perfectly reasonable not to define the behaviour of signalling NaNs in
general, but it is not reasonable to be unnecessarily hostile to IEC
60559. At the very least, there should be a macro NANSIG for creating
one, which can be used in initialisers, and a macro FP_NANSIG for flagging
one.
There are also implementation difficulties with the wording of fpclassify
as it stands, especially since it may need to copy its argument, and this
is not always possible for signalling NaNs.
7.12 should have the extra paragraph:
The macro
NANSIG
is defined if and only if the implementation supports signalling
NaNs for the double type. It expands to a constant expression of type
double representing an implementation-defined signalling NaN. If
defined, NANSIG may be used as an initializer (6.7.8) for an object
of semantic type double; no other semantics for NANSIG values are
defined by this standard.
The macros
NANSIG_F
NANSIG_L
are respectively float and long double analogs of NANSIG.
Note that it is not possible to have solely a float value of NANSIG,
because of the constraints on copying signalling NaN values.
7.12 paragraph 6 should define the extra symbol:
FP_NANSIG
and add the extra sentence:
This standard does not specify whether the argument of fpclassify is
copied or not, in the sense used by IEC 60559.
F.2.1 paragraph 1 needs replacing by::
The NAN, NAN_F, NAN_L, NANSIG, NANSIG_F, NANSIG_L, INFINITY,
INFINITY_F an INFINITY_L macros in <math.h> provide designations for
IEC 60559 NaNs and infinities.
F.3 last paragraph (starting "The signbit macro") should be simplified by
the omission of the exclusion in brackets - i.e. "(except that fpclassify
does not distinguish signalling from quiet NaNs)".
7.12.1 Treatment of error conditions
Paragraph 2 is in clear conflict with the stated intention of IEC 60559
and ISO/IEC 10967-1, and actually prevents an implementation from
conforming to both C9X and the whole of ISO/IEC 10967-1 simultaneously.
Despite this, H.3.1.2 Paragraph 1 claims that the C standard allows "hard
to ignore" trapping and diagnostics as an alternative form of
notification (as required by ISO/IEC 10967-1), but it specifically FORBIDS
this in many routines of the library that provide the ISO/IEC 10967-1
functionality (as described in H.2.3.2).
This is unacceptable. While there are many possible solutions, this
problem is extremely pervasive, and most of them would involve extensive
changes to C9X. However, SOMETHING needs to be done, and the following
are possibilities:
1.To remove the erroneous claims of ISO/IEC 10967-1 support from Annex
H.
2 To define a pragma to select between the mode where errno is
returned and modes where ISO/IEC 10967-1 "hard to ignore" trapping
and diagnostics are used. Unfortunately, the changes would be
extensive.
3.To add the following to 7.12.1:
An implementation shall provide a mechanism for programs to be
executed as described above. It may also provide a mechanism by
which programs are executed in a mode in which some or all
domain and range errors raise signals in an
implementation-defined fashion.
Recommended practice
If domain errors raise a signal, the signal should be SIGILL.
If range errors raise a signal, the signal should be SIGFPE. It
should be possible for the program to run in a mode where
domain errors and range errors that correspond to overflow raise
signals, but range errors that correspond to underflow do not.
Alternatively, people might prefer to use SIGFPE for both classes of
error; there are arguments both ways, and either choice is
reasonable.
F.9 Mathematics <math.h>
Paragraph 4 is seriously incompatible with the spirit of IEC 60559 and
the wording of ISO/IEC 10967-1. Note that 7.12.1 paragraphs 2 and 3
permits an implementation to define additional domain and range error
conditions, but this section does not.
Paragraph 4 should be changed to:
The invalid exception will be raised whenever errno is set to EDOM.
Subsequent subclauses of this annex specify other circumstances when
the invalid or divide-by-zero exceptions are raised.
There is also a possible ambiguity in paragraphs 5 and 6, and a problem
caused by cases where the implementation may define extra range errors as
permitted by 7.12.1. It should be closed by adding the following:
Whenever errno is set to ERANGE, at least one of the divide-by-zero,
overflow or underflow exceptions shall be raised.
H.3.1 Notification alternatives
H.3.1.2 Traps
ISO/IEC 10967-1 6.1.1 point (c) requires the ability to permit the
programmer to specify code to compensate for exceptions if
trap-and-resume exception handling is used. C does not permit such code,
but H.3.1 paragraph 4 claims that it does. In particular, there is no way
t o return a corrected value after a numeric (SIGFPE) signal. Paragraphs 4
of H.3.1 and H.3.1.2 must be corrected, so that they do not claim that C9X
supports ISO/IEC 10967-1 trap-and-resume exception handling.
H.3.1.2 paragraph 4 claims that the C standard allows trap-and-terminate
as well as trap-and-resume. This is not true, either, as C9X stands. In
particular, it does not permit it with exponentF and scaleF implemented
using logb and scalbn etc. Either such termination must be permitted, or
paragraphs 4 of H.3.1 and H.3.1.2 must be corrected; a suggestion is made
for the former elsewhere in this proposal..
Details of PC-UK02xx issues
Category: 1
PC-UK0201
Category: Normative change to existing feature retaining the original
intent
Committee Draft subsection: 4
Title: Further requirements on the conformance documentation
Detailed description:
The Standard requires an implementation to be accompanied by
documentation of various items. However, there is a subtle difference
between the terms "implementation-defined" and "described by the
implementation" which has been missed by this wording (this is partly due
to the tightening up of the uses of this term between C89 and C9X - see
for example subclause 6.10.6).
As a result, the wording does not actually require the latter items to be
documented.
Change the paragraph to:
An implementation shall be accompanied by a document that describes
all features that this International Standard requires to be described
by the implementation, including all implementation-defined
characteristics and all extensions.
PC-UK0244
Category: Inconsistency
Committee Draft subsection: 6.2.5, 6.5.3.4, 6.7
Title: Issues with prototypes and completeness.
Detailed description:
6.2.5p23 says "An array type of unknown size is an incomplete type". Is
the type "int [*]" (which can only occur within a prototype) complete or
incomplete ?
If it is complete, then what is its size ? This can occur in the
construct
int f (int a [sizeof (int [*][*])]);
It it is incomplete, then the type "int [*][*]" is not permitted, which
is clearly wrong.
Now consider the prototype:
int g (int a []);
The parameter clearly has an incomplete type, but since a parameter is an
object (see 3.16) this is forbidden by 6.7p7:
If an identifier for an object is declared with no linkage, the type
for the object shall be complete by the end of its declarator, or by
the end of its init-declarator if it has an initializer.
This is also clearly not what was intended.
One way to fix the first item would be to change 6.5.3.4p1 to read:
[#1] The sizeof operator shall not be applied to an
expression that has function type or an incomplete type, to
|| an array type with unspecified size 72a), to
the parenthesized name of such a type, or to an lvalue that
designates a bit-field object.
|| 72a) An array type with unspecified size occurs in function
|| prototypes when the notation [*] is used, as in "int [*][5][*]".
One way to fix the second item would be to change 6.7p7 to read:
If an identifier for an object is declared with no linkage, the type
for the object shall be complete by the end of its declarator, or by
the end of its init-declarator if it has an initializer;
|| in the case of function arguments (including in prototypes) this shall
|| be after making the adjustments of 6.7.5.3 (from array and function
|| types to pointer types).
PC-UK0246
Committee Draft subsection: 6.2.5, 6.7.2.2
Title: Circular definition of enumerated types
Detailed description:
6.7.2.2 para 4 says:
Each enumerated type shall be compatible with an integer type.
However, 6.2.5 para 17 says:
The type char, the signed and unsigned integer types,
and the enumerated types a re collectively called integer types.
Thus we have a circular definition. To fix this, change the former to one
of:
Each enumerated type shall be compatible with a signed or unsigned
integer type.
or:
Each enumerated type shall be compatible with a standard integer type
or an extended integer type.
PC-UK0248
Committee Draft subsection: 6.3.2.3
Title: Null pointer constants should be castable to pointer types
Detailed description:
6.3.2.3p3 says that:
If a null pointer constant is assigned
to or compared for equality to a pointer, the constant is
converted to a pointer of that type. Such a pointer, called
a null pointer,
However, this doesn't cover cases such as:
(char *) 0
which is neither an assignment or a comparison. Therefore this is not a
null pointer constant, but rather an implementation-defined conversion from
an integer to a pointer. This is clearly an oversight and should be fixed.
Either change:
If a null pointer constant is assigned
to or compared for equality to a pointer, the constant is
converted to a pointer of that type. Such a pointer, called
a null pointer, is guaranteed to compare unequal to a
pointer to any object or function.
to:
If a null pointer constant is assigned
to or compared for equality to a pointer, the constant is
converted to a pointer of that type. When a null pointer
constant is converted to a pointer, the result (called a /null
pointer/) is guaranteed to compare unequal to a pointer to any
object or function.
or change:
If a null pointer constant is assigned
to or compared for equality to a pointer, the constant is
converted to a pointer of that type. Such a pointer, called
a null pointer, is guaranteed to compare unequal to a
pointer to any object or function.
[#4] Conversion of a null pointer to another pointer type
yields a null pointer of that type. Any two null pointers
shall compare equal.
to:
If a null pointer constant is assigned
to or compared for equality to a pointer, the constant is
converted to a pointer of that type.
A /null pointer/ is a special value of any given pointer type
that is guaranteed to compare unequal to a pointer to any
object or function. Conversion of a null pointer constant to a
pointer type, or of a null pointer to another pointer type,
yields a null pointer of that type. Any two null pointers
shall compare equal.
PC-UK0272
Committee Draft subsection: 6.5.9
Title: Tidy up of pointer comparison
Detailed description:
Clause 6.5.8, para 5-6: The original wording (6.3.8) contained
a paragraph between these two. The first sentence of this paragraph
has been moved to paragraph 5. The second sentence ("If two pointers
to object or incomplete types compare equal, both point to the same
object, or both point one past the last element of the same array
object."). does not appear to occur in the FCD. The sentence needs
to be in the FCD so that all cases are covered.
Append to 6.5.9 para 6:
Otherwise the pointers compare unequal.
PC-UK0273
Committee Draft subsection: 6.7.5.3
Title: Forbid incomplete types in prototypes
Detailed description:
Clause 6.7.5.3, para 8: This allows constructs such as:
void f1(void, char);
struct t;
void f2(struct t);
Allowing incomplete types in prototypes is only necessary to support
[*] (another UK proposal would make this a complete type). If
certain incomplete types are allowed in prototypes they need to
be explicitly called out. Otherwise the behaviour should be
made a constraint violation.
Remove the words "may have incomplete type and" from the cited paragraph.
Words should be added somewhere to make it clear that [*] arrays are
complete types; for example, in 6.7.5.2 para 3 change:
... in declarations with function prototype scope.111) ...
to
... in declarations with function prototype scope 111); such
arrays are nonetheless complete types ...
See also PCUK-0244.
PC-UK0277
Committee Draft subsection: 6.2.6.1, 6.5.2.3
Title: Effects on other members of assigning to a union member
Detailed description:
6.5.2.3p5 has wording concerning the storing of values into a union
member:
With one exception, if the value of a member of a union object is
used when the most recent store to the object was to a different
member, the behavior is implementation-defined.
When this wording was written, "implementation-defined" was interpreted
more loosely and there was no other relevant wording concerning the
representation of values. Neither of these is the case anymore.
The requirement to be implementation-defined means that an implementation
must ensure that all stored values are valid in the types of all the other
members, and eliminates the possibility of them being trap representations.
It also makes it practically impossible to have trap representations at all.
This is not the intention of other parts of the Standard.
It turns out that the wording of 6.2.6.1 is sufficient to explain the
behavior in these circumstances. Type punning by unions causes the same
sequence of bytes to be interpreted according to different types. In
general, 6.2.6.1 makes it clear that bit patterns can be trap values, and
so the programmer can never be sure that the value is safe to use in a
different type.
One special case that should be considered is the following:
union {
unsigned short us;
signed int si;
unsigned int ui;
} x;
If x.si is set to a positive value, the requirements of 6.2.5 and 6.2.6.1
will mean that x.ui holds the same value with the same representation.
This appears to be a reasonable assumption. A similar thing happens if
x.ui is set to a value between 0 and INT_MAX. If x.si is set to a negative
value, or x.ui to a value greater than INT_MAX, the other might be set to
a trap representation if there are any padding bits; if there are none,
then it must be the case that the extra bit in x.ui overlaps the sign bit
of x.si. Finally, if x.ui is set to some value and x.us does not have any
padding bits and does not overlap any padding bits of x.ui, then x.us will
have some value determinable from the arrangement of bits in the two types
(this might be the low bits of x.ui, the high bits, or some other
combination).
None of these cases should be particularly surprising.
The cited wording in 6.5.2.3 merely muddles the issue by implying that
all possible members take sensible (non-trap) values. It should be removed;
the rest of the paragraph can stand alone.
Committee Draft subsection: 7.19.8.1, 7.19.8.2
PC-UK0278
Title: Clarify the actions of fread and fwrite
Detailed description:
The exact behaviour of fread and fwrite are not well specified, particularly
on text streams but in actuality even on binary streams. For example, the
wording does not require the bytes of the object to be output in the order
they appear in memory, but would allow (for example) byte-swapping. It is
reported that at least one implementation converts to a text form such as
uuencoding on output to a text stream, converting back on input. And,
finally, there is not even a requirement that data written out by fwrite
and read back by an equivalent call to fread even has the same value.
These changes apply the obvious semantics.
In 7.19.8.1p2, add after the first sentence:
For each object, /size/ calls are made to the /fgetc/ function and
the results stored, in the order read, in an array of /unsigned char/
exactly overlaying the object.
In 7.19.8.2p2, add after the first sentence:
For each object, /size/ calls are made to the /fputc/ function, taking
the values (in order) from an array of /unsigned char/ exactly
overlaying the object.
PC-UK0286
Committee Draft subsection: 7.6, 7.6.3
Title: Inconsistencies in fesetround
Detailed description:
It is not clear from the text whether an implementation is required to
allow the floating-point rounding direction to be changed at runtime. The
wording of 7.6p7 implies that those modes defined must be settable at
runtime ("... supports getting and setting ..."), but if this is the case
then 7.6.3.2p4 (example 1) would have no need to call assert on the results
of the fesetround call, since that call could not fail (if FE_UPWARD is not
defined the code is in error). On the other hand, 7.6.3.2p2 implies that
setting succeeds whenever the argument is one of the defined FE_ macros,
and in any case 7.6.3.2p3 is ambiguous.
Even if the mode cannot be set at runtime, it may be the case that the
code is compiled under more than one mode, and it is therefore convenient
to be able to retrieve the current mode.
Option A
--------
If the intention is that there may be rounding modes that can be in effect
but cannot be set by the program, then make the following changes:
Change 7.6p7 to:
Each of the macros
FE_DOWNWARD
FE_TONEAREST
FE_TOWARDZERO
FE_UPWARD
is defined if the implementation supports this rounding direction
and it might be returned by the fegetround function; it might be
possible to set this direction with the fesetround function.
Additional rounding directions ... [unchanged]
Change 7.6.3.2p2 to:
The fesetround function attempts to establish the rounding direction
represented by its argument round. If the new rounding direction cannot
be established, it is left unchanged.
Change 7.6.3.2p3 to:
The fesetround function returns a zero value if and only if the
new rounding direction has been established.
Example 1 is valid in this option, though it may be desirable to replace
the assert by some other action.
Option B
--------
If the intention is that an implementation must allow the mode to be set
successfully to any FE_* macro that is defined, then make the following
changes:
Change 7.6.3.2p3 to:
The fesetround function returns a zero value if and only if the
argument is equal to a rounding direction macro defined in the
header (that is, if and only if the requested rounding direction
is one supported by the implementation).
In 7.6.3.2p4, change the lines:
setround_ok = fesetround (FE_UPWARD);
assert(setround_ok);
to:
assert(defined (FE_UPWARD));
fesetround(FE_UPWARD); // Can't fail
and delete the declaration of setround_ok.
UK-PC0287
Committee Draft subsection: 7.13, 7.13.2.1
Title: Clarify what the setjmp "environment" is
Detailed description:
Much of the state of the abstract machine is not stored in "objects"
within the meaning of the Standard. It needs to be clear that such state
is not saved and restored by the setjmp/longjmp combination.
Append to 7.13p2:
The environment of a call to the setjmp macro consists of
information sufficient for a call to the longjmp function to return
execution to the correct block and invocation of that block (if it
were called recusively). It does not include the state of the
floating-point flags, of open files, or of any other component of
the abstract machine.
In 7.13.2.1p3, change:
All accessible objects have values as of the time ...
to
All accessible objects have values, and all other components of the
abstract machine [*] have the same state as of the time ...
[*] This includes but is not limited to the floating-point flags and
the state of open files.
It also needs to be clear that optimisers need to take care. Consider
the following code:
jmp_buf env;
int v [N * 2];
for (int i = 0; i < N; i++)
{
v [2*i] = 0;
if (setjmp (env))
// ...
v [2*i+1] = f (); // might call longjmp; note i hasn't changed
}
This might be op