ISO/ IEC JTC1/SC22/WG14 N739

ISO/IEC JTC1/SC22/WG14 N739

                General wording issues (clauses 1 to 6)
                            First Revision
                          Clive D.W. Feather


Abstract
========

This document is an attempt to identify all the minor issues I can find
in clauses 1 to 6 of the Standard. This revision is an update to use
draft 10 pre 1 as the starting point.

Where issues are still open or are undiscussed, I have added material
and the original wording.

=======================================================================

Item 1:

The term "access" is not well defined. From context, it sometimes
appears to mean "read the value", and sometimes "read or write the
value". This ambiguity sometimes makes it hard to understand what is
actually meant.

There needs to be a definition in clause 3, and all uses of the term
need to be checked for the read-only / read-write problem. Probably the
best approach is to define it as "read or write", and to find and fix
the places where "read" is meant.

An example of the "read" usage is 6.3.2.3 paragraph 5:

    With one exception, if a member of a union object is accessed after
    a value has been stored in a different member of the object, the
    behaviour is implementation-defined.

where writing is clearly meant to be excluded.

An example of the "read or write" usage is 6.3 paragraph 6:

    ... If a value is stored into an object ... the type of the lvalue
    becomes the effective type of the object for that access and for
    subsequent accesses ...

where writing is clearly meant to be included.

An example where this causes problems with interpreting the Standard is
6.5.3. Paragraph 11 reads:

    A reference to a value means either an access to or a modification
    of the value.

So "access" presumably means read, but not write. But then paragraph 6
reads:

    What constitutes an access to an object that has volatile-qualified
    type is implementation-defined.

So what constitutes a write to a volatile object is *not* implementation-
defined ?

There are other instances; this is the first one that comes to mind.

====

Item 2:

Change the first part of paragraph 1 of subclause 5.1.2.2.1 to:

    The function called at program startup is named /main/. The
    implementation declares no prototype for this function. It shall be
    defined either with no parameters:

    ...

        int main (int argc, char *argv[]) { /* ... */ }

    or equivalent [*], or in some other implementation-defined manner.

    [*] Thus /int/ can be replaced by a typedef-name defined as /int/,
    or the type of argv can be written as /char **argv/, and so on.

This will make it clear that, while these are the only permitted
strictly conforming alternatives, extensions are allowed but must be
documented.

====

Item 3:

Examples 2 and 6 in subclause 5.1.2.3 need rewording. At present they
use the term "exception" to mean something like a visible overflow trap,
whereas 6.3 makes it clear that an "exception" occurs on overflow even
when the result is silently wrapped.

In example 2, change:

    Provided the addition of two /chars/ can be done without creating
    an overflow exception, ...

to:

    Provided the addition of two /chars/ can be done without overflow,
    or with overflow wrapping silently to produce the correct result, ...

In example 6, change:

    On a machine in which overflows produce an exception ...

to:

    On a machine in which overflows produce an explicit trap ...

and change:

    However on a machine in which overflows do not produce an exception
    and in which the results of overflows are reversible,

to:

    However, on a machine in which overflow silently generates some
    value and where positive and negative overflows cancel,

====

Item 4:

In 5.2.1 paragraph 2, delete the final "literal". The zero character
terminates strings, but does not occur in a string literal (which is a
syntactic construct).

Add a forward reference to "string" in 7.1.1.

====

Item 5:

Subclause 6.1.2 treats the term "identifier" as representing the
sequence of characters. On the other hand, subclause 6.1.2.1 treats the
term as representing that sequence within a given scope. Thus in:

    {
        int fred;       /* fred-1 */
        {
            int fred;   /* fred-2 */
        }
    }

6.1.2 paragraph 8 treats fred-1 and fred-2 as being the same identifier,
while 6.1.2.1 treats them as different.

In 6.1.2 paragraph 4, change:

    An identifier denotes an object ... or a macro parameter.

to:

    An identifier can denote an object ... or a macro parameter.
    The same identifier can denote different entities at different
    points in the program.

In 6.1.2.1 paragraph 1, change:

    An identifier is /visible/ (i.e. can be used) only within a region
    of program text called its scope.

to:

    For each different entity that an identifier designates, the
    identifier is /visible/ (i.e. can be used) only within a region of
    program text called its scope. Different entities designated by
    the same identifier either have non-overlapping scopes, or are
    in different name spaces.

In paragraph 3, change:

    If an outer declaration of a lexically identical identifier
    exists in the same name space, it is hidden until the current
    scope terminates, after which it again becomes visible.

to:

    If an identifer designates two different entities in the same name
    space, the scopes might overlap. If so, the scope of one entity
    (the /inner scope/) will be a strict subset of the scope of the
    other entity (the /outer scope/). Within the inner scope, the
    identifier designates the entity declared in the inner scope; the
    entity declared in the outer scope is /hidden/ (and not visible)
    within the inner scope.

Insert a new paragraph between paragraphs 3 and 4:

    Each occurence of an identifier designates the entity in the
    relevant name space whose declaration is visible at the point that
    the identifier occurs. Unless explicitly stated otherwise, where
    this International Standard uses the term "identifier" to refer to
    some entity (as opposed to the syntactic construct), it is that
    entity that is referred to.

In 7.1.3 paragraph 2, change:

    If the program declares or defines an identifier with the same
    name as an identifier reserved in that context ...

to:

    If the program declares or defines an identifier that is reserved
    in that context ...

====

Item 6a:

In 6.1.2.5, append to paragraph 11:

    The implementation shall define /char/ to have the same range,
    representation, and behaviour as one of /signed char/ and /unsigned
    char/. [*]

    [*] CHAR_MIN, defined in <limits.h>, will have one of the values 0
    or SCHAR_MIN, and this can be used to distinguish the two options.
    Irrespective of the choice made, /char/ is a separate type from the
    other two, and is not compatible with either.

This clarifies that there are only two differently-behaving types, not
three.

====

Item 6b:

In 6.1.2.5, change the last sentence of paragraph 2 from:

    If other quantities are stored in a /char/ object, the behaviour
    is undefined; the values are treated as either signed or nonnegative
    integers.

to:

    If any other character is stored in a /char/ object, the resulting
    value is implementation-defined but shall be within the range of
    values that can be represented in that type.

====

Item 7:

The rules for composite type handle an incomplete array meeting a
complete one, but not the equivalent situation with an incomplete
structure or union.

Replace subclause 6.1.2.6 paragraph 3, first bullet point, with:

    - If one type is complete and the other type is incomplete, the
      composite type is a complete type.

====

Item 8:

Add the following to the end of subclause 6.2.2.3:

    An integer may be converted to any pointer type. The result is
    implementation-defined, and might not be a pointer to an object
    of that type. [59]

    Any pointer type may be converted to an integral type; the result is
    implementation-defined, and need not be in the range of values of
    any integral type. If the resulting value cannot be represented in
    the destination type, the behaviour is undefined. [*]

    [*] Thus if the conversion is to /unsigned int/ but yields a
    negative value, the behaviour is undefined.

    A pointer to a complete or incomplete object type may be converted
    to a pointer to a different complete or incomplete object type. If
    the resulting pointer is not correctly aligned for the pointed to
    type, the behaviour is undefined. Otherwise, when converted back
    again, the result shall compare equal to the original pointer. [*]

    [*] All pointers to character types are correctly aligned. In
    general, the concept "correctly aligned" is transitive: if a pointer
    to type A is correctly aligned for a pointer to type B, which in
    turn is correctly aligned for a pointer to type C, then a pointer to
    type A is correctly aligned for a pointer to type C.

    A pointer to a function ... [this paragraph, taken from 6.3.4,
    remains unchanged].

Delete 6.3.4 paragraph 4, and add the following paragraph to the
constraints (after paragraph 2):

    Conversions that involve pointers, other than where permitted by the
    constraints of 6.3.16.1, shall be specified by means of an explicit
    cast.

====

Item 9a:

The following code is technically illegal:

  union u { int i; float f; };
  u.f = 1.0;
  u.i = 42;
  printf ("%d", u.i);

In 6.3.2.3 paragraph 5, replace:

    With one exception, if a member of a union object is accessed after
    a value has been stored in a different member of the object, the
    behaviour is implementation-defined. [53] One special guarantee is
    made ...

with:

 |  With one exception, if the value of a member of a union object is
 |  used when the most recent store to the object was to a different
 |  member, the behaviour is implementation-defined. [53] One special
    guarantee is made ...

This item ignores the issues of what implementation-defined means;
item 9b deals with that part.

====

Item 9b:

If a union is read from a member other than the one last stored into,
the result is currently implementation-defined. Because the result might
cause a trap of some kind (e.g. invalid pointer), it should be undefined
behaviour in most circumstances; the wording should broadly follow 6.3
on this matter.

In 6.3.2.3, replace paragraph 5 (either the original or the replacement
from item 9a) with:

 |  With two exceptions, if the value of a member of a union object is
 |  used when the most recent store to the object was to a member whose
 |  type does not have the same alignment and representation, the
 |  behaviour is undefined. If either member has character type or is an
 |  array of character type, the behaviour is implementation-defined. [53]
 |  Furthermore, a special guarantee is made ...

====

Item 10:

Replace subclause 6.5.2 paragraph 4 by:

    Each of the comma-separated sets designate the same type, except
    that for bit-fields, it is implementation-defined whether the
    specifier /int/ is the same type as /signed int/ or is the same
    type as /unsigned int/.

Replace subclause 6.5.2.1 paragraph 8 by:

    A bit-field shall have a type that is a qualified or unqualified
    version of /signed int/ or /unsigned int/. A bit field is
    interpreted as a signed or unsigned integral type consisting of the
    specified number of bits. [*]

    [*] As specified in 6.5.2 above, if the actual type specifier used
    is /int/ or there is no type specifier, or is a typedef-name defined
    using either of these, then it is implementation-defined whether the
    bit-field is signed or unsigned.

This eliminates the duplicate wording in these two places, and also
makes it clear that there is not a potential third signedness of
bitfield.

If my proposals for representation of types are accepted, there may need
to be further wording adjustments in the second alteration.

====

Item 11:

In subclause 6.5.2.1, change paragraph 3 from:

    The expression that specifies the width of a bit-field shall be
    an integral constant expression that has nonnegative value that
    shall not exceed the number of bits in an ordinary object of
    compatible type. If the value is zero, the declaration shall have
    no declarator.

to:

    The expression that specifies the width of a bit-field shall be
    an integral constant expression that has nonnegative value that
 |  shall not exceed the number of bits in an object of the type
 |  that would be specified if the colon and expression had been
 |  omitted. If the value is zero, the declaration shall have
    no declarator.

The current wording doesn't say *what* the type is compatible with.

====

Item 12:

Subclause 6.5.2.2 allows an enumerated type (say /enum e/) to be
compatible with /long/ or even /unsigned long long/. On the other hand,
subclause 6.2.1.1 states that the type converts to /int/ or /unsigned
int/ as part of the integral promotions. This produces the apparent
contradiction that two compatible types promote differently !

There are two alternative approaches to solving this.

(A) Change subclause 6.5.2.2 paragraph 4 from:

    Each enumerated type shall be compatible with an integer type.
    The choice of type is implementation-defined, but shall be capable
    of representing the values of all the members of the enumeration.

to:

 |  Each enumerated type shall be compatible with one of the following
 |  types:
 |      signed char             unsigned char
 |      signed short            unsigned short
 |      signed int              unsigned int
    The choice of type is inplementation-defined, but shall be capable
    of representing the values of all the members of the enumeration.

(B) Change subclause 6.2.1.1 paragraph 1 from:

    A /char/, a /short int/, or an /int/ bit-field, or their signed or
    unsigned versions, or an enumeration type, may be used in an
    expression wherever an /int/ or /unsigned int/ may be used. If an
    /int/ can represent all values of the original type, the value is
    converted to an /int/; otherwise, it is converted to an /unsigned
    int/. These are called the /integral promotions/.[37] All other
    arithmetic types are unchanged by the integral promotions.

to:

    A /char/, a /short int/, or an /int/ bit-field, or their signed or
 |  unsigned versions, may be used in an
    expression wherever an /int/ or /unsigned int/ may be used. If an
    /int/ can represent all values of the original type, the value is
    converted to an /int/; otherwise, it is converted to an /unsigned
 |  int/. These are called the /integral promotions/.[37]

 |  An enumeration type may be used in an expression wherever the type
 |  that it is compatible with may be used. The integral promotions
 |  cause the value to be converted in the same way as that compatible
 |  type would be.

    All other arithmetic types are unchanged by the integral promotions.

and in subclause 6.5.2.2, change the first sentence of paragraph 4 from:

    Each enumerated type shall be compatible with an integer type.

to:

 |  Each enumerated type shall be compatible with some signed or
 |  unsigned integral type.

[At present, enumerated types *are* integer types; the intent is to make
them clearly compatible with one of the 10 types named in 6.1.2.5.]

====

Item 13:

Change 6.5.7 paragraph 12 from:

    ... the first member of a union. ...

to:

    ... the first named member of a union. ...

[This isn't strictly necessary, but makes things clearer.]

====

Item 14:

Now implicit int has been removed from the Standard, then there is no
longer a good rationale for allowing functions with an object return
type to execute a return statement without an expression.

Change subclause 6.6.6.4 as follows. Append to the Constraints:

    A /return/ statement without an expression shall only appear in a
    function whose return type is /void/.

Change paragraph 2, last sentence, from:

    A function may have any number of /return/ statements, with and
    without expressions.

to:

    A function may have any number of /return/ statements.

There are two alternative approaches to the remainder of the changes
(the above changes are to be made in either case):

(A) Change 6.6.6.4 paragraph 4 from:

    If a /return/ statement without an expression is executed, and the
    value of the function call is used by the called, the behaviour is
    undefined. Reaching the } that terminates a function is equivalent
    to executing a /return/ statement without an expression.

to:

    If the } that terminates a function is reached, and the value of the
    function call is used by the caller, the behaviour is undefined.

and change the last sentence of subclause 5.1.2.2.3 from:

    If the /main/ function executes a return that specifies no value,
    the termination status returned to the host environment is undefined.

to:

    If the /main/ function executes a return that specifies no value,
 |  the termination status returned to the host environment is unspecified.

[The concept of undefined value is carefully avoided elsewhere.]

(B) Delete 6.6.6.4 paragraph 4 entirely, insert the following
Constraint in 6.6.6.4 at the end:

    In a function whose return type is not /void/, the last statement
    before the terminating } shall have one of the following forms:
    - a /return/ statement with an expression;
    - a /goto/ statement;
    - a block in which the last statement before the terminating } is,
      recursively, one of these forms;
    - an /if/ statement with an /else/, in which each substatement is,
      recursively, one of these forms;
    - a /switch/ statement which is not the smallest enclosing /switch/
      or iteration statement of a /break/ statement, and in which the
      switch body is, recursively, one of these forms;
    - an iteration statement which is not the smallest enclosing
      /switch/ or iteration statement of a /break/ statement, and in
      which the controlling expression (/expression-2/ for a /for/
      statement) is, or is replaced by, a non-zero constant expression.

and delete the last sentence of subclause 5.1.2.2.3:

    If the /main/ function executes a return that specifies no value,
    the termination status returned to the host environment is undefined.