ISO/ IEC JTC1/SC22/WG14 N691


                    Document Number:  WG14 N691/X3J11 97-054

                        C9X Revision Proposal
                        =====================

Title: Representation of values
Author: Clive D.W. Feather
Author Affiliation: Demon Internet Ltd
Postal Address: 322 Regents Park Road, London N3 2QQ, UK
E-mail Address: clive@demon.net
Telephone Number: +44 181 371 1138
Fax Number: +44 181 371 1037
Date: 1997-05-20
Sponsor: BSI/WG14
Proposal Category:
   __ Editorial change/non-normative contribution
   XX Correction
   __ New feature
   __ Addition to obsolescent feature list
   __ Addition to Future Directions
   __ Other (please specify)  ______________________________
Area of Standard Affected:
   __ Environment
   XX Language
   __ Preprocessor
   __ Library
      __ Macro/typedef/tag name
      __ Function
      __ Header
   __ Other (please specify)  ______________________________
Prior Art: n/a
Target Audience: all

Related Documents (if any): none

Proposal Attached: Yes

Abstract:
The Standard is extremely terse on the subject of the representation of
values, and in particular how integers are represented. Nonetheless
there are a number of behaviours (such as the bitwise operators) which
depend on the representation. This proposal is intended to clarify these
matters.



Representation of values
========================


Abstract:

The Standard is extremely terse on the subject of the representation of
values, and in particular how integers are represented. Nonetheless
there are a number of behaviours (such as the bitwise operators) which
depend on the representation. This proposal is intended to clarify these
matters.

The basic approach is to define the representation of an object by
overlaying it on to an array of unsigned char, and then giving
properties for the bytes of the array. In most cases very little is
said, but the behaviour of integral types is spelled out in detail.


Details:

Add to the end of subclause 5.2.4.2.1:

    The value UCHAR_MAX+1 shall equal 2 raised to the power CHAR_BIT.

Replace 6.1.2.5 paragraph 16 by:

    The type char, the signed and unsigned integer types, the integer
    bitfield types, and the enumerated types are collectively called
    /integer types/. The term /integral/ is equivalent to /integer/ in
    this International Standard.

Delete footnote 25.

Add a new subclause 6.1.2.7:

    6.1.2.7  Representations of types.

    The representations of all types are unspecified except as stated
    in this subclause.

    6.1.2.7.1  General.

    Values of type unsigned char shall be represented using a pure
    binary notation [*].

    [*] A positional representation for integers ... [existing FN 25]. A
    byte contains CHAR_BIT bits, and the values of type unsigned char
    range from 0 to 2**CHAR_BIT-1.

    When stored in objects of any other object type, values of that type
    consist of N*CHAR_BIT bits, where N is the size of objects of that
    type, in bytes. The value may be copied into an object of type
    /unsigned char [N]/ (e.g. by memcpy); the resulting set of bytes is
    called the /object representation/ of the value. Two values with the
    same object representation shall compare equal, but values that
    compare equal might have different object representations.

    Certain object representations might not represent a value of that
    type. If such a representation is accessed due to evaluation of an
    object, or if such a representation is produced by a side effect
    that stores into all or any part of the object using an lvalue of
    that type, then the behaviour is undefined [*]. Such representations
    are called /trap representations/.

    [*] Thus an automatic variable can be initialized to a trap
    representation without causing undefined behaviour, but if the value
    of the variable cannot be used until a proper value is stored in it.

    When a value is stored in an object of structure or union type,
    including in a member object, the bytes of the object representation
    that correspond to any padding bytes take unspecified values [*].
    The values of padding bytes shall not affect whether the value of
    such an object is a trap representation. Those bits of a structure
    or union object that are in the same byte as a bitfield member, but
    are not part of that member, shall similarly not affect whether the
    value of such an object is a trap representation.

    [*] Thus structure assignment may be implemented element-at-a-time
    or via mempcpy.

    When a value is stored in a member of an object of union type, the
    bytes of the object representation that do not correspond to that
    member but do correspond to other members take unspecified values,
    but the value of the union object shall not thereby become a trap
    representation.

    Where an operator is applied to a value whose object representation 
    includes padding bits but which is not a trap representation, the
    operator shall ignore those bits for the purpose of determining the
    value of the result. If the result is stored in an object that has
    padding bits, it is unspecified how those padding bits are generated
    - they might not be related to the padding bits of the operands -
    but a trap representation shall not be generated.

    6.1.2.7.2  Integral types.

[[Editorial note: the following wording assumes that integral types can
have "illegal" values. This is a conservative assumption. An alternative
would be to require all object representations to be acceptable, with
padding bits ignored.]]

    For unsigned integral types other than /unsigned char/, the bits of
    the object representation shall be divided into two groups: value
    bits and padding bits (there need not be any of the latter). If
    there are N value bits, each bit shall represent a different power
    of 2 between 1 and 2**(N-1), so that objects of that type shall be
    capable of representing values from 0 to 2**N-1 using a pure binary
    representation; this shall be known as the value representation. The
    values of any padding bits are unspecified [*].

    [*] Some combinations of padding bits might generate trap
    representations; for example, if one padding bit is a parity bit.
    Nonetheless, no arithmetic operation on valid values can generate a
    trap representation other than as part of an exception such as an
    overflow, and this cannot occur with unsigned types.

    For signed integral types, the bits of the object representation
    shall be divided into three groups: value bits, padding bits, and
    the sign bit. There need not be any padding bits; there shall be
    exactly one sign bit. Each bit that is a value bit shall have the
    same value as the same bit in the object representation of the
    corresponding unsigned type (if there are M value bits in the signed
    type and N in the unsigned type, then M <= N). If the sign bit is
    zero, it shall not affect the resulting value. If the sign bit is
    one, then the value shall be modified in one of the following ways:
    - the corresponding value with sign bit 0 is negated;
    - the sign bit has some value between -1 and -2**N inclusive.
    The values of any padding bits are unspecified [*, same as above].
    A valid (non-trap) object representation of a signed integral type
    where the sign bit is zero is a valid object representation of the
    corresponding unsigned type, and shall represent the same value.

    Bit field types shall have no padding bits; an N-bit bitfield shall
    have N value bits if treated as unsigned, and N-1 value bits plus a
    sign bit if treated as signed.

Modify subclause 6.3.7 (the << and >> operators) as follows:

* In paragraph 3, change "the width in bits" to "the number of value and
sign bits in the object representation".

Insert a new paragraph between paragraphs 3 and 4:

    The shift shall be done in terms of the value, irrespective of the
    object representation. The value bits are arranged in order of
    magnitude, with the bit with value 1 at the right hand end. If the
    type of the promoted left operand is signed, the sign bit shall take
    part in the shift, and is placed immediately to the left of the
    value bits. A shift does not overflow, and a trap representation
    shall not be generated.

Change the last sentence of paragraph 5 to:

    If E1 has a signed type and a negative value, the values of the sign
    bit and E2-1 most significant value bits are implementation-defined.

Note: there is no need to make any changes in the & | and ^ operators,
which act on the corresponding bits from the two operands. If these had
different types initially, the integral promotions will bring them to
the same type, clearly defining how the values are affected. Padding
bits cannot alter this; nor can trap representations be generated.