Document SC22/WG14/N669 (X3J11/97-032) Comments on N641 and an outline proposal for secondary integral types. Clive D.W. Feather ABSTRACT This paper consists of two parts. The first is an informal critique of N641, explaining what I think is wrong with it. The second is an outline proposal for a concept of "secondary integral type". If there is interest in it, I can expand it to a formal proposal. N641, AND WHAT'S WRONG WITH IT I have to completely disagree with Randy's position on this paper. Randy summarizes his position as "that the previous interpretation has bad consequences, that it is unreasonable, and the revised interpretation is a reasonable alternative". I think he's wrong in all three. In section 2 of his paper, Randy says: "The previous interpretation probably renders non-conforming any implementation with an extension type." This is purely and simply wrong. To introduce an extension type into a program, the programmer must do something that is not strictly conforming (include a non-Standard header, or use a typedef like __int16). Once this has happened, all bets are off. In particular, provided *one* diagnostic like "non-Standard type name seen" has been generated, there is no need to generate a diagnostic every time it's used. So this claim looks awfully like scaremongering to me. In section 3, he talks about "a fairly artificial distinction between the Standard types and extension types". Despite the waffle about Zen koans, the distinction is that between the Standard and any extension. Such types are a form of extension that is not strictly compatible with the Standard (just like long long used to be). Once a diagnostic (only one, note) has appeared, there's no problem with using them. It is not "wrong to shift this implementation extension type, or add it, or ..."; it is wrong to use the type and believe that you are still strictly conforming. In section 4, he first (correctly) shows that extra types don't affect strictly conforming programs, but then claims this means that "there is no harm in letting the extension type also be a member of a Standard type category". Here he makes the mistake that first caused me to submit DR 067: treating random other types as being integral types adds a large amount of semantic baggage to them. Allowing __uint16 to be an integral type *automatically* allows it to be used for size_t, and this then produces a whole range of undescribed behaviour in an apparently strictly conforming program. This looks to me like begging the question. Finally, in section 4.2, he appears to assume that the only signficant problem is the "biggest type" issue. To address his numbered points: (1) The Standard never discusses integer-holding types larger than unsigned long. Therefore, other than on esoteric systems (and *all* systems using long long were esoteric at the time as far as I was concerned) it is the largest type, and it is the largest type I was ever likely to come across. Thus it *is* a useful interpretation. (2) This is a major argument against long long. (3) Sloppy programmers abound. When unsigned long was the largest possible type, us careful programmers at least had a workable idiom. Randy (and long long) takes it away from us (though I'm going to submit a separate proposal on this). (5) I disagree that inttypes.h is cleaner. Randy totally ignores the other issues with using unknown types. For example, what are the relevant promotion and conversion rules ? What is the result of (sizeof(V)+0) ? Things like that. However, this message is not intended solely as an attack on N641. Curiously enough, I agree with the basic ideas behind it - I just feel they've been handled badly. So here I present a rough set of ideas which can be worked up into a formal proposal if people are actually interested. PROPOSAL - SECONDARY INTEGRAL TYPES Introduce a concept of "secondary integral type". A secondary integral type is a type which has the basic properties of integral types, but is not one of the types so-named in the Standard. The secondary integral types provided by an implementation are implementation-defined. The following is intended to be an *exhaustive* list of areas that need to be addressed, and what the action is. [6.1.2.5] (1) SITs always appear in pairs - a signed and an unsigned version. The two members of the pair have the same storage and alignment requirements, the representation of the common range of values is the same, MAX_U_type >= MAX_S_type, and arithmetic in the unsigned type is always modulo MAX_U_type+1. (2) The range of values of all SITs is at least as great as that of [un]signed char, and is no more than [u]intmax_t. (3) SITs are integral types (and are each either signed integral types or unsigned integral types) and types such as wchar_t, size_t, and ptrdiff_t can be SITs. SITs use pure binary notation. They take part in the derivation and qualification processes in the normal way. [6.1.2.6] (4) Two different SITs are never compatible types, nor are they ever compatible with the primary integral types. [6.1.3.2] (5) Integer constants can only be SITs if they have a larger value than can be represented by unsigned long long. [6.2.1.1] (6) SITs divide into "small" SITs and "large" SITs. A SIT is "small" if MAX_S_type <= INT_MAX and MAX_U_type <= UINT_MAX, and "large" otherwise. The corresponding signed and unsigned types of a pair shall always be both small or both large. (7) Small SITs may be used in an expression wherever an int or unsigned int can. The integral promotions apply to them in the same way. Large SITs are unaffected by the integral promotions. [6.2.1.2] (8) With four (or now five) named types, the concept of "larger" is obvious. Adding SITs requires a rethink. I propose: - Corresponding signed and unsigned types are "the same size". - If two types of the same signedness have different maximum representable values, then the "larger" type is the one with the greater maximum. - If two types have the same maximum representable value, then: * if both are primary types, use the "natural" order; * a SIT is "larger" than char and short, but "smaller" than int, long, and long long; * two SITs have an implementation-defined order which must be transitive among SITs of the same maximum representable value. [I don't see why the last two cases should happen, but let's play safe.] The number of bytes occupied by an object of a smaller type is less than or equal to the number occupied by an object of larger type. (8) The rules of 6.2.1.2 apply, except that the concepts of "size" should be read as "maximum value that can be represented" (a change that is worth making anyway). [6.2.1.7] (9) The question of the usual arithmetic conversions is the knottiest one. The following approach seems the cleanest to me - it is compatible with the present wording, it is fairly easy to explain, and it works even with SITs larger than long long. Replace the text following: Otherwise the integral promotions are performed on both operands. Then the following rules are applied: with: If both operands have signed types, or the operand with the larger type has an unsigned type, the operand with the smaller type is converted to the type of the other operand. If one operand has a signed type and the other has the corresponding unsigned type, the former is converted to the type of the latter. Otherwise, if the larger (signed) type can represent all the values of the smaller (unsigned) type, the operand with the smaller type is converted to the larger type. Otherwise both operands are converted to the unsigned type corresponding to the larger (signed) type. [6.3 onwards] (10) An expression required to have an integral type may be a (large) SIT, and behaves like any other integral type. (11) 6.3.7 paragraph 4 will require rewording. [6.5.2] (12) There are no type specifiers that explicitly generate SITs. An implementation may define typedef names, additional type specifiers that might be combined with existing ones, or additional combinations of existing type specifiers, that specify SITs. All such additional identifiers shall be in an appropriate reserved namespace or require inclusion of a non-Standard header. My intent here is to allow an implementation to provide concepts like signed __int24 unsigned __int24 long char int { 32767 } int __atleast __bigendian : 12 as well as simple typedefed names. [6.8.1] (13) All SITs smaller than long long shall be treated as long long in preprocessor arithmetic. Alternatively, require preprocessor arithmetic to use [u]intmax_t. [7.4 ] (14) All the types provided may be SITs. However, if there is a primary integral type that meets the criterion, it must be used instead of an SIT. The implementation is not required to provide any SITs for these purposes. (15) Reiterate that no SIT can be larger than [u]intmax_t. However, the latter might be SITs. -- Clive D.W. Feather | Associate Director | Director Tel: +44 181 371 1138 | Demon Internet Ltd. | CityScape Internet Services Ltd. Fax: +44 181 371 1037 | |