Some constants are literally literals

Jens Gustedt, INRIA and ICube, France

2023-12-13

document history

document number date comment
n3189 202312 this paper, original proposal

license

CC BY, see https://creativecommons.org/licenses/by/4.0

1 Introduction and Overview

The C standard differentiates two terms that have surprisingly different meaning:

This even leads to confusion in the standard itself, because the definitions of sizeof and alignof cyclicly refer to the definition of “integer constant” and “integer constant expression”. In other places the term “integer constant” is seemingly used with a different meaning than its definition, namely in places where talking about a constant of integer type would be in order.

The goal of this paper is to rename the terms “integer constant”, “floating constant”, “character constant” to “integer literal”, “floating literal” and “character literal”. A table summarizing these systematic changes can be found towards the end.

Otherwise we also rely on some terminology changes as they are introduced by the paper n3187, in particular for the change of to speak of “constant sizeof expression”. But this can evidently be changed if that proposal doesn’t make it.

2 The text for constant expressions and arithmetic expressions is confusing

This is the case because the long list of cases mixes cases that talk about lexical concepts and others that talk about semantic properties of certain subexpressions. We propose to make these distinctions clearer by talking consistently about “literals” when we address a lexical feature, and talk about ‘’constants’’ as a semantik concept that is attached to certain grammatical entities, but which is in general not deducible from local syntactic features.

We propose to make the following changes.

p8 An integer constant expression130) shall have integer type and shall only have operands that are integer constantsliterals, named constants, and compound literal constants of integer type, character constantsliterals, constant sizeof expressions whose results are integer constants, alignof expressions, and floating literals, named constants, or compound literal constants of arithmetic type that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the typeof operators, sizeof operator, or alignof operator.

p10 An arithmetic constant expression shall have arithmetic type and shall only have operands that are integer constantsliterals, floating constantsliterals, named constants or compound literal constants of arithmetic type, character constantsliterals, constant sizeof expressions whose results are integer constants, and alignof expressions. Cast operators in an arithmetic constant expression shall only convert arithmetic types to arithmetic types, except as part of an operand to the typeof operators, sizeof operator, or alignof operator.

3 Enumerations

In 6.2.5

p21 An enumeration comprises a set of named integer constant valuesconstants of integer type. …

4 Preprocessing

This misses in 6.4.8 (Preprocessing numbers) that some pp-number tokens are already interpreted as integer literals during pre-processing itself.

3 Preprocessing number tokens lexically include all floating and integer constant tokensliterals.

Semantics

4 A preprocessing number does not have type or a value; it acquires both after a successful conversion (as part of translation phase 7) to a floating constant tokenliteral or an integer constant tokenliteral. This not withstanding for the evaluation of expressions within conditional source inclusion (6.10.1) and to determine the limit parameter for binary resource inclusion (6.10.3), preprocessing numbers shall have the form of an integer literal and are interpreted as such. For the determination of a line number in a #line directive (6.10.5) digit sequences that also match the requirements for a preprocessing number are interpreted as numbers as well, only that the interpretation is of a decimal integer, even if the leading digit is 0.

5 Character literals

Already in the existing text the term “integer character constant” is a misnomer, because in fact all character constants have integer type. With the proposed changes “integer character constant”, would now become “integer character literal”, which is equally confusing. We propose change this term to “simple character literal”.

In many places, the term “integer character constant” is then currently used as if it would also include UTF-8 character constants/literals. In fact, it seems that the text that speaks of “integer character constant” has not been properly updated when introducing UTF-8 character literals. Therefor we propose to introduce a new term “narrow character literal” that comprises these two cases.

A integersimple character constantliteral is a sequence of one or more multibyte characters enclosed in single-quotes, as in ’x’. A UTF-8 character constantliteral is the same, except prefixed by u8. Together simple and UTF-8 character literal are narrow character literals. A wchar_t character constantliteral is prefixed by the letter L. A UTF-16 character constantliteral is prefixed by the letter u. A UTF-32 character constantliteral is prefixed by the letter U. Collectively, wchar_t, UTF-16, and UTF-32 character constantsliterals are called wide character constantsliterals. …

6 Macros in <stdint.h>

There are inconsistencies in 6.24 what seems to be a confusion of the concepts of “integer constant” and “integer constant expression”.

7.22.4 Macros for integer constantsconstants of integer type

1 The following function-like macros expand to integer constant expressions suitable for initializing objects that have integer types corresponding to types defined in <stdint.h>. Each macro name corresponds to a similar type name in 7.22.1.2 or 7.22.1.5.

2 The argument in any instance of these macros shall be an unsuffixed integer constantliteral (as defined in 6.4.4.1) with a value that does not exceed the limits for the corresponding type.

3 …

7.22.4.1 Macros for minimum-width integer constantsconstants of integer type

7.22.4.2 Macros for greatest-width integer constantsconstants of integer type

7 changing terminology

First do the changes as indicated above. Then in the order as shown

C23 C2y
integer character constant simple character literal as above and in 6.4.4.4 p11 and p18
integer character constant narrow character literal otherwise
character constant character literal otherwise
integer constant integer literal other than in integer constant expression
floating constant floating literal