Some constants are literally literals

Jens Gustedt, INRIA and ICube, France

2024-04-14

document history

document number date comment
n3189 202310 original proposal
n3239 202404 this paper
→ ordinary {character¦string} literal
precise lists of changes
proposed git branch for LaTeX sources

license

CC BY, see https://creativecommons.org/licenses/by/4.0

1 Introduction and Overview

The C standard differentiates two terms that have surprisingly different meaning:

This even leads to confusion in the standard itself, because the definitions of sizeof and alignof cyclicly refer to the definition of “integer constant” and “integer constant expression”. In other places the term “integer constant” is seemingly used with a different meaning than its definition, namely in places where talking about a constant of integer type would be in order.

The goal of this paper is to rename the terms “integer constant”, “floating constant”, “character constant” to “integer literal”, “floating literal” and “character literal”. A table summarizing these systematic changes can be found towards the end.

2 The text for constant expressions and arithmetic expressions is confusing

This is the case because the long list of cases mixes cases that talk about lexical concepts and others that talk about semantic properties of certain subexpressions. We propose to make these distinctions clearer by talking consistently about “literals” when we address a lexical feature, and talk about ‘’constants’’ as a semantic concept that is attached to certain grammatical entities, but which is in general not deducible from local syntactic features.

We propose to make the following changes.

p8 An integer constant expression130) shall have integer type and shall only have operands that are integer constantsliterals, named constants, and compound literal constants of integer type, character constantsliterals, sizeof expressions whose results are integer constants expressions, alignof expressions, and floating literals, named constants, or compound literal constants of arithmetic type that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the typeof operators, sizeof operator, or alignof operator.

p10 An arithmetic constant expression shall have arithmetic type and shall only have operands that are integer constantsliterals, floating constantsliterals, named constants or compound literal constants of arithmetic type, character constantsliterals, sizeof expressions whose results are integer constants expressions, and alignof expressions. Cast operators in an arithmetic constant expression shall only convert arithmetic types to arithmetic types, except as part of an operand to the typeof operators, sizeof operator, or alignof operator.

3 Preprocessing

This misses in 6.4.8 (Preprocessing numbers) that some pp-number tokens are already interpreted as integer literals during pre-processing itself.

3 Preprocessing number tokens lexically include all floating and integer constant tokensliterals.

Semantics

4 A preprocessing number does not have type or a value; it acquires both after a successful conversion (as part of translation phase 7) to a floating constant token or integer constant tokenliteral. This not withstanding for the evaluation of expressions within conditional source inclusion (6.10.1) and to determine the limit parameter for binary resource inclusion (6.10.3), preprocessing numbers have the form of an integer literal and are interpreted as such. For the determination of a line number in a #line directive (6.10.5) digit sequences that also match the requirements for a preprocessing number are interpreted as numbers as well, only that the interpretation is of a decimal integer, even if the leading digit is 0.

4 Character literals

Already in the existing text the term “integer character constant” is a misnomer, because in fact all character constants have integer type. With the proposed changes “integer character constant”, would now become “integer character literal”, which is equally confusing. In alignment with C++, we propose change this term to “ordinary character literal”.

In some occurrence, the term “integer character constant” is then currently used as if it could also include UTF-8 character constants/literals. Therefor we propose to introduce a new term “narrow character literal” that comprises these two cases. Changes are in 6.4.4.5 (Character constants), p2:

A integern ordinary character constantliteral is a sequence of one or more multibyte characters enclosed in single-quotes, as in ’x’. A UTF-8 character constantliteral is the same, except prefixed by u8. Together ordinary and UTF-8 character literal are narrow character literals. A wchar_t character constantliteral is prefixed by the letter L. A UTF-16 character constantliteral is prefixed by the letter u. A UTF-32 character constantliteral is prefixed by the letter U. Collectively, wchar_t, UTF-16, and UTF-32 character constantsliterals are called wide character constantsliterals. …

5 String literals

Similarly, the existing term “character string literal” is very confusing, because all string literals consist of characters. Therefor we propose to rename these to “ordinary string literal”, which is consistent with the new naming for character literals.

6 Strings

It seems that unfortunately the terminology for strings that the Library section introduces has very similar problems. These problems will be tackled by a specific paper.

7 changing terminology

First do the changes as indicated above. Then in the order as shown

C23 C2y
integer character constant ordinary character literal as above and in 6.4.4.5 p11, p17 and p18, in 6.10.10.3, 7.21 p3, I .2, J .3.5 (9)
integer character constant narrow character literal 6.4.4.5 p5 and p6, 6.4.5 p4,
character constant character literal otherwise
integer constant integer constant expression 6.5.4.4 p2, 6.9.1 p3, 7.22.4 p1, 7.29.1 p2,
integer constant constant of integer type 7.22.4 title, 7.22.41 title, 7.22.4.2 title
named integer constant value named constant of integer type 6.2.25 p21
integer constant token integer literal 6.4 p5, 6.4.8 p3 and p4
floating constant token floating literal 6.4 p5, 6.4.8 p3 and p4
integer constant integer literal other than in integer constant expression and M .5 p1
decimal constant decimal literal
hexadecimal constant hexadecimal literal
octal constant octal literal
binary constant binary literal
floating constant floating literal
decimal floating constant decimal floating literal
hexadecimal floating constant hexadecimal floating literal
fractional constant fractional literal
hexadecimal fractional constant hexadecimal fractional literal
character string literal ordinary string literal

8 Changes to the LaTeX source – notes to reviewers and editors

The branch “literals” in the WG14 git repository contains an implementation of the proposed changes to the LaTeX document.

Since it has changes all over the place we organized them in a special way