______________________________________________________________________
2 Lexical conventions [lex]
______________________________________________________________________
1 The text of the program is kept in units called source files in this
International Standard. A source file together with all the headers
(_lib.headers_) and source files included (_cpp.include_) via the pre-
processing directive #include, less any source lines skipped by any of
the conditional inclusion (_cpp.cond_) preprocessing directives, is
called a translation unit. [Note: a C++ program need not all be
translated at the same time. ]
2 [Note: previously translated translation units and instantiation units
can be preserved individually or in libraries. The separate transla-
tion units of a program communicate (_basic.link_) by (for example)
calls to functions whose identifiers have external linkage, manipula-
tion of objects whose identifiers have external linkage, or manipula-
tion of data files. Translation units can be separately translated and
then later linked to produce an executable program. (_basic.link_). ]
2.1 Phases of translation [lex.phases]
1 The precedence among the syntax rules of translation is specified by
the following phases.1)
1 Physical source file characters are mapped, in an implementation-
defined manner, to the source character set (introducing new-line
characters for end-of-line indicators) if necessary. Trigraph
sequences (_lex.trigraph_) are replaced by corresponding single-
character internal representations. Any source file character not
in the basic source character set (_lex.charset_) is replaced by
the universal-character-name that designates that character.2)
2 Each instance of a new-line character and an immediately preceding
backslash character is deleted, splicing physical source lines to
_________________________
1) Implementations must behave as if these separate phases occur, al-
though in practice different phases might be folded together.
2) The process of handling extended characters is specified in terms
of mapping to an encoding that uses only the basic source character
set, and, in the case of character literals and strings, further map-
ping to the execution character set. In practical terms, however, any
internal encoding may be used, so long as an actual extended character
encountered in the input, and the same extended character expressed in
the input as a universal-character-name (i.e. using the notation), are
handled equivalently.
form logical source lines. If, as a result, a character sequence
that matches the syntax of a universal-character-name is produced,
the behavior is undefined. If a source file that is not empty
does not end in a new-line character, or ends in a new-line char-
acter immediately preceded by a backslash character, the behavior
is undefined.
3 The source file is decomposed into preprocessing tokens
(_lex.pptoken_) and sequences of white-space characters (including
comments). A source file shall not end in a partial preprocessing
token or partial comment3). Each comment is replaced by one space
character. New-line characters are retained. Whether each
nonempty sequence of white-space characters other than new-line is
retained or replaced by one space character is implementation-
defined. The process of dividing a source file's characters into
preprocessing tokens is context-dependent. [Example: see the han-
dling of < within a #include preprocessing directive. ]
4 Preprocessing directives are executed and macro invocations are
expanded. If a character sequence that matches the syntax of a
universal-character-name is produced by token concatenation
(_cpp.concat_), the behavior is undefined. A #include preprocess-
ing directive causes the named header or source file to be pro-
cessed from phase 1 through phase 4, recursively.
5 Each source character set member, escape sequence, or universal-
character-name in character literals and string literals is con-
verted to a member of the execution character set.
6 Adjacent ordinary string literal tokens are concatenated. Adja-
cent wide string literal tokens are concatenated.
7 White-space characters separating tokens are no longer signifi-
cant. Each preprocessing token is converted into a token.
(_lex.token_). The resulting tokens are syntactically and semanti-
cally analyzed and translated. [Note: Source files, translation
units and translated translation units need not necessarily be
stored as files, nor need there be any one-to-one correspondence
between these entities and any external representation. The
description is conceptual only, and does not specify any particu-
lar implementation. ]
8 Translated translation units and instantiation units are combined
as follows: [Note: some or all of these may be supplied from a
library. ] Each translated translation unit is examined to pro-
duce a list of required instantiations. [Note: this may include
instantiations which have been explicitly requested
_________________________
3) A partial preprocessing token would arise from a source file ending
in the first portion of a multi-character token that requires a termi-
nating sequence of characters, such as a header-name that is missing
the closing " or >. A partial comment would arise from a source file
ending with an unclosed /* comment.
(_temp.explicit_). ] The definitions of the required templates
are located. It is implementation-defined whether the source of
the translation units containing these definitions is required to
be available. [Note: an implementation could encode sufficient
information into the translated translation unit so as to ensure
the source is not required here. ] All the required instantia-
tions are performed to produce instantiation units. [Note: these
are similar to translated translation units, but contain no refer-
ences to uninstantiated templates and no template definitions. ]
The program is ill-formed if any instantiation fails.
9 All external object and function references are resolved. Library
components are linked to satisfy external references to functions
and objects not defined in the current translation. All such
translator output is collected into a program image which contains
information needed for execution in its execution environment.
2.2 Basic source character set [lex.charset]
1 The basic source character set consists of 96 characters: the space
character, the control characters representing horizontal tab, verti-
cal tab, form feed, and new-line, plus the following 91 graphical
characters:
a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9
_ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = , \ " '
2 The universal-character-name construct provides a way to name other
characters.
hex-quad:
hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
universal-character-name:
\u hex-quad
\U hex-quad hex-quad
The character designated by the universal-character-name \UNNNNNNNN is
that character whose encoding in ISO/IEC 10646 is the hexadecimal
value NNNNNNNN; the character designated by the universal-character-
name \uNNNN is that character whose encoding in ISO/IEC 10646 is the
hexadecimal value 0000NNNN.
2.3 Trigraph sequences [lex.trigraph]
1 Before any other processing takes place, each occurrence of one of the
following sequences of three characters ("trigraph sequences") is
replaced by the single character indicated in Table 1.
Table 1--trigraph sequences
+-----------------------+------------------------+------------------------+
|trigraph replacement | trigraph replacement | trigraph replacement |
+-----------------------+------------------------+------------------------+
| ??= # | ??( [ | ??< { |
+-----------------------+------------------------+------------------------+
| ??/ \ | ??) ] | ??> } |
+-----------------------+------------------------+------------------------+
| ??' ^ | ??! | | ??- ~ |
+-----------------------+------------------------+------------------------+
2 [Example:
??=define arraycheck(a,b) a??(b??) ??!??! b??(a??)
becomes
#define arraycheck(a,b) a[b] || b[a]
--end example]
3 [Note: no other trigraph sequence exists. Each ? that does not begin
one of the trigraphs listed above is not changed. ]
4 Trigraph replacement is done left to right, so that when two sequences
which could represent trigraphs overlap, only the first sequence is
replaced. Characters that result from trigraph replacement are never
part of a subsequent trigraph. [Example: The sequence "???=" becomes
"?=", not "?#". The sequence "?????????" becomes "???", not "?".
--end example]
2.4 Preprocessing tokens [lex.pptoken]
preprocessing-token:
header-name
identifier
pp-number
character-literal
string-literal
preprocessing-op-or-punc
each non-white-space character that cannot be one of the above
1 Each preprocessing token that is converted to a token (_lex.token_)
shall have the lexical form of a keyword, an identifier, a literal, an
operator, or a punctuator.
2 A preprocessing token is the minimal lexical element of the language
in translation phases 3 through 6. The categories of preprocessing
token are: header names, identifiers, preprocessing numbers, character
literals, string literals, preprocessing-op-or-punc, and single non-
white-space characters that do not lexically match the other prepro-
cessing token categories. If a ' or a " character matches the last
category, the behavior is undefined. Preprocessing tokens can be sep-
arated by white space; this consists of comments (_lex.comment_), or
white-space characters (space, horizontal tab, new-line, vertical tab,
and form-feed), or both. As described in Clause _cpp_, in certain
circumstances during translation phase 4, white space (or the absence
thereof) serves as more than preprocessing token separation. White
space can appear within a preprocessing token only as part of a header
name or between the quotation characters in a character literal or
string literal.
3 If the input stream has been parsed into preprocessing tokens up to a
given character, the next preprocessing token is the longest sequence
of characters that could constitute a preprocessing token, even if
that would cause further lexical analysis to fail.
4 [Example: The program fragment 1Ex is parsed as a preprocessing number
token (one that is not a valid floating or integer literal token),
even though a parse as the pair of preprocessing tokens 1 and Ex might
produce a valid expression (for example, if Ex were a macro defined as
+1). Similarly, the program fragment 1E1 is parsed as a preprocessing
number (one that is a valid floating literal token), whether or not E
is a macro name. ]
5 [Example: The program fragment x+++++y is parsed as x ++ ++ + y,
which, if x and y are of built-in types, violates a constraint on
increment operators, even though the parse x ++ + ++ y might yield a
correct expression. ]
2.5 Alternative tokens [lex.digraph]
1 Alternative token representations are provided for some operators and
punctuators4).
2 In all respects of the language, each alternative token behaves the
same, respectively, as its primary token, except for its spelling5).
The set of alternative tokens is defined in Table 2.
_________________________
4) These include "digraphs" and additional reserved words. The term
"digraph" (token consisting of two characters) is not perfectly de-
scriptive, since one of the alternative preprocessing-tokens is %:%:
and of course several primary tokens contain two characters. Nonethe-
less, those alternative tokens that aren't lexical keywords are collo-
quially known as "digraphs".
5) Thus the "stringized" values (_cpp.stringize_) of [ and <: will be
different, maintaining the source spelling, but the tokens can other-
wise be freely interchanged.
Table 2--alternative tokens
+----------------------+-----------------------+-----------------------+
|alternative primary | alternative primary | alternative primary |
+----------------------+-----------------------+-----------------------+
| <% { | and && | and_eq &= |
+----------------------+-----------------------+-----------------------+
| %> } | bitor | | or_eq |= |
+----------------------+-----------------------+-----------------------+
| <: [ | or || | xor_eq ^= |
+----------------------+-----------------------+-----------------------+
| :> ] | xor ^ | not ! |
+----------------------+-----------------------+-----------------------+
| %: # | compl ~ | not_eq != |
+----------------------+-----------------------+-----------------------+
| %:%: ## | bitand & | |
+----------------------+-----------------------+-----------------------+
2.6 Tokens [lex.token]
token:
identifier
keyword
literal
operator
punctuator
1 There are five kinds of tokens: identifiers, keywords, literals,6)
operators, and other separators. Blanks, horizontal and vertical
tabs, newlines, formfeeds, and comments (collectively, "white space"),
as described below, are ignored except as they serve to separate
tokens. Some white space is required to separate otherwise adjacent
identifiers, keywords, and literals.
2.7 Comments [lex.comment]
1 The characters /* start a comment, which terminates with the charac-
ters */. These comments do not nest. The characters // start a com-
ment, which terminates with the next new-line character. If there is a
form-feed or a vertical-tab character in such a comment, only white-
space characters shall appear between it and the new-line that termi-
nates the comment; no diagnostic is required. [Note: The comment
characters //, /*, and */ have no special meaning within a // comment
and are treated just like other characters. Similarly, the comment
characters // and /* have no special meaning within a /* comment. ]
_________________________
6) Literals include strings and character and numeric literals.
2.8 Header names [lex.header]
header-name:
<h-char-sequence>
"q-char-sequence"
h-char-sequence:
h-char
h-char-sequence h-char
h-char:
any member of the source character set except
new-line and >
q-char-sequence:
q-char
q-char-sequence q-char
q-char:
any member of the source character set except
new-line and "
1 Header name preprocessing tokens shall only appear within a #include
preprocessing directive (_cpp.include_). The sequences in both forms
of header-names are mapped in an implementation-defined manner to
headers or to external source file names as specified in
_cpp.include_.
2 If either of the characters ' or \, or either of the character
sequences /* or // appears in a q-char-sequence or a h-char-sequence,
or the character " appears in a h-char-sequence, the behavior is
undefined.7)
2.9 Preprocessing numbers [lex.ppnumber]
pp-number:
digit
. digit
pp-number digit
pp-number nondigit
pp-number e sign
pp-number E sign
pp-number .
1 Preprocessing number tokens lexically include all integral literal
tokens (_lex.icon_) and all floating literal tokens (_lex.fcon_).
2 A preprocessing number does not have a type or a value; it acquires
both after a successful conversion (as part of translation phase 7,
_lex.phases_) to an integral literal token or a floating literal
token.
_________________________
7) Thus, sequences of characters that resemble escape sequences cause
undefined behavior.
2.10 Identifiers [lex.name]
identifier:
nondigit
identifier nondigit
identifier digit
nondigit: one of
universal-character-name
_ a b c d e f g h i j k l m
n o p q r s t u v w x y z
A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z
digit: one of
0 1 2 3 4 5 6 7 8 9
1 An identifier is an arbitrarily long sequence of letters and digits.
Each universal-character-name in an identifier shall designate a char-
acter whose encoding in ISO 10646 falls into one of the ranges speci-
fied in _extendid_. Upper- and lower-case letters are different. All
characters are significant.8)
2 In addition, identifiers containing a double underscore (__) or begin-
ning with an underscore and an upper-case letter are reserved for use
by C++ implementations and standard libraries and shall not be used
otherwise; no diagnostic is required.
2.11 Keywords [lex.key]
1 The identifiers shown in Table 3 are reserved for use as keywords
(that is, they are unconditionally treated as keywords in phase 7):
_________________________
8) On systems in which linkers cannot accept extended characters, an
encoding of the universal-character-name may be used in forming valid
external identifiers. For example, some otherwise unused character or
sequence of characters may be used to encode the \u in a universal-
character-name. Extended characters may produce a long external iden-
tifier, but C++ does not place a translation limit on significant
characters for external identifiers. In C++, upper- and lower-case
letters are considered different for all identifiers, including exter-
nal identifiers.
Table 3--keywords
+--------------------------------------------------------------------------+
|asm do inline short typeid |
|auto double int signed typename |
|bool dynamic_cast long sizeof union |
|break else mutable static unsigned |
|case enum namespace static_cast using |
|catch explicit new struct virtual |
|char extern operator switch void |
|class false private template volatile |
|const float protected this wchar_t |
|const_cast for public throw while |
|continue friend register true |
|default goto reinterpret_cast try |
|delete if return typedef |
+--------------------------------------------------------------------------+
2 Furthermore, the alternative representations shown in Table 4 for cer-
tain operators and punctuators (_lex.digraph_) are reserved and shall
not be used otherwise:
Table 4--alternative representations
+------------------------------------------------+
|and and_eq bitand bitor compl not |
|not_eq or or_eq xor xor_eq |
+------------------------------------------------+
2.12 Operators and punctuators
1 The lexical representation of C++ programs includes a number of pre-
processing tokens which are used in the syntax of the preprocessor or
are converted into tokens for operators and punctuators:
preprocessing-op-or-punc: one of
{ } [ ] # ## ( )
<: :> <% %> %: %:%: ; : ...
new delete ? :: . .*
+ - * / % ^ & | ~
! = < > += -= *= /= %=
^= &= |= << >> >>= <<= == !=
<= >= && || ++ -- , ->* ->
and and_eq bitand bitor compl not not_eq or or_eq
xor xor_eq
Each preprocessing-op-or-punc is converted to a single token in trans-
lation phase 7 (_lex.phases_).
2.13 Literals [lex.literal]
1 There are several kinds of literals.9)
literal:
integer-literal
character-literal
floating-literal
string-literal
boolean-literal
2.13.1 Integer literals [lex.icon]
integer-literal:
decimal-literal integer-suffixopt
octal-literal integer-suffixopt
hexadecimal-literal integer-suffixopt
decimal-literal:
nonzero-digit
decimal-literal digit
octal-literal:
0
octal-literal octal-digit
hexadecimal-literal:
0x hexadecimal-digit
0X hexadecimal-digit
hexadecimal-literal hexadecimal-digit
nonzero-digit: one of
1 2 3 4 5 6 7 8 9
octal-digit: one of
0 1 2 3 4 5 6 7
hexadecimal-digit: one of
0 1 2 3 4 5 6 7 8 9
a b c d e f
A B C D E F
integer-suffix:
unsigned-suffix long-suffixopt
long-suffix unsigned-suffixopt
unsigned-suffix: one of
u U
long-suffix: one of
l L
1 An integer literal is a sequence of digits that has no period or expo-
nent part. An integer literal may have a prefix that specifies its
base and a suffix that specifies its type. The lexically first digit
of the sequence of digits is the most significant. A decimal integer
literal (base ten) begins with a digit other than 0 and consists of a
sequence of decimal digits. An octal integer literal (base eight)
begins with the digit 0 and consists of a sequence of octal digits.10)
An hexadecimal integer literal (base sixteen) begins with 0x or 0X and
_________________________
9) The term "literal" generally designates, in this International
Standard, those tokens that are called "constants" in ISO C.
10) The digits 8 and 9 are not octal digits.
consists of a sequence of hexadecimal digits, which include the deci-
mal digits and the letters a through f and A through F with decimal
values ten through fifteen. [Example: the number twelve can be writ-
ten 12, 014, or 0XC. ]
2 The type of an integer literal depends on its form, value, and suffix.
If it is decimal and has no suffix, it has the first of these types in
which its value can be represented: int, long int, unsigned long
int.11) If it is octal or hexadecimal and has no suffix, it has the
first of these types in which its value can be represented: int,
unsigned int, long int, unsigned long int. If it is suffixed by u or
U, its type is the first of these types in which its value can be rep-
resented: unsigned int, unsigned long int. If it is suffixed by l or
L, its type is the first of these types in which its value can be rep-
resented: long int, unsigned long int. If it is suffixed by ul, lu,
uL, Lu, Ul, lU, UL, or LU, its type is unsigned long int.
3 A program is ill-formed if one of its translation units contains an
integer literal that cannot be represented by any of the allowed
types.
2.13.2 Character literals [lex.ccon]
character-literal:
'c-char-sequence'
L'c-char-sequence'
c-char-sequence:
c-char
c-char-sequence c-char
c-char:
any member of the source character set except
the single-quote ', backslash \, or new-line character
escape-sequence
universal-character-name
escape-sequence:
simple-escape-sequence
octal-escape-sequence
hexadecimal-escape-sequence
simple-escape-sequence: one of
\' \" \? \\
\a \b \f \n \r \t \v
octal-escape-sequence:
\ octal-digit
\ octal-digit octal-digit
\ octal-digit octal-digit octal-digit
hexadecimal-escape-sequence:
\x hexadecimal-digit
hexadecimal-escape-sequence hexadecimal-digit
_________________________
11) A decimal integer literal with no suffix never has type unsigned
int. Otherwise, for example, on an implementation where unsigned int
values have 16 bits and unsigned long values have strictly more than
17 bits, we would have -30000<0, -50000>0 (because 50000 would have
type unsigned int), and -70000<0 (because 70000 would have type long).
1 A character literal is one or more characters enclosed in single
quotes, as in 'x', optionally preceded by the letter L, as in L'x'. A
character literal that does not begin with L is an ordinary character
literal, also referred to as a narrow-character literal. An ordinary
character literal that contains a single c-char has type char, with
value equal to the numerical value of the encoding of the c-char in
the execution character set. An ordinary character literal that con-
tains more than one c-char is a multicharacter literal. A multichar-
acter literal has type int and implementation-defined value.
2 A character literal that begins with the letter L, such as L'x', is a
wide-character literal. A wide-character literal has type wchar_t.12)
The value of a wide-character literal containing a single c-char has
value equal to the numerical value of the encoding of the c-char in
the execution wide-character set. The value of a wide-character lit-
eral containing multiple c-chars is implementation-defined.
3 Certain nongraphic characters, the single quote ', the double quote ",
the question mark ?, and the backslash \, can be represented according
to Table 5.
Table 5--escape sequences
+----------------------------------+
|new-line NL (LF) \n |
|horizontal tab HT \t |
|vertical tab VT \v |
|backspace BS \b |
|carriage return CR \r |
|form feed FF \f |
|alert BEL \a |
|backslash \ \\ |
|question mark ? \? |
|single quote ' \' |
|double quote " \" |
|octal number ooo \ooo |
|hex number hhh \xhhh |
+----------------------------------+
The double quote " and the question mark ?, can be represented as
themselves or by the escape sequences \" and \? respectively, but the
single quote ' and the backslash \ shall be represented by the escape
sequences \' and \\ respectively. If the character following a back-
slash is not one of those specified, the behavior is undefined. An
escape sequence specifies a single character.
4 The escape \ooo consists of the backslash followed by one, two, or
three octal digits that are taken to specify the value of the desired
_________________________
12) They are intended for character sets where a character does not
fit into a single byte.
character. The escape \xhhh consists of the backslash followed by x
followed by one or more hexadecimal digits that are taken to specify
the value of the desired character. There is no limit to the number
of digits in a hexadecimal sequence. A sequence of octal or hexadeci-
mal digits is terminated by the first character that is not an octal
digit or a hexadecimal digit, respectively. The value of a character
literal is implementation-defined if it falls outside of the implemen-
tation-defined range defined for char (for ordinary literals) or
wchar_t (for wide literals).
5 A universal-character-name is translated to the encoding, in the exe-
cution character set, of the character named. If there is no such
encoding, the universal-character-name is translated to an implementa-
tion-defined encoding. [Note: in translation phase 1, a universal-
character-name is introduced whenever an actual extended character is
encountered in the source text. Therefore, all extended characters
are described in terms of universal-character-names. However, the
actual compiler implementation may use its own native character set,
so long as the same results are obtained. ]
2.13.3 Floating literals [lex.fcon]
floating-literal:
fractional-constant exponent-partopt floating-suffixopt
digit-sequence exponent-part floating-suffixopt
fractional-constant:
digit-sequenceopt . digit-sequence
digit-sequence .
exponent-part:
e signopt digit-sequence
E signopt digit-sequence
sign: one of
+ -
digit-sequence:
digit
digit-sequence digit
floating-suffix: one of
f l F L
1 A floating literal consists of an integer part, a decimal point, a
fraction part, an e or E, an optionally signed integer exponent, and
an optional type suffix. The integer and fraction parts both consist
of a sequence of decimal (base ten) digits. Either the integer part
or the fraction part (not both) can be omitted; either the decimal
point or the letter e (or E) and the exponent (not both) can be omit-
ted. The integer part, the optional decimal point and the optional
fraction part form the significant part of the floating literal. The
exponent, if present, indicates the power of 10 by which the signifi-
cant part is to be scaled. If the scaled value is in the range of
representable values for its type, the result is the scaled value if
representable, else the larger or smaller representable value nearest
the scaled value, chosen in an implementation-defined manner. The
type of a floating literal is double unless explicitly specified by a
suffix. The suffixes f and F specify float, the suffixes l and L
specify long double. If the scaled value is not in the range of
representable values for its type, the program is ill-formed.
2.13.4 String literals [lex.string]
string-literal:
"s-char-sequenceopt"
L"s-char-sequenceopt"
s-char-sequence:
s-char
s-char-sequence s-char
s-char:
any member of the source character set except
the double-quote ", backslash \, or new-line character
escape-sequence
universal-character-name
1 A string literal is a sequence of characters (as defined in
_lex.ccon_) surrounded by double quotes, optionally beginning with the
letter L, as in "..." or L"...". A string literal that does not begin
with L is an ordinary string literal, also referred to as a narrow
string literal. An ordinary string literal has type "array of n const
char" and static storage duration (_basic.stc_), where n is the size
of the string as defined below, and is initialized with the given
characters. A string literal that begins with L, such as L"asdf", is
a wide string literal. A wide string literal has type "array of n
const wchar_t" and has static storage duration, where n is the size of
the string as defined below, and is initialized with the given charac-
ters.
2 Whether all string literals are distinct (that is, are stored in
nonoverlapping objects) is implementation-defined. The effect of
attempting to modify a string literal is undefined.
3 In translation phase 6 (_lex.phases_), adjacent narrow string literals
are concatenated and adjacent wide string literals are concatenated.
If a narrow string literal token is adjacent to a wide string literal
token, the behavior is undefined. Characters in concatenated strings
are kept distinct. [Example:
"\xA" "B"
contains the two characters '\xA' and 'B' after concatenation (and not
the single hexadecimal character '\xAB'). ]
4 After any necessary concatenation, in translation phase 7
(_lex.phases_), '\0' is appended to every string literal so that pro-
grams that scan a string can find its end.
5 Escape sequences and universal-character-names in string literals have
the same meaning as in character literals (_lex.ccon_), except that
the single quote ' is representable either by itself or by the escape
sequence \', and the double quote " shall be preceded by a \. In a
narrow string literal, a universal-character-name may map to more than
one char element due to multibyte encoding. The size of a wide string
literal is the total number of escape sequences, universal-character-
names, and other characters, plus one for the terminating L'\0'. The
size of a narrow string literal is the total number of escape
sequences and other characters, plus at least one for the multibyte
encoding of each universal-character-name, plus one for the terminat-
ing '\0'.
2.13.5 Boolean literals [lex.bool]
boolean-literal:
false
true
1 The Boolean literals are the keywords false and true. Such literals
have type bool. They are not lvalues.