lex


  ______________________________________________________________________

  2   Lexical conventions                                [lex]

  ______________________________________________________________________

1 The text of the program is kept in units called source files  in  this
  International  Standard.   A source file together with all the headers
  (_lib.headers_) and source files included (_cpp.include_) via the pre
  processing directive #include, less any source lines skipped by any of
  the conditional inclusion (_cpp.cond_)  preprocessing  directives,  is
  called  a  translation  unit.   [Note:  a C++  program need not all be
  translated at the same time.  ]

2 [Note: previously translated translation units and instantiation units
  can  be  preserved individually or in libraries. The separate transla
  tion units of a program communicate (_basic.link_)  by  (for  example)
  calls  to functions whose identifiers have external linkage, manipula
  tion of objects whose identifiers have external linkage, or  manipula
  tion of data files. Translation units can be separately translated and
  then later linked to produce an executable program. (_basic.link_).  ]

  2.1  Phases of translation                                [lex.phases]

1 The  precedence  among the syntax rules of translation is specified by
  the following phases.1)

    1 Physical source file characters are mapped to the source character
      set (introducing new-line characters for  end-of-line  indicators)
      if necessary.  Trigraph sequences (_lex.trigraph_) are replaced by
      corresponding  single-character  internal  representations.    Any
      source  file  character  not  in  the  basic  source character set
      (_lex.charset_) is replaced by the  universal-character-name  that
      designates that character.2)

    2 Each instance of a new-line character and an immediately preceding
      backslash  character is deleted, splicing physical source lines to
  _________________________
  1) Implementations must behave as if these separate phases occur,  al
  though in practice different phases might be folded together.
  2)  The  process of handling extended characters is specified in terms
  of mapping to an encoding that uses only the  basic  source  character
  set,  and, in the case of character literals and strings, further map
  ping to the execution character set.  In practical terms, however, any
  internal encoding may be used, so long as an actual extended character
  encountered in the input, and the same extended character expressed in
  the input as a universal-character-name (i.e. using the notation), are
  handled equivalently.

      form logical source lines.  A source file that is not empty  shall
      end  in  a new-line character, which shall not be immediately pre
      ceded by a backslash character.

    3 The  source  file  is   decomposed   into   preprocessing   tokens
      (_lex.pptoken_) and sequences of white-space characters (including
      comments).  A source file shall not end in a partial preprocessing
      token or partial comment3).  Each comment is replaced by one space
      character.   New-line  characters  are  retained.   Whether   each
      nonempty sequence of white-space characters other than new-line is
      retained or replaced by one  space  character  is  implementation-
      defined.   The process of dividing a source file's characters into
      preprocessing tokens is context-dependent.  [Example: see the han
      dling of < within a #include preprocessing directive.  ]

    4 Preprocessing  directives  are  executed and macro invocations are
      expanded.  A #include preprocessing  directive  causes  the  named
      header  or  source file to be processed from phase 1 through phase
      4, recursively.

    5 Each source character set member, escape sequence,  or  universal-
      character-name  in  character literals and string literals is con
      verted to a member of the execution character set.

    6 Adjacent character string literal tokens are concatenated.   Adja
      cent wide string literal tokens are concatenated.

    7 White-space  characters  separating  tokens are no longer signifi
      cant.  Each  preprocessing  token  is  converted  into  a   token.
      (_lex.token_). The resulting tokens are syntactically and semanti
      cally analyzed and translated.

    8 Translated translation units and instantiation units are  combined
      as  follows:  [Note:  some  or all of these may be supplied from a
      library.  ] Each translated translation unit is examined  to  pro
      duce  a  list of required instantiations.  [Note: this may include
      instantiations    which    have    been    explicitly    requested
      (_temp.explicit_).   ]  The  definitions of the required templates
      are located. It is implementation-defined whether  the  source  of
      the  translation units containing these definitions is required to
      be available.  [Note: an implementation  could  encode  sufficient
      information  into  the translated translation unit so as to ensure
      the source is not required here.  ] All  the  required  instantia
      tions  are performed to produce instantiation units.  [Note: these
      are similar to translated translation units, but contain no refer
      ences  to uninstantiated templates and no template definitions.  ]
  _________________________
  3) A partial preprocessing token would arise from a source file ending
  in one or more characters of a multi-character  token  followed  by  a
  "line-splicing"  backslash.   A  partial  comment  would  arise from a
  source file ending with an unclosed /* comment, or a //  comment  line
  that ends with a "line-splicing" backslash.

      The program is ill-formed if any instantiation fails.

    9 All external object and function references are resolved.  Library
      components  are linked to satisfy external references to functions
      and objects not defined  in  the  current  translation.  All  such
      translator output is collected into a program image which contains
      information needed for execution in its execution environment.

  +-------                 BEGIN BOX 1                -------+
    Corfield: The inclusion model  for  template  compilation  does  not
    require  the  two  sentences  in  phase  8:  The  definitions of the
    required templates are located. It is implementation-defined whether
    the  source of the translation units containing these definitions is
    required to be available.
  +-------                  END BOX 1                 -------+

  +-------                 BEGIN BOX 2                -------+
    What about shared libraries?
  +-------                  END BOX 2                 -------+

  2.2  Basic source character set                          [lex.charset]

1 The basic source character set consists of 96  characters:  the  space
  character,  the control characters representing horizontal tab, verti
  cal tab, form feed, and new-line,  plus  the  following  91  graphical
  characters:
          a b c d e f g h i j k l m n o p q r s t u v w x y z
          A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
          0 1 2 3 4 5 6 7 8 9
          _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = ,  " '

2 The  universal-character-name  construct  provides a way to name other
  characters.
          hex-quad:
                  hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit

          universal-character-name:
                  ??u hex-quad
                  ??U hex-quad hex-quad
  The character designated by the  universal-character-name  ??UNNNNNNNN
  is  that  character whose encoding in ISO/IEC 10646 is the hexadecimal
  value NNNNNNNN; the character designated by  the  universal-character-
  name  ??uNNNN is that character whose encoding in ISO/IEC 10646 is the
  hexadecimal value 0000NNNN.

  2.3  Trigraph sequences                                 [lex.trigraph]

1 Before any other processing takes place, each occurrence of one of the
  following  sequences  of  three  characters  ("trigraph sequences") is
  replaced by the single character indicated in Table 1.

                       Table 1--trigraph sequences

  +-----------------------+------------------------+------------------------+
  |trigraph   replacement | trigraph   replacement | trigraph   replacement |
  +-----------------------+------------------------+------------------------+
  |  ??=           #      |   ??(           [      |   ??<           {      |
  +-----------------------+------------------------+------------------------+
  |  ??/           \      |   ??)           ]      |   ??>           }      |
  +-----------------------+------------------------+------------------------+
  |  ??'           ^      |   ??!           |      |   ??-           ~      |
  +-----------------------+------------------------+------------------------+

2 [Example:
          ??=define arraycheck(a,b) a??(b??) ??!??! b??(a??)
  becomes
          #define arraycheck(a,b) a[b] || b[a]
   --end example]

3 [Note: no other trigraph sequence exists.  Each ?  that does not begin
  one of the trigraphs listed above is not changed.  ]

  2.4  Preprocessing tokens                                [lex.pptoken]
          preprocessing-token:
                  header-name
                  identifier
                  pp-number
                  character-literal
                  string-literal
                  preprocessing-op-or-punc
                  each non-white-space character that cannot be one of the above

1 Each  preprocessing  token  that is converted to a token (_lex.token_)
  shall have the lexical form of a keyword, an identifier, a literal, an
  operator, or a punctuator.

2 A  preprocessing  token is the minimal lexical element of the language
  in translation phases 3 through 6.  The  categories  of  preprocessing
  token are: header names, identifiers, preprocessing numbers, character
  literals, string literals, preprocessing-op-or-punc, and  single  non-
  white-space  characters  that do not lexically match the other prepro
  cessing token categories.  If a ' or a " character  matches  the  last
  category, the behavior is undefined.  Preprocessing tokens can be sep
  arated by white space; this consists of comments  (_lex.comment_),  or
  white-space characters (space, horizontal tab, new-line, vertical tab,
  and form-feed), or both.  As described in  Clause  _cpp_,  in  certain
  circumstances  during translation phase 4, white space (or the absence
  thereof) serves as more than preprocessing  token  separation.   White
  space can appear within a preprocessing token only as part of a header
  name or between the quotation characters in  a  character  literal  or
  string literal.

3 If  the input stream has been parsed into preprocessing tokens up to a
  given character, the next preprocessing token is the longest  sequence
  of  characters  that  could  constitute a preprocessing token, even if
  that would cause further lexical analysis to fail.

4 [Example: The program fragment 1Ex is parsed as a preprocessing number
  token  (one  that  is  not a valid floating or integer literal token),
  even though a parse as the pair of preprocessing tokens 1 and Ex might
  produce a valid expression (for example, if Ex were a macro defined as
  +1).  Similarly, the program fragment 1E1 is parsed as a preprocessing
  number  (one that is a valid floating literal token), whether or not E
  is a macro name.  ]

5 [Example: The program fragment x+++++y is parsed  as  x  ++  ++  +  y,
  which,  if  x  and  y  are of built-in types, violates a constraint on
  increment operators, even though the parse x ++ + ++ y might  yield  a
  correct expression.  ]

  2.5  Alternative tokens                                  [lex.digraph]

1 Alternative  token representations are provided for some operators and
  punctuators4).

2 In  all  respects  of the language, each alternative token behaves the
  same,  respectively,  as its primary token, except for its spelling5).
  The set of alternative tokens is defined in Table 2.

  _________________________
  4) These include "digraphs" and additional reserved words.   The  term
  "digraph"  (token  consisting  of two characters) is not perfectly de
  scriptive, since one of the alternative preprocessing-tokens  is  %:%:
  and of course several primary tokens contain two characters.  Nonethe
  less, those alternative tokens that aren't lexical keywords are collo
  quially known as "digraphs".
  5)   Thus   [   and   <:   behave   differently   when    "stringized"
  (_cpp.stringize_), but can otherwise be freely interchanged.

                       Table 2--alternative tokens

  +----------------------+-----------------------+-----------------------+
  |alternative   primary | alternative   primary | alternative   primary |
  +----------------------+-----------------------+-----------------------+
  |    <%           {    |     and         &&    |   and_eq        &=    |
  +----------------------+-----------------------+-----------------------+
  |    %>           }    |    bitor         |    |    or_eq        |=    |
  +----------------------+-----------------------+-----------------------+
  |    <:           [    |     or          ||    |   xor_eq        ^=    |
  +----------------------+-----------------------+-----------------------+
  |    :>           ]    |     xor          ^    |     not          !    |
  +----------------------+-----------------------+-----------------------+
  |    %:           #    |    compl         ~    |   not_eq        !=    |
  +----------------------+-----------------------+-----------------------+
  |   %:%:         ##    |   bitand         &    |                       |
  +----------------------+-----------------------+-----------------------+

  2.6  Tokens                                                [lex.token]
          token:
                  identifier
                  keyword
                  literal
                  operator
                  punctuator

1 There are five kinds of  tokens:  identifiers,  keywords,  literals,6)
  operators,  and  other  separators.   Blanks,  horizontal and vertical
  tabs, newlines, formfeeds, and comments (collectively, "white space"),
  as  described  below,  are  ignored  except  as they serve to separate
  tokens.  Some white space is required to separate  otherwise  adjacent
  identifiers, keywords, and literals.

  2.7  Comments                                            [lex.comment]

1 The  characters  /* start a comment, which terminates with the charac
  ters */.  These comments do not nest.  The characters // start a  com
  ment, which terminates with the next new-line character. If there is a
  form-feed or a vertical-tab character in such a comment,  only  white-
  space  characters shall appear between it and the new-line that termi
  nates the comment; no diagnostic  is  required.   [Note:  The  comment
  characters  //, /*, and */ have no special meaning within a // comment
  and are treated just like other characters.   Similarly,  the  comment
  characters // and /* have no special meaning within a /* comment.  ]

  _________________________
  6) Literals include strings and character and numeric literals.

  2.8  Header names                                         [lex.header]
          header-name:
                  <h-char-sequence>
                  "q-char-sequence"
          h-char-sequence:
                  h-char
                  h-char-sequence h-char
          h-char:
                  any member of the source character set except
                          new-line and >
          q-char-sequence:
                  q-char
                  q-char-sequence q-char
          q-char:
                  any member of the source character set except
                          new-line and "

1 Header  name  preprocessing tokens shall only appear within a #include
  preprocessing directive (_cpp.include_).  The sequences in both  forms
  of  header-names  are  mapped  in  an implementation-defined manner to
  external source file names as specified in _cpp.include_.

2 If the characters ',  ,  , or /* appear in the sequence between the  <
  and   >  delimiters,  or  between  the  delimiters,  the  behavior  is
  undefined.7)

  2.9  Preprocessing numbers                              [lex.ppnumber]
          pp-number:
                  digit
                  . digit
                  pp-number digit
                  pp-number nondigit
                  pp-number e sign
                  pp-number E sign
                  pp-number .

1 Preprocessing  number  tokens  lexically  include all integral literal
  tokens (_lex.icon_) and all floating literal tokens (_lex.fcon_).

2 A preprocessing number does not have a type or a  value;  it  acquires
  both  after  a  successful conversion (as part of translation phase 7,
  _lex.phases_) to an integral  literal  token  or  a  floating  literal
  token.

  2.10  Identifiers                                           [lex.name]
          identifier:
                  nondigit
                  identifier nondigit
                  identifier digit

  _________________________
  7) Thus, sequences of characters that resemble escape sequences  cause
  undefined behavior.

          nondigit: one of
                  universal-character-name
                  _ a b c d e f g h i j k l m
                    n o p q r s t u v w x y z
                    A B C D E F G H I J K L M
                    N O P Q R S T U V W X Y Z
          digit: one of
                  0 1 2 3 4 5 6 7 8 9

1 An  identifier  is an arbitrarily long sequence of letters and digits.
  Each universal-character-name in an identifier shall designat a  char
  acter  whose encoding in ISO 10646 falls into one of the ranges speci
  fied in _extendid_.  Upper- and lower-case letters are different.  All
  characters are significant.8)

2 In addition, identifiers containing a double underscore (__) or begin
  ning with an underscore and an upper-case letter are reserved for  use
  by  C++  implementations  and standard libraries and shall not be used
  otherwise; no diagnostic is required.

  2.11  Keywords                                               [lex.key]

1 The identifiers shown in Table 3 are  reserved  for  use  as  keywords
  (that is, they are unconditionally treated as keywords in phase 7):

  _________________________
  8)  On  systems in which linkers cannot accept extended characters, an
  encoding of the universal-character-name may be used in forming  valid
  external identifiers.  For example, some otherwise unused character or
  sequence of characters may be used to encode the ??u in  a  universal-
  character-name.  Extended characters may produce a long external iden
  tifier, but C++ does not place  a  translation  limit  on  significant
  characters  for  external  identifiers.  In C++, upper- and lower-case
  letters are considered different for all identifiers, including exter
  nal identifiers.

                            Table 3--keywords

  +--------------------------------------------------------------------------+
  |asm          do             inline             short         typeid       |
  |auto         double         int                signed        typename     |
  |bool         dynamic_cast   long               sizeof        union        |
  |break        else           mutable            static        unsigned     |
  |case         enum           namespace          static_cast   using        |
  |catch        explicit       new                struct        virtual      |
  |char         extern         operator           switch        void         |
  |class        false          private            template      volatile     |
  |const        float          protected          this          wchar_t      |
  |const_cast   for            public             throw         while        |
  |continue     friend         register           true                       |
  |default      goto           reinterpret_cast   try                        |
  |delete       if             return             typedef                    |
  +--------------------------------------------------------------------------+

2 Furthermore, the alternative representations shown in Table 4 for cer
  tain operators and punctuators (_lex.digraph_) are reserved and  shall
  not be used otherwise:

                   Table 4--alternative representations

            +------------------------------------------------+
            |and      and_eq   bitand   bitor   compl    not |
            |not_eq   or       or_eq    xor     xor_eq       |
            +------------------------------------------------+

  2.12  Operators and punctuators

1 The  lexical  representation of C++ programs includes a number of pre
  processing tokens which are used in the syntax of the preprocessor  or
  are converted into tokens for operators and punctuators:
          preprocessing-op-or-punc: one of
          {       }       [       ]       #       ##      (       )
          <:      :>      <%      %>      %:      %:%:    ;       :       ...
          new     delete  ?       ::      .       .*
          +       -       *       /       %       ^       &       |       ~
          !       =       <       >       +=      -=      *=      /=      %=
          ^=      &=      |=      <<      >>      >>=     <<=     ==      !=
          <=      >=      &&      ||      ++      --      ,       ->*     ->
          and     and_eq  bitand  bitor   compl   not     not_eq  or      or_eq
          xor     xor_eq

  Each preprocessing-op-or-punc is converted to a single token in trans
  lation phase 7 (_lex.phases_).

  2.13  Literals                                           [lex.literal]

1 There are several kinds of literals.9)
          literal:
                  integer-literal
                  character-literal
                  floating-literal
                  string-literal
                  boolean-literal

  2.13.1  Integer literals                                    [lex.icon]
          integer-literal:
                  decimal-literal integer-suffixopt
                  octal-literal integer-suffixopt
                  hexadecimal-literal integer-suffixopt
          decimal-literal:
                  nonzero-digit
                  decimal-literal digit
          octal-literal:
                  0
                  octal-literal octal-digit
          hexadecimal-literal:
                  0x hexadecimal-digit
                  0X hexadecimal-digit
                  hexadecimal-literal hexadecimal-digit
          nonzero-digit: one of
                  1  2  3  4  5  6  7  8  9
          octal-digit: one of
                  0  1  2  3  4  5  6  7
          hexadecimal-digit: one of
                  0  1  2  3  4  5  6  7  8  9
                  a  b  c  d  e  f
                  A  B  C  D  E  F
          integer-suffix:
                  unsigned-suffix long-suffixopt
                  long-suffix unsigned-suffixopt
          unsigned-suffix: one of
                  u  U
          long-suffix: one of
                  l  L

1 An integer literal is a sequence of digits that has no period or expo
  nent part.  An integer literal may have a prefix  that  specifies  its
  base  and a suffix that specifies its type.  The lexically first digit
  of the sequence of digits is the most significant.  A decimal  integer
  literal  (base ten) begins with a digit other then 0 and consists of a
  sequence of decimal digits.  An octal  integer  literal  (base  eight)
  begins with the digit 0 and consists of a sequence of octal digits.10)
  An hexadecimal integer literal (base sixteen) begins with 0x or 0X and
  _________________________
  9) The term "literal"  generally  designates,  in  this  International
  Standard, those tokens that are called "constants" in ISO C.
  10) The digits 8 and 9 are not octal digits.

  consists of a sequence of hexadecimal digits which include the decimal
  digits and the letters a or A through f or F with decimal  values  ten
  through  fifteen.  [Example: the number twelve can be written 12, 014,
  or 0XC.  ]

2 The type of an integer literal depends on its form, value, and suffix.
  If it is decimal and has no suffix, it has the first of these types in
  which its value can be  represented:  int,  long  int,  unsigned  long
  int.11) If it is octal or hexadecimal and has no suffix,  it  has  the
  first  of  these  types  in  which  its value can be represented: int,
  unsigned int, long int, unsigned long int.  If it is suffixed by u  or
  U, its type is the first of these types in which its value can be rep
  resented: unsigned int, unsigned long int.  If it is suffixed by l  or
  L, its type is the first of these types in which its value can be rep
  resented: long int, unsigned long int.  If it is suffixed by  ul,  lu,
  uL, Lu, Ul, lU, UL, or LU, its type is unsigned long int.

3 A  program  is  ill-formed if one of its translation units contains an
  integer literal that cannot be  represented  by  any  of  the  allowed
  types.

  2.13.2  Character literals                                  [lex.ccon]
          character-literal:
                  'c-char-sequence'
                  L'c-char-sequence'
          c-char-sequence:
                  c-char
                  c-char-sequence c-char
          c-char:
                  any member of the source character set except
                          the single-quote ', backslash \, or new-line character
                  escape-sequence
                  universal-character-name
          escape-sequence:
                  simple-escape-sequence
                  octal-escape-sequence
                  hexadecimal-escape-sequence
          simple-escape-sequence: one of
                  \'  \"  \?  \\
                  \a  \b  \f  \n  \r  \t  \v
          octal-escape-sequence:
                  \ octal-digit
                  \ octal-digit octal-digit
                  \ octal-digit octal-digit octal-digit

  _________________________
  11)  A  decimal integer literal with no suffix never has type unsigned
  int.  Otherwise, for example, on an implementation where unsigned  int
  values  have  16 bits and unsigned long values have strictly more than
  17 bits, we would have -30000<0, -50000>0 (because  50000  would  have
  type unsigned int), and -70000<0 (because 70000 would have type long).

          hexadecimal-escape-sequence:
                  \x hexadecimal-digit
                  hexadecimal-escape-sequence hexadecimal-digit

1 A  character  literal  is  one  or  more characters enclosed in single
  quotes, as in 'x', optionally preceded by the letter L,  as  in  L'x'.
  Single  character  literals  that  do not begin with L have type char,
  with value equal to the numerical value of the character in the execu
  tion  character set.  Multicharacter literals that do not begin with L
  have type int and implementation-defined value.

2 A character literal that begins with the letter L, such as L'ab', is a
  wide-character literal.  Wide-character literals have type wchar_t.12)
  Wide-character literals have implementation-defined values, regardless
  of the number of characters in the literal.

3 Certain nongraphic characters, the single quote ', the double quote ",
  the question mark ?, and the backslash \, can be represented according
  to Table 5.

                        Table 5--escape sequences

                   +----------------------------------+
                   |new-line          NL (LF)   \n    |
                   |horizontal tab    HT        \t    |
                   |vertical tab      VT        \v    |
                   |backspace         BS        \b    |
                   |carriage return   CR        \r    |
                   |form feed         FF        \f    |
                   |alert             BEL       \a    |
                   |backslash         \         \\    |
                   |question mark     ?         \?    |
                   |single quote      '         \'    |
                   |double quote      "         \"    |
                   |octal number      ooo       \ooo  |
                   |hex number        hhh       \xhhh |
                   +----------------------------------+
  The double quote " and the question mark  ?,  can  be  represented  as
  themselves or by the escape sequences \" and \?  respectively, but the
  single quote ' and the backslash \ shall be represented by the  escape
  sequences  \' and \\ respectively.  If the character following a back
  slash is not one of those specified, the behavior  is  undefined.   An
  escape sequence specifies a single character.

4 The  escape  \ooo  consists  of the backslash followed by one, two, or
  three octal digits that are taken to specify the value of the  desired
  character.   The  escape \xhhh consists of the backslash followed by x
  _________________________
  12) They are intended for character sets where a  character  does  not
  fit into a single byte.

  followed by one or more hexadecimal digits that are taken  to  specify
  the  value  of the desired character.  There is no limit to the number
  of digits in a hexadecimal sequence.  A sequence of octal or hexadeci
  mal  digits  is terminated by the first character that is not an octal
  digit or a hexadecimal digit, respectively.  The value of a  character
  literal is implementation-defined if it falls outside of the implemen
  tation-defined range defined  for  char  (for  ordinary  literals)  or
  wchar_t (for wide literals).

5 A  universal-character-name is translated to the encoding, in the exe
  cution character set, of the character named.  If  there  is  no  such
  encoding, the universal-character-name is translated to an implementa
  tion-defined encoding.  [Note: in translation phase  1,  a  universal-
  character-name  is introduced whenever an actual extended character is
  encountered in the source text.  Therefore,  all  extended  characters
  are  described  in  terms  of universal-character-names.  However, the
  actual compiler implementation may use its own native  character  set,
  so long as the same results are obtained.  ]

  2.13.3  Floating literals                                   [lex.fcon]
          floating-literal:
                  fractional-constant exponent-partopt floating-suffixopt
                  digit-sequence exponent-part floating-suffixopt
          fractional-constant:
                  digit-sequenceopt . digit-sequence
                  digit-sequence .
          exponent-part:
                  e signopt digit-sequence
                  E signopt digit-sequence
          sign: one of
                  +  -
          digit-sequence:
                  digit
                  digit-sequence digit
          floating-suffix: one of
                  f  l  F  L

1 A  floating  literal  consists  of an integer part, a decimal point, a
  fraction part, an e or E, an optionally signed integer  exponent,  and
  an  optional type suffix.  The integer and fraction parts both consist
  of a sequence of decimal (base ten) digits.  Either the  integer  part
  or  the  fraction  part  (not both) can be omitted; either the decimal
  point or the letter e (or E) and the exponent (not both) can be  omit
  ted.   The  integer  part, the optional decimal point and the optional
  fraction part form the significant part of the floating literal.   The
  exponent,  if present, indicates the power of 10 by which the signifi
  cant part is to be scaled.  If the scaled value is  in  the  range  of
  representable  values  for  its type, the result is either the nearest
  representable value, or the  larger  or  smaller  representable  value
  immediately adjacent to the nearest representatble value, chosen in an
  implementation-defined manner.  The type of a floating literal is dou
  ble  unless  explicitly  specified  by a suffix.  The suffixes f and F
  specify float, the suffixes l and  L  specify  long  double.   If  the

  scaled value is not in the range of representable values for its type,
  the program is ill-formed.

  2.13.4  String literals                                   [lex.string]
          string-literal:
                  "s-char-sequenceopt"
                  L"s-char-sequenceopt"
          s-char-sequence:
                  s-char
                  s-char-sequence s-char
          s-char:
                  any member of the source character set except
                          the double-quote ", backslash \, or new-line character
                  escape-sequence
                  universal-character-name

1 A  string  literal  is  a  sequence  of  characters  (as  defined   in
  _lex.ccon_) surrounded by double quotes, optionally beginning with the
  letter L, as in "..." or L"...".  A string literal that does not begin
  with  L  has  type  "array  of  n  char"  and  static storage duration
  (_basic.stc_), where n is the size of the string as defined below, and
  is  initialized  with  the  given  characters.   A string literal that
  begins with L, such as L"asdf", is a  wide  string  literal.   A  wide
  string  literal  has  type "array of n wchar_t" and has static storage
  duration, where n is the size of the string as defined below,  and  is
  initialized with the given characters.

2 Whether  all  string  literals  are  distinct  (that is, are stored in
  nonoverlapping objects)  is  implementation-defined.   The  effect  of
  attempting to modify a string literal is undefined.

3 In  translation  phase  6 (_lex.phases_), adjacent string literals are
  concatenated and adjacent wide string literals are concatenated.  If a
  string  literal  token is adjacent to a wide string literal token, the
  behavior is undefined.  Characters in concatenated  strings  are  kept
  distinct.  [Example:
          "\xA" "B"
  contains the two characters '\xA' and 'B' after concatenation (and not
  the single hexadecimal character '\xAB').  ]

4 After  any   necessary   concatenation,   in   translation   phase   7
  (_lex.phases_),  '\0' is appended to every string literal so that pro
  grams that scan a string can find its end.

5 Escape sequences and universal-character-names in string literals have
  the  same  meaning  as in character literals (_lex.ccon_), except that
  the single quote ' is representable either by itself or by the  escape
  sequence  \',  and  the double quote " shall be preceded by a \.  In a
  non-wide string literal, a universal-character-name may  map  to  more
  than one char element.  The size of a wide string literal is the total
  number of escape sequences, universal-character-names, and other char
  acters,  plus  one  for the terminating L'\0'.  The size of a non-wide
  string literal is the total  number  of  escape  sequences  and  other

  characters,  plus at least one for the multibyte encoding of each uni
  versal-character-name, plus one for the terminating '\0'.

  2.13.5  Boolean literals                                    [lex.bool]
          boolean-literal:
                  false
                  true

1 The Boolean literals are the keywords false and true.   Such  literals
  have type bool.  They are not lvalues.