______________________________________________________________________ | 1 General [intro] ______________________________________________________________________ | 1.1 Scope | [intro.scope] | 1 This International Standard specifies requirements for processors of | the C++ programming language. The first such requirement is that they implement the language, and so this Standard also defines C++. Other requirements and relaxations of the first requirement appear at various places within the Standard. 2 C++ is a general purpose programming language based on the C programming language as described in ISO/IEC 9899 (1.2). In addition | to the facilities provided by C, C++ provides additional data types, | classes, templates, exceptions, inline functions, operator overloading, function name overloading, references, free store * management operators, function argument checking and type conversion, | and additional library facilities. These extensions to C are summarized in C.1. The differences between C++ and ISO C1) are | summarized in C.2. The extensions to C++ since 1985 are summarized in C.1.2. | 1.2 Normative references | [intro.refs] | 1 The following standards contain provisions which, through reference in | | this text, constitute provisions of this International Standard. At | the time of publication, the editions indicated were valid. All | standards are subject to revision, and parties to agreements based on | this International Standard are encouraged to investigate the | possibility of applying the most recent editions of the standards | indicated below. Members of IEC and ISO maintain registers of | currently valid International Standards. | - ANSI X3/TR-1-82:1982, American National Dictionary for Information | Processing Systems. | - ISO/IEC 9899:1990, C Standard | - ISO/IEC xxxx:199x Amendment 1 to C Standard | +------- BEGIN BOX 1 -------+ This last title must be filled in when Amendment 1 is approved. The | other titles have not been checked for accuracy. | +------- END BOX 1 -------+ 1-2 General DRAFT: 25 January 1994 1.3 Definitions 1.3 Definitions | [intro.defs] | 1 For the purposes of this International Standard, the definitions given | | in ANSI X3/TR-1-82 and the following definitions apply. | - argument: An expression in the comma-separated list bounded by the | parentheses in a function call expression, a sequence of | prepreocessing tokens in the comma-separated list bounded by the | parentheses in a function-like macro invocation, the operand of | throw, or an expression in the comma-separated list bounded by the | angle brackets in a template instantiation. Also known as an actual | argument or actual parameter. | - diagnostic message: A message belonging to an implementation- | defined subset of the implementation's message output. | - dynamic type: The dynamic type of an expression is determined by | its current value and may change during the execution of a program. | If a pointer (8.3.1) whose static type is pointer to class B is | pointing to an object of class D, derived from B (10), the dynamic | type of the pointer is pointer to D. References (8.3.2) are | treated similarly. | - implementation-defined behavior: Behavior, for a correct program | construct and correct data, that depends on the implementation and | that each implementation shall document. The range of possible | behaviors is delineated by the standard. | - implementation limits: Restrictions imposed upon programs by the | implementation. | - locale-specific behavior: Behavior that depends on local | conventions of nationality, culture, and language that each | implementation shall document. | - multibyte character: A sequence of one or more bytes representing a | member of the extended character set of either the source or the | execution environment. The extended character set is a superset of | the basic character set. | - parameter: an object or reference declared as part of a function | declaration or definition ir the catch clause of an exception | handler that acquires a value on entry to the function or handler, | an identifier from the comma-separated list bounded by the | parentheses immediately following the macro name in a function-like | macro definition, or a template-parameter. A function may said to | take arguments or to have parameters. Parameters are also known as | a formal arguments or formal parameters. | - signature: The signature of a function is the information about | that function that participates in overload resolution (13.2): the | types of its parameters and, if the function is a non-static member | of a class, the CV-qualifiers (if any) on the function itself and | whether the function is a direct member of its class or inherited | 1.3 Definitions DRAFT: 25 January 1994 General 1-3 from a base class. | - static type: The static type of an expression is the type (3.8) | resulting from analysis of the program without consideration of | execution semantics. It depends only on the form of the program and | does not change. | - undefined behavior: Behavior, upon use of an erroneous program | construct, of erroneous data, or of indeterminately valued objects, | for which the standard imposes no requirements. Permissible | undefined behavior ranges from ignoring the situation completely | with unpredictable results, to behaving during translation or | program execution in a documented manner characteristic of the | environment (with or without the issuance of a diagnostic message), | to terminating a translation or execution (with the issuance of a | diagnostic message). Note that many erroneous program constructs do | not engender undefined behavior. They are required to be diagnosed. | - unspecified behavior: Behavior, for a correct program construct and | correct data, that depends on the implementation. The range of | possible behaviors is delineated by the standard. The | implementation is not required to document which behavior occurs. | 1.4 Syntax notation [syntax] 1 In the syntax notation used in this manual, syntactic categories are indicated by italic type, and literal words and characters in constant width type. Alternatives are listed on separate lines except in a few cases where a long set of alternatives is presented on one line, marked by the phrase one of. An optional terminal or nonterminal symbol is indicated by the subscript opt, so { expressionopt } | indicates an optional expression enclosed in braces. 2 Names for syntactic categories have generally been chosen according to the following rules: | - X-name is a use of an identifier in a context that determines its | meaning (e.g. class-name, typedef-name). | - X-id is an identifier with no context-dependent meaning (e.g. | qualified-id). | - X-seq is one or more X's without intervening delimiters (e.g. | declaration-seq is a sequence of declarations). | - X-list is one or more X's separated by intervening commas (e.g. | __________________________ 1) Function signatures do not include return type, because that does | not participate in overload resolution. 1-4 General DRAFT: 25 January 1994 1.4 Syntax notation expression-list is a sequence of expressions separated by commas). 1.5 The C++ memory model [intro.memory] 1 The fundamental storage unit in the C++ memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set and is composed of a contiguous sequence of bits, the number of which is implementation-defined. The least significant bit is called the low-order bit; the most significant bit is called the high-order bit. The memory accessible to a C++ program is comprised of one or more contiguous sequences of bytes. Each byte (except perhaps registers) has a unique address. 2 The constructs in a C++ program create, refer to, access, and manipulate objects in memory. Each object (except bit-fields) occupies one or more contiguous bytes. Objects are created by definitions (3.1) and new-expressions (5.3.4). Each object has a type determined by the construct that creates it. The type in turn determines the number of bytes that the object occupies and the interpretation of their contents. Objects may contain other objects, called sub-objects (9.2, 10). An object that is not a sub-object of any other object is called a complete object. For every object x, | there is some object called the complete object of x, determined as | follows: | - If x is a complete object, then x is the complete object of x. | - Otherwise, the complete object of x is the complete object of the | (unique) object that contains x. 3 C++ provides a variety of built-in types and several ways of composing new types from existing types. 4 Certain types have alignment restrictions. An object of one of those | types may appear only at an address that is divisible by a particular | integer. | 1.6 Processor compliance | [intro.compliance] | 1 Every conforming C++ processor shall, within its resource limits, | accept and correctly execute well-formed C++ programs, and shall issue | at least one diagnostic error message when presented with any ill- | formed program that contains a violation of any rule that is | identified as diagnosable in this Standard or of any syntax rule, | except as noted herein. 2 Well-formed C++ programs are those that are constructed according to | the syntax rules, semantic rules identified as diagnosable, and the | One Definition Rule (3.1). If a program is not well-formed but does | not contain any diagnosable errors, this Standard places no | requirement on processors with respect to that program. | 1.7 DRAFT: 25 January 1994 General 1-5 Program execution 1.7 Program execution | [intro.execution] | 1 The semantic descriptions in this Standard define a parameterized | nondeterministic abstract machine. This Standard places no | requirement on the structure of conforming processors. In particular, | they need not copy or emulate the structure of the abstract machine. | Rather, conforming processors are required to emulate (only) the | observable behavior of the abstract machine as explained below. | 2 Certain aspects and operations of the abstract machine are described | | in this Standard as implementationed defined (for example, | sizeof(int)). These constitute the parameters of the abstract | machine. Each implementation shall include documentation describing | its characteristics and behavior in these respects, which | documentation defines the instance of the abstract machine that | corresponds to that implementation (referred to as the ``corresponding | instance'' below). | 3 Certain other aspects and operations of the abstract machine are | | described in this Standard as unspecified (for example, order of | evaluation of arguments to a function). In each case the Standard | defines a set of allowable behaviors. These define the | nondeterministic aspects of the abstract machine. An instance of the | abstract machine may thus have more than one possible execution | sequence for a given program and a given input. | 4 Certain other operations are described in this International Standard | | as undefined (for example, the effect of dereferencing the null | pointer). | 5 A conforming processor executing a well-formed program shall produce | | the same observable behavior as one of the possible execution | sequences of the corresponding instance of the abstract machine with | the same program and the same input. However, if any such execution | sequence contains an undefined operation, this Standard places no | requirement on the processor executing that program with that input | (not even with regard to operations previous to the first undefined | operation). | 6 The observable behavior of the abstract machine is its sequence of | | reads and writes to volatile data and calls to library I/O | functions.2) | __________________________ 2) An implementation can offer additional library I/O functions as an extension. Implementations that do so should treat calls to those functions as ``observable behavior'' as well. ______________________________________________________________________ | 2 Lexical conventions [lex] ______________________________________________________________________ | 1 A C++ program need not all be translated at the same time. The text of the program is kept in units called source files in this standard. | A source file together with all the headers (17.1.2) and source files included (16.2) via the preprocessing directive #include, less any source lines skipped by any of the conditional inclusion (16.1) preprocessing directives, is called a translation unit. Previously translated translation units may be preserved individually or in libraries. The separate translation units of a program communicate (3.4) by (for example) calls to functions whose identifiers have external linkage, manipulation of objects whose identifiers have external linkage, or manipulation of data files. Translation units may be separately translated and then later linked to produce an executable program. (3.4). 2.1 Phases of translation [lex.phases] 1 The precedence among the syntax rules of translation is specified by the following phases.3) 1 Physical source file characters are mapped to the source character set (introducing new-line characters for end-of-line indicators) if necessary. Trigraph sequences (2.2) are replaced by | corresponding single-character internal representations. 2 Each instance of a new-line character and an immediately preceding backslash character is deleted, splicing physical source lines to form logical source lines. A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character. 3 The source file is decomposed into preprocessing tokens (2.3) and sequences of white-space characters (including comments). A source file shall not end in a partial preprocessing token or comment. Each comment is replaced by one space character. New- line characters are retained. Whether each nonempty sequence of white-space characters other than new-line is retained or replaced by one space character is implementation-defined. The process of dividing a source file's characters into preprocessing tokens is context-dependent. For example, see the handling of < within a __________________________ 3) Implementations must behave as if these separate phases occur, although in practice different phases may be folded together. 2-2 Lexical conventionsDRAFT: 25 January 1994 2.1 Phases of translation #include preprocessing directive. 4 Preprocessing directives are executed and macro invocations are expanded. A #include preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4, recursively. 5 Each source character set member and escape sequence in character constants and string literals is converted to a member of the execution character set. 6 Adjacent character string literal tokens are concatenated and adjacent wide string literal tokens are concatenated. 7 White-space characters separating tokens are no longer significant. Each preprocessing token is converted into a token. (See 2.4). The resulting tokens are syntactically and semantically analyzed and translated. The result of this process starting from a single source file is called a translation unit. 8 The translation units that will form a program are combined. All external object and function references are resolved. +------- BEGIN BOX 2 -------+ What about shared libraries? +------- END BOX 2 -------+ Library components are linked to satisfy external references to functions and objects not defined in the current translation. All such translator output is collected into a program image which contains information needed for execution in its execution environment. 2.2 Trigraph sequences [lex.trigraph] 1 Before any other processing takes place, each occurrence of one of the following sequences of three characters (trigraph sequences) is | replaced by the single character indicated in Table 1. | 2.2 DRAFT: 25 January 1994Lexical conventions 2-3 Trigraph sequences | | Table 1-trigraph sequences | __________________________________________________________________________| | trigraph replacement| trigraph replacement| trigraph replacement| |_______________________|________________________|________________________|| | ??= # | ??( [ | ??< { | |_______________________|________________________|________________________|| | ??/ \ | ??) ] | ??> } | |_______________________|________________________|________________________|| | ??' ^ | ??! | | ??- ~ | |_______________________|________________________|________________________|| 2 For example, ??=define arraycheck(a,b) a??(b??) ??!??! b??(a??) becomes #define arraycheck(a,b) a[b] || b[a] 2.3 Preprocessing tokens [lex.pptoken] preprocessing-token: header-name identifier pp-number character-constant string-literal operator digraph punctuator | each non-white-space character that cannot be one of the above 1 Each preprocessing token that is converted to a token (2.5) shall have the lexical form of a keyword, an identifier, a constant, a string literal, an operator, a digraph, or a punctuator. 2 A preprocessing token is the minimal lexical element of the language in translation phases 3 through 6. The categories of preprocessing token are: header names, identifiers, preprocessing numbers, character constants, string literals, operators, punctuators, digraphs, and single non-white-space characters that do not lexically match the other preprocessing token categories. If a ' or a " | character matches the last category, the behavior is undefined. Preprocessing tokens can be separated by white space; this consists of comments (2.6), or white-space characters (space, horizontal tab, 2-4 Lexical conventionsDRAFT: 25 January 1994 2.3 Preprocessing tokens new-line, vertical tab, and form-feed), or both. As described in | Clause 16, in certain circumstances during translation phase 4, white space (or the absence thereof) serves as more than preprocessing token separation. White space may appear within a preprocessing token only as part of a header name or between the quotation characters in a character constant or string literal. 3 If the input stream has been parsed into preprocessing tokens up to a given character, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token. 4 The program fragment 1Ex is parsed as a preprocessing number token (one that is not a valid floating or integer constant token), even though a parse as the pair of preprocessing tokens 1 and Ex might produce a valid expression (for example, if Ex were a macro defined as +1). Similarly, the program fragment 1E1 is parsed as a preprocessing number (one that is a valid floating constant token), whether or not E is a macro name. 5 The program fragment x+++++y is parsed as x ++ ++ + y, which violates a constraint on increment operators, even though the parse x ++ + ++ y might yield a correct expression. 2.4 Digraph sequences [lex.digraph] 1 Alternate representations are provided for the operators and punctuators whose primary representations use the national characters. These include digraphs and additional reserved words. | digraph: | <% | %> | <: | :> | %% | 2 In translation phase 3 (2.1) the digraphs are recognized as preprocessing tokens. Then in translation phase 7 the digraphs and | the additional identifiers listed below are converted into tokens | identical to those from the corresponding primary representations, as | shown in Table 2. | | 2.4 DRAFT: 25 January 1994Lexical conventions 2-5 Digraph sequences | || Table 2-identifiers that are treated as operators || _________________________________________________________________ | | alternate primary| alternate primary| alternate primary| | |____________________|_____________________|_____________________| | | <% { | and && | and_eq &= | | |____________________|_____________________|_____________________| | | %> } | bitor | | or_eq |= | | |____________________|_____________________|_____________________| | | <: [ | or || | xor_eq ^= | | |____________________|_____________________|_____________________| | | :> ] | xor ^ | not ! | | |____________________|_____________________|_____________________| | | %% # | compl ~ | not_eq != | | |____________________|_____________________|_____________________| | | bitand & | | | | |____________________|_____________________|_____________________| | 2.5 Tokens [lex.token] token: identifier keyword literal operator punctuator 1 There are five kinds of tokens: identifiers, keywords, literals (which include strings and character and numeric constants), operators, and other separators. Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments (collectively, white space), as described below, are ignored except as they serve to separate tokens. Some white space is required to separate otherwise adjacent identifiers, keywords, and literals. 2 If the input stream has been parsed into tokens up to a given character, the next token is taken to be the longest string of characters that could possibly constitute a token. 2.6 Comments [lex.comment] 1 The characters /* start a comment, which terminates with the characters */. These comments do not nest. The characters // start | a comment, which terminates the next new-line character. If there is a | form-feed or a vertical-tab character in such a comment, only white- | space characters may appear between it and the new-line that | terminates the comment; no diagnostic is required. The comment 2-6 Lexical conventionsDRAFT: 25 January 1994 2.6 Comments characters //, /*, and */ have no special meaning within a // comment and are treated just like other characters. Similarly, the comment characters // and /* have no special meaning within a /* comment. 2.7 Identifiers [lex.name] identifier: nondigit identifier nondigit identifier digit nondigit: one of _ a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z digit: one of 0 1 2 3 4 5 6 7 8 9 1 An identifier is an arbitrarily long sequence of letters and digits. The first character must be a letter; the underscore _ counts as a letter. Upper- and lower-case letters are different. All characters are significant. 2.8 Keywords [lex.key] 1 The identifiers shown in Table 3 are reserved for use as keywords, and | may not be used otherwise in phases 7 and 8: | || Table 3-keywords || ________________________________________________________________________| | asm delete if reinterpret_cast true || | auto do inline return try || | bool double int short typedef || | break dynamic_cast long signed typeid || | case else mutable sizeof union || | catch enum namespace static unsigned || | char extern new static_cast using || | class false operator struct virtual || | const float private switch void || | const_cast for protected template volatile || | continue friend public this wchar_t || | default goto register throw while || |_______________________________________________________________________|| 2.8 Keywords DRAFT: 25 January 1994Lexical conventions 2-7 2 Furthermore, the alternate representations shown in Table 4 for | certain operators and punctuators (2.4) are reserved and may not be | used otherwise: | || Table 4-alternate representations || ________________________________________________ | | bitand and bitor or xor compl| | | and_eq or_eq xor_eq not not_eq | | |_______________________________________________| | 3 In addition, identifiers containing a double underscore ( __) are reserved for use by C++ implementations and standard libraries and | should be avoided by users; no diagnostic is required. 4 The ASCII representation of C++ programs uses as operators or for | punctuation the characters shown in Table 5. | || Table 5-operators and punctuation characters || _______________________________________________________ | | ! % ^ & * ( ) - + _ { } | ~| | | [ ] \ ; ' : " < > ? , . / | | |______________________________________________________| | Table 6 shows the character combinationations that are used as | operators. | || Table 6-character combinations used as operators || ______________________________________________________________ | | -> ++ -- .* ->* << >> <= >= == != &&| | | || *= /= %= += -= <<= >>= &= ^= |= ::| | |_____________________________________________________________| | Each is converted to a single token in translation phase 7 (2.1). 5 Table 7 shows character combinations that are used as alternative | representations for certain operators and punctuators (2.4). | 2-8 Lexical conventionsDRAFT: 25 January 1994 2.8 Keywords || Table 7-digraphs || ________________________ | | <% %> <: :> %%| | |_______________________| | Each of these is also recognized as a single token in translation phases 3 and 7. 6 Table 8 shows additional tokens that are used by the preprocessor. | || Table 8-preprocessing tokens || __________________________ | | # ## %% %%%% | | |_________________________| | 7 Certain implementation-dependent properties, such as the type of a | sizeof (5.3.3) and the ranges of fundamental types (3.8.1), are defined in the standard header files (16.2) These headers are part of the ISO C standard. In addition the headers | define the types of the most basic library functions. The last two | headers are part of the ISO C standard; is C++ specific. 2.9 Literals [lex.literal] 1 There are several kinds of literals (often referred to as constants). literal: integer-literal character-literal floating-literal string-literal boolean-literal | 2.9.1 DRAFT: 25 January 1994Lexical conventions 2-9 Integer literals 2.9.1 Integer literals [lex.icon] integer-literal: decimal-literal integer-suffixopt octal-literal integer-suffixopt hexadecimal-literal integer-suffixopt decimal-literal: nonzero-digit decimal-literal digit octal-literal: 0 octal-literal octal-digit hexadecimal-literal: 0x hexadecimal-digit 0X hexadecimal-digit hexadecimal-literal hexadecimal-digit nonzero-digit: one of | 1 2 3 4 5 6 7 8 9 octal-digit: one of | 0 1 2 3 4 5 6 7 hexadecimal-digit: one of | 0 1 2 3 4 5 6 7 8 9 a b c d e f A B C D E F integer-suffix: unsigned-suffix long-suffixopt long-suffix unsigned-suffixopt unsigned-suffix: one of | u U long-suffix: one of | l L 1 An integer literal consisting of a sequence of digits is taken to be decimal (base ten) unless it begins with 0 (digit zero). A sequence of digits starting with 0 is taken to be an octal integer (base 2-10 Lexical conventionDRAFT: 25 January 1994 2.9.1 Integer literals eight). The digits 8 and 9 are not octal digits. A sequence of digits preceded by 0x or 0X is taken to be a hexadecimal integer (base sixteen). The hexadecimal digits include a or A through f or F with decimal values ten through fifteen. For example, the number twelve can be written 12, 014, or 0XC. 2 The type of an integer literal depends on its form, value, and suffix. If it is decimal and has no suffix, it has the first of these types in which its value can be represented: int, long int, unsigned long int. If it is octal or hexadecimal and has no suffix, it has the first of these types in which its value can be represented: int, unsigned int, long int, unsigned long int. If it is suffixed by u or U, its type is the first of these types in which its value can be represented: unsigned int, unsigned long int. If it is suffixed by l or L, its type is the first of these types in which its value can be represented: long int, unsigned long int. If it is suffixed by ul, lu, uL, Lu, Ul, lU, UL, or LU, its type is unsigned long int. 3 A program is ill-formed if it contains an integer literal that cannot be represented by any of the allowed types. 2.9.2 Character literals [lex.ccon] character-literal: 'c-char-sequence' L'c-char-sequence' c-char-sequence: c-char c-char-sequence c-char c-char: any member of the source character set except | the single-quote ', backslash \, or new-line character| escape-sequence escape-sequence: simple-escape-sequence octal-escape-sequence hexadecimal-escape-sequence simple-escape-sequence: one of | \' \" \? \\ \a \b \f \n \r \t \v 2.9.2 DRAFT: 25 January 199Lexical conventions 2-11 Character literals octal-escape-sequence: \ octal-digit \ octal-digit octal-digit \ octal-digit octal-digit octal-digit hexadecimal-escape-sequence: \x hexadecimal-digit hexadecimal-escape-sequence hexadecimal-digit 1 A character literal is one or more characters enclosed in single quotes, as in 'x', optionally preceded by the letter L, as in L'x'. | Single character literals that do not begin with L have type char, | with value equal to the numerical value of the character in the machine's character set. Multicharacter literals that do not begin | with L have type int and implementation-defined value. | 2 A character literal that begins with the letter L, such as L'ab', is | a wide-character literal. Wide-character literals have type wchar_t. | They are intended for character sets where a character does not fit | into a single byte. | 3 Certain nongraphic characters, the single quote ', the double quote | | ", ?, and the backslash \, may be represented according to Table 9. | | || Table 9-escape sequences || ___________________________________ | | new-line NL (LF) \n | || | horizontal tab HT \t | || | vertical tab VT \v | || | backspace BS \b | || | carriage return CR \r | || | form feed FF \f | || | alert BEL \a | || | backslash \ \\ | | | question mark ? \? | | | single quote ' \' | | | double quote " \" | | | octal number ooo \ooo | | | hex number hhh \xhhh| | |__________________________________| | If the character following a backslash is not one of those specified, the behavior is undefined. An escape sequence specifies a single character. 2-12 Lexical conventionDRAFT: 25 January 1994 2.9.2 Character literals 4 The escape \ooo consists of the backslash followed by one, two, or three octal digits that are taken to specify the value of the desired character. The escape \xhhh consists of the backslash followed by x followed by a sequence of hexadecimal digits that are taken to specify the value of the desired character. There is no limit to the number of hexadecimal digits in the sequence. A sequence of octal or hexadecimal digits is terminated by the first character that is not an octal digit or a hexadecimal digit, respectively. The value of a character literal is implementation dependent if it exceeds that of the largest char. * 2.9.3 Floating literals [lex.fcon] floating-constant: fractional-constant exponent-partopt floating-suffixopt digit-sequence exponent-part floating-suffixopt fractional-constant: digit-sequenceopt . digit-sequence digit-sequence . exponent-part: e signopt digit-sequence E signopt digit-sequence sign: one of | + - digit-sequence: digit digit-sequence digit floating-suffix: one of | f l F L 1 A floating literal consists of an integer part, a decimal point, a fraction part, an e or E, an optionally signed integer exponent, and an optional type suffix. The integer and fraction parts both consist of a sequence of decimal (base ten) digits. Either the integer part or the fraction part (not both) may be missing; either the decimal point or the letter e (or E) and the exponent (not both) may be missing. The type of a floating literal is double unless explicitly specified by a suffix. The suffixes f and F specify float, the suffixes l and L specify long double. 2.9.4 DRAFT: 25 January 199Lexical conventions 2-13 String literals 2.9.4 String literals [lex.string] string-literal: "s-char-sequenceopt" L"s-char-sequenceopt" s-char-sequence: s-char s-char-sequence s-char s-char: any member of the source character set except | the double-quote ", backslash \, or new-line character| escape-sequence 1 A string literal is a sequence of characters (as defined in 2.9.2) surrounded by double quotes, optionally beginning with the letter L, | as in "..." or L"...". A string literal that does not begin with L | has type array of char and storage class static (3.7), and is initialized with the given characters. Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation dependent. The effect of attempting to modify a string literal is undefined. 2 A string literal that begins with L, such as L"asdf", is a wide- | character string. A wide-character string is of type array of | wchar_t. Concatenation of ordinary and wide-character string literals | is undefined. | +------- BEGIN BOX 3 -------+ Should this render the program ill-formed? Or is it deliberately | undefined to encourage extensions? | +------- END BOX 3 -------+ 3 Adjacent string literals are concatenated. Characters in concatenated | strings are kept distinct. For example, "\xA" "B" contains the two characters '\xA' and 'B' after concatenation (and not the single hexadecimal character '\xAB'). 4 After any necessary concatenation '\0' is appended so that programs that scan a string can find its end. The size of a string is the number of its characters including this terminator. Within a string, | the double quote character " must be preceded by a \. | 2-14 Lexical conventionDRAFT: 25 January 1994 2.9.5 Boolean literals 2.9.5 Boolean literals | [lex.bool] | boolean-literal: || false || true || 1 The Boolean literals are the keywords false and true. Such literals | have type bool and the given values. They are not lvalues. | ______________________________________________________________________ | 3 Basic concepts [basic] ______________________________________________________________________ | 1 This clause presents the basic concepts of the C++ language. It | explains the difference between an object and a name and how they relate to the notion of an lvalue. It introduces the concepts of a declaration and a definition and presents C++'s notion of type, scope, linkage, and storage class. The mechanisms for starting and terminating a program are discussed. Finally, this clause presents | the fundamental types of the language and lists the ways of constructing derived types from these. 2 This clause does not cover concepts that affect only a single part of | the language. Such concepts are discussed in the relevant clauses. | 3 An entity is a value, object, subobject, base class subobject, array | element, variable, function, set of functions, instance of a function, | enumerator, type, class member, template, or namespace. | 4 A name is a use of an identifier (2.7) that denotes an entity or | | label(6.6.4, 6.1). | 5 Every name that denotes an entity is introduced by a declaration. | | Every name that denotes a label is introduced either by a goto | statement (6.6.4) or a labeled-statement(6.1). Every name is | introduced in some contiguous portion of program text called a | declarative region(3.3), which is the largest part of the program in | which that name can possibly be valid. In general, each particular | name is valid only within some possibly discontiguous portion of | program text called its scope(3.3). To determine the scope of a | declaration, it is sometimes convenient to refer to the potential | scope of a declaration. The scope of a declaration is the same as its | potential scope unless the potential scope contains another | declaration of the same name. In that case, the potential scope of | the declaration in the inner (contained) declarative region is | excluded from the scope of the declaration in the outer (containing) | declarative region. | 6 For example, in | | 3-2 Basic concepts DRAFT: 25 January 1994 3 Basic concepts int j = 24; || main() || { || int i = j, j; || j = 42; || } || the identifier j is declared twice as a name (and used twice). The | declarative region of the first j includes the entire example. The | potential scope of the first j begins immediately after that j and | extends to the end of the program, but its (actual) scope excludes the | text between the , and the }. The declarative region of the second | declaration of j (the j immediately before the semicolon) includes | all the text between { and }, but its potential scope excludes the | declarationn of i The scope of the second declaration of j is the | same as its potential scope.. | 7 Some names denote types, classes, or templates. In general, it is | | necessary to determine whether or not a name denotes one of these | entities before parsing the program that contains it. The process | that determines this is called name lookup. | 8 An identifier used in more than one translation unit may potentially | | refer to the same entity in these translation units depending on the linkage (3.4) specified in the translation units. 9 An object is a region of storage (3.9). In addition to giving it a | name, declaring an object gives the object a storage class, (3.7), | which determines the object's lifetime. Some objects are | polymorphic; the implementation generates information carried in each | such object that makes it possible to determine that object's type | during program execution. For other objects, the meaning of the | values found therein is determined by the type of the expressions used | to access them. | +------- BEGIN BOX 4 -------+ Most of this section needs more work. | +------- END BOX 4 -------+ 3.1 Declarations and definitions [basic.def] 1 A declaration (7) introduces one or more names into a program and | gives each name a meaning. | 2 A declaration is a definition unless it declares a function without | | specifying the function's body (8.4), it contains the extern | specifier (7.1.1) and neither an initializer nor a function-body, it | declares a static data member in a class declaration (9.5), it is a class name declaration (9.1), or it is a typedef declaration (7.1.3), | a using declaration(7.3.3), or a using directive(7.3.4). | 3.1 DRAFT: 25 January 1994 Basic concepts 3-3 Declarations and definitions 3 The following, for example, are definitions: | int a; // defines a | extern const int c = 1; // defines c | int f(int x) { return x+a; } // defines f | struct S { int a; int b; }; // defines S | struct X { // defines X | int x; // defines nonstatic data member x | static int y; // declares static data member y | X(): x(0) { } // defines a constructor of X | }; | int X::y = 1; // defines X::y | enum { up, down }; // defines up and down | namespace N { int d; } // defines N and N::d | namespace N1 = N; // defines N1 | X anX; // defines anX | whereas these are just declarations: extern int a; // declares a | extern const int c; // declares c | int f(int); // declares f | struct S; // declares S | typedef int Int; // declares Int | extern X anotherX; // declares anotherX | using N::d; // declares N::d | 4 In some circumstances, C++ implementations generate definitions | automatically. These definitions include default constructors, copy | constructors, assignment operators, and destructors. For example, | given | struct C { || string s; // string is the standard library class (17.5.1.1)|| }; || main() || { || C a; || C b=a; || b=a; || } || the implementation will generate functions to make the definition of | C equivalent to | struct C { || string s; || C(): s() { } || C(const C& x): s(x.s) { } || C& operator=(const C& x) { s = x.s; return *this; } || ~C() { } || }; || 3-4 Basic concepts DRAFT: 25 January 1994 3.1 Declarations and definitions 3.2 One definition rule | [basic.def.odr] | +------- BEGIN BOX 5 -------+ This is still very much under review by the Committee. | +------- END BOX 5 -------+ 1 No translation unit shall contain more than one definition of any | * variable, function, named class or enumeration type. | 2 A function is used if it is called, its address is taken, or it is a | | virtual member function that is not pure. Every program shall contain | at least one definition of every function that is used in that | program. That definition may appear explicitly in the program, it may | be found in the standard or a user-defined library, or (when | appropriate) the implementation may generate it. If a non-virtual | function is not defined, a diagnostic is required only if an attempt | is actually made to call that function. | +------- BEGIN BOX 6 -------+ This says nothing about user-defined libraries. Probably it | shouldn't, but perhaps it should be more explicit that it isn't | discussing it. | +------- END BOX 6 -------+ 3 Exactly one definition in a program is required for a non-local | | variable with static storage duration, unless it has a builtin type or | is an aggregate and also is unused or used only as the operand of the | sizeof operator. | +------- BEGIN BOX 7 -------+ This is still uncertain. | +------- END BOX 7 -------+ 4 At least one definition of a class is required in a translation unit | | if the class is used other than in the formation of a pointer type. | +------- BEGIN BOX 8 -------+ This is not quite right, because it is possible to declare a function | that returns a class object without first defining the class. | +------- END BOX 8 -------+ +------- BEGIN BOX 9 -------+ There may be other situations that do not require a class to be | defined: extern declarations (i.e. "extern X x;"), declaration of | static members, others??? | +------- END BOX 9 -------+ For example the following complete translation unit is well-formed, | even though it never defines X: | 3.2 DRAFT: 25 January 1994 Basic concepts 3-5 One definition rule struct X; // declare X is a struct type || struct X* x1; // use X in pointer formation || X* x2; // use X in pointer formation || 5 There may be more than one definition of a named enumeration type in a | | program provided that each definition appears in a different | translation unit and the values of the enumerators are the same. | +------- BEGIN BOX 10 -------+ This will need to be revisited when the ODR is made more precise | +------- END BOX 10 -------+ 6 There may be more than one definition of a class type in a program | | provided that each definition appears in a different translation unit | and the definitions describe the same type. No diagnostic is required | for a violation of this ODR rule. | +------- BEGIN BOX 11 -------+ This will need to be revisited when the ODR is made more precise | +------- END BOX 11 -------+ 3.3 Declarative regions and scopes | [basic.scope] | 3.3.1 Local scope | [basic.scope.local] | 1 A name declared in a block (6.3) is local to that block. Its scope | | begins at its point of declaration (3.3.10) and ends at the end of its | declarative region. | 2 Names of parameters of a function are local to the function and shall | | not be redeclared in the outermost block of that function. | 3 The name in a catch exception-declaration is local to the handler and | | shall not be redeclared in the outermost block of the handler. | 4 Names in a declaration in the condition part of an if, while, for, | | do, or switch statement are local to the controlled statement and | shall not be redeclared in the outermost block of that statement. | 3.3.2 Function prototype scope | [basic.scope.proto] | 1 In a function declaration, names of parameters (if supplied) have | | function prototype scope, which terminates at the end of the function declarator. | 3.3.3 Function scope | 1 Labels (6.1) can be used anywhere in the function in which they are | declared. Only labels have function scope. | 3-6 Basic concepts DRAFT: 25 January 1994 3.3.4 File scope 3.3.4 File scope | [basic.file.scope] | 1 A name declared outside all named namespaces (_namespace_), blocks | | (6.3) and classes (9) has file scope. The potential scope of such a | name begins at its point of declaration (3.3.10) and ends at the end | of the translation unit that is its declarative region. Names | declared with file scope are said to be global. | 2 File scope can be treated as a special case of namespace scope (3.3.5) | | by viewing an entire translation unit as an unnamed namespace called | the global namespace. | 3.3.5 Namespace scope | [basic.scope.namespace] | 1 A name declared in a namespace (_namespace_) has namespace scope. Its | | potential scope includes its namespace from the name's point of | declaration (3.3.10) onwards, as well as the potential scope of any | using directive (7.3.4) that nominates its namespace. | 3.3.6 Class scope | [basic.scope.class] | 1 The name of a class member is local to its class and can be used only | in a member of that class (9.4) or a class derived from that class, | after the . operator applied to an expression of the type of its | class (5.2.4) or a class derived from (10) its class, after the -> operator applied to a pointer to an object of its class (5.2.4) or a | class derived from (10) its class, after the :: scope resolution operator (5.1) applied to the name of its class or a class derived | from its class, or after a using directive as described above. | +------- BEGIN BOX 12 -------+ What does: "can be used only in a member of that class" mean? It | should be phrased to include: body of member functions, ctor-init- | list, static initializers. | +------- END BOX 12 -------+ 3.3.7 Name hiding | [basic.scope.hiding] | 1 A name may be hidden by an explicit declaration of that same name in a | nested declarative region or derived class. | 2 A class name (9.1) may be hidden by the name of an object, function, | | or enumerator declared in the same scope. If a class and an object, | function, or enumerator are declared in the same scope (in any order) | with the same name the class name is hidden. | 3 If a name is in scope and is not hidden it is said to be visible. | | 4 The region in which a name is visible is called the reach of the name. | | +------- BEGIN BOX 13 -------+ The term 'reach' is defined here but never used. More work is needed | with the "descriptive terminology". | 3.3.7 Name hiding DRAFT: 25 January 1994 Basic concepts 3-7 +------- END BOX 13 -------+ 3.3.8 Explicit qualification | [basic.scope.exqual] | 1 A hidden name can still be used when it is qualified by its class or | | namespace name using the :: operator (5.1, 9.5, 10). A hidden file | scope name can still be used when it is qualified by the unary :: operator (5.1). | 3.3.9 Elaborated type specifier | [basic.scope.elab] | 1 A class name hidden by a name of an object, function, or enumerator in | local or class scope can still be used when appropriately (7.1.5) prefixed with class, struct, or union, or when followed by the :: operator. Similarly, a hidden enumeration name can be used when | appropriately (7.1.5) prefixed with enum. For example: | class A { || public: || static int n; || }; || main() || { || int A; || A::n = 42; // OK || class A a; // OK || A b; // ill-formed: A does not name a type || } || The scope rules are summarized in 10.5. | 3.3.10 Point of declaration | [basic.scope.pdecl] | 1 The point of declaration for a name is immediately after its complete | declarator (8) and before its initializer (if any), except as noted | below. For example, int x = 12; { int x = x; } 2 Here the second x is initialized with its own (unspecified) value. | 3 For the point of declaration for an enumerator, see 7.2. | 4 The point of declaration of a function with the extern or friend | specifier is in the innermost enclosing namespace just after outermost | nested scope containing it which is contained in the namespace. | +------- BEGIN BOX 14 -------+ The terms "just after the outermost nested scope" imply name | 3-8 Basic concepts DRAFT: 25 January 1994 3.3.10 Point of declaration injection. We avoided introducing the concept of name injection in | the working paper up until now. We should probably continue to do | without. | +------- END BOX 14 -------+ 5 The point of declaration of a class first declared in an elaborated- | | type-specifier is immediately after the identifier; | 6 A nonlocal name remains visible up to the point of declaration of the | local name that hides it. For example, const int i = 2; { int i[i]; } declares a local array of two integers. 3.4 Program and linkage [basic.link] 1 A program consists of one or more translation units (2) linked | together. A translation unit consists of a sequence of declarations. | translation unit: || declaration-seqopt || 2 A name which has internal linkage is local to its translation unit. | Names with internal linkage are: variables or function members of a | namespace that are explicitly declared static; function members of a | namespace that are explicitly declared inline and not explicitly | declared extern; variable members of a namespace that are explicitly | declared const and not explicitly declared extern; members of an | unnamed namespace. | 3 The name of a class that has not been used in the declaration of an | | object, function, or class that has external linkage and has no static | members (9.5) and no noninline member functions (9.4.2) has internal | linkage. | 4 Every declaration of a particular name of namespace scope that is not | declared to have internal linkage in one of these ways shall refer to | the same variable (3.9), function (8.3.5), or class (9) in every | translation unit in which it appears. Such names are said to have external linkage. * 5 A name which is declared in an unnamed namespace has internal linkage | and such name does not refer to another entity with the same name | declared in another translation unit. | 6 Typedef names (7.1.3), enumerators (7.2), and template names (14) do | not have external linkage. | +------- BEGIN BOX 15 -------+ 3.4 DRAFT: 25 January 1994 Basic concepts 3-9 Program and linkage How are the bodies of templates linked to their declarations? | +------- END BOX 15 -------+ 7 Static class members (9.5) have external linkage. 8 Noninline class member functions have external linkage. Inline class member functions must have exactly one definition in a program. | +------- BEGIN BOX 16 -------+ To be reworked when the ODR is clarified. | +------- END BOX 16 -------+ 9 Local names (3.3) explicitly declared extern have external linkage unless already declared static (7.1.1). 10After all adjustments of types (during which typedefs (7.1.3) are | replaced by their definitions), the types specified by all | declarations of a particular external name must be identical, except | that such types may differ by the presence or absence of a major array | bound (8.3.4). A violation of this rule does not require a diagnostic. 11A function may be defined only in namespace or class scope. | 12Linkage to non-C++ declarations can be achieved using a linkage- specification (7.5). 3.5 Start and termination [basic.start] 3.5.1 Main function | [basic.start.main] | 1 A program shall contain a function called main, which is the | designated start of the program. | 2 This function is not predefined by the compiler, it cannot be | overloaded, and its type is implementation dependent. The two examples below are allowed on any implementation. It is recommended that any further (optional) parameters be added after argv. The function main() may be defined as int main() { /* ... */ } | or int main(int argc, char* argv[]) { /* ... */ } | In the latter form argc shall be the number of arguments passed to the program from an environment in which the program is run. If argc is nonzero these arguments shall be supplied as zero-terminated strings in argv[0] through argv[argc-1] and argv[0] shall be the | name used to invoke the program or "". It is guaranteed that argv[argc]==0. 3-10 Basic concepts DRAFT: 25 January 1994 3.5.1 Main function 3 The function main() shall not be called from within a program. The | linkage (3.4) of main() is implementation dependent. The address of main() shall not be taken and main() shall not be declared inline or | static. 4 Calling the function void exit(int); | declared in (17.2.4.4) terminates the program without | leaving the current block and hence without destroying any local variables (12.4). The argument value is returned to the program's environment as the value of the program. 5 A return statement in main() has the effect of leaving the main function (destroying any local variables) and calling exit() with the return value as the argument. If control reaches the end of main | without encountering a return statement, the effect is that of | executing | return 0; || 3.5.2 Initialization of non-local objects | [basic.start.init] | +------- BEGIN BOX 17 -------+ This is still under active discussion by the committee. | +------- END BOX 17 -------+ 1 The initialization of nonlocal static objects (3.7) in a translation unit is done before the first use of any function or object defined in that translation unit. Such initializations (8.5, 9.5, 12.1, 12.6.1) may be done before the first statement of main() or deferred to any point in time before the first use of a function or object defined in that translation unit. The default initialization of all static objects to zero (8.5) is performed before any dynamic (that is, run- time) initialization. No further order is imposed on the initialization of objects from different translation units. The initialization of local static objects is described in 6.7. | 3.6 Termination | [basic.start.term] | 1 Destructors (12.4) for initialized static objects are called when returning from main() and when calling exit() (17.2.4.4). | Destruction is done in reverse order of initialization. The function atexit() from can be used to specify that a function must be called at exit. If atexit() is to be called, objects initialized before an atexit() call may not be destroyed until after the function specified in the atexit() call has been called. | 2 Where a C++ implementation coexists with a C implementation, any | actions specified by the C implementation to take place after the atexit() functions have been called take place after all destructors 3.6 Termination DRAFT: 25 January 1994 Basic concepts 3-11 have been called. 3 Calling the function void abort(); | declared in terminates the program without executing destructors for static objects and without calling the functions passed to atexit(). | 3.7 Storage duration | [basic.stc] | 1 The storage duration of an object determines its lifetime. | 2 The storage class specifiers static, auto, and mutable are related | | to storage duration as described below. | 3.7.1 Static storage duration | [basic.stc.static] | 1 All non-local variables have static storage duration; such variables | are created and destroyed as described in 3.5 and _stmt.decl_. | 2 Note that if an object of static storage class has a constructor or a | | destructor with side effects, it shall not be eliminated even if it | appears to be unused. | +------- BEGIN BOX 18 -------+ This awaits committee action on the ``as-if'' rule. | +------- END BOX 18 -------+ 3 The keyword static may be used to declare a local variable with | | static storage duration; for a description of initialization and | destruction of local variables, see 6.7. | 4 The keyword static applied to a class variable in a class definition | | also determines that it has static storage duration. | 3.7.2 Automatic storage duration | [basic.stc.auto] | 1 Local objects not declared static or explicitly declared auto have | | automatic storage duration and are associated with an invocation of a | block. | 2 Each object with automatic storage duration is initialized (12.1) each | | time the control flow reaches its definition and destroyed (12.4) | whenever control passes from within the scope of the object to outside | that scope (6.6). 3 A named automatic object with a constructor or destructor with side | effects may not be destroyed before the end of its block, nor may it | be eliminated even if it appears to be unused. | 3-12 Basic concepts DRAFT: 25 January 1994 3.7.3 Dynamic storage class 3.7.3 Dynamic storage class | [basic.stc.dynamic] | 1 Objects may be created and destroyed dynamically, using operator new, | | operator new[] , operator delete, or operator delete []. | 2 In addition, an explicit destructor call may destroy an object. | | +------- BEGIN BOX 19 -------+ This section requires much more work. | +------- END BOX 19 -------+ 3.7.4 Duration of sub-objects | [basic.stc.inherit] | 1 The storage duration of class subobjects, base class subobjects and | | array elements is that of their complete object (1.5). | 3.7.5 The mutable keyword | [basic.stc.mutable] | 1 The keyword mutable is grammatically a storage class specifier but is | | unrelated to the storage duration (lifetime) of the class member it | describes. Modifying a class member declared mutable is deemed not | to be modifying the value of the object that contains that member. | Therefore, mutable members of const objects are not const. | 3.7.6 Reference duration | [basic.stc.ref] | 1 Except in the case of a local reference declaration initialised by an | | rvalue, a reference may be used to name an existing object denoted by | an lvalue. | 2 The reference has static duration if it is declared non-locally, | | automatic duration if declared locally including as a function | parameter, and inherited duration if declared in a class. | 3 References may or may not require storage. | | 4 The duration of a reference is distinct from the duration of the | | object it refers to except in the case of a local reference | declaration initialized by an rvalue. | 5 Access through a reference to an object which no longer exists or has | | not yet been constructed yields undefined behaviour. | +------- BEGIN BOX 20 -------+ Can references be declared auto or static? This section probably does | not belong here. | +------- END BOX 20 -------+ 3.8 Types DRAFT: 25 January 1994 Basic concepts 3-13 3.8 Types [basic.types] 1 There are two kinds of types: fundamental types and compound types. | Types may describe objects, references (8.3.2), or functions (8.3.5). 2 Arrays of unknown size and classes that have been declared but not | defined are called incomplete types because the size and structure of | an instance of the type is unknown. Also, the void type represents | an empty set of values, so that no objects of type void ever exist; | void is an incomplete type. The term incompletely-defined object type | is a synonym for incomplete type; the term completely-defined object | type is a synonym for complete type; 3 A class type (such as class X) may be incomplete at one point in a | translation unit and complete later on; the type class X is the same | type at both points. The declared type of an array may be incomplete | at one point in a translation unit and complete later on; the array | types at those two points (array of unknown bound of T and array of N | T) are different types. However, the type of a pointer to array of unknown size cannot be completed. 4 Variables that have incomplete type are prohibited in some contexts. | For example: class X; // X us an incomplete type extern X* xp; // xp is a pointer to an incomplete type | extern int arr[]; // the type of arr is incomplete | typedef int UNKA[]; // UNKA is an incomplete type UNKA* arrp; // arrp is a pointer to an incomplete type | UNKA** arrpp; void foo() { xp++; // ill-formed: X is incomplete | arrp++; // ill-formed: incomplete type | arrpp++; // okay: sizeof UNKA* is known | } struct X { int i; }; // now X is a complete type int arr[10]; // now the type of arr is complete X x; | void bar() { xp = &x; // okay; type is ``pointer to X'' | arrp = &arr; // ill-formed: different types | xp++; // okay: X is complete | arrp++; // ill-formed: UNKA can't be completed | } 3-14 Basic concepts DRAFT: 25 January 1994 3.8.1 Fundamental types 3.8.1 Fundamental types [basic.fundamental] 1 There are several fundamental types. The standard header | specifies the largest and smallest values of each for an implementation. 2 Objects declared as characters ( char) are large enough to store any | member of the implementation's basic character set. If a character from this set is stored in a character variable, its value is equivalent to the integer code of that character. Characters may be explicitly declared unsigned or signed. Plain char, signed char, | and unsigned char are three distinct types. A char, a signed char, | and an unsigned char consume the same amount of space. | 3 An enumeration comprises a set of named integer constant values. Each distinct enumeration constitutes a different enumerated type. | Each constant has the type of its enumeration. 4 There are four signed integer types: signed char, short int, int, and | long int. In this list, each type provides at least as much storage | as those preceding it in the list, but the implementation may | otherwise make any of them equal in storage size. Plain ints have | the natural size suggested by the machine architecture; the other | signed integer types are provided to meet special needs. 5 Type wchar_t is a distinct type whose values can represent distinct | codes for all members of the largest extended character set specified | among the supported locales (_lib.locale_). Type wchar_t has the | same size, signedness, and alignment requirements (1.5) as one of the | other integral types, called its underlying type. 6 For each of the signed integer types, there exists a corresponding | (but different) signed integer type: unsigned char, unsigned short | int, unsigned int, and unsigned long int, each of which which occupies | the same amount of storage and has the same alignment requirements | (1.5) as the corresponding signed integer type.4) An alignment | requirement is an implementation-dependent restriction on the value of a pointer to an object of a given type (5.4, 1.5). | 7 Unsigned integers, declared unsigned, obey the laws of arithmetic modulo 2n where n is the number of bits in the representation of that | particular size of integer. This implies that unsigned arithmetic does not overflow. 8 Values of type bool can be either true or false.5) There are no | | signed, unsigned, short, or long bool types or values. As | __________________________ 4) See 7.1.5.2 regarding the correspondence between types and the sequences of type-specifiers that designate them. 5) Using a bool value in ways described by this International Standard as ``undefined,'' such as by examining the value of an uninitialized automatic variable, might cause it to behave as if is neither true nor false. 3.8.1 DRAFT: 25 January 1994 Basic concepts 3-15 Fundamental types described below, bool values behave as integral types. Thus, for | example, they participate in integral promotions (4.1, 5.2.3). | Although values of type bool generally behave as signed integers, for | example by promoting (4.1) to int instead of unsigned int, a bool | value can successfully be stored in a bit-field of any (nonzero) size. | 9 There are three floating types: float, double, and long double. | | The type double provides at least as much precision as float, and | the type long double provides at least as much precision as double. | Each implementation defines the characteristics of the fundamental floating point types in the standard header . 10Types bool, char, and the signed and unsigned integer types are | collectively called integral types. A synonym for integral type is | integer type. Enumerations (7.2) are not integral, but they can be | promoted (4.1) to signed or unsigned int. Integral and floating types are collectively called arithmetic types. 11The void type specifies an empty set of values. It is used as the return type for functions that do not return a value. No object of type void may be declared. Any expression may be explicitly converted to type void (5.4); the resulting expression may be used only as an expression statement (6.2), as the left operand of a comma expression (5.18), or as a second or third operand of ?: (5.16). | 3.8.2 Compound types | [basic.compound] | 1 There is a conceptually infinite number of compound types constructed | from the fundamental types in the following ways: - arrays of objects of a given type, 8.3.4; - functions, which have parameters of given types and return objects of a given type, 8.3.5; - pointers to objects or functions (including static members of classes) of a given type, 8.3.1; - references to objects or functions of a given type, 8.3.2; - constants, which are values of a given type, 7.1.5; - classes containing a sequence of objects of various types (9), a set of functions for manipulating these objects (9.4), and a set of restrictions on the access to these objects and functions, 11; - structures, which are classes without default access restrictions, 11; - unions, which are classes capable of containing objects of different types at different times, 9.6; - pointers to non-static6) class members, which identify __________________________ 3-16 Basic concepts DRAFT: 25 January 1994 3.8.2 Compound types members of a given type within objects of a given class, 8.3.3. 2 In general, these methods of constructing types can be applied recursively; restrictions are mentioned in 8.3.1, 8.3.4, 8.3.5, and 8.3.2. 3 Any type so far mentioned is an unqualified type. Each unqualified type has three corresponding qualified versions of its type:7) a const-qualified version, a volatile-qualified version, and a const- | volatile-qualified version (see 7.1.5). The cv-qualified or | unqualified versions of a type are distinct types that belong to the | same category and have the same representation and alignment requirements.8) A compound type is not cv-qualified (3.8.3) by the | cv-qualifiers (if any) of the type from which it is compounded. 4 A pointer to objects of a type T is referred to as a pointer to T. For example, a pointer to an object of type int is referred to as pointer to int and a pointer to an object of class X is called a pointer to X. Pointers to incomplete types are allowed although | there are restrictions on what can be done with them (3.8). 5 Objects of cv-qualified (3.8.3) or unqualified type void* (pointer to void), can be used to point to objects of unknown type. A void* must have enough bits to hold any object pointer. 6 Except for pointers to static members, text referring to pointers does not apply to pointers to members. 3.8.3 CV-qualifiers [basic.type.qualifier] 1 There are two cv-qualifiers, const and volatile. When applied to an object, const means the program may not change the object, and volatile has an implementation-defined meaning.9) An object may have both cv-qualifiers. 2 There is a (partial) ordering on cv-qualifiers, so that one object or | pointer may be said to be more cv-qualified than another. Table 10 | shows the relations that constitute this ordering. | __________________________ 6) Static class members are objects or functions, and pointers to them are ordinary pointers to objects or functions. 7) See 8.3.4 and 8.3.5 regarding cv-qualified array and function | types. 8) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions. 9) Roughly, volatile means the object may change of its own accord (that is, the processor may not assume that the object continues to hold a previously held value). 3.8.3 CV-qualifiers DRAFT: 25 January 1994 Basic concepts 3-17 || Table 10-relations on const and volatile || ______________________________________ | | no cv-qualifier < const | | | no cv-qualifier < volatile | | | no cv-qualifier < const volatile| | | const < const volatile| | | volatile < const volatile| | |_____________________________________| | 3 A pointer or reference to cv-qualified type (sometimes called a cv- | | qualified pointer or reference) need not actually point to a cv- | qualified object, but it is treated as if it does. For example, a pointer to const int may point to an unqualified int, but a well- | formed program may not attempt to change the pointed-to object through | that pointer even though it may change the same object through some | other access path. CV-qualifiers are supported by the type system so | that a cv-qualified object or cv-qualified access path to an object | may not be subverted without casting (5.4). For example: void f() { int i = 2; // not cv-qualified const int ci = 3; // cv-qualified (initialized as required) ci = 4; // error: attempt to modify const const int* cip; // pointer to const int | cip = &i; // okay: cv-qualified access path to unqualified *cip = 4; // error: attempt to modify through ptr to const int* ip; | ip = cip; // error: attempt to convert const int* to int*| } 3.8.4 Type names [basic.type.name] 1 Fundamental and compound types can be given names by the typedef | mechanism (7.1.3), and families of types and functions can be specified and named by the template mechanism (14). | 3.9 Lvalues and rvalues | [basic.lval] | 1 Every expression is either an lvalue or rvalue. | 2 An lvalue refers to an object or function. Some rvalue expressions- | | those of class or cv-qualified class type-also refer to objects.10) | __________________________ 10) Expressions such as invocations of constructors and of functions that return a class type do in some sense refer to an object, and the implementation may invoke a member function upon such objects, but the 3-18 Basic concepts DRAFT: 25 January 1994 3.9 Lvalues and rvalues 3 Some builtin operators and function calls yield lvalues. For example, | | if E is an expression of pointer type, then *E is an lvalue | expression referring to the object or function to which E points. As | another example, the function | int& f(); || yields an lvalue, so the call f() is an an lvalue expression. | 4 Some builtin operators expect lvalue operands, for example the builtin | | assignment operators all expect their left hand operands to be | lvalues. Other builtin operators yield rvalues, and some expect them. | For example the unary and binary + operator expect rvalue arguments | and yields an rvalue result. Constructor invocations and calls to | functions that do not return references are always rvalues. | 5 The discussion of each builtin operator in 5 indicates whether it | expects lvalue operands and whether it yields an lvalue. The | discussion of reference initialization in 8.5.3 indicates the behavior | of lvalues and rvalues in other significant contexts. 6 User defined operators are functions, and whether such operators | expect or yield lvalues is determined by their type. | 7 Rvalues may be qualified types, however the unqualified type is used | | unless the rvalue is of class type and a member function is called on | the rvalue. | 8 Whenever an lvalue that refers to a non-array11) non-class object | | appears in a context where an lvalue is not expected, the value | contained in the referenced object is used. When this occurs, the value has the unqualified type of the lvalue. For example: const int* cip; | int i = *cip // "*cip" has type int If this type is incomplete, the program is ill-formed. +------- BEGIN BOX 21 -------+ In C this is undefined. +------- END BOX 21 -------+ For example: struct X; X* xp; | xp; // okay: pointer to incomplete type *xp; // error: incomplete type __________________________ expressions are not lvalues. 11) An lvalue that refers to an array object is usually converted to a | (rvalue) pointer to the initial element of the array (4.6). 3.9 DRAFT: 25 January 1994 Basic concepts 3-19 Lvalues and rvalues However, when an lvalue is used as the operand of sizeof the value | contained in the referenced object is not accessed, since that | operator does not evaluate its operand. 9 An lvalue or rvalue of class type can also be used to modify its | referent under certain circumstances. | +------- BEGIN BOX 22 -------+ Provide example cross-reference. | +------- END BOX 22 -------+ 10Functions cannot be modified, but pointers to functions may be | modifiable. | 11An expression of incomplete type cannot be used to modify an object, | | but a pointer to such an object may be modifiable and the object itself may be modifiable at some point in the program where its type is complete. | 12Array objects cannot be modified, but their elements may be | modifiable. | 13The referent of a const-qualified expression shall not be modified | | (through that expression), except that if it is of class type and has | a mutable component, that component may be modified. | 14If an expression can be used to modify its object, it is called | | modifiable. A program that attempts to modify an object through a | nonmodifiable lvalue or rvalue expression is ill-formed. | ______________________________________________________________________ | 4 Standard conversions [conv] ______________________________________________________________________ | 1 Some operators may, depending on their operands, cause conversion of the value of an operand from one type to another. This section summarizes the conversions demanded by most ordinary operators and explains the result to be expected from such conversions; it will be supplemented as required by the discussion of each operator. These conversions are also used in initialization (8.5, 8.5.3, 12.8, 12.1). 12.3 and 13.2 describe user-defined conversions and their interaction with standard conversions. The result of a conversion is an lvalue only if the result is a reference (8.3.2). 4.1 Integral promotions [conv.prom] 1 A char, wchar_t, bool, short int, enumerator, object of | enumeration type (7.2), or an int bit-field (9.7) (in both their signed and unsigned varieties) may be used wherever an integer rvalue may be used. In contexts where a constant integer is required, the | bool, char, wchar_t, short int, object of enumeration type (7.2), or bit-field must be constant. (Enumerators are always constant). | 2 Except for enumerators, objects of enumeration type, and type | wchar_t, if an int can represent all the values of the original type, the value is converted to int; otherwise it is converted to unsigned int. | 3 For enumerators, objects of enumeration type, and type wchar_t, if an | int can represent all the values of the underlying type, the value is converted to an int; otherwise if an unsigned int can represent all the values, the value is converted to an unsigned int; otherwise, if a long can represent all the values, the value is converted to a long; otherwise it is converted to unsigned long. | 4 A Boolean value may be converted to int, taking false to zero and | | true to one. | 5 This process is called integral promotion. | 4.2 Integral conversions [conv.integral] 1 An integer rvalue may be converted to any integral type. If the target type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). In a two's complement representation, this conversion is conceptual and there is 4-2 Standard conversionDRAFT: 25 January 1994 4.2 Integral conversions no change in the bit pattern. 2 When an integer is converted to a signed type, the value is unchanged if it can be represented in the new type; otherwise the value is implementation dependent. | 3 When an integer is converted to bool, see 4.9. | 4.3 Float and double [conv.double] 1 Single-precision floating point arithmetic may be used for float expressions. When a less precise floating value is converted to an equally or more precise floating type, the value is unchanged. When a more precise floating value is converted to a less precise floating type and the value is within representable range, the result may be either the next higher or the next lower representable value. If the result is out of range, the behavior is undefined. 4.4 Floating and integral [conv.float] 1 Conversion of a floating value to an integral type truncates; that is, the fractional part is discarded. Such conversions are machine dependent; for example, the direction of truncation of negative numbers varies from machine to machine. The result is undefined if the value cannot be represented in the integral type. 2 Conversions of integral values to floating type are as mathematically correct as the hardware allows. Loss of precision occurs if an integral value cannot be represented exactly as a value of the floating type. 4.5 Arithmetic conversions [conv.arith] 1 Many binary operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions. 2 - If either operand is of type long double, the other is converted to long double. - Otherwise, if either operand is double, the other is converted to double. - Otherwise, if either operand is float, the other is converted to float. - Otherwise, the integral promotions (4.1) are performed on both operands. - Then, if either operand is unsigned long the other is converted to unsigned long. 4.5 DRAFT: 25 January 199Standard conversions 4-3 Arithmetic conversions - Otherwise, if one operand is a long int and the other unsigned int, then if a long int can represent all the values of an unsigned int, the unsigned int is converted to a long int; otherwise both operands are converted to unsigned long int. - Otherwise, if either operand is long, the other is converted to long. - Otherwise, if either operand is unsigned, the other is converted to unsigned. - Otherwise, both operands are int. 4.6 Pointer conversions [conv.ptr] 1 The following conversions may be performed wherever pointers (8.3.1) are assigned, initialized, compared, or otherwise used: - A constant expression (5.19) that evaluates to zero (the null pointer constant) when assigned to, compared with, alternated with (5.16), or used as an initializer of an operand of pointer type is converted to a pointer of that type. It is guaranteed that this value will produce a pointer distinguishable from a pointer to any object or function. - A pointer to a cv-qualified or unqualified object type may be converted to a pointer to the same type with greater cv- qualifications (3.8.3). That is, for any unqualified type T, a T* may be converted to a const T*, a volatile T*, or a const volatile T*; a const T* may be converted to a const volatile T*; or a volatile T* may be converted to a const volatile T*. - A pointer to any object type may be converted to a void* with the greater or equal cv-qualifications. That is, for any unqualified type T. a T* may be converted to a void*, a const void*, a volatile void*, or a const volatile void*; a const T* may be converted to a const void* or a const volatile void*; a volatile T* may be converted to a volatile void* or a const volatile void*; and a const volatile T* may be converted to a const volatile void*. | - Two pointer types and T2 are similar if there exists a type T and | integer N>0 such that: | T1 is Tcv1,n * . . . cv1,1 * cv1,0 and | T2 is Tcv2,n * . . . cv2,1 * cv2,0 where each cvi,j is const, volatile, const volatile, or | nothing. An expression of type T1 may be converted to type T2 if | and only if the following conditions are satisfied: | 4-4 Standard conversionDRAFT: 25 January 1994 4.6 Pointer conversions - the pointer types are similar. | - for every j>0, if const is in cv1,j then const is in cv2,j, | and similarly for volatile. | - the cv1,j and cv2,j are different, then const is in every cv2,k | for 0