______________________________________________________________________ 1 General [intro] ______________________________________________________________________ 1.1 Scope [intro.scope] 1 This International Standard specifies requirements for implementations of the C++ programming language. The first such requirement is that they implement the language, and so this International Standard also defines C++. Other requirements and relaxations of the first require- ment appear at various places within the Standard. 2 C++ is a general purpose programming language based on the C program- ming language as described in ISO/IEC 9899:1990 Programming Languages C (_intro.refs_). In addition to the facilities provided by C, C++ provides additional data types, classes, templates, exceptions, inline functions, operator overloading, function name overloading, refer- ences, free store management operators, function argument checking and type conversion, and additional library facilities. These extensions to C are summarized in _diff.c_. The differences between C++ and ISO C1) are summarized in _diff.iso_. The extensions to C++ since 1985 are summarized in _diff.c++_. 3 Clauses _lib.library_ through _lib.input.output_ (the library clauses) describe the Standard C++ library, which provides definitions for the following kinds of entities: macros (_cpp.replace_), values (_basic_), types (_dcl.name_, _dcl.meaning_), templates (_temp_), classes (_class_), functions (_dcl.fct_), and objects (_dcl.dcl_). 4 For classes and class templates, the library clauses specify partial definitions. Private members (_class.access_) are not specified, but each implementation shall supply them to complete the definitions according to the description in the library clauses. 5 For functions, function templates, objects, and values, the library clauses specify declarations. Implementations shall supply defini- tions consistent with the descriptions in the library clauses. 6 The names defined in the library have namespace scope (_basic.namespace_). A C++ translation unit (_lex.phases_) obtains access to these names by including the appropriate standard library header (_cpp.include_). 7 The templates, classes, functions, and objects in the library have external linkage (_basic.link_). The implementation provides defini- tions for standard library entities, as necessary, while combining translation units to form a complete C++ program (_lex.phases_). 1.2 Normative references [intro.refs] 1 The following standards contain provisions which, through reference in this text, constitute provisions of this International Standard. At the time of publication, the editions indicated were valid. All stan- dards are subject to revision, and parties to agreements based on this International Standard are encouraged to investigate the possibility of applying the most recent editions of the standards indicated below. Members of IEC and ISO maintain registers of currently valid Interna- tional Standards. --ISO/IEC 2382 Dictionary for Information Processing Systems. --ISO/IEC 9899:1990 Programming Languages - C --ISO/IEC:1990 Programming Languages - C AMENDMENT 1: C Integrity 2 The library described in Clause 7 of the C Standard and Clause 7 of Amendment 1 to the C Standard is hereinafter called the Standard C Library.1) 1.3 Implementation compliance [intro.compliance] 1 The set of "diagnosable semantic rules" consists of all semantic rules in this International Standard except for those rules containing an explicit notation that "no diagnostic is required." 2 Every conforming C++ implementation shall, within its resource limits, accept and correctly execute well-formed C++ programs, and shall issue at least one diagnostic message when presented with any ill-formed program that contains a violation of any diagnosable semantic rule or of any syntax rule. 3 If an ill-formed program contains no violations of diagnosable seman- tic rules, this International Standard places no requirement on imple- mentations with respect to that program. 4 Two kinds of implementations are defined: hosted and freestanding. For a hosted implementation, this International Standard defines the set of available libraries. A freestanding implementation is one in which execution may take place without the benefit of an operating system, and has an implementation-defined set of libraries that includes certain language-support libraries (_lib.compliance_). 5 Although this International Standard states only requirements on C++ implementations, those requirements are often easier to understand if they are phrased as requirements on programs, parts of programs, or execution of programs. Such requirements have the following meaning: _________________________ 1) With the qualifications noted in clauses _lib.library_ through _lib.input.output_, and in _diff.library_, the Standard C library is a subset of the Standard C++ library. --Whenever this International Standard places a requirement on the form of a program (that is, the characters, tokens, syntactic ele- ments, and types that make up the program), and a program does not meet that requirement, the program is ill-formed and the implementa- tion shall issue a diagnostic message when processing that program. --Whenever this International Standard places a requirement on the execution of a program (that is, the values of data that are used as part of program execution) and the data encountered during execution do not meet that requirement, the behavior of the program is unde- fined and this International Standard places no requirements at all on the behavior of the program. 6 In this International Standard, a term is italicized when it is first defined. In this International Standard, the examples, the notes, the footnotes, and the non-normative annexes are not part of the normative Standard. Each example is introduced by "[Example:" and terminated by "]". Each note is introduced by "[Note:" and terminated by "]". 7 A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any well-formed program. One example of such an extension is allowing identifiers to contain characters outside the basic source character set. Implementations are required to diagnose programs that use such extensions that are ill-formed according to this Standard. Having done so, however, they can compile and execute such programs. 1.4 Definitions [intro.defs] 1 For the purposes of this International Standard, the definitions given in ISO/IEC 2382 and the following definitions apply. --argument: An expression in the comma-separated list bounded by the parentheses in a function call expression, a sequence of preprocess- ing tokens in the comma-separated list bounded by the parentheses in a function-like macro invocation, the operand of throw, or an expression in the comma-separated list bounded by the angle brackets in a template instantiation. Also known as an "actual argument" or "actual parameter." --diagnostic message: A message belonging to an implementation-defined subset of the implementation's output messages. --dynamic type: The dynamic type of an lvalue expression is the type of the most derived object (_intro.object_) to which the lvalue refers. [Example: if a pointer (_dcl.ptr_) p whose static type is "pointer to class B" is pointing to an object of class D, derived from B (_class.derived_), the dynamic type of the expression *p is "pointer to D." References (_dcl.ref_) are treated similarly. ] The dynamic type of an rvalue expression is its static type. --ill-formed program: input to a C++ implementation that is not a well-formed program (q. v.). --implementation-defined behavior: Behavior, for a well-formed program construct and correct data, that depends on the implementation and that each implementation shall document. --implementation limits: Restrictions imposed upon programs by the implementation. --locale-specific behavior: Behavior that depends on local conventions of nationality, culture, and language that each implementation shall document. --multibyte character: A sequence of one or more bytes representing a member of the extended character set of either the source or the execution environment. The extended character set is a superset of the basic character set. --parameter: an object or reference declared as part of a function declaration or definition, or in the catch clause of an exception handler that acquires a value on entry to the function or handler; an identifier from the comma-separated list bounded by the parenthe- ses immediately following the macro name in a function-like macro definition; or a template-parameter. Parameters are also known as "formal arguments" or "formal parameters." --signature: The signature of a function is the information about that function that participates in overload resolution (_over.match_): the types of its parameters and, if the function is a class member, the cv- qualifiers (if any) on the function itself and the class in which the member function is declared.2) The signature of a template function specialization includes the types of its template arguments (_temp.over.link_). --static type: The static type of an expression is the type (_basic.types_) resulting from analysis of the program without con- sideration of execution semantics. It depends only on the form of the program and does not change while the program is executing. --undefined behavior: Behavior, such as might arise upon use of an erroneous program construct or of erroneous data, for which the Standard imposes no requirements. Undefined behavior may also be expected when the standard omits the description of any explicit definition of behavior. [Note: permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Note that many erroneous program constructs do not engender undefined behav- ior; they are required to be diagnosed. ] _________________________ 2) Function signatures do not include return type, because that does not participate in overload resolution. --unspecified behavior: Behavior, for a well-formed program construct and correct data, that depends on the implementation. The implemen- tation is not required to document which behavior occurs. [Note: usually, the range of possible behaviors is delineated by the Stan- dard. ] --well-formed program: a C++ program constructed according to the syn- tax rules, diagnosable semantic rules, and the One Definition Rule (_basic.def.odr_). 2 Clause _lib.definitions_ defines additional terms that are used only in the library clauses (_lib.library_-_lib.input.output_). 1.5 Syntax notation [syntax] 1 In the syntax notation used in this International Standard, syntactic categories are indicated by italic type, and literal words and charac- ters in constant width type. Alternatives are listed on separate lines except in a few cases where a long set of alternatives is pre- sented on one line, marked by the phrase "one of." An optional termi- nal or nonterminal symbol is indicated by the subscript "opt," so { expressionopt } indicates an optional expression enclosed in braces. 2 Names for syntactic categories have generally been chosen according to the following rules: --X-name is a use of an identifier in a context that determines its meaning (e.g. class-name, typedef-name). --X-id is an identifier with no context-dependent meaning (e.g. qual- ified-id). --X-seq is one or more X's without intervening delimiters (e.g. dec- laration-seq is a sequence of declarations). --X-list is one or more X's separated by intervening commas (e.g. expression-list is a sequence of expressions separated by commas). 1.6 The C++ memory model [intro.memory] 1 The fundamental storage unit in the C++ memory model is the byte. A byte is at least large enough to contain any member of the basic exe- cution character set and is composed of a contiguous sequence of bits, the number of which is implementation-defined. The least significant bit is called the low-order bit; the most significant bit is called the high-order bit. The memory available to a C++ program consists of one or more sequences of contiguous bytes. Every byte has a unique address.3) _________________________ 3) The implementation is free to disregard this, or any other, re- quirement as long as doing so has no perceptible effect on the execu- tion of the program. Thus, for example, an implementation is free to place any variable in an internal register that does not have an ad- 2 [Note: the representation of types is described in _basic.types_. ] 1.7 The C++ object model [intro.object] 1 The constructs in a C++ program create, refer to, access, and manipu- late objects. An object is a region of storage. An object is created by a definition (_basic.def_), by a new-expression (_expr.new_) or by the implementation (_class.temporary_) when needed. The properties of an object are determined when the object is created. An object can have a name (_basic_). An object has a storage duration (_basic.stc_) which influences its lifetime (_basic.life_). An object has a type (_basic.types_). The term object type refers to the type with which the object is created. The object's type determines the number of bytes that the object occupies and the interpretation of its content. Some objects are polymorphic (_class.virtual_); the implementation generates information carried in each such object that makes it possi- ble to determine that object's type during program execution. For other objects, the interpretation of the values found therein is determined by the type of the expressions (_expr_) used to access them. 2 Objects can contain other objects, called sub-objects. A sub-object can be a member sub-object (_class.mem_) or a base class sub-object (_class.derived_). An object that is not a sub-object of any other object is called a complete object. 3 For every object x, there is some object called the complete object of x, determined as follows: --If x is a complete object, then x is the complete object of x. --Otherwise, the complete object of x is the complete object of the (unique) object that contains x. If a complete object, a nonstatic data member (_class.mem_), or an array element is of class type, its type is considered the most derived class, to distinguish it from the class type of any base class subobject; an object of a most derived class type is called a most derived object. 4 Unless it is a bit-field (_class.bit_), a most derived object shall have a non-zero size and shall occupy one or more bytes of storage. Base class sub-objects may have zero size. An object of POD type (_basic.types_) shall occupy contiguous bytes of storage. 5 [Note: C++ provides a variety of built-in types and several ways of composing new types from existing types (_basic.types_). ] _________________________ dress as long as the program does not do anything that depends on the address of the variable. 1.8 Program execution [intro.execution] 1 The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine. This International Standard places no requirement on the structure of conforming imple- mentations. In particular, they need not copy or emulate the struc- ture of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below. 2 Certain aspects and operations of the abstract machine are described in this International Standard as implementation-defined (for example, sizeof(int)). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these respects. Such documenta- tion shall define the instance of the abstract machine that corre- sponds to that implementation (referred to as the ``corresponding instance'' below). 3 Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, the Standard defines a set of allowable behaviors. These define the non- deterministic aspects of the abstract machine. An instance of the abstract machine can thus have more than one possible execution sequence for a given program and a given input. 4 Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. ] 5 A conforming implementation executing a well-formed program shall pro- duce the same observable behavior as one of the possible execution sequences of the corresponding instance of the abstract machine with the same program and the same input. However, if any such execution sequence contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations previous to the first undefined operation). 6 The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions.4) 7 Accessing an object designated by a volatile lvalue (_basic.lval_), modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression might _________________________ 4) An implementation can offer additional library I/O functions as an extension. Implementations that do so should treat calls to those functions as ``observable behavior'' as well. produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evalua- tions shall be complete and no side effects of subsequent evaluations shall have taken place.5) 8 Once the execution of a function begins, no expressions from the call- ing function are evaluated until execution of the called function has completed.6) 9 In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object). 10When the processing of the abstract machine is interrupted by receipt of a signal, only the values of objects as of the previous sequence point may be relied on. Objects that may be modified between the pre- vious sequence point and the next sequence point need not have received their correct values yet. 11An instance of each object with automatic storage duration (_basic.stc.auto_) is associated with each entry into its block. Such an object exists and retains its last-stored value during the execu- tion of the block and while the block is suspended (by a call of a function or receipt of a signal). 12The least requirements on a conforming implementation are: --At sequence points, volatile objects are stable in the sense that previous evaluations are complete and subsequent evaluations have not yet occurred. --At program termination, all data written into files shall be identi- cal to one of the possible results that execution of the program according to the abstract semantics would have produced. --The input and output dynamics of interactive devices shall take place in such a fashion that prompting messages actually appear prior to a program waiting for input. What constitutes an interac- tive device is implementation-defined. [Note: more stringent correspondences between abstract and actual semantics may be defined by each implementation. ] _________________________ 5) Note that some aspects of sequencing in the abstract machine are unspecified; the preceding restriction upon side effects applies to that particular execution sequence in which the actual code is gener- ated. 6) In other words, function executions do not interleave with each other. 13A full-expression is an expression that is not a subexpression of another expression. 14[Note: certain contexts in C++ cause the evaluation of a full- expression that results from a syntactic construct other than expres- sion (_expr.comma_). For example, in _dcl.init_ one syntax for ini- tializer is ( expression-list ) but the resulting construct is a function call upon a constructor function with expression-list as an argument list; such a function call is a full-expression. For example, in _dcl.init_, another syntax for initializer is = initializer-clause but again the resulting construct might be a function call upon a con- structor function with one assignment-expression as an argument; again, the function call is a full-expression. ] 15[Note: that the evaluation of a full-expression can include the evalu- ation of subexpressions that are not lexically part of the full- expression. For example, subexpressions involved in evaluating default argument expressions (_dcl.fct.default_) are considered to be created in the expression that calls the function, not the expression that defines the default argument. ] 16[Note: operators can be regrouped according to the usual mathematical rules only where the operators really are associative or commutative.7) For example, in the following fragment int a, b; /*...*/ a = a + 32760 + b + 5; the expression statement behaves exactly the same as a = (((a + 32760) + b) + 5); due to the associativity and precedence of these operators. Thus, the result of the sum (a + 32760) is next added to b, and that result is then added to 5 which results in the value assigned to a. On a machine in which overflows produce an exception and in which the range of values representable by an int is [-32768,+32767], the implementa- tion cannot rewrite this expression as a = ((a + b) + 32765); since if the values for a and b were, respectively, -32754 and -15, the sum a + b would produce an exception while the original expression would not; nor can the expression be rewritten either as a = ((a + 32765) + b); or a = (a + (b + 32765)); since the values for a and b might have been, respectively, 4 and -8 or -17 and 12. However on a machine in which overflows do not produce an exception and in which the results of overflows are reversible, the above expression statement can be rewritten by the implementation in any of the above ways because the same result will occur. ] _________________________ 7) Overloaded operators are never assumed to be associative or commu- tative. 17There is a sequence point at the completion of evaluation of each full-expression8). 18When calling a function (whether or not the function is inline), there is a sequence point after the evaluation of all function arguments (if any) which takes place before execution of any expressions or state- ments in the function body. There is also a sequence point after the copying of a returned value and before the execution of any expres- sions outside the function9). Several contexts in C++ cause evalua- tion of a function call, even though no corresponding function call syntax appears in the translation unit. [Example: evaluation of a new expression invokes one or more allocation and constructor functions; see _expr.new_. For another example, invocation of a conversion func- tion (_class.conv.fct_) can arise in contexts in which no function call syntax appears. ] The sequence points at function-entry and function-exit (as described above) are features of the function calls as evaluated, whatever the syntax of the expression that calls the function might be. 19In the evaluation of each of the expressions a && b a || b a ? b : c a , b using the built-in meaning of the operators in these expressions (_expr.log.and_, _expr.log.or_, _expr.cond_, _expr.comma_) there is a sequence point after the evaluation of the first expression10). +------- BEGIN BOX 1 -------+ The Working Group is still discussing whether there is a sequence point after the operand of dynamic-cast is evaluated; this is a con- text from which an exception might be thrown, even though no function call is performed. This has not yet been voted upon by the Working Group, and it may be redundant with the sequence point at function- exit. +------- END BOX 1 -------+ _________________________ 8) As specified in _class.temporary_, after the "end-of-full- expression" sequence point, a sequence of zero or more invocations of destructor functions for temporary objects takes place, in reverse or- der of the construction of each temporary object. 9) The sequence point at the function return is not explicitly speci- fied in ISO C, and can be considered redundant with sequence points at full-expressions, but the extra clarity is important in C++. In C++, there are more ways in which a called function can terminate its exe- cution, such as the throw of an exception. 10) The operators indicated in this paragraph are the built-in opera- tors, as described in Clause _expr_. When one of these operators is overloaded (_over_) in a valid context, thus designating a user- defined operator function, the expression designates a function invo- cation, and the operands form an argument list, without an implied se- quence point between them. ______________________________________________________________________ 2 Lexical conventions [lex] ______________________________________________________________________ 1 The text of the program is kept in units called source files in this International Standard. A source file together with all the headers (_lib.headers_) and source files included (_cpp.include_) via the pre- processing directive #include, less any source lines skipped by any of the conditional inclusion (_cpp.cond_) preprocessing directives, is called a translation unit. [Note: a C++ program need not all be translated at the same time. ] 2 [Note: previously translated translation units and instantiation units can be preserved individually or in libraries. The separate transla- tion units of a program communicate (_basic.link_) by (for example) calls to functions whose identifiers have external linkage, manipula- tion of objects whose identifiers have external linkage, or manipula- tion of data files. Translation units can be separately translated and then later linked to produce an executable program. (_basic.link_). ] 2.1 Phases of translation [lex.phases] 1 The precedence among the syntax rules of translation is specified by the following phases.11) 1 Physical source file characters are mapped to the source character set (introducing new-line characters for end-of-line indicators) if necessary. Trigraph sequences (_lex.trigraph_) are replaced by corresponding single-character internal representations. Any source file character not in the basic source character set (_lex.charset_) is replaced by the universal-character-name that designates that character.12) 2 Each instance of a new-line character and an immediately preceding backslash character is deleted, splicing physical source lines to _________________________ 11) Implementations must behave as if these separate phases occur, al- though in practice different phases might be folded together. 12) The process of handling extended characters is specified in terms of mapping to an encoding that uses only the basic source character set, and, in the case of character literals and strings, further map- ping to the execution character set. In practical terms, however, any internal encoding may be used, so long as an actual extended character encountered in the input, and the same extended character expressed in the input as a universal-character-name (i.e. using the notation), are handled equivalently. form logical source lines. A source file that is not empty shall end in a new-line character, which shall not be immediately pre- ceded by a backslash character. 3 The source file is decomposed into preprocessing tokens (_lex.pptoken_) and sequences of white-space characters (including comments). A source file shall not end in a partial preprocessing token or partial comment13). Each comment is replaced by one space character. New-line characters are retained. Whether each nonempty sequence of white-space characters other than new-line is retained or replaced by one space character is implementation- defined. The process of dividing a source file's characters into preprocessing tokens is context-dependent. [Example: see the han- dling of < within a #include preprocessing directive. ] 4 Preprocessing directives are executed and macro invocations are expanded. A #include preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4, recursively. 5 Each source character set member, escape sequence, or universal- character-name in character literals and string literals is con- verted to a member of the execution character set. 6 Adjacent character string literal tokens are concatenated. Adja- cent wide string literal tokens are concatenated. 7 White-space characters separating tokens are no longer signifi- cant. Each preprocessing token is converted into a token. (_lex.token_). The resulting tokens are syntactically and semanti- cally analyzed and translated. 8 Translated translation units and instantiation units are combined as follows: [Note: some or all of these may be supplied from a library. ] Each translated translation unit is examined to pro- duce a list of required instantiations. [Note: this may include instantiations which have been explicitly requested (_temp.explicit_). ] The definitions of the required templates are located. It is implementation-defined whether the source of the translation units containing these definitions is required to be available. [Note: an implementation could encode sufficient information into the translated translation unit so as to ensure the source is not required here. ] All the required instantia- tions are performed to produce instantiation units. [Note: these are similar to translated translation units, but contain no refer- ences to uninstantiated templates and no template definitions. ] The program is ill-formed if any instantiation fails. _________________________ 13) A partial preprocessing token would arise from a source file end- ing in one or more characters of a multi-character token followed by a "line-splicing" backslash. A partial comment would arise from a source file ending with an unclosed /* comment, or a // comment line that ends with a "line-splicing" backslash. 9 All external object and function references are resolved. Library components are linked to satisfy external references to functions and objects not defined in the current translation. All such translator output is collected into a program image which contains information needed for execution in its execution environment. +------- BEGIN BOX 2 -------+ Corfield: The inclusion model for template compilation does not require the two sentences in phase 8: The definitions of the required templates are located. It is implementation-defined whether the source of the translation units containing these definitions is required to be available. +------- END BOX 2 -------+ +------- BEGIN BOX 3 -------+ What about shared libraries? +------- END BOX 3 -------+ 2.2 Basic source character set [lex.charset] 1 The basic source character set consists of 96 characters: the space character, the control characters representing horizontal tab, verti- cal tab, form feed, and new-line, plus the following 91 graphical characters: a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = , " ' 2 The universal-character-name construct provides a way to name other characters. hex-quad: hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit universal-character-name: ??u hex-quad ??U hex-quad hex-quad The character designated by the universal-character-name ??UNNNNNNNN is that character whose encoding in ISO/IEC 10646 is the hexadecimal value NNNNNNNN; the character designated by the universal-character- name ??uNNNN is that character whose encoding in ISO/IEC 10646 is the hexadecimal value 0000NNNN. 2.3 Trigraph sequences [lex.trigraph] 1 Before any other processing takes place, each occurrence of one of the following sequences of three characters ("trigraph sequences") is replaced by the single character indicated in Table 1. Table 1--trigraph sequences +-----------------------+------------------------+------------------------+ |trigraph replacement | trigraph replacement | trigraph replacement | +-----------------------+------------------------+------------------------+ | ??= # | ??( [ | ??< { | +-----------------------+------------------------+------------------------+ | ??/ \ | ??) ] | ??> } | +-----------------------+------------------------+------------------------+ | ??' ^ | ??! | | ??- ~ | +-----------------------+------------------------+------------------------+ 2 [Example: ??=define arraycheck(a,b) a??(b??) ??!??! b??(a??) becomes #define arraycheck(a,b) a[b] || b[a] --end example] 3 [Note: no other trigraph sequence exists. Each ? that does not begin one of the trigraphs listed above is not changed. ] 2.4 Preprocessing tokens [lex.pptoken] preprocessing-token: header-name identifier pp-number character-literal string-literal preprocessing-op-or-punc each non-white-space character that cannot be one of the above 1 Each preprocessing token that is converted to a token (_lex.token_) shall have the lexical form of a keyword, an identifier, a literal, an operator, or a punctuator. 2 A preprocessing token is the minimal lexical element of the language in translation phases 3 through 6. The categories of preprocessing token are: header names, identifiers, preprocessing numbers, character literals, string literals, preprocessing-op-or-punc, and single non- white-space characters that do not lexically match the other prepro- cessing token categories. If a ' or a " character matches the last category, the behavior is undefined. Preprocessing tokens can be sep- arated by white space; this consists of comments (_lex.comment_), or white-space characters (space, horizontal tab, new-line, vertical tab, and form-feed), or both. As described in Clause _cpp_, in certain circumstances during translation phase 4, white space (or the absence thereof) serves as more than preprocessing token separation. White space can appear within a preprocessing token only as part of a header name or between the quotation characters in a character literal or string literal. 3 If the input stream has been parsed into preprocessing tokens up to a given character, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token, even if that would cause further lexical analysis to fail. 4 [Example: The program fragment 1Ex is parsed as a preprocessing number token (one that is not a valid floating or integer literal token), even though a parse as the pair of preprocessing tokens 1 and Ex might produce a valid expression (for example, if Ex were a macro defined as +1). Similarly, the program fragment 1E1 is parsed as a preprocessing number (one that is a valid floating literal token), whether or not E is a macro name. ] 5 [Example: The program fragment x+++++y is parsed as x ++ ++ + y, which, if x and y are of built-in types, violates a constraint on increment operators, even though the parse x ++ + ++ y might yield a correct expression. ] 2.5 Alternative tokens [lex.digraph] 1 Alternative token representations are provided for some operators and punctuators14). 2 In all respects of the language, each alternative token behaves the same, respectively, as its primary token, except for its spelling15). The set of alternative tokens is defined in Table 2. _________________________ 14) These include "digraphs" and additional reserved words. The term "digraph" (token consisting of two characters) is not perfectly de- scriptive, since one of the alternative preprocessing-tokens is %:%: and of course several primary tokens contain two characters. Nonethe- less, those alternative tokens that aren't lexical keywords are collo- quially known as "digraphs". 15) Thus [ and <: behave differently when "stringized" (_cpp.stringize_), but can otherwise be freely interchanged. Table 2--alternative tokens +----------------------+-----------------------+-----------------------+ |alternative primary | alternative primary | alternative primary | +----------------------+-----------------------+-----------------------+ | <% { | and && | and_eq &= | +----------------------+-----------------------+-----------------------+ | %> } | bitor | | or_eq |= | +----------------------+-----------------------+-----------------------+ | <: [ | or || | xor_eq ^= | +----------------------+-----------------------+-----------------------+ | :> ] | xor ^ | not ! | +----------------------+-----------------------+-----------------------+ | %: # | compl ~ | not_eq != | +----------------------+-----------------------+-----------------------+ | %:%: ## | bitand & | | +----------------------+-----------------------+-----------------------+ 2.6 Tokens [lex.token] token: identifier keyword literal operator punctuator 1 There are five kinds of tokens: identifiers, keywords, literals,16) operators, and other separators. Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments (collectively, "white space"), as described below, are ignored except as they serve to separate tokens. Some white space is required to separate otherwise adjacent identifiers, keywords, and literals. 2.7 Comments [lex.comment] 1 The characters /* start a comment, which terminates with the charac- ters */. These comments do not nest. The characters // start a com- ment, which terminates with the next new-line character. If there is a form-feed or a vertical-tab character in such a comment, only white- space characters shall appear between it and the new-line that termi- nates the comment; no diagnostic is required. [Note: The comment characters //, /*, and */ have no special meaning within a // comment and are treated just like other characters. Similarly, the comment characters // and /* have no special meaning within a /* comment. ] _________________________ 16) Literals include strings and character and numeric literals. 2.8 Header names [lex.header] header-name: "q-char-sequence" h-char-sequence: h-char h-char-sequence h-char h-char: any member of the source character set except new-line and > q-char-sequence: q-char q-char-sequence q-char q-char: any member of the source character set except new-line and " 1 Header name preprocessing tokens shall only appear within a #include preprocessing directive (_cpp.include_). The sequences in both forms of header-names are mapped in an implementation-defined manner to external source file names as specified in _cpp.include_. 2 If the characters ', , , or /* appear in the sequence between the < and > delimiters, or between the delimiters, the behavior is undefined.17) 2.9 Preprocessing numbers [lex.ppnumber] pp-number: digit . digit pp-number digit pp-number nondigit pp-number e sign pp-number E sign pp-number . 1 Preprocessing number tokens lexically include all integral literal tokens (_lex.icon_) and all floating literal tokens (_lex.fcon_). 2 A preprocessing number does not have a type or a value; it acquires both after a successful conversion (as part of translation phase 7, _lex.phases_) to an integral literal token or a floating literal token. 2.10 Identifiers [lex.name] identifier: nondigit identifier nondigit identifier digit _________________________ 17) Thus, sequences of characters that resemble escape sequences cause undefined behavior. nondigit: one of universal-character-name _ a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z digit: one of 0 1 2 3 4 5 6 7 8 9 1 An identifier is an arbitrarily long sequence of letters and digits. Each universal-character-name in an identifier shall designat a char- acter whose encoding in ISO 10646 falls into one of the ranges speci- fied in _extendid_. Upper- and lower-case letters are different. All characters are significant.18) 2 In addition, identifiers containing a double underscore (__) or begin- ning with an underscore and an upper-case letter are reserved for use by C++ implementations and standard libraries and shall not be used otherwise; no diagnostic is required. 2.11 Keywords [lex.key] 1 The identifiers shown in Table 3 are reserved for use as keywords (that is, they are unconditionally treated as keywords in phase 7): _________________________ 18) On systems in which linkers cannot accept extended characters, an encoding of the universal-character-name may be used in forming valid external identifiers. For example, some otherwise unused character or sequence of characters may be used to encode the ??u in a universal- character-name. Extended characters may produce a long external iden- tifier, but C++ does not place a translation limit on significant characters for external identifiers. In C++, upper- and lower-case letters are considered different for all identifiers, including exter- nal identifiers. Table 3--keywords +--------------------------------------------------------------------------+ |asm do inline short typeid | |auto double int signed typename | |bool dynamic_cast long sizeof union | |break else mutable static unsigned | |case enum namespace static_cast using | |catch explicit new struct virtual | |char extern operator switch void | |class false private template volatile | |const float protected this wchar_t | |const_cast for public throw while | |continue friend register true | |default goto reinterpret_cast try | |delete if return typedef | +--------------------------------------------------------------------------+ 2 Furthermore, the alternative representations shown in Table 4 for cer- tain operators and punctuators (_lex.digraph_) are reserved and shall not be used otherwise: Table 4--alternative representations +------------------------------------------------+ |and and_eq bitand bitor compl not | |not_eq or or_eq xor xor_eq | +------------------------------------------------+ 2.12 Operators and punctuators 1 The lexical representation of C++ programs includes a number of pre- processing tokens which are used in the syntax of the preprocessor or are converted into tokens for operators and punctuators: preprocessing-op-or-punc: one of { } [ ] # ## ( ) <: :> <% %> %: %:%: ; : ... new delete ? :: . .* + - * / % ^ & | ~ ! = < > += -= *= /= %= ^= &= |= << >> >>= <<= == != <= >= && || ++ -- , ->* -> and and_eq bitand bitor compl not not_eq or or_eq xor xor_eq Each preprocessing-op-or-punc is converted to a single token in trans- lation phase 7 (_lex.phases_). 2.13 Literals [lex.literal] 1 There are several kinds of literals.19) literal: integer-literal character-literal floating-literal string-literal boolean-literal 2.13.1 Integer literals [lex.icon] integer-literal: decimal-literal integer-suffixopt octal-literal integer-suffixopt hexadecimal-literal integer-suffixopt decimal-literal: nonzero-digit decimal-literal digit octal-literal: 0 octal-literal octal-digit hexadecimal-literal: 0x hexadecimal-digit 0X hexadecimal-digit hexadecimal-literal hexadecimal-digit nonzero-digit: one of 1 2 3 4 5 6 7 8 9 octal-digit: one of 0 1 2 3 4 5 6 7 hexadecimal-digit: one of 0 1 2 3 4 5 6 7 8 9 a b c d e f A B C D E F integer-suffix: unsigned-suffix long-suffixopt long-suffix unsigned-suffixopt unsigned-suffix: one of u U long-suffix: one of l L 1 An integer literal is a sequence of digits that has no period or expo- nent part. An integer literal may have a prefix that specifies its base and a suffix that specifies its type. The lexically first digit of the sequence of digits is the most significant. A decimal integer literal (base ten) begins with a digit other then 0 and consists of a sequence of decimal digits. An octal integer literal (base eight) begins with the digit 0 and consists of a sequence of octal digits.20) An hexadecimal integer literal (base sixteen) begins with 0x or 0X and _________________________ 19) The term "literal" generally designates, in this International Standard, those tokens that are called "constants" in ISO C. 20) The digits 8 and 9 are not octal digits. consists of a sequence of hexadecimal digits which include the decimal digits and the letters a or A through f or F with decimal values ten through fifteen. [Example: the number twelve can be written 12, 014, or 0XC. ] 2 The type of an integer literal depends on its form, value, and suffix. If it is decimal and has no suffix, it has the first of these types in which its value can be represented: int, long int, unsigned long int.21) If it is octal or hexadecimal and has no suffix, it has the first of these types in which its value can be represented: int, unsigned int, long int, unsigned long int. If it is suffixed by u or U, its type is the first of these types in which its value can be rep- resented: unsigned int, unsigned long int. If it is suffixed by l or L, its type is the first of these types in which its value can be rep- resented: long int, unsigned long int. If it is suffixed by ul, lu, uL, Lu, Ul, lU, UL, or LU, its type is unsigned long int. 3 A program is ill-formed if one of its translation units contains an integer literal that cannot be represented by any of the allowed types. 2.13.2 Character literals [lex.ccon] character-literal: 'c-char-sequence' L'c-char-sequence' c-char-sequence: c-char c-char-sequence c-char c-char: any member of the source character set except the single-quote ', backslash \, or new-line character escape-sequence universal-character-name escape-sequence: simple-escape-sequence octal-escape-sequence hexadecimal-escape-sequence simple-escape-sequence: one of \' \" \? \\ \a \b \f \n \r \t \v octal-escape-sequence: \ octal-digit \ octal-digit octal-digit \ octal-digit octal-digit octal-digit _________________________ 21) A decimal integer literal with no suffix never has type unsigned int. Otherwise, for example, on an implementation where unsigned int values have 16 bits and unsigned long values have strictly more than 17 bits, we would have -30000<0, -50000>0 (because 50000 would have type unsigned int), and -70000<0 (because 70000 would have type long). hexadecimal-escape-sequence: \x hexadecimal-digit hexadecimal-escape-sequence hexadecimal-digit 1 A character literal is one or more characters enclosed in single quotes, as in 'x', optionally preceded by the letter L, as in L'x'. Single character literals that do not begin with L have type char, with value equal to the numerical value of the character in the execu- tion character set. Multicharacter literals that do not begin with L have type int and implementation-defined value. 2 A character literal that begins with the letter L, such as L'ab', is a wide-character literal. Wide-character literals have type wchar_t.22) Wide-character literals have implementation-defined values, regardless of the number of characters in the literal. 3 Certain nongraphic characters, the single quote ', the double quote ", the question mark ?, and the backslash \, can be represented according to Table 5. Table 5--escape sequences +----------------------------------+ |new-line NL (LF) \n | |horizontal tab HT \t | |vertical tab VT \v | |backspace BS \b | |carriage return CR \r | |form feed FF \f | |alert BEL \a | |backslash \ \\ | |question mark ? \? | |single quote ' \' | |double quote " \" | |octal number ooo \ooo | |hex number hhh \xhhh | +----------------------------------+ The double quote " and the question mark ?, can be represented as themselves or by the escape sequences \" and \? respectively, but the single quote ' and the backslash \ shall be represented by the escape sequences \' and \\ respectively. If the character following a back- slash is not one of those specified, the behavior is undefined. An escape sequence specifies a single character. 4 The escape \ooo consists of the backslash followed by one, two, or three octal digits that are taken to specify the value of the desired character. The escape \xhhh consists of the backslash followed by x _________________________ 22) They are intended for character sets where a character does not fit into a single byte. followed by one or more hexadecimal digits that are taken to specify the value of the desired character. There is no limit to the number of digits in a hexadecimal sequence. A sequence of octal or hexadeci- mal digits is terminated by the first character that is not an octal digit or a hexadecimal digit, respectively. The value of a character literal is implementation-defined if it falls outside of the implemen- tation-defined range defined for char (for ordinary literals) or wchar_t (for wide literals). 5 A universal-character-name is translated to the encoding, in the exe- cution character set, of the character named. If there is no such encoding, the universal-character-name is translated to an implementa- tion-defined encoding. [Note: in translation phase 1, a universal- character-name is introduced whenever an actual extended character is encountered in the source text. Therefore, all extended characters are described in terms of universal-character-names. However, the actual compiler implementation may use its own native character set, so long as the same results are obtained. ] 2.13.3 Floating literals [lex.fcon] floating-literal: fractional-constant exponent-partopt floating-suffixopt digit-sequence exponent-part floating-suffixopt fractional-constant: digit-sequenceopt . digit-sequence digit-sequence . exponent-part: e signopt digit-sequence E signopt digit-sequence sign: one of + - digit-sequence: digit digit-sequence digit floating-suffix: one of f l F L 1 A floating literal consists of an integer part, a decimal point, a fraction part, an e or E, an optionally signed integer exponent, and an optional type suffix. The integer and fraction parts both consist of a sequence of decimal (base ten) digits. Either the integer part or the fraction part (not both) can be omitted; either the decimal point or the letter e (or E) and the exponent (not both) can be omit- ted. The integer part, the optional decimal point and the optional fraction part form the significant part of the floating literal. The exponent, if present, indicates the power of 10 by which the signifi- cant part is to be scaled. If the scaled value is in the range of representable values for its type, the result is either the nearest representable value, or the larger or smaller representable value immediately adjacent to the nearest representatble value, chosen in an implementation-defined manner. The type of a floating literal is dou- ble unless explicitly specified by a suffix. The suffixes f and F specify float, the suffixes l and L specify long double. If the scaled value is not in the range of representable values for its type, the program is ill-formed. 2.13.4 String literals [lex.string] string-literal: "s-char-sequenceopt" L"s-char-sequenceopt" s-char-sequence: s-char s-char-sequence s-char s-char: any member of the source character set except the double-quote ", backslash \, or new-line character escape-sequence universal-character-name 1 A string literal is a sequence of characters (as defined in _lex.ccon_) surrounded by double quotes, optionally beginning with the letter L, as in "..." or L"...". A string literal that does not begin with L has type "array of n char" and static storage duration (_basic.stc_), where n is the size of the string as defined below, and is initialized with the given characters. A string literal that begins with L, such as L"asdf", is a wide string literal. A wide string literal has type "array of n wchar_t" and has static storage duration, where n is the size of the string as defined below, and is initialized with the given characters. 2 Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation-defined. The effect of attempting to modify a string literal is undefined. 3 In translation phase 6 (_lex.phases_), adjacent string literals are concatenated and adjacent wide string literals are concatenated. If a string literal token is adjacent to a wide string literal token, the behavior is undefined. Characters in concatenated strings are kept distinct. [Example: "\xA" "B" contains the two characters '\xA' and 'B' after concatenation (and not the single hexadecimal character '\xAB'). ] 4 After any necessary concatenation, in translation phase 7 (_lex.phases_), '\0' is appended to every string literal so that pro- grams that scan a string can find its end. 5 Escape sequences and universal-character-names in string literals have the same meaning as in character literals (_lex.ccon_), except that the single quote ' is representable either by itself or by the escape sequence \', and the double quote " shall be preceded by a \. In a non-wide string literal, a universal-character-name may map to more than one char element. The size of a wide string literal is the total number of escape sequences, universal-character-names, and other char- acters, plus one for the terminating L'\0'. The size of a non-wide string literal is the total number of escape sequences and other characters, plus at least one for the multibyte encoding of each uni- versal-character-name, plus one for the terminating '\0'. 2.13.5 Boolean literals [lex.bool] boolean-literal: false true 1 The Boolean literals are the keywords false and true. Such literals have type bool. They are not lvalues. ______________________________________________________________________ 3 Basic concepts [basic] ______________________________________________________________________ 1 [Note: this clause presents the basic concepts of the C++ language. It explains the difference between an object and a name and how they relate to the notion of an lvalue. It introduces the concepts of a declaration and a definition and presents C++'s notion of type, scope, linkage, and storage duration. The mechanisms for starting and termi- nating a program are discussed. Finally, this clause presents the fundamental types of the language and lists the ways of constructing compound types from these. 2 This clause does not cover concepts that affect only a single part of the language. Such concepts are discussed in the relevant clauses. ] 3 An entity is a value, object, subobject, base class subobject, array element, variable, function, set of functions, instance of a function, enumerator, type, class member, template, or namespace. 4 A name is a use of an identifier (_lex.name_) that denotes an entity or label (_stmt.goto_, _stmt.label_). A variable is introduced by the declaration of an object. The variable's name denotes the object. 5 Every name that denotes an entity is introduced by a declaration. Every name that denotes a label is introduced either by a goto state- ment (_stmt.goto_) or a labeled-statement (_stmt.label_). 6 Some names denote types, classes, enumerations, or templates. In gen- eral, it is necessary to determine whether or not a name denotes one of these entities before parsing the program that contains it. The process that determines this is called name lookup (_basic.lookup_). 7 Two names are the same if --they are identifiers composed of the same character sequence; or --they are the names of overloaded operator functions formed with the same operator; or --they are the names of user-defined conversion functions formed with the same type. 8 An identifier used in more than one translation unit can potentially refer to the same entity in these translation units depending on the linkage (_basic.link_) of the identifier specified in each translation unit. 3.1 Declarations and definitions [basic.def] 1 A declaration (_dcl.dcl_) introduces names into a translation unit or redeclares names introduced by previous declarations. A declaration specifies the interpretation and attributes of these names. 2 A declaration is a definition unless it declares a function without specifying the function's body (_dcl.fct.def_), it contains the extern specifier (_dcl.stc_) and neither an initializer nor a function-body, it declares a static data member in a class declaration (_class.static_), it is a class name declaration (_class.name_), or it is a typedef declaration (_dcl.typedef_), a using declara- tion(_namespace.udecl_), or a using directive(_namespace.udir_). 3 [Example: all but one of the following are definitions: int a; // defines a extern const int c = 1; // defines c int f(int x) { return x+a; } // defines f and defines x struct S { int a; int b; }; // defines S struct X { // defines X int x; // defines nonstatic data member x static int y; // declares static data member y X(): x(0) { } // defines a constructor of X }; int X::y = 1; // defines X::y enum { up, down }; // defines up and down namespace N { int d; } // defines N and N::d namespace N1 = N; // defines N1 X anX; // defines anX whereas these are just declarations: extern int a; // declares a extern const int c; // declares c int f(int); // declares f struct S; // declares S typedef int Int; // declares Int extern X anotherX; // declares anotherX using N::d; // declares N::d --end example] 4 [Note: in some circumstances, C++ implementations implicitly define the default constructor (_class.ctor_), copy constructor (_class.copy_), assignment operator (_class.copy_), or destructor (_class.dtor_) member functions. [Example: given struct C { string s; // string is the standard library class (_lib.string_) }; int main() { C a; C b = a; b = a; } the implementation will implicitly define functions to make the defi- nition of C equivalent to struct C { string s; C(): s() { } C(const C& x): s(x.s) { } C& operator=(const C& x) { s = x.s; return *this; } ~C() { } }; --end example] --end note] 5 [Note: a class name can also be implicitly declared by an elaborated- type-specifier (_basic.scope.pdecl_). ] 6 A program is ill-formed if the definition of any object gives the object an incompletely-defined object type (_basic.types_). 3.2 One definition rule [basic.def.odr] 1 No translation unit shall contain more than one definition of any variable, function, class type, enumeration type or template. 2 A function is used if it is called, its address is taken, it is used to form a pointer to member, or it is a virtual member function that is not pure (_class.abstract_). Every program shall contain at least one definition of every function that is used in that program. That definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropriate) it is implicitly defined (see _class.ctor_, _class.dtor_ and _class.copy_). If a non-virtual function is not defined, a diagnostic is required only if an attempt is actually made to call that function. If a vir- tual function is not defined and it is neither called nor used to form a pointer to member, no diagnostic is required. +------- BEGIN BOX 4 -------+ This says nothing about user-defined libraries. Probably it shouldn't, but perhaps it should be more explicit that it isn't dis- cussing it. +------- END BOX 4 -------+ 3 A non-local variable with static storage duration shall have exactly one definition in a program unless the variable either has a built-in type or is an aggregate and unless it is either unused or used only as the operand of the sizeof operator. +------- BEGIN BOX 5 -------+ This is still uncertain. +------- END BOX 5 -------+ 4 Exactly one definition of a class is required in a translation unit if the class is used in a way that requires the class type to be complete. [Example: the following complete translation unit is well- formed, even though it never defines X: struct X; // declare X as a struct type struct X* x1; // use X in pointer formation X* x2; // use X in pointer formation --end example] [Note: the rules for declarations and expressions describe in which contexts complete class types are required. A class type T must be complete if: --an object of type T is defined (_basic.def_, _expr.new_), or --an lvalue-to-rvalue conversion is applied to an lvalue referring to an object of type T (_conv.lval_), or --an expression is converted (either implicitly or explicitly) to type T (_conv_, _expr.type.conv_, _expr.dynamic.cast_, _expr.static.cast_, _expr.cast_), or --an expression is converted to the type pointer to T or reference to T using an implicit conversion (_conv_), a dynamic_cast (_expr.dynamic.cast_) or a static_cast (_expr.static.cast_), or --a class member access operator is applied to an object expression of type T (_expr.ref_), or --the typeid operator (_expr.typeid_) or the sizeof operator (_expr.sizeof_) is applied to an operand of type T, or --a function with a return type of type T is called (_expr.call_), or --an lvalue of type T is assigned to (_expr.ass_). ] 5 There can be more than one definition of a class type (_class_), enu- meration type (_dcl.enum_), inline function with external linkage (_dcl.fct.spec_), class template (_temp_), non-static function tem- plate (_temp.fct_), static data member of a class template (_temp.static_), member function template (_temp.mem.func_), or tem- plate specialization for which some template parameters are not speci- fied (_temp.spec_, _temp.class.spec_) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then --each definition of D shall consist of the same sequence of tokens; and --in each definition of D, corresponding names, looked up according to _basic.lookup_, shall refer to an entity defined within the defini- tion of D, or shall refer to the same entity, after overload resolu- tion (_over.match_) and after matching of partial template special- ization (_temp.over_), except that a name can refer to a const object with internal or no linkage if the object has the same inte- gral or enumeration type in all definitions of D, and the object is initialized with a constant expression (_expr.const_), and the value (but not the address) of the object is used, and the object has the same value in all definitions of D; and --in each definition of D, the overloaded operators referred to, the implicit calls to conversion operators, constructors, operator new functions and operator delete functions, shall refer to the same function, or to a function defined within the definition of D; and --in each definition of D, a default argument used by an (implicit or explicit) function call is treated as if its token sequence were present in the definition of D; that is, the default argument is subject to the three requirements described above (and, if the default argument has sub-expressions with default arguments, this requirement applies recursively).23) --if D is a class with an implicitly-declared constructor (_class.ctor_), it is as if the constructor was implicitly defined in every translation unit where it is used, and the implicit defini- tion in every translation unit shall call the same constructor for a base class or a class member of D. [Example: // translation unit 1: struct X { X(int); X(int, int); }; X::X(int = 0) { } class D: public X { }; D d2; // X(int) called by D() // translation unit 2: struct X { X(int); X(int, int); }; X::X(int = 0, int = 0) { } class D: public X { }; // X(int, int) called by D(); // D()'s implicit definition // violates the ODR --end example] If D is a template, and is defined in more than one translation unit, then the last four requirements from the list above shall apply to names from the template's enclosing scope used in the template definition (_temp.nondep_), and also to dependent names at the point of instantiation (_temp.dep_). If the defini- tions of D satisfy all these requirements, then the program shall behave as if there were a single definition of D. If the defini- tions of D do not satisfy these requirements, then the behavior is undefined. _________________________ 23) _dcl.fct.default_ describes how default argument names are looked up. 3.3 Declarative regions and scopes [basic.scope] 1 Every name is introduced in some portion of program text called a declarative region, which is the largest part of the program in which that name is valid, that is, in which that name may be used as an unqualified name to refer to the same entity. In general, each par- ticular name is valid only within some possibly discontiguous portion of program text called its scope. To determine the scope of a decla- ration, it is sometimes convenient to refer to the potential scope of a declaration. The scope of a declaration is the same as its poten- tial scope unless the potential scope contains another declaration of the same name. In that case, the potential scope of the declaration in the inner (contained) declarative region is excluded from the scope of the declaration in the outer (containing) declarative region. 2 [Example: in int j = 24; int main() { int i = j, j; j = 42; } the identifier j is declared twice as a name (and used twice). The declarative region of the first j includes the entire example. The potential scope of the first j begins immediately after that j and extends to the end of the program, but its (actual) scope excludes the text between the , and the }. The declarative region of the second declaration of j (the j immediately before the semicolon) includes all the text between { and }, but its potential scope excludes the decla- ration of i. The scope of the second declaration of j is the same as its potential scope. ] 3 The names declared by a declaration are introduced into the scope in which the declaration occurs, except that the presence of a friend specifier (_class.friend_), certain uses of the elaborated-type- specifier (_basic.scope.pdecl_), and using-directives (_names- pace.udir_) alter this general behavior. 4 [Note: the name look up rules are summarized in _basic.lookup_. ] 3.3.1 Point of declaration [basic.scope.pdecl] 1 The point of declaration for a name is immediately after its complete declarator (_dcl.decl_) and before its initializer (if any), except as noted below. [Example: int x = 12; { int x = x; } Here the second x is initialized with its own (indeterminate) value. ] 2 [Note: a nonlocal name remains visible up to the point of declaration of the local name that hides it. [Example: const int i = 2; { int i[i]; } declares a local array of two integers. ] ] 3 The point of declaration for an enumerator is immediately after its enumerator-definition. [Example: const int x = 12; { enum { x = x }; } Here, the enumerator x is initialized with the value of the constant x, namely 12. ] 4 After the point of declaration of a class member, the member name can be looked up in the scope of its class. [Note: this is true even if the class is an incomplete class. For example, struct X { enum E { z = 16 }; int b[X::z]; //ok }; --end note] 5 The point of declaration of a class first declared in an elaborated- type-specifier is as follows: --if the elaborated-type-specifier has the form: class-key identifier ; the elaborated-type-specifier declares the identifier to be a class- name in the scope that contains the declaration, otherwise --if the elaborated-type-specifier has the form class-key identifier ... the identifier is declared as a class-name in the smallest non- class, non-function prototype scope that contains the declaration. [Note: except for the friend class declaration case mentioned below, any other form of elaborated-type-specifier must refer to an already declared class-name or enum-name; see _basic.lookup.elab_. ] 6 A class declared as a friend with a declaration of the form: friend class-key identifier ; and not previously declared is introduced in the smallest enclosing non-class scope that contains the friend declaration. A function declared as a friend and not previously declared, is introduced in the smallest enclosing non-class scope that contains the friend declara- tion. [Note: when looking for a prior declaration of a class or func- tion introduced by a friend declaration, scopes outside of the inner- most enclosing namespace scope are not considered; see _names- pace.memdef_. ] 7 [Note: For point of instantiation of a template, see _temp.inst_. ] 3.3.2 Local scope [basic.scope.local] 1 A name declared in a block (_stmt.block_) is local to that block. Its potential scope begins at its point of declaration (_basic.scope.pdecl_) and ends at the end of its declarative region. 2 The potential scope of a function parameter name in a function defini- tion (_dcl.fct.def_) begins at its point of declaration and ends at the end of the outermost block of the function definition. A parame- ter name shall not be redeclared in the outermost block of the func- tion definition. 3 The name in a catch exception-declaration is local to the handler and shall not be redeclared in the outermost block of the handler. 4 Names declared in the for-init-statement, and in the condition of if, while, for, and switch statements are local to the if, while, for, or switch statement (including the controlled statement), and shall not be redeclared in a subsequent condition of that statement nor in the outermost block (or, for the if statement, any of the outermost blocks) of the controlled statement; see _stmt.select_. 3.3.3 Function prototype scope [basic.scope.proto] 1 In a function declaration, or in any function declarator except the declarator of a function definition (_dcl.fct.def_), names of parame- ters (if supplied) have function prototype scope, which terminates at the end of the nearest enclosing function declarator. 3.3.4 Function scope 1 Labels (_stmt.label_) have function scope and may be used anywhere in the function in which they are declared. Only labels have function scope. 3.3.5 Namespace scope [basic.scope.namespace] 1 The declarative region of a namespace-definition is its namespace- body. The potential scope denoted by an original-namespace-name is the concatenation of the declarative regions established by each of the namespace-definitions in the same declarative region with that original-namespace-name. Entities declared in a namespace-body are said to be members of the namespace, and names introduced by these declarations into the declarative region of the namespace are said to be member names of the namespace. A namespace member name has names- pace scope. Its potential scope includes its namespace from the name's point of declaration (_basic.scope.pdecl_) onwards, as well as the potential scope of any using directive (_namespace.udir_) that nominates its namespace. [Example: namespace N { int i; int g(int a) { return a; } int k(); void q(); } namespace { int l=1; } // the potential scope of l is from its point of declaration // to the end of the translation unit namespace N { int g(char a) // overloads N::g(int) { return l+a; // l is from unnamed namespace } int i; // error: duplicate definition int k(); // ok: duplicate function declaration int k() // ok: definition of N::k() { return g(i); // calls N::g(int) } int q(); // error: different return type } --end example] 2 A namespace member can also be referred to after the :: scope resolu- tion operator (_expr.prim_) applied to the name of its namespace; see _namespace.qual_. 3 A name declared outside all named or unnamed namespaces (_basic.namespace_), blocks (_stmt.block_) and classes (_class_) has global namespace scope (also called global scope). The potential scope of such a name begins at its point of declaration (_basic.scope.pdecl_) and ends at the end of the translation unit that is its declarative region. Names declared in the global namespace scope are said to be global. 3.3.6 Class scope [basic.scope.class] 1 The following rules describe the scope of names declared in classes. 1)The potential scope of a name declared in a class consists not only of the declarative region following the name's declarator, but also of all function bodies, default arguments, and construc- tor ctor-initializers in that class (including such things in nested classes). 2)A name N used in a class S shall refer to the same declaration when re-evaluated in its context and in the completed scope of S. 3)If reordering member declarations in a class yields an alternate valid program under (1) and (2), the program's behavior is ill- formed, no diagnostic is required. 4)A name declared within a member function hides a declaration of the same name whose scope extends to or past the end of the member function's class. 5)The potential scope of a declaration that extends to or past the end of a class definition also extends to the regions defined by its member definitions, even if the members are defined lexically outside the class (this includes static data member definitions, nested class definitions and member function definitions (that is, the parameter-declaration-clause including default arguments (_dcl.fct.default_), the member function body and, for constructor functions (_class.ctor_), the ctor-initializer (_class.base.init_)). [Example: typedef int c; enum { i = 1 }; class X { char v[i]; // error: 'i' refers to ::i // but when reevaluated is X::i int f() { return sizeof(c); } // okay: X::c char c; enum { i = 2 }; }; typedef char* T; struct Y { T a; // error: 'T' refers to ::T // but when reevaluated is Y::T typedef long T; T b; }; struct Z { int f(const R); // error: 'R' is parameter name // but swapping the two declarations // changes it to a type typedef int R; }; --end example] 2 The name of a class member shall only be used as follows: --in the scope of its class (as described above) or a class derived (_class.derived_) from its class, --after the . operator applied to an expression of the type of its class (_expr.ref_) or a class derived from its class, --after the -> operator applied to a pointer to an object of its class (_expr.ref_) or a class derived from its class, --after the :: scope resolution operator (_expr.prim_) applied to the name of its class or a class derived from its class, --or after a using declaration (_namespace.udecl_). 3 [Note: the scope of names introduced by friend declarations is described in _basic.scope.pdecl_. ] 3.3.7 Name hiding [basic.scope.hiding] 1 A name can be hidden by an explicit declaration of that same name in a nested declarative region or derived class (_class.member.lookup_). 2 A class name (_class.name_) or enumeration name (_dcl.enum_) can be hidden by the name of an object, function, or enumerator declared in the same scope. If a class or enumeration name and an object, func- tion, or enumerator are declared in the same scope (in any order) with the same name, the class or enumeration name is hidden wherever the object, function, or enumerator name is visible. 3 In a member function definition, the declaration of a local name hides the declaration of a member of the class with the same name; see _basic.scope.class_. The declaration of a member in a derived class (_class.derived_) hides the declaration of a member of a base class of the same name; see _class.member.lookup_. 4 If a name is in scope and is not hidden it is said to be visible. 3.4 Name look up [basic.lookup] 1 The name look up rules apply uniformly to all names (including type- def-names (_dcl.typedef_), namespace-names (_basic.namespace_) and class-names (_class.name_)) wherever the grammar allows such names in the context discussed by a particular rule. Name look up associates the use of a name with a declaration (_basic.def_) of that name. Name look up shall find an unambiguous declaration for the name (see _class.member.lookup_). Name look up may associate more than one dec- laration with a name if it finds the name to be a function name; the declarations are said to form a set of overloaded functions (_over.load_). Overload resolution (_over.match_) takes place after name look up has succeeded. The access rules (_class.access_) are considered only once name look up and function overload resolution (if applicable) have succeeded. Only after name look up, function over- load resolution (if applicable) and access checking have succeeded are the attributes introduced by the name's declaration used further in expression processing (_expr_). 2 A name "looked up in the context of an expression" is looked up as an unqualified name in the scope where the expression is found. 3 [Note: _basic.link_ discusses linkage issues. The notions of scope, point of declaration and name hiding are discussed in _basic.scope_. ] 3.4.1 Unqualified name look up [basic.lookup.unqual] 1 In all the cases listed in this subclause, the scopes are searched for a declaration in the order listed in each of the respective category; name look up ends as soon as a declaration is found for the name. If no declaration is found, the program is ill-formed. 2 The declarations from the namespace nominated by a using-directive become visible in a namespace enclosing the using-directive; see _namespace.udir_. For the purpose of the unqualified name look up rules described in this subclause, the declarations from the namespace nominated by the using-directive are considered members of the enclos- ing namespace. 3 A name used in global scope, outside of any function, class or user- declared namespace, shall be declared before its use in global scope. 4 A name used in a user-declared namespace outside of the definition of any function or class shall be declared before its use in that names- pace or before its use in a namespace enclosing its namespace. 5 A name used in the definition of a function24) that is a member of namespace N (where, only for the purpose of exposition, N could repre- sent the global scope) shall be declared before its use in the block in which it is used or in one of its enclosing blocks (_stmt.block_) or, shall be declared before its use in namespace N or, if N is a nested namespace, shall be declared before its use in one of N's enclosing namespaces. [Example: namespace A { namespace N { void f(); } } void A::N::f() { i = 5; // The following scopes are searched for a declaration of i: // 1) function scope of A::N::f, before the use of i // 2) scope of namespace N // 3) scope of namespace A // 4) global scope, before the definition of A::N::f } --end example] 6 A name used in the definition of a class X outside of a member func- tion body or nested class definition25) shall be declared in one of the following ways: --before its use in class X or be a member of a base class of X (_class.member.lookup_), or _________________________ 24) This refers to unqualified names following the function declara- tor; such a name may be used as a type or as a default argument name in the parameter-declaration-clause, or may be used in the function body. 25) This refers to unqualified names following the class name; such a name may be used in the base-clause or may be used in the class defi- nition. --if X is a nested class of class Y (_class.nest_), before the defini- tion of X in Y, or shall be a member of a base class of Y (this look up applies in turn to Y's enclosing classes, starting with the innermost enclosing class),26) or --if X is a local class (_class.local_) or is a nested class of a local class, before the definition of class X in a block enclosing the definition of class X, or --if X is a member of namespace N, or is a nested class of a class that is a member of N, or is a local class or a nested class within a local class of a function that is a member of N, before the defi- nition of class X in namespace N or in one of N's enclosing names- paces. [Example: namespace M { class B { }; } namespace N { class Y : public M::B { class X { int a[i]; }; }; } // The following scopes are searched for a declaration of i: // 1) scope of class N::Y::X, before the use of i // 2) scope of class N::Y, before the definition of N::Y::X // 3) scope of N::Y's base class M::B // 4) scope of namespace N, before the definition of N::Y // 5) global scope, before the definition of N --end example] [Note: when looking for a prior declaration of a class or function introduced by a friend declaration, scopes outside of the innermost enclosing namespace scope are not considered; see _names- pace.memdef_. ] [Note: _basic.scope.class_ further describes the restrictions on the use of names in a class definition. _class.nest_ further describes the restrictions on the use of names in nested class definitions. _class.local_ further describes the restrictions on the use of names in local class definitions. ] 7 A name used in the definition of a function that is a member function (_class.mfct_)27) of class X shall be declared in one of the following _________________________ 26) This look up applies whether the definition of X is nested within Y's definition or whether X's definition appears in a namespace scope enclosing Y's definition (_class.nest_). 27) That is, an unqualified name following the function declarator; such a name may be used as a type or as a default argument name in the parameter-declaration-clause, or may be used in the function body, or, if the function is a constructor, may be used in the expression of a mem-initializer. ways: --before its use in the block in which it is used or in an enclosing block (_stmt.block_), or --shall be a member of class X or be a member of a base class of X (_class.member.lookup_), or --if X is a nested class of class Y (_class.nest_), shall be a member of Y, or shall be a member of a base class of Y (this look up applies in turn to Y's enclosing classes, starting with the inner- most enclosing class),28) or --if X is a local class (_class.local_) or is a nested class of a local class, before the definition of class X in a block enclosing the definition of class X, or --if X is a member of namespace N, or is a nested class of a class that is a member of N, or is a local class or a nested class within a local class of a function that is a member of N, before the member function definition, in namespace N or in one of N's enclosing named namespaces. [Example: class B { }; namespace M { namespace N { class X : public B { void f(); }; } } void M::N::X::f() { i = 16; } // The following scopes are searched for a declaration of i: // 1) function scope of M::N::X::f, before the use of i // 2) scope of class M::N::X // 3) scope of M::N::X's base class B // 4) scope of namespace M::N // 5) scope of namespace M // 6) global scope, before the definition of M::N::X::f --end example] [Note: _class.mfct_ and _class.static_ further describe the restrictions on the use of names in member function defi- nitions. _class.nest_ further describes the restrictions on the use of names in the scope of nested classes. _class.local_ further describes the restrictions on the use of names in local class defini- tions. ] _________________________ 28) This look up applies whether the member function is defined within the definition of class X or whether the member function is defined in a namespace scope enclosing X's definition. 8 Name look up for a name used in the definition of a friend function (_class.friend_) defined inline in the class granting friendship shall proceed as described for look up in member function definitions. If the friend function is not defined in the class granting friendship, name look up in the friend function definition shall proceed as described for look up in namespace member function definitions. 9 A name used in a function parameter-declaration-clause as a default argument (_dcl.fct.default_) or used in the expression of a mem- initializer (_class.base.init_) is looked up as if the name were used in the outermost block of the function definition. In particular, the function parameter names are visible for name look up in default argu- ments and in mem-initializers. [Note: _dcl.fct.default_ further describes the restrictions on the use of names in default arguments. _class.base.init_ further describes the restrictions on the use of names in a ctor-initializer. ] 10A name used in the definition of a static member of class X (_class.static.data_) (after the qualified-id of the static member) is looked up as if the name was used in a member function of X. [Note: _class.static.data_ further describes the restrictions on the use of names in the definition of a static data member. ] 11A name used in the handler for a function-try-block (_except_) is looked up as if the name was used in the outermost block of the func- tion definition. In particular, the function parameter names shall not be redeclared in the exception-declaration or in the outermost block of a handler for the function-try-block. Names declared in the outermost block of the function definition are not found when looked up in the scope of a handler for the function-try-block. 12[Note: the rules for name look up in template definitions are described in _temp.res_. ] 3.4.2 Qualified name look up [basic.lookup.qual] 1 The name of a class or namespace member can be referred to after the :: scope resolution operator (_expr.prim_) applied to a nested-name- specifier that nominates its class or namespace. During the look up for a name preceding the :: scope resolution operator, object, func- tion, and enumerator names are ignored. If the name found is not a class-name (_class_) or namespace-name (_namespace.def_), the program is ill-formed. [Example: class A { public: static int n; }; int main() { int A; A::n = 42; // OK A b; // ill-formed: A does not name a type } --end example] 2 [Note: Multiply qualified names, such as N1::N2::N3::n, can be used to refer to members of nested classes (_class.nest_) or members of nested namespaces. ] 3 In a declaration in which the declarator-id is a qualified-id, names used before the qualified-id being declared are looked up in the defining namespace scope; names following the qualified-id are looked up in the scope of the member's class or namespace. [Example: class X { }; class C { class X { }; static const int number = 50; static X arr[number]; }; X C::arr[number]; // ill-formed: // equivalent to: ::X C::arr[C::number]; // not to: C::X C::arr[C::number]; --end example] 4 A name prefixed by the unary scope operator :: (_expr.prim_) is looked up in global scope, in the translation unit where it is used. The name shall be declared in global namespace scope or shall be a name whose declaration is visible in global scope because of a using direc- tive (_namespace.qual_). The use of :: allows a global name to be referred to even if its identifier has been hidden (_basic.scope.hiding_). 5 A nested-name-specifier that names a scalar type, followed by ::, fol- lowed by ~type-name is a pseudo-destructor-name for a scalar type (_expr.pseudo_). The type-name is looked up as a type in the scope of the nested-name-specifier. [Example: struct A { typedef int I; }; typedef int I1, I2; extern int* p; extern int* q; p->A::I::~I(); // I is looked up in the scope of A q->I1::~I2(); // I2 is looked up in the scope of // the postfix-expression --end example] [Note: _basic.lookup.classref_ describes how name look up proceeds after the . and -> operators. ] 3.4.2.1 Class members [class.qual] 1 If the nested-name-specifier of a qualified-id nominates a class, the name specified after the nested-name-specifier is looked up in the scope of the class. The name shall represent a member of that class or a member of one of its base classes (_class.derived_). [Note: a class member can be referred to using a qualified-id as soon as the member point of declaration (_basic.scope.pdecl_) in the class member- specification has been encountered. ] [Note: _class.member.lookup_ describes how name look up proceeds in class scope. ] 2 A class member name hidden by a name in a nested declarative region or by the name of a derived class member can still be found if qualified by the name of its class followed by the :: operator. 3.4.2.2 Namespace members [namespace.qual] 1 If the nested-name-specifier of a qualified-id nominates a namespace, the name specified after the nested-name-specifier is looked up in the scope of the namespace. 2 Given X::m, where X is a namespace, if m is declared directly in X, let S be the set of all such declarations of m. Else if there are no using-directives in X, S is the empty set. Else let S be the union of all sets of declarations of m found in the namespaces designated by the using-directives in X. If m is declared directly in these names- paces, let S be the set of all such declarations of m. Else if these namespaces do not contain any using-directives, S is the empty set. Else, this search is applied recursively to the namespaces designated by the using-directives in these namespaces. No namespace is searched more than once in the lookup of a name. If S is the empty set the program is ill-formed, otherwise S is the required set of declarations of m. If S has exactly one member then X::m refers to that member. Otherwise if the use of m is not one that allows a unique declaration to be chosen from S, the program is ill-formed. [Note: the choice could be made by overload resolution (_over.match_) or resolution between class names and non-class names (_class.name_). For example: int x; namespace Y { void f(float); void h(int); } namespace Z { void h(double); } namespace A { using namespace Y; void f(int); void g(int); int i; } namespace B { using namespace Z; void f(char); int i; } namespace AB { using namespace A; using namespace B; void g(); } void h() { AB::g(); // g is declared directly in AB, // therefore S is { AB::g() } and AB::g() is chosen AB::f(1); // f is not declared directly in AB so the rules are // applied recursively to A and B; // namespace Y is not searched and Y::f(float) // is not considered; // S is { A::f(int), B::f(char) } and overload // resolution chooses A::f(int) AB::f('c'); // as above but resolution chooses B::f(char) AB::x++; // x is not declared directly in AB, and // is not declared in A or B, so the rules are // applied recursively to Y and Z, // S is { } so the program is ill-formed AB::i++; // i is not declared directly in AB so the rules are // applied recursively to A and B, // S is { A::i, B::i } so the use is ambiguous // and the program is ill-formed AB::h(16.8); // h is not declared directly in AB and // not declared directly in A or B so the rules are // applied recursively to Y and Z, // S is { Y::h(int), Z::h(float) } and overload // resolution chooses Z::h(float) } 3 The same declaration found more than once is not an ambiguity (because it is still a unique declaration). For example: namespace A { int a; } namespace B { using namespace A; } namespace C { using namespace A; } namespace BC { using namespace B; using namespace C; } void f() { BC::a++; // ok: S is { A::a, A::a } } namespace D { using A::a; } namespace BD { using namespace B; using namespace D; } void g() { BD::a++; // ok: S is { A::a, A::a } } 4 Since each referenced namespace is searched at most once, the follow- ing is well-defined: namespace B { int b; } namespace A { using namespace B; int a; } namespace B { using namespace A; } void f() { A::a++; // ok: a declared directly in A, S is { A::a } B::a++; // ok: both A and B searched (once), S is { A::a } A::b++; // ok: both A and B searched (once), S is { B::b } B::b++; // ok: b declared directly in B, S is { B::b } } --end note] 5 During the look up of a qualified namespace member name, if the look up finds more than one declaration of the member, and if one declara- tion introduces a class name or enumeration name and the other decla- rations either introduce the same object, the same enumerator or a set of functions, the non-type name hides the class or enumeration name if and only if the declarations are from the same namespace; otherwise (the declarations are from different namespaces), the program is ill- formed. [Example: namespace A { struct x { }; int x; int y; } namespace B { struct y {}; } namespace C { using namespace A; using namespace B; int i = C::x; // ok, A::x (of type 'int') int j = C::y; // ambiguous, A::y or B::y } --end example] 6 In a declaration for a namespace member in which the declarator-id is a qualified-id, given that the qualified-id for the namespace member has the form nested-name-specifier unqualified-id the unqualified-id shall name a member of the namespace designated by the nested-name-specifier. [Example: namespace A { namespace B { void f1(int); } using namespace B; } void A::f1(int){} // ill-formed, f1 is not a member of A --end example] However, in such namespace member declarations, the nested-name-specifier may rely on using-directives to implicitly pro- vide the initial part of the nested-name-specifier. [Example: namespace A { namespace B { void f1(int); } } namespace C { namespace D { void f1(int); } } using namespace A; using namespace C::D; void B::f1(int){} // okay, defines A::B::f1(int) void f1(int){} // okay, defines C::D::f1(int) --end example] 3.4.3 Elaborated type specifiers [basic.lookup.elab] 1 An elaborated-type-specifier may be used to refer to a previously declared class-name or enum-name even though the name has been hidden by an object, function, or enumerator declaration (_basic.scope.hiding_). The class-name or enum-name in the elabo- rated-type-specifier may either be a simple identifer or be a quali- fied-id. 2 If the name in the elaborated-type-specifier is a simple identifer, and unless the elaborated-type-specifier has the following form: class-key identifier ; the identifier is looked up according to _basic.lookup.unqual_ but ignoring any objects, functions or enumerators that have been declared. If this name look up finds a typedef-name, the elaborated- type-specifier is ill-formed. If the elaborated-type-specifier refers to an enum-name and this look up does not find a previously declared enum-name, the elaborated-type-specifier is ill-formed. If the elabo- rated-type-specifier refers to an class-name and this look up does not find a previously declared class-name, or if the elaborated-type- specifier has the form: class-key identifier ; the elaborated-type-specifier is a declaration that introduces the class-name as described in _basic.scope.pdecl_. 3 If the name is a qualified-id, the name is looked up according its qualifications, as described in _basic.lookup.qual_, but ignoring any objects, functions or enumerators that have been declared. If this name look up finds a typedef-name, the elaborated-type-specifier is ill-formed. If this name look up does not find a previously declared class-name or enum-name, the elaborated-type-specifier is ill-formed. [Example: struct Node { struct Node* Next; // ok: Refers to Node at global scope struct Data* Data; // ok: Declares type Data // at global scope and member Data }; struct Data { struct Node* Node; // ok: Refers to Node at global scope friend struct ::Glob; // error: Glob is not declared // cannot introduce a qualified type friend struct Glob; // ok: Declares Glob in global scope /* ... */ }; struct Base { struct Data; // ok: Declares nested Data struct ::Data* thatData; // ok: Refers to ::Data struct Base::Data* thisData; // ok: Refers to nested Data friend class ::Data; // ok: global Data is a friend friend class Data; // ok: nested Data is a friend struct Data { /* ... */ }; // Defines nested Data struct Data; // ok: Redeclares nested Data }; struct Data; // ok: Redeclares Data at global scope struct ::Data; // error: cannot introduce a qualified type struct Base::Data; // error: cannot introduce a qualified type struct Base::Datum; // error: Datum undefined struct Base::Data* pBase; // ok: refers to nested Data --end example] 3.4.4 Class member access [basic.lookup.classref] 1 If the id-expression in a class member access (_expr.ref_) is an unqualified-id, and the type of the object expression is of a class type C (or of pointer to a class type C), the unqualified-id is looked up in the scope of class C. If the type of the object expression is of pointer to scalar type, the unqualified-id is looked up in the scope of the object expression. 2 If the unqualified-id is ~type-name, and the type of the object expression is of a class type C (or of pointer to a class type C), the type-name is looked up in the context of the entire postfix-expression and in the scope of class C. The type-name shall refer to a class- name. If type-name is found in both contexts, the name shall refer to the same class type. If the type of the object expression is of scalar type, the type-name is looked up in the scope of the object expression (_expr.pseudo_). 3 If the id-expression in a class member access is a qualified-id of the form class-name-or-namespace-name::... the class-name-or-namespace-name following the . or -> operator is looked up both in the context of the entire postfix-expression and in the scope of the class of the object expression. If the name is found only in the scope of the class of the object expression, the name shall refer to a class-name. If the name is found only in the context of the entire postfix-expression, the name shall refer to a class-name or namespace-name. If the name is found in both contexts, the class- name-or-namespace-name shall refer to the same entity. [Note: because the name of a class is inserted in its class scope (_class_), the name of a class is also considered a nested member of that class. ] 4 If the qualified-id has the form ::class-name-or-namespace-name::... the class-name-or-namespace-name is looked up in global scope as a class-name or namespace-name. 5 If the nested-name-specifier contains a class template-id (_temp.names_), its template-arguments are evaluated in the context in which the entire postfix-expression occurs. 6 If the id-expression is a conversion-function-id, its conversion-type- id shall denote the same type in both the context in which the entire postfix-expression occurs and in the context of the class of the object expression (or the class pointed to by the pointer expression). 3.4.5 Using directives and namespace aliases [basic.lookup.udir] 1 When looking up a namespace-name in a using-directive or namespace- alias-definition, only namespace names are considered. 3.5 Program and linkage [basic.link] 1 A program consists of one or more translation units (_lex_) linked together. A translation unit consists of a sequence of declarations. translation unit: declaration-seqopt 2 A name is said to have linkage when it might denote the same object, reference, function, type, template, namespace or value as a name introduced by a declaration in another scope: --When a name has external linkage, the entity it denotes can be referred to by names from scopes of other translation units or from other scopes of the same translation unit. --When a name has internal linkage, the entity it denotes can be referred to by names from other scopes in the same translation unit. --When a name has no linkage, the entity it denotes cannot be referred to by names from other scopes. 3 A name having namespace scope (_basic.scope.namespace_) has internal linkage if it is the name of --an object that is explicitly declared static or, is explicitly declared const and neither explicitly declared extern nor previously declared to have external linkage; or --a function that is explicitly declared static or, is explicitly declared inline and neither explicitly declared extern nor previ- ously declared to have external linkage; or --a function template that is explicitly declared static or, is explicitly declared inline; or --the name of a data member of an anonymous union. 4 A name having namespace scope has external linkage if it is the name of --an object, unless it has internal linkage; or --a function, unless it has internal linkage; or --a named class (_class_), or an unnamed class defined in a typedef declaration in which the class has the typedef name for linkage pur- poses (_dcl.typedef_); or --a named enumeration (_dcl.enum_), or an unnamed enumeration defined in a typedef declaration in which the enumeration has the typedef name for linkage purposes (_dcl.typedef_); or --an enumerator belonging to an enumeration with external linkage; or --a template, unless it is a function template that has internal link- age (_temp_); or --a namespace (_basic.namespace_), unless it is declared within an unnamed namespace. 5 In addition, a name of class scope has external linkage if the name of the class has external linkage. 6 The name of a function declared in a block scope has linkage. If the block scope function declaration matches a prior visible declaration of the same function, the function name receives the linkage of the previous declaration; otherwise, it receives external linkage. The name of an object declared by a block scope extern declaration has linkage. If the block scope declaration matches a prior visible dec- laration of the same object, the name introduced by the block scope declaration receives the linkage of the previous declaration; other- wise, it receives external linkage. [Example: static void f(); static int i = 0; void g() { extern void f(); // internal linkage int i; // 'i' has no linkage { extern void f(); // internal linkage extern int i; // external linkage } } --end example] 7 Names not covered by these rules have no linkage. Moreover, except as noted, a name declared in a local scope (_basic.scope.local_) has no linkage. A name with no linkage (notably, the name of a class or enu- meration declared in a local scope (_basic.scope.local_)) shall not be used to declare an entity with linkage. If a declaration uses a type- def name, it is the linkage of the type name to which the typedef refers that is considered. [Example: void f() { struct A { int x; }; // no linkage extern A a; // ill-formed typedef A B; extern B b; // ill-formed } --end example] This implies that names with no linkage cannot be used as template arguments (_temp.arg_). 8 Two names that are the same (_basic_) and that are declared in differ- ent scopes shall denote the same object, reference, function, type, enumerator, template or namespace if --both names have external linkage or else both names have internal linkage and are declared in the same translation unit; and --both names refer to members of the same namespace or to members, not by inheritance, of the same class; and --when both names denote functions, the function types are identical for purposes of overloading; and --when both names denote function templates, the signatures (_temp.over.link_) are the same. 9 After all adjustments of types (during which typedefs (_dcl.typedef_) are replaced by their definitions), the types specified by all decla- rations of a particular external name shall be identical, except that declarations for an array object can specify array types that differ by the presence or absence of a major array bound (_dcl.array_), and declarations for functions with the same name can specify different numbers and types of parameters (_dcl.fct_). A violation of this rule on type identity does not require a diagnostic. 10[Note: linkage to non-C++ declarations can be achieved using a link- age-specification (_dcl.link_). ] 3.6 Start and termination [basic.start] 3.6.1 Main function [basic.start.main] 1 A program shall contain a global function called main, which is the designated start of the program. It is implementation-defined whether a program in a freestanding environment is required to define a main function. [Note: in a freestanding environment, start-up and termina- tion is implementation-defined; start-up contains the execution of constructors for objects of namespace scope with static storage dura- tion; termination contains the execution of destructors for objects with static storage duration. ] 2 An implementation shall not predefine the main function. This func- tion shall not be overloaded. It shall have a return type of type int, but otherwise its type is implementation-defined. All implemen- tations shall allow both of the following definitions of main: int main() { /* ... */ } and int main(int argc, char* argv[]) { /* ... */ } In the latter form argc shall be the number of arguments passed to the program from the environment in which the program is run. If argc is nonzero these arguments shall be supplied in argv[0] through argv[argc-1] as pointers to the initial characters of null-terminated multibyte strings (NTMBSs) and argv[0] shall be the pointer to the initial character of a NTMBS that represents the name used to invoke the program or "". The value of argc shall be nonnegative. The value of argv[argc] shall be 0. [Note: it is recommended that any further (optional) parameters be added after argv. ] 3 The function main shall not be called from within a program. The linkage (_basic.link_) of main is implementation-defined. A program that takes the address of main, or declares it inline or static is ill-formed. The name main is not otherwise reserved. [Example: mem- ber functions, classes, and enumerations can be called main, as can entities in other namespaces. ] 4 Calling the function void exit(int); declared in (_lib.support.start.term_) terminates the pro- gram without leaving the current block and hence without destroying any objects with automatic storage duration (_class.dtor_). If exit is called to end a program during the destruction of an object with static storage duration, the program has undefined behavior. 5 A return statement in main has the effect of leaving the main function (destroying any objects with automatic storage duration) and calling exit with the return value as the argument. If control reaches the end of main without encountering a return statement, the effect is that of executing return 0; 3.6.2 Initialization of non-local objects [basic.start.init] 1 The storage for objects with static storage duration (_basic.stc.static_) shall be zero-initialized (_dcl.init_) before any other initialization takes place. Objects of POD types (_basic.types_) with static storage duration initialized with constant expressions (_expr.const_) shall be initialized before any dynamic initialization takes place. Objects of namespace scope with static storage duration defined in the same translation unit and dynamically initialized shall be initialized in the order in which their defini- tion appears in the translation unit. [Note: _dcl.init.aggr_ describes the order in which aggregate members are initialized. The initialization of local static objects is described in _stmt.dcl_. ] 2 An implementation is permitted to perform the initialization of an object of namespace scope with static storage duration as a static initialization even if such initialization is not required to be done statically, provided that --the dynamic version of the initialization does not change the value of any other object of namespace scope with static storage duration prior to its initialization, and --the static version of the initialization produces the same value in the initialized object as would be produced by the dynamic initial- ization if all objects not required to be initialized statically were initialized dynamically. [Note: as a consequence, if the initialization of an object obj1 refers to an object obj2 of namespace scope with static storage dura- tion potentially requiring dynamic initialization and defined later in the same translation unit, it is unspecified whether the value of obj2 used will be the value of the fully initialized obj2 (because obj2 was statically initialized) or will be the value of obj2 merely zero- initialized. For example, inline double fd() { return 1.0; } extern double d1; double d2 = d1; // unspecified: // may be statically initialized to 0.0 or // dynamically initialized to 1.0 double d1 = fd(); // may be initialized statically to 1.0 --end note] 3 It is implementation-defined whether the dynamic initialization (_dcl.init_, _class.static_, _class.ctor_, _class.expl.init_) of an object of namespace scope with static storage duration is done before the first statement of main or deferred to any point in time after the first statement of main but before the first use of a function or object defined in the same translation unit. [Example: // -- File 1 -- #include "a.h" #include "b.h" B b; A::A(){ b.Use(); } // -- File 2 -- #include "a.h" A a; // -- File 3 -- #include "a.h" #include "b.h" extern A a; extern B b; main() { a.Use(); b.Use(); } It is implementation-defined whether a is defined before main is entered or whether its definition is delayed until a is first used in main. It is implementation-defined whether b is defined before main is entered or whether its definition is delayed until b is first used in main. In particular, if a is defined before main is entered, it is not guaranteed that b will be initialized before it is used by the initialization of a, that is, before A::A is called. ] 4 If construction or destruction of a non-local static object ends in throwing an uncaught exception, the result is to call terminate (_lib.terminate_). 3.6.3 Termination [basic.start.term] 1 Destructors (_class.dtor_) for initialized objects of static storage duration (declared at block scope or at namespace scope) are called when returning from main and when calling exit (_lib.support.start.term_). These objects are destroyed in the reverse order of the completion of their constructors. For an object of array or class type, all subobjects of that object are destroyed before any local object with static storage duration initialized dur- ing the construction of the subobjects is destroyed. 2 If a function contains a local object of static storage duration that has been destroyed and the function is called during the destruction of an object with static storage duration, the program has undefined behavior if the flow of control passes through the definition of the previously destroyed local object. 3 If a function is registered with atexit (see , _lib.support.start.term_) then following the call to exit, any objects with static storage duration initialized prior to the registration of that function will not be destroyed until the registered function is called from the termination process and has completed. For an object with static storage duration constructed after a function is regis- tered with atexit, then following the call to exit, the registered function is not called until the execution of the object's destructor has completed. 4 Where a C++ implementation coexists with a C implementation, any actions specified by the C implementation to take place after the atexit functions have been called take place after all destructors have been called. 5 Calling the function void abort(); declared in terminates the program without executing destructors for objects of automatic or static storage duration and without calling the functions passed to atexit(). 3.7 Storage duration [basic.stc] 1 Storage duration is the property of an object that defines the minimum potential lifetime of the storage containing the object. The storage duration is determined by the construct used to create the object and is one of the following: --static storage duration --automatic storage duration --dynamic storage duration 2 Static and automatic storage durations are associated with objects introduced by declarations (_basic.def_). The dynamic storage dura- tion is associated with objects created with operator new (_expr.new_). 3 The storage class specifiers static and auto are related to storage duration as described below. 4 References (_dcl.ref_) might or might not require storage; however, the storage duration categories apply to references as well. 3.7.1 Static storage duration [basic.stc.static] 1 All non-local objects have static storage duration. The storage for these objects shall last for the duration of the program (_basic.start.init_, _basic.start.term_). 2 If an object of static storage duration has initialization or a destructor with side effects, it shall not be eliminated even if it appears to be unused. 3 The keyword static can be used to declare a local variable with static storage duration. [Note: _stmt.dcl_ describes the initialization of local static variables; _basic.start.term_ describes the destruction of local static variables. ] 4 The keyword static applied to a class data member in a class defini- tion gives the data member static storage duration. 3.7.2 Automatic storage duration [basic.stc.auto] 1 Local objects explicitly declared auto or register or not explicitly declared static or extern have automatic storage duration. The stor- age for these objects lasts until the block in which they are created exits. 2 [Note: these objects are initialized and destroyed as described _stmt.dcl_. ] 3 If a named automatic object has initialization or a destructor with side effects, it shall not be destroyed before the end of its block, nor shall it be eliminated as an optimization even if it appears to be unused. 3.7.3 Dynamic storage duration [basic.stc.dynamic] 1 Objects can be created dynamically during program execution (_intro.execution_), using new-expressions (_expr.new_), and destroyed using delete-expressions (_expr.delete_). A C++ implementation pro- vides access to, and management of, dynamic storage via the global allocation functions operator new and operator new[] and the global deallocation functions operator delete and operator delete[]. 2 The global allocation and deallocation functions are always implicitly declared. The library provides default definitions for them (_lib.new.delete_). A C++ program shall provide at most one defini- tion of any of the functions ::operator new(size_t) ::operator new(size_t, void*) ::operator new(size_t, const std::nothrow&) ::operator new[](size_t) ::operator new[](size_t, void*) ::operator new[](size_t, const std::nothrow&) ::operator delete(void*) ::operator delete(void*, void*) ::operator delete(void*, const std::nothrow&) ::operator delete[](void*) ::operator delete[](void*, void*) ::operator delete[](void*, const nothrow&) Any such function definitions replace the default versions. This replacement is global and takes effect upon program startup (_basic.start_). Allocation and/or deallocation functions can also be declared and defined for any class (_class.free_). 3 Any allocation and/or deallocation functions defined in a C++ program shall conform to the semantics specified in this subclause. 3.7.3.1 Allocation functions [basic.stc.dynamic.allocation] 1 Allocation functions shall be class member functions or global func- tions; a program is ill-formed if allocation functions are declared in a namespace scope other than global scope or declared static in global scope. They can be overloaded, but the return type shall always be void* and the first parameter type shall always be size_t (_expr.sizeof_), an implementation-defined integral type defined in the standard header (_lib.language.support_). For these functions, parameters other than the first can have associated default arguments (_dcl.fct.default_). 2 The function shall return the address of a block of available storage at least as large as the requested size. The order, contiguity, and initial value of storage allocated by successive calls to an alloca- tion function is unspecified. The pointer returned is suitably aligned so that it can be assigned to a pointer of any type and then used to access such an object or an array of such objects in the stor- age allocated (until the storage is explicitly deallocated by a call to a corresponding deallocation function). The pointer returned points to the start (lowest byte address) of the allocated storage. If the size of the space requested is zero, the value returned shall not be a null pointer value (_conv.ptr_) and shall not point to or within any other currently allocated storage. The results of derefer- encing a pointer returned as a request for zero size are undefined.29) 3 If an allocation function is unable to obtain an appropriate block of storage, it can invoke the currently installed new_handler30) and/or throw an exception (_except_) of class bad_alloc (_lib.bad.alloc_) or a class derived from bad_alloc. 4 If the allocation function returns the null pointer the result is implementation-defined. 3.7.3.2 Deallocation functions [basic.stc.dynamic.deallocation] 1 Deallocation functions shall be class member functions or global func- tions; a program is ill-formed if deallocation functions are declared in a namespace scope other than global scope or declared static in global scope. 2 Each deallocation function shall return void and its first parameter shall be void*. For class member deallocation functions, a second parameter of type size_t (_lib.support.types_) may be added. If both versions are declared in the same class, the one-parameter form is the _________________________ 29) The intent is to have operator new() implementable by calling mal- loc() or calloc(), so the rules are substantially the same. C++ dif- fers from C in requiring a zero request to return a non-null pointer. 30) A program-supplied allocation function can obtain the address of the currently installed new_handler (_lib.new.handler_) using the set_new_handler() function (_lib.set.new.handler_). usual deallocation function and the two-parameter form is used for placement delete (_expr.new_). If the second version is declared but not the first, it is the usual deallocation function, not placement delete. 3 The value of the first parameter supplied to a deallocation function shall be a null pointer value, or refer to storage allocated by the corresponding allocation function (even if that allocation function was called with a zero argument). If the value of the first argument is a null pointer value, the call to the deallocation function has no effect. If the value of the first argument refers to a pointer already deallocated, the effect is undefined. 4 If the argument given to a deallocation function is a pointer that is not the null pointer value (_conv.ptr_), the deallocation function will deallocate the storage referenced by the pointer and render the pointer invalid. The value of a pointer that refers to deallocated storage is indeterminate. The effect of using the value of a pointer to deallocated storage is undefined.31) 3.7.4 Duration of sub-objects [basic.stc.inherit] 1 The storage duration of member subobjects, base class subobjects and array elements is that of their complete object (_intro.object_). 3.8 Object Lifetime [basic.life] 1 The lifetime of an object is a runtime property of the object. The lifetime of an object of type T begins when: --storage with the proper alignment and size for type T is obtained, and --if T is a class type with a non-trivial constructor (_class.ctor_), the constructor call has completed. The lifetime of an object of type T ends when: --if T is a class type with a non-trivial destructor (_class.dtor_), the destructor call starts, or --the storage which the object occupies is reused or released. 2 [Note: the lifetime of an array object or of an object of POD types (_basic.types_) starts as soon as storage with proper size and align- ment is obtained, and its lifetime ends when the storage which the array or object occupies is reused or released. _class.base.init_ describes the lifetime of base and member subobjects. ] _________________________ 31) On some implementations, it causes a system-generated runtime fault. 3 The properties ascribed to objects throughout this International Stan- dard apply for a given object only during its lifetime. [Note: in particular, before the lifetime of an object starts and after its lifetime ends there are significant restrictions on the use of the object, as described below, in _class.base.init_ and in _class.cdtor_. Also, the behavior of an object under construction and destruction might not be the same as the behavior of an object whose lifetime has started and not ended. _class.base.init_ and _class.cdtor_ describe the behavior of objects during the construction and destruction phases. ] 4 A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling the destructor for an object of a class type with a non-trivial destructor. For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (_expr.delete_) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior. 5 Before the lifetime of an object has started but after the storage which the object will occupy has been allocated32) or, after the life- time of an object has ended and before the storage which the object occupied is reused or released, any pointer that refers to the storage location where the object will be or was located may be used but only in limited ways. Such a pointer refers to allocated storage (_basic.stc.dynamic.deallocation_), and using the pointer as if the pointer were of type void*, is well-defined. Such a pointer may be dereferenced (to initialize a reference, for example) but converting the resulting lvalue to an rvalue (_conv.lval_) results in undefined behavior. If the object will be or was of a class type with a non- trivial destructor, and the pointer is used as the operand of a delete-expression, the program has undefined behavior. If the object will be or was of a non-POD class type, the program has undefined behavior if: --the pointer is used to access a non-static data member or call a non-static member function of the object, or --the pointer is implicitly converted (_conv.ptr_) to a pointer to a base class type, or --the pointer is used as the operand of a static_cast (_expr.static.cast_) (except when the conversion is to void* or char*) --the pointer is used as the operand of a dynamic_cast (_expr.dynamic.cast_). [Example: _________________________ 32) For example, before the construction of a global object of non-POD class type (_class.cdtor_). struct B { virtual void f(); void mutate(); virtual ~B(); }; struct D1 : B { void f(); }; struct D2 : B { void f(); }; void B::mutate() { new (this) D2; // reuses storage - ends the lifetime of '*this' f(); // undefined behavior ... = this; // ok, 'this' points to valid memory } void g() { void* p = malloc(sizeof(D1) + sizeof(D2)); B* pb = new (p) D1; pb->mutate(); &pb; // ok: pb points to valid memory void* q = pb; // ok: pb points to valid memory pb->f(); // undefined behavior, lifetime of *pb has ended } --end example] 6 Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any reference to the original object may be used but only in limited ways. Such a reference refers to allocated storage (_basic.stc.dynamic.deallocation_), and using the reference as an lvalue (to initialize another reference, for example) is well-defined. If an lvalue-to-rvalue conversion (_conv.lval_) is applied to such a reference, the program has undefined behavior; if the original object will be or was of a non-POD class type, the pro- gram has undefined behavior if: --the reference is used to access a non-static data member or call a non-static member function of the object, or --the reference is used as the operand of a static_cast (_expr.static.cast_) (except when the conversion is to char&), or --the reference is used as the operand of a dynamic_cast (_expr.dynamic.cast_) or as the operand of typeid. 7 If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is cre- ated at the storage location which the original object occupied, a pointer that pointed to the original object or, a reference that referred to the original object or, the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if: --the storage for the new object exactly overlays the storage location which the original object occupied, and --the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and --the original object was a most derived object (_intro.object_) of type T and the new object is a most derived object of type T (that is, they are not base class subobjects). [Example: struct C { int i; void f(); const C& operator=( const C& ); }; const C& C::operator=( const C& other) { if ( this != &other ) { this->~C(); // lifetime of '*this' ends new (this) C(other); // new object of type C created f(); // well-defined } return *this; } C c1; C c2; c1 = c2; // well-defined c1.f(); // well-defined; c1 refers to a new object of type C --end example] 8 If a program ends the lifetime of an object of type T with static (_basic.stc.static_) or automatic (_basic.stc.auto_) storage duration and if T has a non-trivial destructor,33) the program must ensure that an object of the original type occupies that same storage location when the implicit destructor call takes place; otherwise the behavior of the program is undefined. This is true even if the block is exited with an exception. [Example: class T { }; struct B { ~B(); }; void h() { B b; new (&b) T; } // undefined behavior at block exit --end example] 9 Creating a new object at the storage location that a const object with static or automatic storage duration occupies or, at the storage _________________________ 33) that is, an object for which a destructor will be called implicit- ly -- either upon exit from the block for an object with automatic storage duration or upon exit from the program for an object with static storage duration. location that such a const object used to occupy before its lifetime ended results in undefined behavior. [Example: struct B { B(); ~B(); }; const B b; void h() { b.~B(); new (&b) const B; // undefined behavior } --end example] 3.9 Types [basic.types] 1 [Note: these clauses impose requirements on implementations regarding the representation of types. There are two kinds of types: fundamen- tal types and compound types. Types describe objects (_intro.object_), references (_dcl.ref_), or functions (_dcl.fct_). ] 2 For any object type T, whether or not the object holds a valid value of type T, the underlying bytes (_intro.memory_) making up the object can be copied into an array of char or unsigned char.34) If the con- tent of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value. [Exam- ple: #define N sizeof(T) char buf[N]; T obj; // obj initialized to its original value memcpy(buf, &obj, N); // between these two calls to memcpy, // obj might be modified memcpy(&obj, buf, N); // at this point, each subobject of obj of scalar type // holds its original value --end example] 3 For any POD type T, if two pointers to T point to distinct T objects obj1 and obj2, if the value of obj1 is copied into obj2, using the memcpy library function, obj2 shall subsequently hold the same value as obj1. [Example: T* t1p; T* t2p; // provided that t2p points to an initialized object ... memcpy(t1p, t2p, sizeof(T)); // at this point, every subobject of scalar type in *t1p // contains the same value as the corresponding subobject in // *t2p --end example] _________________________ 34) By using, for example, the library functions (_lib.headers_) mem- cpy or memmove. 4 The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T). The value representation of an object is the sequence of bits that hold the value of type T. For POD types, the value repre- sentation is a sequence of bits in the object representation that determines a value, which is one discrete element of an implementa- tion-defined set of values.35) 5 Object types have alignment requirements (_basic.fundamental_, _basic.compound_). The alignment of an object type is an implementa- tion-defined integer value representing a number of bytes; an object is allocated at an address that meets the alignment requirements of its object type. 6 A class that has been declared but not defined or, an array of unknown size or of incomplete element type is an incomplete type.36) Also, the void type is an incomplete type (_basic.fundamental_). Objects shall not be defined to have an incomplete type. The term incompletely- defined object type is a synonym for incomplete type; the term com- pletely-defined object type is a synonym for complete type. 7 A class type (such as "class X") might be incomplete at one point in a translation unit and complete later on; the type "class X" is the same type at both points. The declared type of an array object might be an array of incomplete class type and therefore incomplete; if the class type is completed later on in the translation unit, the array type becomes complete; the array type at those two points is the same type. The declared type of an array object might be an array of unknown size and therefore be incomplete at one point in a translation unit and complete later on; the array types at those two points ("array of unknown bound of T" and "array of N T") are different types. The type of a pointer to array of unknown size, or of a type defined by a type- def declaration to be an array of unknown size, cannot be completed. [Example: class X; // X is an incomplete type extern X* xp; // xp is a pointer to an incomplete type extern int arr[]; // the type of arr is incomplete typedef int UNKA[]; // UNKA is an incomplete type UNKA* arrp; // arrp is a pointer to an incomplete type UNKA** arrpp; void foo() { xp++; // ill-formed: X is incomplete arrp++; // ill-formed: incomplete type arrpp++; // okay: sizeof UNKA* is known } struct X { int i; }; // now X is a complete type int arr[10]; // now the type of arr is complete _________________________ 35) The intent is that the memory model of C++ is compatible with that of ISO/IEC 9899 Programming Language C. 36) The size and layout of an instance of an incomplete type is un- known. X x; void bar() { xp = &x; // okay; type is ``pointer to X'' arrp = &arr; // ill-formed: different types xp++; // okay: X is complete arrp++; // ill-formed: UNKA can't be completed } --end example] 8 [Note: the rules for declarations and expressions describe in which contexts incomplete types are prohibited. ] 9 Arithmetic types (_basic.fundamental_), enumeration types, pointer types, and pointer to member types (_basic.compound_), and cv- qualified versions of these types (_basic.type.qualifier_) are collec- tively called scalar types. Scalar types, POD class types, POD union types (_class_), arrays of such types and cv-qualified versions of these types (_basic.type.qualifier_) are collectively called POD types. 10If two types T1 and T2 are the same type, then T1 and T2 are layout- compatible types. [Note: Layout-compatible enumerations are described in _dcl.enum_. Layout-compatible POD-structs and POD-unions are described in _class.mem_. ] 3.9.1 Fundamental types [basic.fundamental] 1 Objects declared as characters char) shall be large enough to store any member of the implementation's basic character set. If a charac- ter from this set is stored in a character object, its value shall be equivalent to the integer code of that character. It is implementa- tion-defined whether a char object can hold negative values. Charac- ters can be explicitly declared unsigned or signed. Plain char, signed char, and unsigned char are three distinct types. A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (_basic.types_); that is, they have the same object representation. For character types, all bits of the object representation participate in the value representa- tion. For unsigned character types, all possible bit patterns of the value representation represent numbers. These requirements do not hold for other types. In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined. 2 There are four signed integer types: "signed char", "short int", "int", and "long int." In this list, each type provides at least as much storage as those preceding it in the list. Plain ints have the natural size suggested by the architecture of the execution environment37) ; the other signed integer types are provided to meet _________________________ 37) that is, large enough to contain any value in the range of INT_MIN and INT_MAX, as defined in the header . special needs. 3 For each of the signed integer types, there exists a corresponding (but different) unsigned integer type: "unsigned char", "unsigned short int", "unsigned int", and "unsigned long int," each of which occupies the same amount of storage and has the same alignment requirements (_basic.types_) as the corresponding signed integer type38) ; that is, each signed integer type has the same object repre- sentation as its corresponding unsigned integer type. The range of nonnegative values of a signed integer type is a subrange of the cor- responding unsigned integer type, and the value representation of each corresponding signed/unsigned type shall be the same. 4 Unsigned integers, declared unsigned, shall obey the laws of arith- metic modulo 2n where n is the number of bits in the representation of that particular size of integer.39) 5 Type wchar_t is a distinct type whose values can represent distinct codes for all members of the largest extended character set specified among the supported locales (_lib.locale_). Type wchar_t shall have the same size, signedness, and alignment requirements (_intro.memory_) as one of the other integral types, called its underlying type. 6 Values of type bool are either true or false.40) [Note: there are no signed, unsigned, short, or long bool types or values. ] As described below, bool values behave as integral types. Values of type bool par- ticipate in integral promotions (_conv.prom_). 7 Types bool, char, wchar_t, and the signed and unsigned integer types are collectively called integral types.41) A synonym for integral type is integer type. The representations of integral types shall define values by use of a pure binary numeration system.42) [Example: this _________________________ 38) See _dcl.type.simple_ regarding the correspondence between types and the sequences of type-specifiers that designate them. 39) This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type. 40) Using a bool value in ways described by this International Stan- dard as ``undefined,'' such as by examining the value of an uninitial- ized automatic variable, might cause it to behave as if is neither true nor false. 41) Therefore, enumerations (_dcl.enum_) are not integral; however, enumerations can be promoted to int, unsigned int, long, or unsigned long, as specified in _conv.prom_. 42) A positional representation for integers that uses the binary dig- its 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral pow- er of 2, except perhaps for the bit with the highest position. (Adapted from the American National Dictionary for Information Pro- cessing Systems.) International Standard permits 2's complement, 1's complement and signed magnitude representations for integral types. ] 8 There are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double. The value representation of floating-point is implementation-defined. Integral and floating types are collectively called arithmetic types. Specializations of the standard template numeric_limits (_lib.support.limits_) shall specify the maximum and minimum values of each arithmetic types for an implementation. 9 The void type has an empty set of values. The void type is an incom- plete type that cannot be completed. It is used as the return type for functions that do not return a value. Any expression can be explicitly converted to type void (_expr.cast_); the resulting expres- sion shall be used only as an expression statement (_stmt.expr_), as the left operand of a comma expression (_expr.comma_), or as a second or third operand of ?: (_expr.cond_). 10[Note: even if the implementation defines two or more basic types to have the same value representation, they are nevertheless different types. ] 3.9.2 Compound types [basic.compound] 1 Compound types can be constructed from the fundamental types in the following ways: --arrays of objects of a given type, _dcl.array_; --functions, which have parameters of given types and return void or references or objects of a given type, _dcl.fct_; --pointers to void or objects or functions (including static members of classes) of a given type, _dcl.ptr_; --references to objects or functions of a given type, _dcl.ref_; --constants, which are values of a given type, _dcl.type_; --classes containing a sequence of objects of various types (_class_), a set of functions for manipulating these objects (_class.mfct_), and a set of restrictions on the access to these objects and func- tions, _class.access_; --unions, which are classes capable of containing objects of different types at different times, _class.union_; --enumerations, which comprise a set of named constant values. Each distinct enumeration constitutes a different enumerated type, _dcl.enum_; --pointers to non-static43) class members, which identify members of a given type within objects of a given class, _dcl.mptr_. 2 These methods of constructing types can be applied recursively; restrictions are mentioned in _dcl.ptr_, _dcl.array_, _dcl.fct_, and _dcl.ref_. 3 A pointer to objects of type T is referred to as a "pointer to T." [Example: a pointer to an object of type int is referred to as "pointer to int" and a pointer to an object of class X is called a "pointer to X." ] Except for pointers to static members, text refer- ring to "pointers" does not apply to pointers to members. Pointers to incomplete types are allowed although there are restrictions on what can be done with them (_basic.types_). The value representation of pointer types is implementation-defined. Pointers to cv-qualified and cv-unqualified versions (_basic.type.qualifier_) of layout-compatible types shall have the same value representation and alignment require- ments (_basic.types_). 4 Objects of cv-qualified (_basic.type.qualifier_) or cv-unqualified type void* (pointer to void), can be used to point to objects of unknown type. A void* shall be able to hold any object pointer. A cv-qualified or cv-unqualified (_basic.type.qualifier_) void* shall have the same representation and alignment requirements as a cv- qualified or cv-unqualified char*. 3.9.3 CV-qualifiers [basic.type.qualifier] 1 A type mentioned in _basic.fundamental_ and _basic.compound_ is a cv- unqualified type. Each cv-unqualified fundamental type (_basic.fundamental_) has three corresponding cv-qualified versions of its type: a const-qualified version, a volatile-qualified version, and a const-volatile-qualified version. The term object type (_intro.object_) includes the cv-qualifiers specified when the object is created. The presence of a const specifier in a decl-specifier-seq declares an object of const-qualified object type; such object is called a const object. The presence of a volatile specifier in a decl-specifier-seq declares an object of volatile-qualified object type; such object is called a volatile object. The presence of both cv-qualifiers in a decl-specifier-seq declares an object of const- volatile-qualified object type; such object is called a const volatile object. The cv-qualified or cv-unqualified versions of a type are distinct types; however, they shall have the same representation and alignment requirements (_basic.types_).44) _________________________ 43) Static class members are objects or functions, and pointers to them are ordinary pointers to objects or functions. 44) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions. 2 A compound type (_basic.compound_) is not cv-qualified by the cv- qualifiers (if any) of the type from which it is compounded. Any cv- qualifiers that appear in an array declaration apply to the array ele- ment type, not the array type (_dcl.array_). 3 Each non-function, non-static, non-mutable member of a const-qualified class object is const-qualified, each non-function, non-static member of a volatile-qualified class object is volatile-qualified and simi- larly for members of a const-volatile class. See _dcl.fct_ and _class.this_ regarding cv-qualified function types. 4 There is a (partial) ordering on cv-qualifiers, so that a type can be said to be more cv-qualified than another. Table 6 shows the rela- tions that constitute this ordering. Table 6--relations on const and volatile +----------+ no cv-qualifier < const no cv-qualifier < volatile no cv-qualifier |< const volatile const < | const volatile volatile < | const volatile +----------+ 5 In this International Standard, the notation cv (or cv1, cv2, etc.), used in the description of types, represents an arbitrary set of cv- qualifiers, i.e., one of {const}, {volatile}, {const, volatile}, or the empty set. Cv-qualifiers applied to an array type attach to the underlying element type, so the notation "cv T," where T is an array type, refers to an array whose elements are so-qualified. Such array types can be said to be more (or less) cv-qualified than other types based on the cv-qualification of the underlying element types. 3.10 Lvalues and rvalues [basic.lval] 1 Every expression is either an lvalue or an rvalue. 2 An lvalue refers to an object or function. Some rvalue expressions-- those of class or cv-qualified class type--also refer to objects.45) 3 [Note: some built-in operators and function calls yield lvalues. [Example: if E is an expression of pointer type, then *E is an lvalue expression referring to the object or function to which E points. As another example, the function int& f(); _________________________ 45) Expressions such as invocations of constructors and of functions that return a class type refer to objects, and the implementation can invoke a member function upon such objects, but the expressions are not lvalues. yields an lvalue, so the call f() is an lvalue expression. ] ] 4 [Note: some built-in operators expect lvalue operands. [Example: built-in assignment operators all expect their left hand operands to be lvalues. ] Other built-in operators yield rvalues, and some expect them. [Example: the unary and binary + operators expect rvalue argu- ments and yield rvalue results. ] The discussion of each built-in operator in clause _expr_ indicates whether it expects lvalue operands and whether it yields an lvalue. ] 5 Constructor invocations and calls to functions that do not return ref- erences are always rvalues. User defined operators are functions, and whether such operators expect or yield lvalues is determined by their type. 6 Whenever an lvalue appears in a context where an rvalue is expected, the lvalue is converted to an rvalue; see _conv.lval_, _conv.array_, and _conv.func_. 7 The discussion of reference initialization in _dcl.init.ref_ and of temporaries in _class.temporary_ indicates the behavior of lvalues and rvalues in other significant contexts. 8 Class rvalues can have cv-qualified types; non-class rvalues always have cv-unqualified types. Rvalues shall always have complete types or the void type; in addition to these types, lvalues can also have incomplete types. 9 An lvalue for an object is necessary in order to modify the object except that an rvalue of class type can also be used to modify its referent under certain circumstances. [Example: a member function called for an object (_class.mfct_) can modify the object. ] 10Functions cannot be modified, but pointers to functions can be modifi- able. 11A pointer to an incomplete type can be modifiable. At some point in the program when the pointed to type is complete, the object at which the pointer points can also be modified. 12The referent of a const-qualified expression shall not be modified (through that expression), except that if it is of class type and has a mutable component, that component can be modified (_dcl.type.cv_). 13If an expression can be used to modify the object to which it refers, the expression is called modifiable. A program that attempts to mod- ify an object through a nonmodifiable lvalue or rvalue expression is ill-formed. 14If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined46): _________________________ --the dynamic type of the object, --a cv-qualified version of the dynamic type of the object, --a type that is the signed or unsigned type corresponding to the dynamic type of the object, --a type that is the signed or unsigned type corresponding to a cv- qualified version of the dynamic type of the object, --an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a sub- aggregate or contained union), --a type that is a (possibly cv-qualified) base class type of the dynamic type of the object, --a char or unsigned char type. _________________________ 46) The intent of this list is to specify those circumstances in which an object may or may not be aliased. ______________________________________________________________________ 4 Standard conversions [conv] ______________________________________________________________________ 1 Standard conversions are implicit conversions defined for built-in types. The full set of such conversions is enumerated in this clause. A standard conversion sequence is a sequence of standard conversions in the following order: --Zero or one conversion from the following set: lvalue-to-rvalue con- version, array-to-pointer conversion, and function-to-pointer con- version. --Zero or one conversion from the following set: integral promotions, floating point promotion, integral conversions, floating point con- versions, floating-integral conversions, pointer conversions, pointer to member conversions, and boolean conversions. --Zero or one qualification conversion. [Note: a standard conversion sequence can be empty, i.e., it can con- sist of no conversions. ] A standard conversion sequence will be applied to an expression if necessary to convert it to a required des- tination type. 2 [Note: expressions with a given type will be implicitly converted to other types in several contexts: --When used as operands of operators. The operator's requirements for its operands dictate the destination type. See _expr_. --When used in the condition of an if statement or iteration statement (_stmt.select_, _stmt.iter_). The destination type is bool. --When used in the expression of a switch statement. The destination type is integral (_stmt.select_). --When used as the source expression for an initialization (which includes use as an argument in a function call and use as the expression in a return statement). The type of the entity being initialized is (generally) the destination type. See _dcl.init_, _dcl.init.ref_. --end note] 3 An expression e can be implicitly converted to a type T if and only if the declaration T t=e;" is well-formed, for some invented temporary variable t (_dcl.init_). The effect of the implicit conversion is the same as performing the declaration and initialization and then using the temporary variable as the result of the conversion. The result is an lvalue if T is a reference type (_dcl.ref_), and an rvalue other- wise. The expression e is used as an lvalue if and only if the decla- ration uses it as an lvalue. 4 [Note: For user-defined types, user-defined conversions are considered as well; see _class.conv_. In general, an implicit conversion sequence (_over.best.ics_) consists of a standard conversion sequence followed by a user-defined conversion followed by another standard conversion sequence. 5 There are some contexts where certain conversions are suppressed. For example, the lvalue-to-rvalue conversion is not done on the operand of the unary & operator. Specific exceptions are given in the descrip- tions of those operators and contexts. ] 4.1 Lvalue-to-rvalue conversion [conv.lval] 1 An lvalue (_basic.lval_) of a non-function, non-array type T can be converted to an rvalue. If T is an incomplete type, a program that necessitates this conversion is ill-formed. If the object to which the lvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior. If T is a non- class type, the type of the rvalue is the cv-unqualified version of T. Otherwise, the type of the rvalue is T. 47) If the lvalue refers to a bit-field of type T, the resulting rvalue is not a bit-field. 2 The value contained in the object indicated by the lvalue is the rvalue result. When an lvalue-to-rvalue conversion occurs within the operand of sizeof (_expr.sizeof_) the value contained in the refer- enced object is not accessed, since that operator does not evaluate its operand. 3 [Note: See also _basic.lval_. ] 4.2 Array-to-pointer conversion [conv.array] 1 An lvalue or rvalue of type "array of N T" or "array of unknown bound of T" can be converted to an rvalue of type "pointer to T." The result is a pointer to the first element of the array. _________________________ 47) In C++ class rvalues can have cv-qualified types (because they are objects). This differs from ISO C, in which non-lvalues never have cv-qualified types. 4.3 Function-to-pointer conversion [conv.func] 1 An lvalue of function type T can be converted to an rvalue of type "pointer to T." The result is a pointer to the function.48) 2 [Note: See _over.over_ for additional rules for the case where the function is overloaded. ] 4.4 Qualification conversions [conv.qual] 1 An rvalue of type "pointer to cv1 T" can be converted to an rvalue of type "pointer to cv2 T" if "cv2 T" is more cv-qualified than "cv1 T." 2 An rvalue of type "pointer to member of X of type cv1 T" can be con- verted to an rvalue of type "pointer to member of X of type cv2 T" if "cv2 T" is more cv-qualified than "cv1 T." 3 A conversion can add cv-qualifiers at levels other than the first in multi-level pointers, subject to the following rules:49) Two pointer types T1 and T2 are similar if there exists a type T and integer N>0 such that: T1 is cv1,0 pointer to cv1,1 pointer to ... cv1,n pointer to T and T2 is cv2,0 pointer to cv2,1 pointer to ... cv2,n pointer to T where each cvi,j is const, volatile, const volatile, or nothing. An expression of type T1 can be converted to type T2 if and only if the following conditions are satisfied: --the pointer types are similar. --for every j>0, if const is in cv1,j then const is in cv2,j, and similarly for volatile. --if the cv1,j and cv2,j are different, then const is in every cv2,k for 0