Document Number: WG14 N633/X3J11 96-097 C9X Revision Proposal ===================== Title: Add "inline" capability to C, DRAFT 2 Author: Leith (Casey) Leedom Author Affiliation: Silicon Graphics, Inc. Postal Address: Mail Stop: 8U-500 2011 N. Shoreline Blvd. Mountain View, CA 94043-1389 E-mail Address: leedom@sgi.com Telephone Number: +1 415 9331504 Fax Number: +1 415 9628404 Sponsor: Date: 1996-11-26 Proposal Category: __ Editorial change/non-normative contribution __ Correction X_ New feature __ Addition to obsolescent feature list __ Addition to Future Directions __ Other (please specify) Area of Standard Affected: __ Environment X_ Language __ Preprocessor __ Library __ Macro/typedef/tag name __ Function __ Header __ Other (please specify) Prior Art: C++, GCC, various vendors including SGI, IBM, SUN, and DEC Target Audience: Programmers interested in improving application performance by specifying that certain functions should be compiled ``inline.'' Related Documents (if any): DRAFT ISO/ANSI C++ Language Standard, http://www.cygnus.com/misc/wp/nov96; GCC, File: gcc.info, Node: Inline Proposal Attached: X_ Yes __ No, but what's your interest? Abstract: C++ has supported a function inlining facility since its inception. Several C compilers now offer a similar facility; the GNU C Compiler, GCC, being a notable, widely available example. Abstract: ========= The C++ language language has included a function inlining capability since its inception. The current ISO/ANSI C++ Standard DRAFT dated November 15, 1996 covers the syntax and semantics of the "inline" keyword in sections 3.2 (One definition rule) and 7.1.2 (Function specifiers). These sections are attached in Appendix B of this proposal. The GNU C Compiler, GCC, is an ISO 9899-1990 compliant compiler which supports several extensions to the ANSI C standard, including a function inlining capability very similar to the function inlining of C++. The syntax and semantics of the "inline" keyword is covered in the Inline info node in the gcc.info file. This info node is attached in Appendix C of this proposal. Both C++ and GCC specify that the use of the keyword "inline" in a function definition serves as a ``hint'' to the compiler that calls to such a function be substituted with the function body rather than the usual call mechanism. There are, however, several differences between the C++ and GCC "inline" definitions. The primary difference is the extensive facility in GCC to control when and where inline functions are allowed to be instantiated as separate, callable functions. See Appendix A for a complete description of the differences between the two definitions. Proposal: ========= This proposal advocates the inclusion into the ISO/ANSI C9X standard of a function inlining facility identical to that implemented by C++. C++'s model was chosen over GCC's because of the desire to maintain wherever possible and reasonable a high degree of compatibility between the C and C++ standards. Because GCC is a widely available compiler and some application code may have already adopted its inline model this may lead to problems. However, 1. it's highly likely that most such uses are guarded by preprocessor directives active only under GCC, 2. the differences are minor enough not to introduce problems in a majority of cases, and 3. the desire to maintain compatibility with the C++ standard is important enough to accept the remaining problems. In this proposal: o References to Draft 5 of the C9X C Language Standard are of the form C9X-D5/section.number. o References to the 1996 November 15 ISO/ANSI C++ Standard DRAFT are of the form 1996-nov-C++/section.number. o References to the Inline info node from the file gcc.info for GCC release 2.6.3 are of the form GCC-2.6.3, or simply GCC. Proposal details: ================= A new keyword, "inline," is added to the set of declaration-specifiers allowable for function declarations and definitions (C9X-D5/6.7.1). The semantics of this new keyword are identical to those in 1996-nov-C++. Note that the C++ Standard Committee has recently changed the traditional C++ semantics of the "inline" keyword. In previous versions of the C++ language and standard, "inline" has implied "internal linkage" as if the "static" storage class keyword had been specified. In 1996-nov-C++ this implied linkage semantic has been dropped and "inline" no longer has any effect on the linkage of a function (1996-nov-C++/7.1.2, footnote 2 in paragraph 4). The relevant portions of 1996-nov-C++ are included here: 3.2 One definition rule ... 1 No translation unit shall contain more than one definition of any variable, function, [or] enumeration type, ... 3 Every program shall contain exactly one definition of every non-inline function or object that is used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, ... An inline function shall be defined in every transla- tion unit in which it is used. 5 There can be more than one definition of a ... inline function with external linkage ... in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then --each definition of D shall consist of the same sequence of tokens; ... 7.1.2 Function specifiers ... 1 Function-specifiers can be used only in function declarations. function-specifier: inline ... 2 A function declaration ... with an inline specifier declares an inline function. The inline specifier indicates to the implementation that inline substitution of the function body at the point of call is to be preferred to the usual function call mechanism. An implementation is not required to perform this inline substitution at the point of call; however, even if this inline substitution is omitted, the other rules for inline functions defined by this subclause shall still be respected. 3 ... The inline specifier shall not appear on a block scope function declaration. ... The inline keyword has no effect on the linkage of a function. 4 An inline function shall be defined in every translation unit in which it is used and shall have exactly the same definition in every case (see one definition rule, 3.2). [Note: a call to the inline function may be encountered before its definition appears in the translation unit.] If a function with external linkage is declared inline in one transla- tion unit, it shall be declared inline in all translation units in which it appears; no diagnostic is required. [Note: a static local variable in an extern inline function always refers to the same object.] Note that the use above of the phrase ``the same sequence of tokens'' is somewhat ambiguous since 1996-nov-C++ doesn't seem to have an explicit definition of what the word ``token'' means in the context of the body of the standard. E.g. in C9X-D5 we have: 6.1 Lexical elements ... [4] A token is the minimal lexical element of the language in translation phases 7 and 8. The categories of tokens are: keywords, identifiers, constants, strings literals, operators, and puntuators. A preprocessing token is the minimal lexical element of the language in translation phases 3 though 3. The categories of preprocessing token are:header names, identifiers, preprocessing numbers, character constants, string literals, operators, punctuators, and single non-white-space characters that do not lexically match the other preprocessing token categories. ... However, the structure of the two documents is very similar including nearly identical numbering and wording for the description of translation phases. Thus we will take the word ``token'' above to mean ``translation phase 7 token'' which is the meaning of the word in C9X-D5. Unresolved issues: ================== 1. The above definition will lead to conflict with the ``one definition rule'' in C9X-D5/6.7, paragraph 6: An external definition is an external declaration that is also a definition of a function or an object. If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof operator), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one. [83. Thus, if an identifier declared with external linkage is not used in an expression, there need be no external definition for it.] 2. 1996-nov-C++ is imprecise on whether multiple out-of-line instantiations of an ``extern inline'' function result in a single instantiation (i.e. if you take the address of such a function in multiple translation units, do you get the same address). This issue has been raised with the 1996-nov-C++ committee. 3. The 1996-nov-C++ would appear to result in multiple out-of-line copies of "static inline" functions (one per translation unit). This is a result of the language which states that "inline" indicates that inline substitution is preferred. As a result it appears that a "static" out-of-line copy must be generated. Since these copies would have internal linkage, there's no way a linker could collapse such copies. Appendix A: Differences between C++ and GCC "inline": ===================================================== While the GCC-2.6.3 inlining facility was almost certainly modeled on the C++ inline capability there are several differences. The single largest difference between the GCC-2.6.3 and 1995-C++ definitions has to do with GCC's extensive ``instantiation control'' capabilities for inline functions. With GCC an implementor can explicitly control whether an inline function should have unsubstitutable references result in [translation-unit] local copies of the function with internal linkage, local copies with external linkage, or non-local copies with external linkage. For example, this could be used to cause external users of a library function, f(), to use the normal call mechanism while internal uses of f() could all be inlined. This would allow the library implementors to carefully control space, time, and binary compatibility issues of using inline functions (if user code were allowed to inline f(), then compiled user code could become binary incompatible with a new version of the library which used different internal data structures). Much of this control is probably provided to get around the fact that the loader won't collapse multiple definitions of a function to a single instance as is done for uninitialized variables with external linkage. GCC-2.6.3 also allows an inline function to be used before being defined. Uses prior to the inline definition will call an out-of-line instantiation of the function using the normal function call mechanism while uses following the definition will inline it. It appears from 1996-nov-C++ that it is legal to call an inline function before it is defined in C++ but it isn't 100% clear (see issue #3 above). Finally, GCC-2.6.3 explicitly states that if all calls to a ``static inline'' are integrated into the caller and the function's address is never taken, then no out-of-line code for the function will be emitted. 1996-nov-C++ merely states that "inline" is an indication that inline substitution should be preferred thus it seems that an internal linkage out-of-line copy is mandated by the standard ... (see issue #4 above) Appendix B: relevant material from ISO/ANSI C++ Standard DRAFT September 24, 1996 [http://www.cygnus.com/misc/wp/sep96] ==================================================================== 3.2 One definition rule [basic.def.odr] 1 No translation unit shall contain more than one definition of any variable, function, class type, enumeration type or template. 2 An expression is potentially evaluated unless either it is the operand of the sizeof operator (_expr.sizeof_), or it is the operand of the typeid operator and does not designate an lvalue of polymorphic class type (_expr.typeid_). An object or non-overloaded function is used if its name appears in a potentially-evaluated expression. A virtual member function is used if it is not pure. An overloaded function is used if it is selected by overload resolution when referred to from a potentially-evaluated expression. [Note: this covers calls to named functions (_expr.call_), operator overloading (_over_), user-defined conversions (_class.conv.fct_), allocation function for placement new (_expr.new_), as well as non-default initialization (_dcl.init_). A copy constructor is used even if the call is actually elided by the implementation. ] An allocation or deallocation function for a class is used by a new expression appearing in a potentially-evaluated expression as specified in _expr.new_ and _class.free_. A dealloca- tion function for a class is used by a delete expression appearing in a potentially-evaluated expression as specified in _expr.delete_ and _class.free_. A copy-assignment function for a class is used by an implicitly-defined copy-assignment function for another class as spec- ified in _class.copy_. A default constructor for a class is used by default initialization as specified in _dcl.init_. A constructor for a class is used as specified in _dcl.init_. A destructor for a class is used as specified in _class.dtor_. 3 Every program shall contain exactly one definition of every non-inline function or object that is used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropri- ate) it is implicitly defined (see _class.ctor_, _class.dtor_ and _class.copy_). An inline function shall be defined in every transla- tion unit in which it is used. 4 Exactly one definition of a class is required in a translation unit if the class is used in a way that requires the class type to be com- plete. [Example: the following complete translation unit is well- formed, even though it never defines X: struct X; // declare X as a struct type struct X* x1; // use X in pointer formation X* x2; // use X in pointer formation --end example] [Note: the rules for declarations and expressions describe in which contexts complete class types are required. A class type T must be complete if: --an object of type T is defined (_basic.def_, _expr.new_), or --an lvalue-to-rvalue conversion is applied to an lvalue referring to an object of type T (_conv.lval_), or --an expression is converted (either implicitly or explicitly) to type T (_conv_, _expr.type.conv_, _expr.dynamic.cast_, _expr.static.cast_, _expr.cast_), or --an expression is converted to the type pointer to T or reference to T using an implicit conversion (_conv_), a dynamic_cast (_expr.dynamic.cast_) or a static_cast (_expr.static.cast_), or --a class member access operator is applied to an expression of type T (_expr.ref_), or --the typeid operator (_expr.typeid_) or the sizeof operator (_expr.sizeof_) is applied to an operand of type T, or --a function with a return type or argument type of type T is defined (_basic.def_) or called (_expr.call_), or --an lvalue of type T is assigned to (_expr.ass_). ] 5 There can be more than one definition of a class type (_class_), enu- meration type (_dcl.enum_), inline function with external linkage (_dcl.fct.spec_), class template (_temp_), non-static function tem- plate (_temp.fct_), static data member of a class template (_temp.static_), member function template (_temp.mem.func_), or tem- plate specialization for which some template parameters are not speci- fied (_temp.spec_, _temp.class.spec_) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then --each definition of D shall consist of the same sequence of tokens; and --in each definition of D, corresponding names, looked up according to _basic.lookup_, shall refer to an entity defined within the defini- tion of D, or shall refer to the same entity, after overload resolu- tion (_over.match_) and after matching of partial template special- ization (_temp.over_), except that a name can refer to a const object with internal or no linkage if the object has the same inte- gral or enumeration type in all definitions of D, and the object is initialized with a constant expression (_expr.const_), and the value (but not the address) of the object is used, and the object has the same value in all definitions of D; and --in each definition of D, the overloaded operators referred to, the implicit calls to conversion operators, constructors, operator new functions and operator delete functions, shall refer to the same function, or to a function defined within the definition of D; and --in each definition of D, a default argument used by an (implicit or explicit) function call is treated as if its token sequence were present in the definition of D; that is, the default argument is subject to the three requirements described above (and, if the default argument has sub-expressions with default arguments, this requirement applies recursively).2) --if D is a class with an implicitly-declared constructor (_class.ctor_), it is as if the constructor was implicitly defined in every translation unit where it is used, and the implicit defini- tion in every translation unit shall call the same constructor for a base class or a class member of D. [Example: _________________________ 2) _dcl.fct.default_ describes how default argument names are looked up. // translation unit 1: struct X { X(int); X(int, int); }; X::X(int = 0) { } class D: public X { }; D d2; // X(int) called by D() // translation unit 2: struct X { X(int); X(int, int); }; X::X(int = 0, int = 0) { } class D: public X { }; // X(int, int) called by D(); // D()'s implicit definition // violates the ODR --end example] If D is a template, and is defined in more than one translation unit, then the last four requirements from the list above shall apply to names from the template's enclosing scope used in the template definition (_temp.nondep_), and also to dependent names at the point of instantiation (_temp.dep_). If the defini- tions of D satisfy all these requirements, then the program shall behave as if there were a single definition of D. If the defini- tions of D do not satisfy these requirements, then the behavior is undefined. 7.1.2 Function specifiers [dcl.fct.spec] 1 Function-specifiers can be used only in function declarations. function-specifier: inline virtual explicit 2 A function declaration (_dcl.fct_, _class.mfct_, _class.friend_) with an inline specifier declares an inline function. The inline specifier indicates to the implementation that inline substitution of the func- tion body at the point of call is to be preferred to the usual func- tion call mechanism. An implementation is not required to perform this inline substitution at the point of call; however, even if this inline substitution is omitted, the other rules for inline functions defined by this subclause shall still be respected. 3 A function defined within a class definition is an inline function. The inline specifier shall not appear on a block scope function declaration.2) 4 An inline function shall be defined in every translation unit in which it is used and shall have exactly the same definition in every case (_basic.def.odr_). [Note: a call to the inline function may be _________________________ 2) The inline keyword has no effect on the linkage of a function. encountered before its definition appears in the translation unit. ] If a function with external linkage is declared inline in one transla- tion unit, it shall be declared inline in all translation units in which it appears; no diagnostic is required. [Note: a static local variable in an extern inline function always refers to the same object. ] 5 The virtual specifier shall only be used in declarations of nonstatic class member functions that appear within a member-specification of a class declaration; see _class.virtual_. 6 The explicit specifier shall be used only in declarations of construc- tors within a class declaration; see _class.conv.ctor_. Appendix C: relevant material from GNU GCC 2.6.3: ================================================= File: gcc.info, Node: Inline, Next: Extended Asm, Prev: Alignment, Up: C Extensions An Inline Function is As Fast As a Macro ======================================== By declaring a function `inline', you can direct GNU CC to integrate that function's code into the code for its callers. This makes execution faster by eliminating the function-call overhead; in addition, if any of the actual argument values are constant, their known values may permit simplifications at compile time so that not all of the inline function's code needs to be included. The effect on code size is less predictable; object code may be larger or smaller with function inlining, depending on the particular case. Inlining of functions is an optimization and it really "works" only in optimizing compilation. If you don't use `-O', no function is really inline. To declare a function inline, use the `inline' keyword in its declaration, like this: inline int inc (int *a) { (*a)++; } (If you are writing a header file to be included in ANSI C programs, write `__inline__' instead of `inline'. *Note Alternate Keywords::.) You can also make all "simple enough" functions inline with the option `-finline-functions'. Note that certain usages in a function definition can make it unsuitable for inline substitution. Note that in C and Objective C, unlike C++, the `inline' keyword does not affect the linkage of the function. GNU CC automatically inlines member functions defined within the class body of C++ programs even if they are not explicitly declared `inline'. (You can override this with `-fno-default-inline'; *note Options Controlling C++ Dialect: C++ Dialect Options..) When a function is both inline and `static', if all calls to the function are integrated into the caller, and the function's address is never used, then the function's own assembler code is never referenced. In this case, GNU CC does not actually output assembler code for the function, unless you specify the option `-fkeep-inline-functions'. Some calls cannot be integrated for various reasons (in particular, calls that precede the function's definition cannot be integrated, and neither can recursive calls within the definition). If there is a nonintegrated call, then the function is compiled to assembler code as usual. The function must also be compiled as usual if the program refers to its address, because that can't be inlined. When an inline function is not `static', then the compiler must assume that there may be calls from other source files; since a global symbol can be defined only once in any program, the function must not be defined in the other source files, so the calls therein cannot be integrated. Therefore, a non-`static' inline function is always compiled on its own in the usual fashion. If you specify both `inline' and `extern' in the function definition, then the definition is used only for inlining. In no case is the function compiled on its own, not even if you refer to its address explicitly. Such an address becomes an external reference, as if you had only declared the function, and had not defined it. This combination of `inline' and `extern' has almost the effect of a macro. The way to use it is to put a function definition in a header file with these keywords, and put another copy of the definition (lacking `inline' and `extern') in a library file. The definition in the header file will cause most calls to the function to be inlined. If any uses of the function remain, they will refer to the single copy in the library. GNU C does not inline any functions when not optimizing. It is not clear whether it is better to inline or not, in this case, but we found that a correct implementation when not optimizing was difficult. So we did the easy thing, and turned it off.