ISO/ IEC JTC1/SC22/WG14 N633


                    Document Number:  WG14 N633/X3J11 96-097


                        C9X Revision Proposal
                        =====================

Title: Add "inline" capability to C, DRAFT 2
Author: Leith (Casey) Leedom
Author Affiliation: Silicon Graphics, Inc.
Postal Address: Mail Stop: 8U-500
                2011 N. Shoreline Blvd.
                Mountain View, CA 94043-1389
E-mail Address: leedom@sgi.com
Telephone Number: +1 415 9331504
Fax Number: +1 415 9628404
Sponsor: 
Date: 1996-11-26
Proposal Category:
   __ Editorial change/non-normative contribution
   __ Correction
   X_ New feature
   __ Addition to obsolescent feature list
   __ Addition to Future Directions
   __ Other (please specify)  
Area of Standard Affected:
   __ Environment
   X_ Language
   __ Preprocessor
   __ Library
      __ Macro/typedef/tag name
      __ Function
      __ Header
   __ Other (please specify)  
Prior Art: C++, GCC, various vendors including SGI, IBM, SUN, and DEC
Target Audience: Programmers interested in improving application performance
        by specifying that certain functions should be compiled ``inline.''
Related Documents (if any): DRAFT ISO/ANSI C++ Language Standard,
        http://www.cygnus.com/misc/wp/nov96; GCC, File: gcc.info, Node:
        Inline
Proposal Attached: X_ Yes __ No, but what's your interest?
Abstract: C++ has supported a function inlining facility since its
        inception.  Several C compilers now offer a similar facility; the
        GNU C Compiler, GCC, being a notable, widely available example.


Abstract:
=========

The C++ language language has included a function inlining capability since
its inception.  The current ISO/ANSI C++ Standard DRAFT dated November 15,
1996 covers the syntax and semantics of the "inline" keyword in sections 3.2
(One definition rule) and 7.1.2 (Function specifiers).  These sections are
attached in Appendix B of this proposal.

The GNU C Compiler, GCC, is an ISO 9899-1990 compliant compiler which
supports several extensions to the ANSI C standard, including a function
inlining capability very similar to the function inlining of C++.  The
syntax and semantics of the "inline" keyword is covered in the Inline info
node in the gcc.info file.  This info node is attached in Appendix C of this
proposal.

Both C++ and GCC specify that the use of the keyword "inline" in a function
definition serves as a ``hint'' to the compiler that calls to such a
function be substituted with the function body rather than the usual call
mechanism.  There are, however, several differences between the C++ and GCC
"inline" definitions.  The primary difference is the extensive facility in
GCC to control when and where inline functions are allowed to be
instantiated as separate, callable functions.  See Appendix A for a complete
description of the differences between the two definitions.


Proposal:
=========

This proposal advocates the inclusion into the ISO/ANSI C9X standard of a
function inlining facility identical to that implemented by C++.  C++'s
model was chosen over GCC's because of the desire to maintain wherever
possible and reasonable a high degree of compatibility between the C and C++
standards.  Because GCC is a widely available compiler and some application
code may have already adopted its inline model this may lead to problems.
However, 1. it's highly likely that most such uses are guarded by
preprocessor directives active only under GCC, 2. the differences are minor
enough not to introduce problems in a majority of cases, and 3. the desire
to maintain compatibility with the C++ standard is important enough to
accept the remaining problems.

In this proposal:

  o References to Draft 5 of the C9X C Language Standard are of the form
    C9X-D5/section.number.

  o References to the 1996 November 15 ISO/ANSI C++ Standard DRAFT are of
    the form 1996-nov-C++/section.number.

  o References to the Inline info node from the file gcc.info for GCC
    release 2.6.3 are of the form GCC-2.6.3, or simply GCC.


Proposal details:
=================

A new keyword, "inline," is added to the set of declaration-specifiers
allowable for function declarations and definitions (C9X-D5/6.7.1).  The
semantics of this new keyword are identical to those in 1996-nov-C++.  Note
that the C++ Standard Committee has recently changed the traditional C++
semantics of the "inline" keyword.  In previous versions of the C++ language
and standard, "inline" has implied "internal linkage" as if the "static"
storage class keyword had been specified.  In 1996-nov-C++ this implied
linkage semantic has been dropped and "inline" no longer has any effect on
the linkage of a function (1996-nov-C++/7.1.2, footnote 2 in paragraph 4).

The relevant portions of 1996-nov-C++ are included here:

      3.2  One definition rule ...
    
    1 No translation unit shall contain more than one definition of any
      variable, function, [or] enumeration type, ...
    
    3 Every program shall contain exactly one definition of every non-inline
      function or object that is used in that program; no diagnostic
      required.  The definition can appear explicitly in the program, it can
      be found in the standard or a user-defined library, ...  An inline
      function shall be defined in every transla- tion unit in which it is
      used.
    
    5 There can be more than one definition of a ... inline function with
      external linkage ... in a program provided that each definition
      appears in a different translation unit, and provided the definitions
      satisfy the following requirements.  Given such an entity named D
      defined in more than one translation unit, then
    
      --each definition of D shall consist of the same sequence  of  tokens;
        ...

      7.1.2  Function specifiers ...
    
    1 Function-specifiers can be used only in function declarations.
              function-specifier:
                      inline
                      ...
    
    2 A function declaration ... with an inline specifier declares an inline
      function.  The inline specifier indicates to the implementation that
      inline substitution of the function body at the point of call is to be
      preferred to the usual function call mechanism.  An implementation is
      not required to perform this inline substitution at the point of call;
      however, even if this inline substitution is omitted, the other rules
      for inline functions defined by this subclause shall still be
      respected.
    
    3 ... The inline specifier shall not appear on a block scope function
      declaration. ... The inline keyword has no effect on the linkage of a
      function.
    
    4 An inline function shall be defined in every translation unit in which
      it is used and shall have exactly the same definition in every case
      (see one definition rule, 3.2).  [Note: a call to the inline function
      may be encountered before its definition appears in the translation
      unit.]  If a function with external linkage is declared inline in one
      transla- tion unit, it shall be declared inline in all translation
      units in which it appears; no diagnostic is required.  [Note: a static
      local variable in an extern inline function always refers to the same
      object.]

Note that the use above of the phrase ``the same sequence of tokens'' is
somewhat ambiguous since 1996-nov-C++ doesn't seem to have an explicit
definition of what the word ``token'' means in the context of the body of
the standard.  E.g. in C9X-D5 we have:

    6.1 Lexical elements
    ...
    [4] A token is the minimal lexical element of the language in
    translation phases 7 and 8.  The categories of tokens are: keywords,
    identifiers, constants, strings literals, operators, and puntuators.  A
    preprocessing token is the minimal lexical element of the language in
    translation phases 3 though 3.  The categories of preprocessing token
    are:header names, identifiers, preprocessing numbers, character
    constants, string literals, operators, punctuators, and single
    non-white-space characters that do not lexically match the other
    preprocessing token categories. ...

However, the structure of the two documents is very similar including nearly
identical numbering and wording for the description of translation phases.
Thus we will take the word ``token'' above to mean ``translation phase 7
token'' which is the meaning of the word in C9X-D5.

Unresolved issues:
==================

 1. The above definition will lead to conflict with the ``one definition
    rule'' in C9X-D5/6.7, paragraph 6:

        An external definition is an external declaration that is also a
        definition of a function or an object.  If an identifier declared
        with external linkage is used in an expression (other than as part
        of the operand of a sizeof operator), somewhere in the entire
        program there shall be exactly one external definition for the
        identifier; otherwise, there shall be no more than one. [83. Thus,
        if an identifier declared with external linkage is not used in an
        expression, there need be no external definition for it.]

 2. 1996-nov-C++ is imprecise on whether multiple out-of-line instantiations
    of an ``extern inline'' function result in a single instantiation (i.e.
    if you take the address of such a function in multiple translation
    units, do you get the same address).  This issue has been raised with
    the 1996-nov-C++ committee.

 3. The 1996-nov-C++ would appear to result in multiple out-of-line copies
    of "static inline" functions (one per translation unit).  This is a
    result of the language which states that "inline" indicates that inline
    substitution is preferred.  As a result it appears that a "static"
    out-of-line copy must be generated.  Since these copies would have
    internal linkage, there's no way a linker could collapse such copies.

Appendix A: Differences between C++ and GCC "inline":
=====================================================

While the GCC-2.6.3 inlining facility was almost certainly modeled on the
C++ inline capability there are several differences.

The single largest difference between the GCC-2.6.3 and 1995-C++ definitions
has to do with GCC's extensive ``instantiation control'' capabilities for
inline functions.  With GCC an implementor can explicitly control whether an
inline function should have unsubstitutable references result in
[translation-unit] local copies of the function with internal linkage, local
copies with external linkage, or non-local copies with external linkage.

For example, this could be used to cause external users of a library
function, f(), to use the normal call mechanism while internal uses of f()
could all be inlined.  This would allow the library implementors to
carefully control space, time, and binary compatibility issues of using
inline functions (if user code were allowed to inline f(), then compiled
user code could become binary incompatible with a new version of the library
which used different internal data structures).

Much of this control is probably provided to get around the fact that the
loader won't collapse multiple definitions of a function to a single
instance as is done for uninitialized variables with external linkage.

GCC-2.6.3 also allows an inline function to be used before being defined.
Uses prior to the inline definition will call an out-of-line instantiation
of the function using the normal function call mechanism while uses
following the definition will inline it.  It appears from 1996-nov-C++ that
it is legal to call an inline function before it is defined in C++ but it
isn't 100% clear (see issue #3 above).

Finally, GCC-2.6.3 explicitly states that if all calls to a ``static
inline'' are integrated into the caller and the function's address is never
taken, then no out-of-line code for the function will be emitted.
1996-nov-C++ merely states that "inline" is an indication that inline
substitution should be preferred thus it seems that an internal linkage
out-of-line copy is mandated by the standard ... (see issue #4 above)

Appendix B: relevant material from ISO/ANSI C++ Standard DRAFT
            September 24, 1996 [http://www.cygnus.com/misc/wp/sep96]
====================================================================

  3.2  One definition rule                               [basic.def.odr]

1 No  translation  unit  shall  contain  more than one definition of any
  variable, function, class type, enumeration type or template.

2 An expression is potentially evaluated unless either it is the operand
  of  the  sizeof  operator (_expr.sizeof_), or it is the operand of the
  typeid operator and does not designate an lvalue of polymorphic  class
  type (_expr.typeid_).  An object or non-overloaded function is used if
  its name appears in a  potentially-evaluated  expression.   A  virtual
  member  function is used if it is not pure.  An overloaded function is
  used if it is selected by overload resolution when referred to from  a
  potentially-evaluated  expression.   [Note: this covers calls to named
  functions (_expr.call_), operator overloading  (_over_),  user-defined
  conversions  (_class.conv.fct_), allocation function for placement new
  (_expr.new_), as well as non-default initialization  (_dcl.init_).   A
  copy  constructor  is  used even if the call is actually elided by the
  implementation.  ] An allocation or deallocation function for a  class
  is  used  by  a  new  expression  appearing in a potentially-evaluated
  expression as specified in _expr.new_ and _class.free_.   A  dealloca-
  tion  function for a class is used by a delete expression appearing in
  a potentially-evaluated expression as specified in  _expr.delete_  and
  _class.free_.   A  copy-assignment  function for a class is used by an
  implicitly-defined copy-assignment function for another class as spec-
  ified  in  _class.copy_.  A default constructor for a class is used by
  default initialization as specified in _dcl.init_.  A constructor  for
  a  class is used as specified in _dcl.init_.  A destructor for a class

  is used as specified in _class.dtor_.

3 Every program shall contain exactly one definition of every non-inline
  function  or  object  that  is  used  in  that  program; no diagnostic
  required.  The definition can appear explicitly in the program, it can
  be found in the standard or a user-defined library, or (when appropri-
  ate) it is implicitly  defined  (see  _class.ctor_,  _class.dtor_  and
  _class.copy_).   An inline function shall be defined in every transla-
  tion unit in which it is used.

4 Exactly one definition of a class is required in a translation unit if
  the  class  is  used  in a way that requires the class type to be com-
  plete.  [Example: the following complete  translation  unit  is  well-
  formed, even though it never defines X:
          struct X;      // declare X as a struct type
          struct X* x1;  // use X in pointer formation
          X* x2;         // use X in pointer formation
    --end  example]  [Note:  the  rules for declarations and expressions
  describe in which contexts complete class types are required.  A class
  type T must be complete if:

  --an object of type T is defined (_basic.def_, _expr.new_), or

  --an  lvalue-to-rvalue conversion is applied to an lvalue referring to
    an object of type T (_conv.lval_), or

  --an expression is converted (either implicitly or explicitly) to type
    T        (_conv_,       _expr.type.conv_,       _expr.dynamic.cast_,
    _expr.static.cast_, _expr.cast_), or

  --an expression is converted to the type pointer to T or reference  to
    T   using   an   implicit   conversion   (_conv_),   a  dynamic_cast
    (_expr.dynamic.cast_) or a static_cast (_expr.static.cast_), or

  --a class member access operator is applied to an expression of type T
    (_expr.ref_), or

  --the   typeid   operator   (_expr.typeid_)  or  the  sizeof  operator
    (_expr.sizeof_) is applied to an operand of type T, or

  --a function with a return type or argument type of type T is  defined
    (_basic.def_) or called (_expr.call_), or

  --an lvalue of type T is assigned to (_expr.ass_).  ]

5 There  can be more than one definition of a class type (_class_), enu-
  meration type (_dcl.enum_),  inline  function  with  external  linkage
  (_dcl.fct.spec_),  class  template  (_temp_), non-static function tem-
  plate  (_temp.fct_),  static  data  member   of   a   class   template
  (_temp.static_),  member  function template (_temp.mem.func_), or tem-
  plate specialization for which some template parameters are not speci-
  fied  (_temp.spec_, _temp.class.spec_) in a program provided that each
  definition appears in a different translation unit, and  provided  the
  definitions  satisfy the following requirements.  Given such an entity

  named D defined in more than one translation unit, then

  --each definition of D shall consist of the same sequence  of  tokens;
    and

  --in each definition of D, corresponding names, looked up according to
    _basic.lookup_, shall refer to an entity defined within the  defini-
    tion of D, or shall refer to the same entity, after overload resolu-
    tion (_over.match_) and after matching of partial template  special-
    ization  (_temp.over_),  except  that  a  name  can refer to a const
    object with internal or no linkage if the object has the same  inte-
    gral  or enumeration type in all definitions of D, and the object is
    initialized with a constant expression (_expr.const_), and the value
    (but  not the address) of the object is used, and the object has the
    same value in all definitions of D; and

  --in each definition of D, the overloaded operators referred  to,  the
    implicit  calls  to conversion operators, constructors, operator new
    functions and operator delete functions, shall  refer  to  the  same
    function, or to a function defined within the definition of D; and

  --in  each definition of D, a default argument used by an (implicit or
    explicit) function call is treated as if  its  token  sequence  were
    present  in  the  definition  of D; that is, the default argument is
    subject to the three  requirements  described  above  (and,  if  the
    default  argument  has  sub-expressions with default arguments, this
    requirement applies recursively).2)

  --if   D   is   a   class   with  an  implicitly-declared  constructor
    (_class.ctor_), it is as if the constructor was  implicitly  defined
    in every translation unit where it is used, and the implicit defini-
    tion in every translation unit shall call the same constructor for a
    base class or a class member of D.  [Example:

  _________________________
  2) _dcl.fct.default_  describes how default argument names are  looked
  up.

              // translation unit 1:
              struct X {
                      X(int);
                      X(int, int);
              };
              X::X(int = 0) { }
              class D: public X { };
              D d2; // X(int) called by D()

              // translation unit 2:
              struct X {
                      X(int);
                      X(int, int);
              };
              X::X(int = 0, int = 0) { }
              class D: public X { };     // X(int, int) called by D();
                                         // D()'s implicit definition
                                         // violates the ODR
      --end example] If D is a template, and is defined in more than one
    translation unit, then the last  four  requirements  from  the  list
    above  shall apply to names from the template's enclosing scope used
    in the template definition (_temp.nondep_), and  also  to  dependent
    names  at  the  point of instantiation (_temp.dep_).  If the defini-
    tions of D satisfy all these requirements, then  the  program  shall
    behave  as  if  there were a single definition of D.  If the defini-
    tions of D do not satisfy these requirements, then the  behavior  is
    undefined.

  7.1.2  Function specifiers                              [dcl.fct.spec]

1 Function-specifiers can be used only in function declarations.
          function-specifier:
                  inline
                  virtual
                  explicit

2 A function declaration (_dcl.fct_, _class.mfct_, _class.friend_)  with
  an inline specifier declares an inline function.  The inline specifier
  indicates to the implementation that inline substitution of the  func-
  tion  body  at the point of call is to be preferred to the usual func-
  tion call mechanism.  An implementation is  not  required  to  perform
  this  inline  substitution at the point of call; however, even if this
  inline substitution is omitted, the other rules for  inline  functions
  defined by this subclause shall still be respected.

3 A  function  defined  within a class definition is an inline function.
  The inline specifier shall  not  appear  on  a  block  scope  function
  declaration.2)

4 An inline function shall be defined in every translation unit in which
  it is used and shall have exactly the same definition  in  every  case
  (_basic.def.odr_).   [Note:  a  call  to  the  inline  function may be
  _________________________
  2) The inline keyword has no effect on the linkage of a function.

  encountered before its definition appears in the translation unit.   ]
  If a function with external linkage is declared inline in one transla-
  tion unit, it shall be declared inline in  all  translation  units  in
  which  it  appears;  no diagnostic is required.  [Note: a static local
  variable in an extern  inline  function  always  refers  to  the  same
  object.  ]

5 The  virtual specifier shall only be used in declarations of nonstatic
  class member functions that appear within a member-specification of  a
  class declaration; see _class.virtual_.

6 The explicit specifier shall be used only in declarations of construc-
  tors within a class declaration; see _class.conv.ctor_.

Appendix C: relevant material from GNU GCC 2.6.3:
=================================================

File: gcc.info,  Node: Inline,  Next: Extended Asm,  Prev: Alignment,  Up: C Extensions

An Inline Function is As Fast As a Macro
========================================

   By declaring a function `inline', you can direct GNU CC to integrate
that function's code into the code for its callers.  This makes
execution faster by eliminating the function-call overhead; in
addition, if any of the actual argument values are constant, their known
values may permit simplifications at compile time so that not all of the
inline function's code needs to be included.  The effect on code size is
less predictable; object code may be larger or smaller with function
inlining, depending on the particular case.  Inlining of functions is an
optimization and it really "works" only in optimizing compilation.  If
you don't use `-O', no function is really inline.

   To declare a function inline, use the `inline' keyword in its
declaration, like this:

     inline int
     inc (int *a)
     {
       (*a)++;
     }

   (If you are writing a header file to be included in ANSI C programs,
write `__inline__' instead of `inline'.  *Note Alternate Keywords::.)

   You can also make all "simple enough" functions inline with the
option `-finline-functions'.  Note that certain usages in a function
definition can make it unsuitable for inline substitution.

   Note that in C and Objective C, unlike C++, the `inline' keyword
does not affect the linkage of the function.

   GNU CC automatically inlines member functions defined within the
class body of C++ programs even if they are not explicitly declared
`inline'.  (You can override this with `-fno-default-inline'; *note
Options Controlling C++ Dialect: C++ Dialect Options..)

   When a function is both inline and `static', if all calls to the
function are integrated into the caller, and the function's address is
never used, then the function's own assembler code is never referenced.
In this case, GNU CC does not actually output assembler code for the
function, unless you specify the option `-fkeep-inline-functions'.
Some calls cannot be integrated for various reasons (in particular,
calls that precede the function's definition cannot be integrated, and
neither can recursive calls within the definition).  If there is a
nonintegrated call, then the function is compiled to assembler code as
usual.  The function must also be compiled as usual if the program
refers to its address, because that can't be inlined.

   When an inline function is not `static', then the compiler must
assume that there may be calls from other source files; since a global
symbol can be defined only once in any program, the function must not
be defined in the other source files, so the calls therein cannot be
integrated.  Therefore, a non-`static' inline function is always
compiled on its own in the usual fashion.

   If you specify both `inline' and `extern' in the function
definition, then the definition is used only for inlining.  In no case
is the function compiled on its own, not even if you refer to its
address explicitly.  Such an address becomes an external reference, as
if you had only declared the function, and had not defined it.

   This combination of `inline' and `extern' has almost the effect of a
macro.  The way to use it is to put a function definition in a header
file with these keywords, and put another copy of the definition
(lacking `inline' and `extern') in a library file.  The definition in
the header file will cause most calls to the function to be inlined.
If any uses of the function remain, they will refer to the single copy
in the library.

   GNU C does not inline any functions when not optimizing.  It is not
clear whether it is better to inline or not, in this case, but we found
that a correct implementation when not optimizing was difficult.  So we
did the easy thing, and turned it off.