INLINING CONSIDERATIONS

                   WG14/N616 (X3J11/96-080)

                   Tom MacDonald
                   Cray Research
                   An SGI Company
                   655F Lone Oak Drive
                   Eagan  MN  55121
                   USA

                   tam@cray.com
                   3 October 1996


Introduction

I started writing a proposal that would provide exact edits to the C9X
DRAFT 6 for adding inlining to C.  However, it quickly became apparent
that the committee needs to list all the issues and then decide if this is
a feature we want to provide.  I have a list of issues below that is
intended to help us decide if we want add inlining to C.


Motivation:

The motivation for adding inlining to C is performance.  There are several
implementations that have successfully incorporated some form of inlining
into translators (e.g., gcc, Cray, SGI, and C++).  The current success of
inlining is a good reason for the C committee to examine the potential of
this language feature.

Although an implementation can perform automatic inlining, direction from
the authors of a program is often needed to avoid problems such as
excessive translation time, excessive translation memory usage, and
excessive size of the resulting executable binary.  The author often knows
when inlining is a good idea and when it is not needed.  The translator
all too often has to resort heuristics.


Debatable Issues:

What does the "inline" keyword mean?

The "inline" keyword never changes the behavior of a program, but might
allow it to execute faster.  That is, an implementation is free to ignore
"inline" if it chooses to do so (except that it must successfully
translate programs that contain correct usage of the keyword).  Thus,
"inline" is a hint to the translator (similar to "register").  Any
implementation that exploits this hint must preserve the behavior of a
strictly conforming program, such that it behaves the same as the same
program without the hint.  Note:  in C++ "inline" is not just a hint.


What syntax should be used?

There are two possibilities, an inline keyword and an inline pragma.  The
inline keyword mimics the C++ standard but takes another name away from
the user name space.  A pragma approach is most likely dependent upon
adoption of the new form of pragma (i.e., a unary preprocessor operator).


What are the issues with an inline keyword?

First, is "inline" a storage-class-specifier, a type-specifier, or a new
creature called an inline-specifier?  The C++ spec. uses a production
called a `function-specifier' and "inline" is one of the alternatives
(along with others like "virtual").

The meaning of "inline" in the following examples has to be defined:

   inline extern void func1(void);  // not allowed by some C++ compilers
   inline void func2(void);
   inline static void func3(void);

What happens if "inline" appears on something other than a function?

   inline int x;                          // Error in C++
   inline int f(void), (*pf)(void)=f, i;  // Error in C++
   typedef inline int T;                  // Error with some C++ compilers
   T f(void) { return 0; }

What about inline composition?

   inline static int compose(void);
   static int compose(void) { return 0; }  // OK in C++, compose is inline

   static int comp(void) { return 0; }
   inline static int comp(void);           // OK in C++, comp is inline

   static int kkk(void) { return 0; }

   int main(void) {
      inline int comp(void);              // Error with some C++ compilers

      int k = kkk();
   }

   inline static int kkk(void);           // Error with some C++ compilers

Exactly when can you add the "inline" specifier?


What are the issues with an inline pragma?

No standard pragmas exist yet, however the committee might be willing to
entertain the notion of a standard pragma if the new form of pragma is
adopted.

A pragma eliminates some of the issues above.  Since inline is a hint that
doesn't affect program correctness, the implementation has some liberty
with a pragma such as the following:

   pragma("inline my_func");

When present at function scope, it specifies functions to inline.
However, an implementation can easily extend the meaning when present at
block scope to mean inline any occurrence of `my_func' in the next
statement.  All too often programmers want to limit where a function is
inlined.  The C Standard is not burdened with all the issues about syntax
and semantics.

A pragma approach does mean that C and C++ have different features for
essentially the same functionality.  The problem here is, that it is
hard to keep up with the changes being made to the C++ Draft.

What would it even mean to standardize a pragma?
Perhaps something like:
 
1.  A specification for the optional and non-optional pp-tokens.
2.  An intended meaning for a pragma of the specified form.
3.  A recommendation that no diagnostic (except possibly an
    "informational" one) be issued for a pragma of the specified form.
4.  A recommendation that a pragma of the specified form should not
    be interpreted to have another, unrelated meaning.
 
Consider two possibilities (one from Cray and the other from SGI):
 
    #pragma [no]inline [_CRI] (name[,name...])
 
    #pragma [no]inline [here|routine|global] [(name[,name...])]
 
These are pretty close, without the optional elements, but it would be
nice to eliminate gratuitous differences in those.  A standard
specification based on these might go something like what follows (item
6.c below is there for the sake of argument, because it's there for C++).

====================================================================
5.  Pragma inline syntax
 
    #pragma [no]inline [vendor_id] [(name[,name...])] [inline_args]
 
where vendor_id shall be from the implementors namespace, and
inline_args is not further specified (but does show where to put
implementation-specific extensions like [here|routine|global] or
[from "path"]).
 
6.  Pragma inline intended semantics
 
a.  The the functions designated by the names (or all functions if
    none are named) should be considered candidates for inlining.
b.  The invocations that are inlined are determined by the
    placement of the directive:  Next statement in a block,
    rest of file outside a block.  (Other criteria may be
    specified in inline_args. but would not be portable.)
c.  The pragma shall follow the definitions of the named functions,
    if they are visible.  (Paths to other files may be
    specified in inline_args. but would not be portable.)
====================================================================
7.  Other pragma inline possibilities

a.  Note that the new form of pragma opens up new possibilities for
    placement.  For example, it could go after the parenthesized
    parameter list, like a C++ cv-qualifier:
 
        static int f(int i) pragma("inline") { ... }
 
    The meaning is the same as:
 
        static int f(int t) { ... }
        #pragma inline (f)
 
    The advantage is that it looks like part of the function signature,
    and the name for the function doesn't have to be repeated the name
    of the function.  This also addresses issues above like:

        inline int f(void), (*pf)(void)=f, i;

    which can be written as:
 
        int f(void) pragma("inline"), (*pf)(void)=f, i;

b.  A way to supply linkage control, like GNU's interpretation of
    extern inline, could be:
 
        static int f(int i) pragma("inline else extern") { ... }
 
    or
        
        static int f(int i) { ... }
        pragma("inline (f) else extern");
 
    This means that if the function can be inlined, then the given
    definition should be used, else ignore the definition and treat
    f as externally defined.  The meaning is like:
 
        #if F_CAN_BE_INLINED
            static int f(int i) pragma("inline") { ... }
        #else
            extern int f(int i);
        #endif
 
    except that the compiler defines F_CAN_BE_INLINED automatically,
    perhaps even on an invocation-by-invocation basis.
 
    The problem of keeping the local and external definitions in sync
    is still there.  The definitions need not actually be identical,
    but you would probably want the same behavior.  That way the local
    definition could be a specialization of the general library
    function.
 
    Even in this case, it appears the pragma can be legitimately regarded
    as a pure hint, because if it is macro-defined away, the program
    behaves (aside from execution time) in the same way as if the local
    definition were used to inline all invocations.