Inline functions accessing identifiers declared with constexpr

Jens Gustedt, INRIA and ICube, France

2024-05-05

document history

document number date comment
n3253 202405 this paper, original proposal

license

CC BY, see https://creativecommons.org/licenses/by/4.0

1 Problem description

1.1 Constraints for inline definitions

C23 has introduced several new features for which the text for inline functions has not been properly updated, yet. This concerns:

In particular the latter already have lead to diverging practice

constexpr unsigned unfug = 1;

inline unsigned get(void) {
    return unfug;
}

clang accepts it, gcc refuses. The problem here is that we made unfug to have internal linkage, and 6.7.5 p3 states

An inline definition of a function with external linkage shall not contain, … anywhere in the tokens making up the function definition, a reference to an identifier with internal linkage.

Note that this uses the vague term “reference to an identifier” instead of simply using “an identifier” which would be more appropriate when discussing token sequences. This strange terminology could perhaps be interpreted of wanting to say “taking a reference of an identifier”, in which case the interpretation of clang would be correct, and the diagnostic of gcc would be overprotective.

In that existing text, it is important to note that this talks of usages of the identifier (with internal linkage) and not about the use of the underlying object (with static storage duration) or function. A use as in the following

constexpr unsigned unfug = 1;

extern unsigned const*const my_copy;

inline unsigned getit(void) {
       return *my_copy;
}

is valid (and should remain so) even if the instantiation then has

unsigned const*const my_copy = &unfug;

Here the access of the pointer value goes through a pointer object with external linkage and all copies of the inline function will see the same pointer value and will use the same instance of the constexpr object.

On the other hand we think that the following use of constexpr objects should be prohibited

constexpr unsigned unfug = 1;

inline unsigned const* gotit(void) {
    return &unfug;
}

Here, each translation unit would have a separate instance of the unfug object, and thus gotit returns a different value for each translation unit and the semantics differ.

1.2 Semantics of inline definitions

The new constexpr feature also puts the finger on another problem that is currently not addressed by the C standard at all:

Inline definitions with the same name in different translation units could have different semantics.

This could for example happen simply because the code plainly uses different programming text (e.g include files) for the different TU. But even if we would impose that inline definitions with the same name are always composed of the same token sequence, identifiers that are used in such a sequence could refer to different features. Such a different “interpretation” of the inline definition could for example happen because some feature macros or enumeration constants are defined differently for the compilation of two separate TU (already possible in C17) or because constexpr objects have different values (new in C23).

Still, the C standard in several places talks about the function and not about the functions so it could be argued that differences in semantics between different inline definitions have undefined behavior by omission.

In any case, this is an inherently dangerous property and it might perhaps be time to state the intent that divergence in semantics of inline definitions in different TU is not intended. As far as we can see, the only possibility here is to make the behavior of a program that has diverging inline definitions undefined. We propose to do that constructively, by imposing that all inline definitions already agree on a token level.

1.3 Semantics of inline and external definitions

When it comes to semantic differences between an inline definition and the (unique) external definition, the situation is even less clear. At the beginning when gcc introduced their inline feature it was even advertised as a feature that inline definition and external definition could be distinct.

Currently, the C standard only has (at the end of 6.7.5 p7)

It is unspecified whether a call to the function uses the inline definition or the external definition.

So if the inline definition or external definition have different semantics, the program has indeterminate behavior.

We don’t think that this is a good choice because nevertheless these identifiers have external linkage and should be considered to have the same semantics across the whole program.

So we think that we should also mark such divergence for inline functions with external linkage as undesirable. If we’d make that UB, implementations that want to continue to provide this possibility to their customers could still do so by extension.

2 Questions

  1. Shall we modify the constraints for inline definitions as proposed in n3253?
  2. Shall we modify the semantics for diverging inline definitions in different translation units by imposing undefined behavior as proposed in n3253?
  3. Shall we modify the semantics of diverging inline and external definitions in different translation units by imposing undefined behavior as proposed in n3253?
  4. Shall we modify the semantics of diverging inline and external definitions in different translation units by making the behavior implementation-defined as proposed in n3253?

3 Proposed wording

Removals are in stroke-out red, additions are in underlined green.

Change 6.7.5 p3 and add two footnotes as follows

An inline definition of a function with external linkage shall not contain, anywhere in the tokens making up the function definition,

1) Expression that evaluate the address of the underlying feature include the address, array subscripting and function call operators.
2) These constraints are intended to ensure that inline definitions that appear in different translation units but are made up from the same tokens have the same semantics.

If the answer to question 2. is affirmative, add a new paragraph after 6.7.5 p7

All inline definitions with the same external name in different translation units that constitute the program shall be made up of the same token sequence, only discarding possible changes in white space. Any identifier that is used in that token sequence shall be in the same name space (label, tag, member, attribute or ordinary) and refer to compatible features. In particular, all used named constants that are declared in file-scope with the same name shall have compatible type and shall have the same value in all translation units.

Additionally, if the answer to question 3. is affirmative, add to that new paragraph:

All inline definitions and an external definition with the same external name in different translation units that constitute the program shall be made up of the same token sequence, only discarding possible changes in white space. Any identifier that is used in that token sequence shall be in the same name space (label, tag, member, attribute or ordinary) and refer to compatible features. In particular, all used named constants that are declared in file-scope with the same name shall have compatible type and shall have the same value in all translation units.

or, alternatively if the answer to question 4. is affirmative, add a sentence at the end of that new paragraph

It is implementation-defined if a similar property for inline definitions and the external definition of the same function holds.