Document Number: N2299
Supersedes: N2208
Submitter: Martin Sebor
Submission Date: September 29, 2018
Subject: Library Functions And Compound Literals, v2

Summary

A discussion of N2145 exposed a general problem with the interaction between function-like macros defined for standard library functions and passing compound literals as arguments to such functions. Specifically, in §7.1.4 Use of library functions C states that:

Any function declared in a header may be additionally implemented as a function-like macro defined in the header, ….

Historically, implementations have made use of this technique to provide more efficient versions of "hot" library functions that avoid the overhead of a function call. The technique predates the introduction of the inline keyword and so is arguably less relevant today, but it is nevertheless still in relatively wide-spread practice, albeit for a small subset of APIs. A common example where this approach is still used is the character classification macros defined in the <ctype.h> header, and some I/O functions declared in <stdio.h>.

Not all standard library functions are commonly used with compound literals. It makes little sense to call a function like abs that takes an arithmetic argument with a compound literal. The functions that are ideal candidates for compound literals are those that take their arguments by reference without modifying them. Making such functions usable with compound literals is the focus of this paper.

For instance, although this practice in modern implementations is rare, AIX defines memcpy as a macro to force inlining:

	/*
	*   The following macro definitions cause the XLC compiler to inline
	*   these functions whenever possible.
	*/

	#ifndef __cplusplus
	#ifdef __STR__
	…
	#       define memcpy(__s1,__s2,__n) __memcpy(__s1,__s2,__n)
	…
As a result, the following call to memcpy is rejected as invalid by the XLC compiler:
	struct Pair { int a, b; };
	extern void *p;
	memcpy (p, &(struct Pair){1, 2}, sizeof (struct Pair));
Another example from the standard library involves the asctime function that takes a const struct tm* argument. When the function is implemented as a macro, code like the following is rejected:
	char *str = asctime (&(struct tm){
	  .tm_year = 2018, .tm_mon = 3, .tm_mday = 26, …
	});

Beyond these original use cases, using macros to define standard library APIs has gained increased relevance with the advent of generic programming in C, specifically with the introduction of the generic selection feature (_Generic) in C11. A number of library functions are specified to take arguments of multiple distinct types and made to act as "overloads" of the same name. Besides the atomic functions declared in <stdatomic> and discussed in N2145, other examples of such "overloaded" APIs include the type-generic math functions defined in <tgmath.h>. Although other techniques for achieving this effect are possible, all existing implementations make use of macros to provide these overloads. Consequently, neither of the calls in the examples below is valid:

	atomic_store_explicit (p, (struct Pair){ 1, 0 }, memory_order_weak);   // too many arguments to macro
or
	int e = exp ((struct Pair){ 123, 456 }.first);   // too many arguments to macro
The main difference between the two is that unlike the math overloads which are explicitly specified to be implemented using macros, the atomic APIs are specified as functions.

Further Discussion

The resolution originally proposed in N2208 solved the macro problem by effectively preventing conforming implementations from defining the most severely affected library functions as macros in library headers. Implementations would have either had the option to define library functions inline for efficiency, or to make use of some non-standard extension to "make it work," or to simply define them out-of-line. All three approaches are well represented in existing practice. In particular, an efficient solution to the "inlining problem" has been available in popular implementations for decades in the form compiler intrinsics. This is even reflected in the text of the orginal C89 standard which states in Footnote 87 (the same text is Footnote 157 in C11) the following.

Footnote 87) Because external identifiers and some macro names beginning with an underscore are reserved, implementations may provide special semantics for such names. For example, the identifier _BUILTIN_abs could be used to indicate generation of in-line code for the abs function. Thus, the appropriate header could specify
	#define abs(x) _BUILTIN_abs(x)
for a compiler whose code generator will accept it. In this manner, a user desiring to guarantee that a given library function such as abs will be a genuine function may write
	#undef abs
whether the implementation's header provides a macro implementation of abs or a builtin implementation. The prototype for the function, which precedes and is hidden by any macro definition, is thereby revealed also.

It is worth noting that popular implementations commonly recognize names of standard library functions as special and expand calls to them inline even in the absence of macros like the one above, and even without including the corresponding headers. Defining library functions as macros to achieve this effect is not only unnecessary, is is also inferior to the _BUILTIN_abs solution, and certainly to the practice of recognizing standard library function names such as abs as special, because it doesn't help when a program declares the function without including the header.

The resolution proposed in N2208 would have also been in line with C++ which has never allowed implementations to define any library functions as macros. The authors are not aware of any complaints about inefficiencies in C++ implementations as a result. Furthermore, with the availability of C compound literals in popular C++ compilers as an extension, a solution that works equally well in both languages would be desirable.

However, during the April 2018 meeting at Brno the proposed resolution was rejected by WG14 on the basis that inlining library functions is essential for the efficiency of C programs and that the C inline keyword is not suitable for this purpose. The committee didn't want to force compilers to implement extensions to make inlining possible. Instead, viewing compound literals as not sufficiently important to solve the problem, the committee considered it preferable to continue to prevent strictly conforming programs from using compound literals as arguments to library functions. The importance of C/C++ interoperability did not factor into this decision.

We view this outcome as unsatisfactory. We find the committee's disregard for established practice especially unfortunate for the evolution of the language (as mentioned above, mechanisms to solve all problems are readily available and have been used in practice for decades). We also find the decision out of line with the committee's own charter to consider existing (and future) C programs more important than accommodating the limitations of historical implementations.

Proposed Resolution

Unless the committee is willing to reconsider its position on the original proposal submitted in N2208, in an attempt to improve the portability of C programs and the usability of the language, we offer an alternate proposal that does not rely on extensions to make the inline keyword work for this purpose. Instead, we suggest to require implementations that do define library functions as macros, to simply "make them work" with compound literals. Specifically, we propose to modify §7.1.4 Use of library functions, paragraph 1 as follows.

Any function declared in a header may be additionally implemented as a function-like macro defined in the header, so if a library function is declared explicitly when its header is included, one of the techniques shown below can be used to ensure the declaration is not affected by such a macro. A library function implemented as a function-like macro may be invoked with an argument that is a compound literal, or with an address expression whose operand is a compond literal as argument, even if the compound literal contains commas, withput enclosing the argument in parentheses. Any macro definition of a function can be suppressed locally by enclosing the name of the function in parentheses, …

Furthermore, add the following example to the end of §7.1.4.

–8–   EXAMPLE   Irrespective of whether it is implemented as function-like macro defined in the <string.h> header the function memcpy may be invoked as follows after including the header and with the macro defined.
	extern void *p;
	memcpy (p, (int[]){ 1, 2, 3, 4 }, sizeof (int[4]));
–9–   EXAMPLE   The example above notwithstanding, when the function isalpha is implemented as function-like macro defined in the <ctype.h> header, the following is not a valid invocation of the macro because the argument to the function contains commas but is not a compound literal.
	struct A { int a, b };
	isalpha ((struct A){ 'a', 'b' }.a);

For a proposal to correct the problem for the atomic functions defined in <stdatomic.h> refer to N2145.