1. Changelog
1.1. Revision 0 - January 7th, 2025
-
Initial release. ✨
2. Introduction and Motivation
There are several strange hiccups and problems with as it concerns constant sized arrays, variable length arrays, , and other (potentially new) feature sets. Aaron Ballman’s [N3260] provided a way to do direct type matching, which eliminated some of these concerns when using a type as the controlling indicator for generic selection. But, this was not enough to stop the few situations where implementation quirks were exposed and the inadequacies of type compatibility for a compile-time, zero run-time cost feature that was meant to be.
Additionally, the integration of Martin Uecker’s [N3348] has let an existing Undefined Behavior leak into by allow Variable-Length Arrays (VLAs) of non-constant size match with arrays of both constant and non-constant sizes (including ones whose size does not match the input, perhaps at multiple dimensions). The problem was acknowledged during the integration of [N3348] by WG14, but it was voted in with the promise that a paper would provide a better array matching capability later on.
There was some degree of agreement to pursue a similar-but-differently-specified approach to matching that was contained in [N3441]. Unfortunately, a revision of this paper was not ready for WG14 by the previous meeting, and therefore it was skipped over and WG14 simply settled on Martin Uecker’s [N3348]. At the meeting, it was voiced that if it was possible, another paper doing better matching should be brought forward, and one of us (JeanHeyd) volunteered to produce a paper that was in line with Martin Uecker’s design in [N3348].
The core approach for multidimensional matching and stricter matching from our [N3441] is still possible even after the integration of Martin Uecker’s tweaks, repurposing the use of and in multidimensional matching to be more in line with what Martin Uecker designed and what WG14 has accepted in the current Working Draft of the C Standard. This paper sets out performing that behavior and extending the concept even to , to aid with improved matching.
2.1. Unusual Array Behavior
Consider the following:
int main () { int arr [ 10 ] = {}; int result = _Generic ( typeof ( arr ), int [ 10 ] : 0 , int [ 11 ] : 1 , int [ 20 ] : 2 , default : 3 ); return result ; }
This works just fine: constant sized arrays are considered compatible only with arrays of the same type and constant size. The above programs compiles, runs, returns dependably, and exits. Consider the same program, however, with a variable length array for the controlling type of the generic selection expression:
int main () { int n = 20 ; int vla [ n ] = {}; int result = _Generic ( typeof ( vla ), int [ 10 ] : 0 , int [ 11 ] : 1 , int [ 20 ] : 2 , default : 3 ); return result ; }
This program is fine even with the thanks to the latest changes in the C Working Draft ([N3348]). The problem is after that, as every non-default branch of this generic selection is considered a match. Every variable length array is compatible with every other kind of array insofar as the compatibility rules are concerned. This provokes a constraint violation after the whole , in that only one branch of a generic selection expression may match the controlling type (or none, in which case there must be a default branch).
[N3348] exacerbated this problem by not only leaving the compatibility rules around this matter unresolved, but introducing variable length array types as a plausible branch for generic selection by stripping the constraint out:
int main () { int n = 20 ; int vla [ n ] = {}; int result = _Generic ( typeof ( vla ), int [ 20 ] : 0 , // VLA... matches? int [ 11 ] : 1 , int [ 10 ] : 2 , int [ n ] : 3 , // VLA matches?? Double-compatibility issue?? default : 4 ); return result ; }
Unfortunately, this too results in the same problem: all of these branches are considered compatible with one another under the changes and direction that [N3348] added to the C standard around August 2025. Even if one went back to matching on constant sized arrays for the controlling type, this code would still not compile because the VLA branch is considered compatible with all of the other type branches in the list: the compiler would reject the code still as no two generic selection branches may contain a compatible type, either:
int main () { int n = 20 ; int arr [ 20 ] = {}; int result = _Generic ( typeof ( arr ), int [ 10 ] : 0 , int [ 11 ] : 1 , int [ 20 ] : 4 , int [ n ] : 3 , // compiler error: non-distinct branch from // every other generic selection branch default : 4 ); return result ; }
This continues to deteriorate when using to match "any-sized integer array"; both of them are compatible with one another and they both match on arrays that are either variable length arrays or constant sized arrays. Nominally, this might not be a problem, except there is further issue: the compatibility rules themselves have Bad Behavior on them even if you strip out all of the compatible match branches and only have one:
int main () { int n = 10 ; int arr [ n ] = {}; // variable length array int result = _Generic ( typeof ( arr ), int [ 11 ] : 0 , // this matches for some reason??? default : 1 ); return result ; }
This program returns 0, which makes no sense from the perspective of , which is a translation time-capable matching algorithm. The sizes do not match, but because we defined this in terms of compatibility (and all constant sized and variable length arrays are compatible with one another) we have introduced undefined behavior here. Worse, this gives the impression that the array has the size when it clearly does not. This is easily spotted in these simple toy programs, but is far less usable when applied to larger programs with much more complex control flow and no ahead-of-time knowable constant values.
The acceptance of [N3348] made this situation slightly worse by allowing variable length arrays to be put inside of as well, leading to a situation where variable length arrays can easily match array types that are not of the same length.
int main () { int n = 10 ; int arr [ 20 ] = {}; int result = _Generic ( typeof ( arr ), int [ n ] : 0 , // fixed-size arrays now match against any variable length array default : 1 ); return result ; }
Given these issues, we propose a general overhaul and a new phrase-of-power that we are going to term strict compatibility, that would be applied within generic selection context. The goal is to remove the recently introduced undefined behavior in generic selection and bring better type matching.
3. Design
This paper is, exclusively, focusing on improvements to . In particular, we:
-
make it so
can match only other fixedT [ CONSTANT ] -
make it so
can match specifically arrays of fixed size or other incomplete arrays inT [] ;_Generic -
make it so
can match all types of array sizes inT [ * ] (as is the case after [N3348]);_Generic -
make it so using both
andT [] means that fixed-size arrays are matched byT [ * ] and VLAs are matched byT [] , if they are both present (preferentially matchingT [ * ] overT [] ).T [ * ]
We do this by defining "strict compatibility" first and foremost, which more precisely allows Known Constant Length Arrays, Incomplete Arrays, and Variable Length Arrays to match an array dimension with a size specification of . allows Known Constant Length Arrays and Incomplete Arrays, but not Variable Length Arrays. precisely matches , where the two constants compute the same integer constant expression.
Then, we define a second algorithm in the Generic Selection section of the standard that ranks certain kinds of array matches higher than others based on the use of , /, and . The algorithm effectively states that, provided all of the strictly compatible associations are all derived from pointer or array types, we count how many constant, unknown size, and "any" (variable length or ) matches there are. The one with the highest "constant" dimensions wins, and if not that then the one with the highest "unknown size" dimensions win, and then finally the one with the highest "any" dimension matches wins. If any are equal and we run out of discriminators, then too many are compatible and it results in an error. The reason it is important to count is because you want something that looks like this:
int main () { return _Generic ( int [ 2 ][ 2 ], int [][ 2 ] : 0 , int [ 2 ][] : 1 ); }
to be an error, while this:
int main () { return _Generic ( int [ 2 ][ 2 ], int [ 2 ][ 2 ] : 0 , int [ 2 ][] : 1 ); }
should return . It also makes it so binds "last" since it binds to any given dimension, while either binds to itself or a constant:
int main () { return _Generic ( int [ 2 ][ 2 ], int [ 2 ][ * ] : 1 int [ 2 ][] : 0 ) + _Generic ( int [ 2 ][ 2 ], int [ * ][ * ] : 1 int [ * ][] : 0 ); // equivalent to return 0 + 0; }
Obviously, and are considered different for the purposes of the algorithm as well (they have to compute the same known constant value in order to match), which finally gives up proper separation of array types in .
3.1. Undefined Behavior / Perceived Value Mismatch
This paper eliminates the undefined behavior of matching constant-sized arrays or incomplete array to variable-length arrays and vice-versa. The case of undefined behavior / wrongful perception that it does not fix that was introduced by [N3348] is where a VLA is matched to another VLA of the wrong size:
int main () { int n = foo (); int m = n + 1 ; _Generic ( int [ n ][ m ], int [ n ][ m ] : 1 , // dimensions are swappe (incorrect), but // generic selection still picks this one ); }
This is because the variable length array types are all substituted with , such that the compiler reads it as:
int main () { int n = foo (); int m = n + 1 ; _Generic ( int [ n ][ m ], int [ * ][ * ] : 1 , // all variable length dimensions are // replaced with [*], so technically fine even if // visually misleading ); }
This can be problematic for reading purposes and for other uses. But, this 'cat" is already "out of the bag", so to speak, since the changes in [N3348]. There’s no real fix for this, unfortunately, other than just going back to banning the use of non- variable length arrays in the type names of generic associations. But we removed that constraint, so this paper simply leaves this one untouched case of seeming undefined behavior in place, even though by-the-specification the replacement makes it somewhat "better'.
3.2. Array Usage Examples: Constant Sized Input
Here is an example of expected behavior from matching on a constant sized array with the whole gamut of different types deployed:
#include <assert.h>int main () { int arr [ 10 ] = {}; int result_constants = _Generic ( typeof ( arr ), int [ 10 ] : 0 , int [ 11 ] : 1 , int [ 20 ] : 2 , default : 3 ); assert ( result_constants == 0 ); int result_constant_and_incomplete = _Generic ( typeof ( arr ), int [ 10 ] : 0 , int [] : 1 , default : 2 ); assert ( result_constant_and_incomplete == 0 ); int result_incomplete = _Generic ( typeof ( arr ), int [] : 0 , default : 1 ); assert ( result_incomplete == 0 ); int result_incomplete_and_vla = _Generic ( typeof ( arr ), int [] : 0 , int [ * ] : 1 , default : 2 ); assert ( result_incomplete_and_vla == 0 ); int result_incomplete_constant_and_vla = _Generic ( typeof ( arr ), int [ 10 ] : 0 , int [] : 1 , int [ * ] : 2 , default : 3 ); assert ( result_incomplete_constant_and_vla == 0 ); return 0 ; }
3.3. Array Usage Examples: Variable Length Input
Here is a similar example, but with the input array being a VLA:
#include <assert.h>int main () { int n = 10 ; int vla [ n ] = {}; int vla_result_constants = _Generic ( typeof ( vla ), int [ 10 ] : 1 , int [ 11 ] : 2 , int [ 20 ] : 3 , default : 0 ); assert ( vla_result_constants == 0 ); int vla_result_constant_and_incomplete = _Generic ( typeof ( vla ), int [ 10 ] : 1 , int [] : 0 , default : 2 ); assert ( vla_result_constant_and_incomplete == 0 ); int vla_result_incomplete = _Generic ( typeof ( vla ), int [] : 1 , default : 0 ); assert ( vla_result_incomplete == 0 ); int vla_result_incomplete_and_vla = _Generic ( typeof ( vla ), int [] : 1 , int [ * ] : 0 , default : 2 ); assert ( vla_result_incomplete_and_vla == 0 ); int vla_result_incomplete_constant_and_vla = _Generic ( typeof ( vla ), int [ 10 ] : 1 , int [] : 2 , int [ * ] : 0 , default : 3 ); assert ( vla_result_incomplete_constant_and_vla == 0 ); return 0 ; }
4. Prior Art
There is no prior art for this. We intend to gauge Committee reaction since this was supported by commentary during the integration of Martin Uecker’s [N3348].
5. Wording
The wording is relative to the latest Working Draft.
5.1. Intent
The intent of this wording is to provide:
-
add a new definition for "strict compatible"/"strict compatibility" types;
-
and, modify Generic Selection to use this new "strict compatibility" for types, with a lexicographic ordering algorithm for array matching.
5.2. Modify §6.2.7 "Compatible type and composite type"
6.2.7 Compatible type and composite typeTwo types are compatible types if they are the same. Additional rules for determining whether two types are compatible are described in 6.7.3 for type specifiers, in 6.7.4 for type qualifiers, and in 6.7.7 for certain declarators. Two pointer types are compatible if both are identically qualified and both are pointers to compatible types. Two array types are compatible if and only if both of the following hold:
If the two array types are used in a context which requires them to be compatible, the behavior is undefined if the lengths of both are specified and the corresponding array length expressions evaluate to unequal values. Moreover, two complete structure, union, or enumerated types declared with the same tag are compatible if members satisfy the following requirements:
- they have compatible element types;
- and, if both array length expressions are present, and are integer constant expressions, then they same constant value.
there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types;
if one member of the pair is declared with an alignment specifier, the other is declared with an equivalent alignment specifier;
and, if one member of the pair is declared with a name, the other is declared with the same name.
For two structures, corresponding members shall be declared in the same order. For two unions declared in the same translation unit, corresponding members shall be declared in the same order. For two structures or unions, corresponding bit-fields shall have the same widths. For two enumerations, corresponding members shall have the same values; if one has a fixed underlying type, then the other shall have a compatible fixed underlying type. For determining type compatibility, anonymous structures and unions are considered a regular member of the containing structure or union type, and the type of an anonymous structure or union is considered compatible with the type of another anonymous structure or union, respectively, if their members fulfill the preceding requirements.
Furthermore, two structure, union, or enumerated types declared in separate translation units are compatible in the following cases:
both are declared without tags and they fulfill the preceding requirements;
both have the same tag and are completed somewhere in their respective translation units and they fulfill the preceding requirements;
both have the same tag and at least one of the two types is not completed in its translation unit.
Otherwise, the structure, union, or enumerated types are incompatible.
Two types are strictly compatible if, in certain cases, they meet more stringent requirements than just being compatible. Two pointer types are strictly compatible if the referenced types are strictly compatible and both are identically qualified. For two array types:
if they both have a known constant length, then the two types are strictly compatible if the constant lengths are the same and the element types are strictly compatible;
if they both have an unknown size, then the two types are strictly compatible if the element types are strictly compatible;
if they both have a length expression of
, then the two types are strictly compatible if the element types are strictly compatible;* if one has a known constant length and the other has unknown size, then the two types are strictly compatible if the element types are strictly compatible;
if one has a non-constant length and the other has unknown size, then the two types are strictly compatible if the element types are strictly compatible;
or, if one has a non-constant length and the other has an array length expression of
, then the two types are strictly compatible if the element types are strictly compatible.* Otherwise, the two array types are not strictly compatible. Other types are strictly compatible if they are compatible.
All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
...
5.3. Modify §6.5.2.1 "Generic selection"
6.5.2.1 Generic selectionSyntax...ConstraintsA generic selection shall have no more than one
generic association. No two generic associations in the same generic selection shall specify strictly compatible types. If the generic controlling operand is an assignment expression, the controlling type of the generic selection expression is the type of the assignment expression as if it had undergone an lvalue conversion, array to pointer conversion, or function to pointer conversion. Otherwise, the controlling type of the generic selection expression is the type designated by the type name.default The controlling type shall be compatible with at most one of the types named in the generic association list. If a generic selection has noOne generic association shall be the selected association as described further in this subclause, unless there is ageneric association, its controlling type shall be compatible with exactly one of the types named in its generic association list.default generic association in the generic association list, in which case there may be one or zero selected associations.default SemanticsLet determine-association(
,T control ) be the algorithm which takes two strictly compatible typesT assoc andT control as parameters. It computes an ordered triplet of integer values as the result, with each value of the triplet called (T assoc ,N constant ,N unknown ). Each value of the ordered tripletN any ,N constant , andN unknown that make up the result of the algorithm starts at 0, and then is modified as described in the following.N any
If
is a pointer type, then the current result is added to the result of determine-association utilized onT assoc andT control ’s referenced type and that is yielded as the result.T assoc Otherwise, if
is an array type, then the values of the triplet are added to the result of determine-association utilized on with the element types ofT assoc andT control . Then, the values are increased by the following before the algorithm then yields that as the result:T assoc
if
has a known constant length andT control has a known constant length, increaseT assoc by 1;N constant if
has unknown size andT control has either a known constant length or unknown size, increaseT assoc by 1;N unknown or, if
orT control has a size ofT assoc or a non-constant, variable length, increase* by one.N any Otherwise, if
is not an array type or a pointer type, the algorithm yields the current value of the triplet (T assoc ,N constant ,N unknown ).N any NOTE The following table shows sample invocations and utilizations of determine-association.
T control T assoc determine-association( ,T control )T assoc int int (0, 0, 0) int [] int [] (0, 1, 0) int [ 20 ] int [ 20 ] (1, 0, 0) int [ 20 ] int [] (0, 1, 0) int [ 20 ] int [ * ] (0, 0, 1) int [ * ] int [ 20 ] (0, 0, 1) int [] int [ 20 ] (0, 1, 0) int [ * ][] int [ * ][ * ] (0, 0, 2) int [ * ][] int [][ * ] (0, 0, 2) int [ * ][] int [ * ][] (0, 1, 1) int ( * )[ 20 ] int ( * )[] (0, 1, 0) int ( * )[ 20 ][ 20 ] int ( * )[ * ][ 20 ] (1, 0, 1) int [ 20 ][ 20 ][ 20 ] int [ 20 ][ 20 ][ 20 ] (3, 0, 0) int [ 20 ][ 20 ][ 20 ] int [ 20 ][ * ][ * ] (1, 0, 2) int [ 20 ][ 20 ][ 20 ] int [ 20 ][][ 20 ] (2, 1, 0) int [ 20 ][ 20 ][ 20 ] int [][][ 20 ] (1, 2, 0) Let the controlling type be "
" and the generic association’s type name be "T control ". Let "compatible associations" be the generic associations whereT assoc is strictly compatible withT control .T assoc
If there are zero compatible associations and there is no
generic association, a constraint is violated.default Otherwise, if there are zero compatible associations and there is a
generic association, that is the selected association.default Otherwise, if there is only one compatible association, that is the selected association.
Otherwise, there are two or more compatible associations. Each compatible association whose type name is
is used to produce a set of ordered integer triplets by invoking the algorithm determine-association(T compat ,T control ), whereT compat andT control are the arguments. The selected association is the compatible association with the ordered triplet that is lexicographically greatest, nonzero and unique.✨FN) If any of the resulting triplets is equivalent to (T compat ,0 ,0 ), then there are multiple compatible associations where one or more are neither pointer type derivations or array type derivations and a constraint has been violated. If there is not a unique, lexicographically greatest ordered triplet, then there are multiple compatible associations and a constraint is violated.0 ✨FN) (1, 0, 0) is greater than (0, 1, 0) and (2, 0, 0) is greater than (1, 500, 42).The generic controlling operand is not evaluated. If a generic selection has a selected association
generic association with a type name that is compatible with the controlling type, then the result expression of the generic selection is the expression in that generic association. Otherwise, the result expression of the generic selection is the expression in thegeneric association. None of the expressions from any other generic association of the generic selection is evaluated.default The type and value of a generic selection are identical to those of its result expression. It is an lvalue, a function designator, or a void expression if its result expression is, respectively, an lvalue, a function designator, or a void expression.
EXAMPLE A cbrt type-generic macro can be implemented as follows:#define cbrt(X) _Generic((X), \ long double: cbrtl, \ default: cbrt, \ float: cbrtf \ )(X) EXAMPLE The following two generic selection expressions select different associations because the assign- ment expression operand undergoes lvalue conversion while the type name operand is unchanged:int func ( const int i ) { return _Generic ( i , int : 0 , // int is selected const int : 1 , default : 2 ) + _Generic ( typeof ( i ), int : 0 , const int : 1 , // const int is selected default : 2 ); } Thus, this function returns 1.
EXAMPLE The following generic selection expressions are valid and all evaluate to 1.void foo ( int n , int m ) { _Generic ( int [ 3 ][ 2 ], int [ 3 ][ * ] : 1 , int [ 2 ][ * ] : 0 ); _Generic ( int [ 3 ][ 2 ], int [ * ][ 2 ] : 1 , int [ * ][ 3 ] : 0 ); _Generic ( int [ 3 ][ n ], int [ 3 ][ * ] : 1 , int [ 2 ][ * ] : 0 ); _Generic ( int [ n ][ m ], int [ * ][ * ] : 1 , char [ * ][ * ] : 0 ); _Generic ( int ( * )[ 2 ], int ( * )[ * ] : 1 ); } EXAMPLE Generic selection can match multidimensional arrays by using incomplete array types or variable length array types.int main () { int result = _Generic ( int [ 20 ][ 10 ], int [][ * ] : 0 , default : 1 ); return result ; // return 0 } Constant array types are associated with other constant array types before they are associated with incomplete array types.
int main () { static_assert ( _Generic ( int [ 20 ][ 10 ], int [][] : 1 , default : 0 )) static_assert ( _Generic ( int [ 20 ][ 10 ], int [][] : 0 , int [ 20 ][] : 1 , default : 0 )); static_assert ( _Generic ( int [ 20 ][ 10 ], int [][] : 0 , int [ 20 ][] : 0 , int [ 20 ][ 10 ] : 1 , default : 0 )); } Multiple types of equivalent compatible association with the controlling operand is a constraint violation.
int main () { static_assert ( _Generic ( int [ 20 ][ 10 ], int [][] : 0 , int [][ 10 ] : 1 , int [][ * ] : 0 , default : 0 )); static_assert ( _Generic ( int [ 20 ][ 10 ], int [ * ][] : 0 , int [][] : 1 , )); static_assert ( _Generic ( int [][], int [ * ][] : 0 , int [][] : 1 , )); static_assert ( _Generic ( int [][], int [][] : 1 , int [ 56 ][] : 0 , )); _Generic ( int , int : 1 , int : 1 , // constraint violation ); _Generic ( int [ 20 ][ 10 ], int [ 20 ][] : 1 , int [ 20 ][] : 1 , // constraint violation ); _Generic ( int [ 20 ][ 10 ], int [][] : 1 , int [][] : 1 , // constraint violation ); _Generic ( int [ 20 ][ 10 ], int [ * ][] : 1 , int [ * ][] : 1 , // constraint violation ); _Generic ( int [ 20 ][ 10 ], int [ 20 ][] : 1 , int [][ 10 ] : 1 , // constraint violation ); } EXAMPLE A variable length array is associated with variable length array types before incomplete array types.
extern int n ; int main () { int vla [ n ] = {}; static_assert ( _Generic ( typeof ( vla ), int [ 10 ] : 0 , int [ 11 ] : 0 , int [ 20 ] : 0 , default : 1 )); static_assert ( _Generic ( typeof ( vla ), int [ 10 ] : 0 , int [] : 1 , default : 0 )); static_assert ( _Generic ( typeof ( vla ), int [] : 0 , default : 1 )); static_assert ( _Generic ( typeof ( vla ), int [] : 0 , int [ * ] : 1 , default : 0 )); static_assert ( _Generic ( typeof ( vla ), int [ 10 ] : 0 , int [] : 0 , int [ * ] : 1 , default : 0 )); } int n = 10 ;
5.4. Delete compatibility requirements from §6.7.7.2 "Pointer declarators" to move them to Compatible/composite types section
2 For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
5.5. Modify §6.7.7.3 "Array declarators", moving compatible requirements to Compatible/composite types section and allowing multiple incomplete array type specifications in generic selection
6.5.2.1 Generic selectionConstraintsIn addition to optional type qualifiers and the keyword
, thestatic and[ can delimit an expression or] . If they delimit an expression, called the array length expression, the expression shall have an integer type. If the expression is a constant expression, it shall have a value greater than zero. The element type shall not be* an incomplete or function type.a function type. The element type shall not be an incomplete type, except when used as part of the potentially nested sequence of declarators or abstract declarators for the controlling type of a generic selection or as part of the type name for a generic association. The optional type qualifiers and the keywordshall appear only in a declaration of a function parameter with an array type, and then only in the outermost array type derivation.static ...Semantics......Two array types are compatible if and only if both of the following hold:
They have compatible element types.
If both array length expressions are present, and are integer constant expressions, then they same constant value.
If the two array types are used in a context which requires them to be compatible, the behavior is undefined if the lengths of both are specified and the corresponding array length expressions evaluate to unequal values.