N2904
Enhancements to Enumerations

Published Proposal,

Previous Revisions:
N2575 (r2), n2533 (r1), n2008 (r0)
Authors:
Clive Pygott (LDRA Ltd.)
Paper Source:
GitHub
Issue Tracking:
GitHub
Project:
ISO/IEC JTC1/SC22/WG14 9899: Programming Language — C
Proposal Category:
Feature Request
Target:
General Developers, ABI Lovers, Embedded Systems Developers

Abstract

Enumerations should have the ability to specify the underlying type to aid in portability and usability across platforms, across ABIs, and across languages (for serialization and similar purposes).

1. Changelog

1.1. Revision 3 - January 1st, 2022

1.2. Revision 2 - October 4th, 2020

1.3. Revision 1 - June 28th, 2020

1.4. Revision 0 - February 17th, 2016

2. Introduction and Motivation

C normally tries to picks int for its enumerations, but it’s entirely unspecified what the type for the enum will end up being. It’s constants (and the initializers for those constants) are always treated as ints, which is not very helpful for individuals who want to use things like enumerations in their bitfields with specific kinds of properties. This means it’s impossible to portably define an enumeration, which drastically decreases its usefulness and makes it harder to rely on enumeration values (and consequently, their type) in standard C code. This has led to a number of communities and tools attempting to do enumerations differently in several languages, or in the case of C++ simply enhancing enumerations with specific features to make them both portable and dependable.

This proposal provides an underlying enumeration type, specified after a colon of the _identifier_ for the enumeration name, to give the enumeration a dependable type. It makes the types for each of the enumeration constants the same as the specified underlying type, while leaving the current enumerations as unspecified as they were in their old iterations. It does not attempt to solve problems outside the scope of making sure that constants with specified underlying type are dependable, and attempts to make forward declaration of enumerations work across implementations.

3. Prior Art

C++ has this as a feature for their enumerations. Certain C compilers have this as an extension in their C compilation modes specifically, including Clang.

4. Design

The design of this feature follows C++'s syntax for both compatibility reasons and because the design is genuinely simple and useful:

enum a : unsigned long long {
	a0 = 0xFFFFFFFFFFFFFFFFULL
	// ^ not a constraint violation with a 64-bit unsigned long long
};

Furthermore, the type of a0 is specified to be unsigned long long, such this program:

enum a : unsigned long long {
	a0 = 0xFFFFFFFFFFFFFFFFULL
};

int main () {
	return _Generic(a0, unsigned long long: 0, default: 1);
}

exits with a return value of 0. Note that because this change is entirely opt-in, no previous code is impacted and code that was originally a syntax violation will become well-formed with the same semantics as they had from their C++ counterparts. The interesting component of this proposal - that is currently marked optional - addresses a separate issue found in the current enumeration specification.

4.1. Bit-Precise Integer Types and _Bool?

Integers such as _BitInt(31) are, currently, allowed as an extension for an underlying enumeration type in Clang. However, discussing this with the Clang implementers, there was sentiment that this just "happened to work" and was a not a fully planned part of the _BitInt/_ExtInt integration plan. They proposed that they would implement a diagnostic for it for future versions of Clang. In the standard, we do not want to step on the toes of anyone who may want to develop extensions in this place, especially when it comes to whether or not bit-precise enumeration types undergo integer promotion or follow the same rules for enumeration constants and similar. Therefore, we exclude them as usable types at this time.

We do not exclude _Bool from the possible set of types. It is allowed in C++ (as just bool) and other C extensions, and it allows for an API to provide mnemonic or otherwise fitting names for binary choices without needing to resort to a bit-field of a particular type. This provides a tangible benefit to code. Values outside of true or false can be errored/warned on when creating a _Bool enumeration, but that is a quality of implementation decision.

5. Proposed Wording

The following wording is relative to N2731.

5.1. Intent

The intent of the wording is to provide the ability to express enumerations with the underlying type present. In particular:

5.2. Proposed Specification

5.2.1. Modify Section §6.2.7 Compatible type and composite type, paragraph 1

… Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: if one is declared with a tag, the other shall be declared with the same tag. If both are completed anywhere within their respective translation units, then the following additional requirements apply: … For two enumerations, corresponding members shall have the same values and their underlying types shall be compatible types.

5.2.2. Modify Section §6.4.4.3 Enumeration constants

6.4.4.3   Enumeration constants

Syntax

enumeration-constant:
identifier

Semantics

An identifier declared as an enumeration constant for an enumeration without fixed underlying type has type int. An identifier declared as an enumeration constant for an enumeration with fixed underlying type has that underlying type during the specification of the enumeration type (i.e., from the start of the opening brace in the enum-specifier).
An identifier declared as an enumeration constant for an enumeration without fixed underlying type has an implementation-defined type wide enough for its value until the type is complete (i.e., until the closing brace in the enum-specifier). After the type is complete, the enumeration constant for an enumeration without fixed underlying type has an implementation-defined type large enough to hold all of its members.
An enumeration constant may be used in an expression wherever a value of standard or extended integer type may be used. It has the underlying type of the enumeration.

Forward references: enumeration specifiers (6.7.2.2).

5.2.3. Modify Section §6.7.2.2 Enumeration constants

6.7.2.2 Enumeration specifiers

Syntax

enum-specifier:
enum attribute-specifier-sequenceopt identifieropt enum-type-specifieropt { enumerator-list }
enum attribute-specifier-sequenceopt identifieropt enum-type-specifieropt { enumerator-list , }
enum identifier
enumerator-list:
enumerator
enumerator-list , enumerator
enumerator:
enumeration-constant attribute-specifier-sequenceopt
enumeration-constant attribute-specifier-sequenceopt = constant-expression
enum-type-specifier:
: specifier-qualifier-list
All enumerations have an underlying type. The underlying type can be explicitly specified using an enum-type-specifier and such an underlying type is its fixed underlying type.

Constraints

The type specifiers in the enum type specifier’s specifier-qualifier-list shall specify an integer type that is not an enumerated type nor a bit-precise integer type. No alignment specifiers shall appear in the specifier qualifier list. The underlying type of the enumeration is the unqualified, non-atomic version of the type specified by the type specifiers in the specifier qualifier list.
The expression that defines the value of an enumeration constant of an enumeration with a fixed underlying type shall have a value representable as that fixed underlying type.
The For an enumeration without a fixed underlying type, the expression that defines the value of an enumeration constant shall be an integer constant expression that has a value representable as an int.
If an enum type specifier is present, then the longest possible sequence of tokens that can be interpreted as a specifier qualifier list is as interpreted part of the enum type specifier.
If any declaration-specifiers (6.7) contains an enum specifier that with an enum type specifier to provide a fixed underlying type, it shall be followed by an opening brace {, enumerator list, and closing brace } when there is an immediately following declarator (6.7.6) or abstract-declarator (6.7.7).

Semantics

The optional attribute specifier sequence in the enum specifier appertains to the enumeration; the attributes in that attribute specifier sequence are thereafter considered attributes of the enumeration whenever it is named. The optional attribute specifier sequence in the enumerator appertains to that enumerator.
The identifiers in an enumerator list of an enumeration without fixed underlying type are declared as constants that have type int and they . The identifiers in an enumerator list of an enumeration with fixed underlying type are declared as constants whose types are the same as the fixed underlying type. They may appear may appear wherever such are permitted.133) An enumerator with = defines its enumeration constant as the value of the constant expression. If the first enumerator has no =, the value of its enumeration constant is 0. Each subsequent enumerator with no = defines its enumeration constant as the value of the constant expression obtained by adding 1 to the value of the previous enumeration constant. (The use of enumerators with = may produce enumeration constants with values that duplicate other values in the same enumeration.) The enumerators of an enumeration are also known as its members.
Each For all enumerations without fixed underlying type, each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type (excluding the bit-precise integer types) . The choice of type is implementation-defined134), but shall be capable of representing the values of all the members of the enumeration.
For all enumerations with a fixed underlying type, the enumerated type is compatible with the underlying type of the enumeration. If the underlying type is _Bool, the enumerated type has the same semantics as _Bool.
The An enumerated type declaration without a fixed underlying type is an incomplete type until immediately after the } that terminates the list of enumerator declarations, and complete thereafter. An enumerated type declaration of an enumeration with fixed underlying type declares a complete type immediately after its enum type specifier (i.e. after the opening { of its enumerator list).
EXAMPLE   The following fragment: …

EXAMPLE The following fragment:
#include <limits.h>

enum E1: short;
enum E2: short;
enum E3;
enum E4 : unsigned long long;

enum E1 : short { m11, m12 };
enum E2 : long  { m21, m22 }; /* Constraint violation */

enum E3 : int;
enum E3 : int {
	m31,
	m32,
	m33 = sizeof(enum E3)
};

enum E4 : unsigned long long {
	m41 = ULLONG_MAX,
	m42 /* Constraint violation: unrepresentable value */
};

enum E1 x = m11;
enum E1 : long int x; /* Constraint violation: enum-type-specifier with declarator */

demonstrates many of the properties of multiple declarations of enumerations with underlying types. E3 in particular is an enumeration declaration that chooses int as its underlying type, which matches the second declaration and the third declaration with definition. Despite E3 being declared without an underlying type first, it is declared with an underlying type second that is the same as its first, so sizeof(enum E3) is not a constraint violation.

EXAMPLE The following fragment:
enum e { A };
enum e : int;
enum e;

is a valid triplet of declarations if the implementation-defined underlying type chosen for the first declaration matches the underlying type specified in the second declaration. Otherwise, it is a constraint violation.

EXAMPLE The following fragment:
enum no_underlying {
	a0
};

int main () {
	int a = _Generic(a0,
		int: 2,
		unsigned char: 1,
		default: 0
	);
	int b = _Generic((enum no_underlying)a0,
		int: 2,
		unsigned char: 1,
		default: 0
	);
	return 0;
}

demonstrates the implementation-defined nature of the underlying type of enumerations using generic selection (6.5.1.1). The value of a after its initialization is 2. The value of b after its initialization is implementation-defined: the enumeration must be compatible with a type large enough to fit the values of its enumeration constants. Since the only value is 0 for a0, b may hold any of 2, 1, or 0.

Now, consider a similar fragment, but using a fixed underlying type:

enum underlying : unsigned char {
	b0
};

int main () {
	int a = _Generic(b0,
		int: 2,
		unsigned char: 1,
		default: 0
	);
	int b = _Generic((enum underlying)b0,
		int: 2,
		unsigned char: 1,
		default: 0
	);
	return 0;
}

Here, we are guaranteed that a and b are both initialized to 1. This makes enumerations with an underlying size more portable.

EXAMPLE Enumerations with a fixed underlying type must have their braces and the enumerator list specified as part of their declaration:
void f1 (enum a : long b); /* Constraint violation */
void f2 (enum c : long { x } d);

typedef enum t u;
typedef enum v : short W; /* Constraint violation */
typedef enum q : short { s } R;

enum forward;
extern enum forward fwd_val0; /* Constraint violation: incomplete type */
extern int fwd_val0; /* Constraint violation: incompatible type */

enum forward1 : int;
extern enum forward1 fwd_val1;
extern int fwd_val1;

int main () {
	enum e : short f = 0; /* Constraint violation */
	enum g : short { y } h = y;
	return 0;
}

Forward references: tags (6.7.2.3), declarations (6.7), declarators (6.7.6), type names (6.7.7) .

5.2.4. Modify Section §6.7.2.3 Tags

6.7.2.3 Tags

Constraints

A type specifier of the form
enum identifier
without an enumerator list shall only appear in a context where incomplete types are allowed or after the type it specifies is complete. For an enumeration without fixed underlying type, it is considered complete only after the closing } of the enumerator list. For an enumeration with fixed underlying type, it is complete after its enum type specifier.

A type specifier of the form
struct-or-union attribute-specifier-sequenceopt identifieropt { member-declaration-list }

or

enum attribute-specifier-sequenceopt identifieropt enum-type-specifieropt { enumerator-list }

or

enum attribute-specifier-sequenceopt identifieropt enum-type-specifieropt { enumerator-list , }

declares a structure, union, or enumerated type. …

A declaration of the form
struct-or-union attribute-specifier-sequenceopt identifier ;
or
enum attribute-specifier-sequenceopt identifier enum-type-specifieropt ;

specifies a structure or union type structure, union, or enumerated type and declares the identifier as a tag of that type.142) If the enumerated type does not contain the enum type specifier, the enumerated type is incomplete. Otherwise, the enumerated type is complete. The optional attribute specifier sequence appertains to the structure or union type being declared; the attributes in that attribute specifier sequence are thereafter considered attributes of the structure or union type whenever it is named.

If a type specifier of the form
struct-or-union attribute-specifier-sequenceopt identifier
or
enum attribute-specifier-sequenceopt identifier

occurs other than as part of one of the above forms, and no other declaration of the identifier as a tag is visible, then it declares an incomplete structure or union type structure, union, or enumerated type , and declares the identifier as the tag of that type. 142)

142)A similar construction with enum does not exist.
If a type specifier of the form
struct-or-union attribute-specifier-sequenceopt identifier

or

enum attribute-specifier-sequenceopt identifier

occurs other than as part of one of the above forms, and a declaration of the identifier as a tag is visible, then it specifies the same type as that other declaration, and does not redeclare the tag.

5.2.5. Add implementation-defined enumeration behavior to Annex J

6. Acknowledgements

Thanks to:

We hope this paper serves you all well.