1. Changelog
1.1. Revision 2 - November 10th, 2025
-
Adjust wording for example which used old semantics around "matching the first/second form".
-
Fix minor specification issues around "shall be ..." versus just saying "is..." (statement of fact versus weirdly-worded constraint).
-
"shall be implementation-defined" -> "is/are implementation-defined".
-
Incorporate and synchronize changes between C and C++
1.2. Revision 1 - February 16th, 2025
-
Revise current strategy to fit results of GCC requests and C++ standardization.
-
Directives are expanded after the name, unconditionally, all the time.
-
parameter has additional examples and clarification of calculation.offset
-
-
Fix typos:
-
"an" -> "a"
-
"2" -> "two"
-
1.3. Revision 0 - December 23rd, 2024
-
Initial Release! 🎉
2. Introduction and Motivation
During the standardization discussion of in WG21 in the last year for [P1967] and [P3540], several adjustments were requested to the behavior of for niche cases. This paper synchronizes the behavior between what WG21 has voted into the C++26 Release and what is contained in the current C23/C Working Draft.
The requested synchronizations are as follows:
-
No potential double-expansion of preprocessor parameters allowed in any way.
-
Preprocessor expansion of parameters always happens, not just for
, and it is performed for everything after the resource name instead of just for inside of the parameters (a previous design in the C++ proposal, revision 13).limit -
Make it clear we’re producing a sequence of (preprocessor) tokens, and not necessarily (post-processor, Phase 7) tokens.
-
Adding the extremely-popular and already-implemented
andgnu :: offset parameters.clang :: offset
The wording below attempts to accomplish all of these things.
3. Wording
This wording is relative to C’s latest working draft.
📝 Editor’s Note: The ✨ characters are intentional. They represent stand-ins to be replaced by the editor.
3.1. Modify 6.10.2 Conditional inclusion
Syntax
1 ... ...
Description2 ... ...
...
4 A defined macro expression evaluates to 1 if the identifier is currently defined as a macro name (that is, if it is predefined or if it has been the subject of a #define preprocessing directive without an intervening #undef directive with the same subject identifier), 0 if it is not
...
6 The header or source file identified by the parenthesized preprocessing token sequence in each contained has_include expression is searched for as if that preprocessing token were the pp-tokens in a
directive, except that no further macro expansion is performed. Such a directive shall#include satisfy the syntactic requirements of aconsist solely of a header name or shall have a preprocessing token sequence that can be combined into a single header name preprocessing token . The has_include expression evaluates todirective#include if the search for the source file succeeds, and to1 if the search fails.0 7 The resource (6.10.4) identified by the header-name preprocessing token sequence in each contained has_embed expression is searched for as if those preprocessing token were the pp-tokens in a
directive, except that no additional macro expansion is performed. Such a directive shall satisfy#embed the syntactic requirements of aany constraints from thedirective#embed directive including the preprocessor parameters (such as#embed not being allowed within adefined orlimit parameter) . …offset
3.2. Modify §6.10.4.1 to change the expansion behavior of macros
Description1 A resource is a source of data accessible from the translation environment.
An embed parameter is a single preprocessor parameter in the embed parameter sequence.It has aimplementationresource width, which is the implementation-defined size in bits of the located resource.It also has a resource width, which is either:
the number of bits as computed from the optionally-provided limit embed parameter (6.10.4.2), if present; or,the implementation resource width.2✨ A bracket resource search for a sequence of characters searches a sequence of places for a resource identified uniquely by that sequence of characters. How the places are determined for the resource identified is implementation-defined.
3✨ A quote resource search for a sequence of characters attempts to identify a resource that is named by the sequence of characters. The named resource is searched for in an implementation-defined manner. If the implementation does not support a quote resource search for that sequence of characters, or if the search fails, the result of the quote search is the result of a bracket resource search for the same sequence of characters.
Constraints3
An embed parameter sequence is a whitespace-delimited list of preprocessor parameters which can modify the result of the replacement for theLet embed element width be either:preprocessing directive.#embed Let implementation resource count be
- an integer constant expression greater than zero determined by an implementation-defined embed parameter; or,
(5.3.5.3.2).CHAR_BIT . Let resource count initially be( resource width ) / ( embed element width ) . Let resource offset initially be zero. The result of( implementation resource count ) shall be zero.( resource width ) % ( embed element width ) Constraints4 An
directive shall have its quoted or bracket resource search succeed, and it shall identify a resource that can be processed by the implementation as a binary data sequence given the provided embed parameters.#embed 5 Embed parameters not specified in this document
shall beare implementation-defined. Implementation-defined embed parameters may change the subsequently-defined semantics of the directive; otherwise, #embed directives which do not contain implementation-defined embed parameters shall behave as described in this document.5A resource is considered empty when its resource width is zero.6Let embed element width be either:
an integer constant expression greater than zero determined by an implementation-defined embed parameter; or,
CHAR_BIT (5.3.5.3.2).
The result of (resource width) % (embed element width) shall be zero.
📝 IMPORTANT Editor’s Note: Replace all instances of "implementation resource width" with simply "resource width".
Semantics
6✨ A resource is considered empty in one of the following cases:
- its resource count is zero;
- or, its resource offset is greater than the implementation resource count.
7 The
expansionreplacement of adirective is a preprocessor token sequence in the form of a comma-separated list of integer constant expressions, unless otherwise modified by embed parameters.#embed formed from the list of integer constant expressions described later in this subclause. The group of tokens for each integer constant expression in the list is separated in the token sequence from the group of tokens for the previous integer constant expression in the list by a comma.Thesequencelist neither begins nor ends in a comma.If the list of integer constant expressions is empty, the token sequence is empty. The directive is replaced by its expansion and, with the presence of certain embed parameters, additional or replacement token sequences.If the resource is empty, the directive is not replaced by the comma-separated list of integer constant expressions representing the resource. Otherwise, the resource offset indicates the firstvalues (which would have been placed in the comma-separated list had the resource offset been equivalent to zero) that are discarded, ignored, and are not part of the list. There are exactly $max(0, min((resource\ count), (implementation\ resource\ count) - (resource\ offset)))$ integer constant expressions in the comma-separated list, where $max$ and $min$ select the maximum and minimum value between two provided values, respectively. The value of each integer constant expression is determined in an implementation-defined manner, and is in the range from $0$ to $2^{embed\ element\ width} − 1$, inclusive.FOOTNOTE(For example, an embed element width of 8 will yield a range of values from 0 to 255, inclusive.) If:( resource offset ) then the contents of the initialized elements of the array are as-if the resource’s binary data represented by the resource offset and the resource count, as a file, is
- the list of integer constant expressions is used to initialize an array of a type compatible with
, or compatible withunsigned char ifchar cannot hold negative values; and,char - the embed element width is equal to
(5.3.5.3.2),CHAR_BIT (7.23.8.1) into the array at translation time.fread
8 A preprocessing directive of the form
# embed h-char-sequence< embed-parameter-sequenceopt new-line> searches a sequence of implementation-defined places for a resource identified uniquely by the specified sequence between the
and< . The search for the named resource is done in an implementation-defined manner.> 9 A preprocessing directive of the form
# embed q-char-sequence" embed-parameter-sequenceopt new-line" searches a sequence of implementation-defined places for a resource identified uniquely by the specified sequence between the
delimiters. The search for the named resource is done in an implementation-defined manner. If this search is not supported, or if the search fails, the directive is reprocessed as if it read"
# embed h-char-sequence > embed-parameter-sequenceopt new-line< with the identical contained q-char-sequence (including > characters, if any) from the original directive.
8 A preprocessing directive of the form
# header-name embed-parameter-sequenceopt new-lineembed causes the replacement of that directive by preprocessing tokens derived from data in the source by the header name, as specified below.
9 If the header name is of the form
h-char-sequence< > the resource is identified by a bracket resource search for the sequence of characters of the h-char-sequence. Otherwise, if the header name is of the form
q-char-sequence" " the resource is identified by a quoted resource search for the sequence of characters of the q-char-sequence.
10 Either form of the
directive shall process any preprocessing tokens after the name, if present, as in normal text. The preprocessing tokens, if present, shall then have the form of an embed parameter sequence. The directive is then replaced as described previously in this subclause.#embed specified previously behaves as specified later in this subclause. The values of the integer constant expressions in the expanded sequence are determined by an implementation-defined mapping of the resource’s data. Each integer constant expression’s value is in the range from 0 to (2embed element width) − 1, inclusive.207) If:
the list of integer constant expressions is used to initialize an array of a type compatible with, or compatible withunsigned char ifchar cannot hold negative values; and,char the embed element width is equal to(5.3.5.3.2),CHAR_BIT then the contents of the initialized elements of the array are as-if the resource’s binary data is(7.23.8.1) into the array at translation time.fread
1110 A preprocessing directive of the form
# pp-tokens new-lineembed (that does not match one of the two previous forms) is permitted. The preprocessing tokens after embed in the directive are processed just as in normal text. (Each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens.) The directive resulting after all replacements shall match one of the two previous forms. If the directive matches one of the two previous forms after the directive is processed as in normal text, any further processing as in normal text described for the two previous forms is not performed. The method by which a sequence of preprocessing tokens between a
12✨ NOTE If the directive is processed as in normal text because it doesn’t match the first two forms but matches the third, processing as in normal text happens once and only once for the entire directive, including its parameters.and a< preprocessing token pair or a pair of> characters is combined into a single resource name preprocessing token is implementation-defined."
13✨ EXAMPLE If the directive matches one of the first two forms, then processing as in normal text only applies to everything but the resource name. If the directive matches the third form, then processing as in normal text applies to the entire directive:
#define offset(ARG) limit(ARG) #define prefix(ARG) suffix(ARG) #define THE_ADDITION "teehee" #define THE_RESOURCE ":3c" #embed ":3c" offset(2) prefix(THE_ADDITION) #embed THE_RESOURCE offset(2) prefix(THE_ADDITION) is equivalent to:
#embed ":3c" limit(2) prefix("teehee") #embed ":3c" limit(2) prefix("teehee")
3.3. Modify §6.10.4.1 Semantics, ❡12 (now ❡13) to add a new embed parameter
An embed parameter with a preprocessor parameter token that is one of the following is a standard embed parameter:
limit prefix suffix if_empty offset
3.4. Modify §6.10.4.2 "limit parameter"'s macro expansion rules in Semantics, ❡3 and ❡4
...
3 The standard embed parameter
with a preprocessor parameter tokendenotes a balanced preprocessing token sequencelimit that will be used to compute the resource width.whose integer constant expression becomes the new value for the resource’s resource count defined in 6.10.4.1. The integer constant expression is evaluated using the rules specified for conditional inclusion (6.10.2), but without doing any further processing as in normal text.Independently of any macro replacement done previously (e.g. when matching the form of #embed), the constant expression is evaluated after the balanced preprocessing token sequence is processed as in normal text, using the rules specified for conditional inclusion (6.10.2), with the exception that any defined macro expressions are not permitted.4The resource width is:4✨The resource count is set to:
0, if the integer constant expression evaluates to 0; or,the implementation resource width if it is less than the embed element width multiplied by the integer constant expression; or,the embed element width multiplied by the integer constant expression, if it is less than or equal to the implementation resource width.
- 0, if the integer constant expression evaluates to 0;
- or, the implementation resource count if the integer constant expression is greater than the implementation resource count;
- or, the integer constant expression, if it is less than or equal to the implementation resource count.
3.5. Add a new section §6.10.4.3 "offset parameter"
6.10.4.3parameteroffset ConstraintsThe
standard embed parameter may appear zero times or one time in the embed parameter sequence. Its preprocessor argument clause shall be present and have the form:offset
( constant-expression )
and shall be an integer constant expression with a non-negative value.
The token
shall not appear within the preprocessor balanced token sequence.defined SemanticsThe
standard embed parameter denotes a balanced preprocessing token sequence whose integer constant expression becomes the value of the resource’s resource offset as defined in 6.10.4.1.offset The integer constant expression is evaluated using the rules specified for conditional inclusion (6.10.2), but without doing any further processing as in normal text.
EXAMPLE Using the same hypothetical resource that has an implementation resource count of at least 4 and is identified by
, the two arrays should have identical contents in certain positions after< sdk / jump . wav > is applied.offset constexpr const unsigned char sound_signature [] = { #embed <sdk/jump.wav> limit(2+2) }; constexpr const unsigned char truncated_sound_signature [] = { #embed <sdk/jump.wav> offset(2) limit(2) }; static_assert ( sizeof ( sound_signature ) == 4 ); static_assert ( sizeof ( truncated_sound_signature ) == 2 ); static_assert ( sound_signature [ 2 ] == truncated_sound_signature [ 0 ]); static_assert ( sound_signature [ 3 ] == truncated_sound_signature [ 1 ]); EXAMPLE Given a resource
that has an implementation-resource-count of 1, the following directives:< single_byte > #embed <single_byte> offset(1) if_empty(44203) #embed <single_byte> limit(0) offset(1) if_empty(44203) are replaced with:
42203 42203 EXAMPLE Given a resource
that has an implementation-resource-count of 1,< single_byte > will be considered empty despite__has_embed , aslimit ( 1 ) has exhausted the implementation-resource-count:offset ( 1 ) int f () { #if __has_embed(<single_byte> limit(1) offset(1) prefix(some tokens))\ == __STDC_EMBED_EMPTY__ // if <single_byte> exists, this // conditional inclusion branch is taken and the function // returns 0. return 0 ; #else // otherwise, the resource does not exist #error "The resource does not exist" #endif }