P1040R5: std::embed

1. Revision History

1.1. Revision 5 - January 13th, 2020

Split #embed into a new paper.
Add memory and time benchmarks from various implementation strategies in the new Current Practice section.
Address concerns for a generic API and similar in the new Results Analysis section.
Retarget to EWG and SG 7.

1.2. Revision 4 - November 26th, 2018

Wording is now relative to [n4778].
Minor typo and tweak fixes.

1.3. Revision 3 - November 26th, 2018

Change to using consteval.
Discuss potential issues with accessing resources after full semantic analysis is performed. Prepare to poll Evolution Working Group. Reference new paper, [p1130], about resource management.

1.4. Revision 2 - October 10th, 2018

Destroy embed_options and alignment options: if the function is materialized only at compile-time through constexpr or the upcoming "immediate functions" (constexpr!), there is no reason to make this part of the function. Instead, the user can choose their own alignment when they pin this down into a std::array or some form of C array / C++ storage.

1.5. Revision 1 - June 10th, 2018

Create future directions section, follow up on Library Evolution Working Group comments.
Change std::embed_options::null_terminated to std::embed_options::null_terminate.
Add more code demonstrating the old way and motivating examples.
Incorporate LEWG feedback, particularly alignment requirements illuminated by Odin Holmes and Niall Douglass. Add a feature macro on top of having __has_include( <embed> ).

1.6. Revision 0 - May 11th, 2018

Initial release.

2. Motivation

I’m very keen on std::embed. I’ve been hand-embedding data in executables for NEARLY FORTY YEARS now. — Guy "Hatcat" Davidson, June 15, 2018

Currently	With Proposal
se-shell@virt-deb> python strfy.py \ fxaa.spriv \ stringified_fxaa.spirv.h #include <span> constexpr inline const auto& fxaa_spriv_data = #include "stringified_fxaa.spirv.h" ; // prevent embedded nulls from // ruining everything with // char_traits<char>::length // or strlen template <typename T, std::size_t N> constexpr std::size_t string_array_size(const T (&)[N]) { return N - 1; } int main (int char*[]) { constexpr std::span<const std::byte> fxaa_binary{ fxaa_spriv_data, string_array_size(fxaa_spriv_data) }; // assert this is a SPIRV // file, at compile-time static_assert(fxaa_binary[0] == 0x03 && fxaa_binary[1] == 0x02 && fxaa_binary[2] == 0x23 && fxaa_binary[3] == 0x07, "given wrong SPIRV data, " "check rebuild or check " "the binaries!") auto context = make_vulkan_context(); // data kept around and made // available for binary // to use at runtime auto fxaa_shader = make_shader( context, fxaa_binary ); for (;;) { // ... // and we’re off! // ... } return 0; }	‏‏ ‏‏ ‏‏ ‏‏ #include <embed> int main (int, char*[]) { constexpr std::span<const std::byte> fxaa_binary = std::embed( "fxaa.spirv" ); // assert this is a SPIRV // file, at compile-time static_assert(fxaa_binary[0] == 0x03 && fxaa_binary[1] == 0x02 && fxaa_binary[2] == 0x23 && fxaa_binary[3] == 0x07, "given wrong SPIRV data, " "check rebuild or check " "the binaries!") auto context = make_vulkan_context(); // data kept around and made // available for binary // to use at runtime auto fxaa_shader = make_shader( context, fxaa_binary ); for (;;) { // ... // and we’re off! // ... } return 0; }
se-shell@virt-deb> python gen_cxx_random_data.py \ -o include/gen/random_data.h #include <cstdint> #include <utility> constexpr std::uint64_t val_64_const = 0xcbf29ce484222325u; constexpr std::uint64_t prime_64_const = 0x100000001b3u; inline constexpr std::uint64_t hash_64_fnv1a_const(const char* const ptr, std::size_t ptr_size, const std::uint64_t value = val_64_const ) noexcept { return (ptr_size == 1) ? value : hash_64_fnv1a_const( &ptr[1], ptr_size - 1, (value ^ static_cast<std::uint64_t>(ptr)) prime_64_const ); } #include <gen/random_data.h> int main () { constexpr std::uint64_t actual = hash_64_fnv1a_const(&random_data[0], std::size(random_data)); return static_cast<int>(actual); }	#include <embed> #include <cstdint> constexpr std::uint64_t val_64_const = 0xcbf29ce484222325u; constexpr std::uint64_t prime_64_const = 0x100000001b3u; inline constexpr std::uint64_t hash_64_fnv1a_const(const char* const ptr, std::size_t ptr_size, const std::uint64_t value = val_64_const ) noexcept { return (ptr_size == 1) ? value : hash_64_fnv1a_const( &ptr[1], ptr_size - 1, (value ^ static_cast<std::uint64_t>(ptr)) prime_64_const ); } int main () { constexpr std::span<const char> art_data = std::embed("/dev/urandom", 32); constexpr std::uint64_t actual = hash_64_fnv1a_const(art_data.data(), art_data.size()); return static_cast<int>(actual); } (Works here.)

Currently

With Proposal

se-shell@virt-deb> python strfy.py \
	fxaa.spriv \
	stringified_fxaa.spirv.h

#include <span>

constexpr inline 
const auto& fxaa_spriv_data =
#include "stringified_fxaa.spirv.h"
;

// prevent embedded nulls from
// ruining everything with 
// char_traits<char>::length
// or strlen
template <typename T, std::size_t N>
constexpr std::size_t 
string_array_size(const T (&)[N]) {
    return N - 1;
}

int main (int char*[]) {
	constexpr std::span<const std::byte> 
	fxaa_binary{ 
		fxaa_spriv_data, 
		string_array_size(fxaa_spriv_data)
	};

	// assert this is a SPIRV 
	// file, at compile-time	
	static_assert(fxaa_binary[0] == 0x03 
		&& fxaa_binary[1] == 0x02
		&& fxaa_binary[2] == 0x23 
		&& fxaa_binary[3] == 0x07, 
		"given wrong SPIRV data, "
		"check rebuild or check "
		"the binaries!")

	auto context = make_vulkan_context();

	// data kept around and made
	// available for binary
	// to use at runtime
	auto fxaa_shader = make_shader( 
		context, fxaa_binary );

	for (;;) {
		// ...
		// and we’re off!
		// ...
	}

	return 0;
}

‏‏
‏‏
‏‏
‏‏

#include <embed>
















int main (int, char*[]) {
	constexpr std::span<const std::byte> 
	fxaa_binary = 
		std::embed( "fxaa.spirv" );
	


	// assert this is a SPIRV 
	// file, at compile-time	
	static_assert(fxaa_binary[0] == 0x03 
		&& fxaa_binary[1] == 0x02
		&& fxaa_binary[2] == 0x23 
		&& fxaa_binary[3] == 0x07, 
		"given wrong SPIRV data, "
		"check rebuild or check "
		"the binaries!")

	auto context = make_vulkan_context();

	// data kept around and made
	// available for binary
	// to use at runtime
	auto fxaa_shader = make_shader( 
		context, fxaa_binary );

	for (;;) {
		// ...
		// and we’re off!
		// ...
	}

	return 0;
}

se-shell@virt-deb> python gen_cxx_random_data.py \
	-o include/gen/random_data.h

 
#include <cstdint>
#include <utility>


constexpr std::uint64_t val_64_const 
	= 0xcbf29ce484222325u;
constexpr std::uint64_t prime_64_const 
	= 0x100000001b3u;

inline constexpr std::uint64_t
hash_64_fnv1a_const(const char* const ptr, 
	std::size_t ptr_size, 
	const std::uint64_t value = val_64_const
) noexcept {
	return (ptr_size == 1) 
	  ? value 
	  : hash_64_fnv1a_const(
		&ptr[1],
		ptr_size - 1, 
		(value ^ static_cast<std::uint64_t>(*ptr)) 
		* prime_64_const
		);
}

#include <gen/random_data.h>

int main () {


	constexpr std::uint64_t actual
		= hash_64_fnv1a_const(&random_data[0],
			std::size(random_data));

	return static_cast<int>(actual);
}

#include <embed>
#include <cstdint>



constexpr std::uint64_t val_64_const
	= 0xcbf29ce484222325u;
constexpr std::uint64_t prime_64_const
	= 0x100000001b3u;

inline constexpr std::uint64_t
hash_64_fnv1a_const(const char* const ptr, 
	std::size_t ptr_size, 
	const std::uint64_t value = val_64_const
) noexcept {
	return (ptr_size == 1) 
	  ? value 
	  : hash_64_fnv1a_const(
		&ptr[1],
		ptr_size - 1, 
		(value ^ static_cast<std::uint64_t>(*ptr)) 
		* prime_64_const
		);
}




int main () {
	constexpr std::span<const char> art_data 
		= std::embed("/dev/urandom", 32);
	constexpr std::uint64_t actual 
		= hash_64_fnv1a_const(art_data.data(), 
			art_data.size());

	return static_cast<int>(actual);
}

(Works here.)

A very large amount of C and C++ programmer -- at some point -- attempts to #include large chunks of non-C++ data into their code. Of course, #include expects the format of the data to be source code, and thusly the program fails with spectacular lexer errors. Thusly, many different tools and practices were adapted to handle this, as far back as 1995 with the xxd tool. Many industries need such functionality, including (but hardly limited to):

Financial Development
- representing coefficients and numeric constants for performance-critical algorithms;
Game Development
- assets that do not change at runtime, such as icons, fixed textures and other data;
- Shader and scripting code;
Embedded Development
- storing large chunks of binary, such as firmware, in a well-compressed format;
- placing data in memory on chips and systems that do not have an operating system or file system;
Application Development
- compressed binary blobs representing data
- non-C++ script code that is not changed at runtime;
Server Development
- configuration parameters which are known at build-time and are baked in to set limits and give compile-time information to tweak performance under certain loads;
- SSL/TLS Certificates hard-coded into your executable (requiring a rebuild and potential authorization before deploying new certificates), and;
Static Analyzers
- Static analyzers suffer -- much like their binary code generating friends -- from having to parse extremely large array literals;
- Reduces memory pressure and enables better information tracking and potential sanitization (file source is not lost in build system).

In the pursuit of this goal, these tools have proven to have inadequacies and contribute poorly to the C++ development cycle as it continues to scale up for larger and better low-end devices and high-performance machines, bogging developers down with menial build tasks and trying to cover-up disappointing differences between platforms. It also absolutely destroys state-of-the-art compilers due to the extremely high memory overhead of producing an Abstract Syntax Tree for a braced initializer list of several tens of thousands of integral constants with numeric values at 255 or less.

The request for some form of #include_string or similar dates back quite a long time, with one of the oldest stack overflow questions asked-and-answered about it dating back nearly 10 years. Predating even that is a plethora of mailing list posts and forum posts asking how to get script code and other things that are not likely to change into the binary.

This paper proposes <embed> to make this process much more efficient, portable, and streamlined.

3. Scope and Impact

template <typename T = byte> consteval span<const T> embed( string_view resource_identifier ) is an extension to the language proposed entirely as a library construct. The goal is to have it implemented with compiler intrinsics, builtins, or other suitable mechanisms. It does not affect the language. The proposed header to expose this functionality is <embed>, making the feature entirely-opt-in by checking if either the proposed feature test macro or header exists.

4. Design Decisions

<embed> avoids using the preprocessor or defining new string literal syntax like its predecessors, preferring the use of a free function in the std namespace. This gives std::embed a greater degree of power and advantage over <embed>'s design is derived heavily from community feedback plus the rejection of the prior art up to this point, as well as the community needs demonstrated by existing practice and their pit falls.

4.1. Current Practice

Here, we examine current practice, their benefits, and their pitfalls. There are a few cross-platform (and not-so-cross-platform) paths for getting data into an executable. We also scrutinize the performance, with numbers for both memory overhead and speed overhead available at the repository that houses the current implementation. For ease of access, the numbers as of January 2020 with the latest versions of the indicated compilers and tools are replicated below.

All three major implementations were explored, plus an early implementation of this functionality in GCC. A competing implementation in a separate C++-like meta language called Circle was also looked at by the behest of Study Group 7.

4.1.1. Speed Results

Below are timing results for a file of random bytes using a specific strategy. The file is of the size specified at the top of the column. Files are kept the same between strategies and tests.

Intel Core i7-6700HQ @ 2.60 GHz
24.0 GB RAM 2952 MHz
Debian Sid or Windows 10
Method: Gather timings from time *nix program or Measure-Command { ... } PowerShell, compute mean

Strategy	4 bytes	40 bytes	400 bytes	4 kilobytes
`#embed` GCC	0.201 s	0.208 s	0.207 s	0.218 s
`phd::embed` GCC	0.709 s	0.724 s	0.711 s	0.715 s
`xxd`-generated GCC	0.225 s	0.215 s	0.237 s	0.247 s
`xxd`-generated Clang	0.272 s	0.275 s	0.272 s	0.272 s
`xxd`-generated MSVC	0.204 s	0.229 s	0.209 s	0.232 s
Circle `@array`	0.353 s	0.359 s	0.361 s	0.361 s
Circle `@embed`	0.199 s	0.208 s	0.204 s	0.368 s
`objcopy` (linker)	0.501 s	0.482 s	0.519 s	0.527 s

Strategy	40 kilobytes	400 kilobytes	4 megabytes	40 megabytes
`#embed` GCC	0.236 s	0.231 s	0.300 s	1.069 s
`phd::embed` GCC	0.705 s	0.713 s	0.772 s	1.135 s
`xxd`-generated GCC	0.406 s	2.135 s	23.567 s	225.290 s
`xxd`-generated Clang	0.366 s	1.063 s	8.309 s	83.250 s
`xxd`-generated MSVC	0.552 s	3.806 s	52.397 s	Out of Memory
Circle `@array`	0.353 s	0.363 s	0.421 s	0.585 s
Circle `@embed`	0.238 s	0.199 s	0.219 s	0.368 s
`objcopy` (linker)	0.500 s	0.497 s	0.555 s	2.183 s

Strategy	400 megabytes	1 gigabyte
`#embed` GCC	9.803 s	26.383 s
`phd::embed` GCC	4.170 s	11.887 s
`xxd`-generated GCC	Out of Memory	Out of Memory
`xxd`-generated Clang	Out of Memory	Out of Memory
`xxd`-generated MSVC	Out of Memory	Out of Memory
Circle `@array`	2.655 s	6.023 s
Circle `@embed`	1.886 s	4.762 s
`objcopy` (linker)	22.654 s	58.204 s

4.1.2. Memory Size Results

Below is the peak memory usage (heap usage) for a file of random bytes using a specific strategy. The file is of the size specified at the top of the column. Files are kept the same between strategies and tests.

Intel Core i7-6700HQ @ 2.60 GHz
24.0 GB RAM 2952 MHz
Debian Sid or Windows 10
Method: /usr/bin/time -v or Execute command hundreds of times, stare at Task Manager

Strategy	4 bytes	40 bytes	400 bytes	4 kilobytes
`#embed` GCC	17.26 MB	17.26 MB	17.26 MB	17.27 MB
`phd::embed` GCC	38.82 MB	38.77 MB	38.80 MB	38.80 MB
`xxd`-generated GCC	17.26 MB	17.26 MB	17.26 MB	17.27 MB
`xxd`-generated Clang	35.12 MB	35.22 MB	35.31 MB	35.88 MB
`xxd`-generated MSVC	< 30.00 MB	< 30.00 MB	< 33.00 MB	< 38.00 MB
Circle `@array`	53.56 MB	53.60 MB	53.53 MB	53.88 MB
Circle `@embed`	33.35 MB	33.34 MB	33.34 MB	33.35 MB
`objcopy` (linker)	17.32 MB	17.31 MB	17.31 MB	17.31 MB

Strategy	40 kilobytes	400 kilobytes	4 megabytes	40 megabytes
`#embed` GCC	17.26 MB	17.96 MB	53.42 MB	341.72 MB
`phd::embed` GCC	38.80 MB	40.10 MB	59.06 MB	208.52 MB
`xxd`-generated GCC	24.85 MB	134.34 MB	1,347.00 MB	12,622.00 MB
`xxd`-generated Clang	41.83 MB	103.76 MB	718.00 MB	7,116.00 MB
`xxd`-generated MSVC	~48.60 MB	~477.30 MB	~5,280.00 MB	Out of Memory
Circle `@array`	53.69 MB	54.73 MB	65.88 MB	176.44 MB
Circle `@embed`	33.34 MB	33.34 MB	39.41 MB	113.12 MB
`objcopy` (linker)	17.31 MB	17.31 MB	17.31 MB	57.13 MB

Strategy	400 megabytes	1 gigabyte
`#embed` GCC	3,995.34 MB	9,795.31 MB
`phd::embed` GCC	1,494.66 MB	5,279.37 MB
`xxd`-generated GCC	Out of Memory	Out of Memory
`xxd`-generated Clang	Out of Memory	Out of Memory
`xxd`-generated MSVC	Out of Memory	Out of Memory
Circle `@array`	1,282.34 MB	3,199.28 MB
Circle `@embed`	850.40 MB	2,128.36 MB
`objcopy` (linker)	425.77 MB	1,064.74 MB

4.1.3. Results Analysis

The above clearly demonstrates the superiority of std::embed over latest optimized trunk builds of various compilers. It is also notable that originally the Circle language did not have an @embed keyword, but it was added in December 2019. When the compiler author was spoken to about Study Group 7’s aspirations for a more generic way of representing data from a file, the ultimate response was this:

I’ll add a new @embed keyword that takes a type and a file path and loads the file and embeds it into an array prvalue of that type. This will cut out the interpreter and it’ll run at max speed. Feed back like this is good. This is super low-hanging fruit.

– Sean Baxter, December 12th, 2019

It was Circle’s conclusion that a generic API was unsuitable and suffered from the same performance pitfalls that currently plagued current-generation compilers today. And it was SG7’s insistence that a more generic API would be suitable, modeled on Circle’s principles. Given that thorough exploration of the design space in Circle led to the same conclusion this proposal is making, and given the wide variety of languages providing a similar interface (D, Nim, Rust, etc.), it is clear that a more generic API is not desirable for functionality as fundamental and simple as this. This does not preclude a more generic solution being created, but it does prioritize the "Bird in the Hand" approach that the Direction Group and Bjarne Stroustrup have advocated for many times.

Furthermore, inspecting compiler bug reports around this subject area reveal that this is not the first time GCC has suffered monumental memory blowup over unoptimized representation of data. In fact, this is a 16+ year old problem that GCC has been struggling with for a long time now (C++ version here). That the above numbers is nearing the best that can be afforded by some of the most passionate volunteers and experts curating an extremely large codebase should be testament to how hard the language is this area for compiler developers, and how painful it is for regular developers using their tools.

Clang, while having a better data representation and more optimized structures at its disposal, is similarly constrained. With significant implementation work, they are deeply constrained in what they can do:

It might be possible to introduce some sort of optimized representation specifically for initializer lists. But it would be a big departure from existing AST handling. And it wouldn’t really open up new use cases, given that string literal handling is already reasonably efficient.

– Eli Friedman, December 29th 2019

Is this really the best use of compiler developer energy?

To provide a backdrop against which a big departure from current AST handling in can be compared, an implementation of the built-in necessary for this proposal is -- for an experienced developer -- at most a few day’s work in either GCC or Clang. Other compiler engineers have reported similar ease of implementation and integration. Should this really be delegated to Quality of Implementation that will be need to be solved N times over by every implementation in their own particularly special way? Chipping away at what is essentially a fundamental inefficiency required by C++'s inescapable tokenization model from the preprocessor plus the sheer cost of an ever-growing language that makes simple constructs like a brace initializer list of integer constants expensive is, in this paper’s demonstrated opinion, incredibly unwise.

4.1.4. Manual Work

Many developers also hand-wrap their files in (raw) string literals, or similar to massage their data -- binary or not -- into a conforming representation that can be parsed at source code:

Have a file data.json with some data, for example:

{ "Hello": "World!" }

Mangle that file with raw string literals, and save it as raw_include_data.h:

R"json({ "Hello": "World!" })json"

Include it into a variable, optionally made constexpr, and use it in the program:

#include <iostream>
#include <string_view>

int main() {
  constexpr std::string_view json_view =
#include "raw_include_data.h"
    ;
		
  // { "Hello": "World!" }
  std::cout << json_view << std::endl;
  return 0;
}

This happens often in the case of people who have not yet taken the "add a build step" mantra to heart. The biggest problem is that the above C++-ready source file is no longer valid in as its original representation, meaning the file as-is cannot be passed to any validation tools, schema checkers, or otherwise. This hurts the portability and interop story of C++ with other tools and languages.

Furthermore, if the string literal is too big vendors such as VC++ will hard error the build (example from Nonius, benchmarking framework).

4.1.5. Processing Tools

Other developers use pre-processors for data that can’t be easily hacked into a C++ source-code appropriate state (e.g., binary). The most popular one is xxd -i my_data.bin, which outputs an array in a file which developers then include. This is problematic because it turns binary data in C++ source. In many cases, this results in a larger file due to having to restructure the data to fit grammar requirements. It also results in needing an extra build step, which throws any programmer immediately at the mercy of build tools and project management. An example and further analysis can be found in the §6.1.1 Pre-Processing Tools Alternative and the §6.1.2 python Alternative section.

4.1.6. `ld`, resource files, and other vendor-specific link-time tools

Resource files and other "link time" or post-processing measures have one benefit over the previous method: they are fast to perform in terms of compilation time. A example can be seen in the §6.1.3 ld Alternative section.

4.1.7. The `incbin` tool

There is a tool called [incbin] which is a 3rd party attempt at pulling files in at "assembly time". Its approach is incredibly similar to ld, with the caveat that files must be shipped with their binary. It unfortunately falls prey to the same problems of cross-platform woes when dealing with VC++, requiring additional pre-processing to work out in full.

4.2. Prior Art

There has been a lot of discussion over the years in many arenas, from Stack Overflow to mailing lists to meetings with the Committee itself. The latest advancements that had been brought to WG21’s attention was p0373r0 - File String Literals. It proposed the syntax F"my_file.txt" and bF"my_file.txt", with a few other amenities, to load files at compilation time. The following is an analysis of the previous proposal.

4.2.1. Literal-Based, constexpr

A user could reasonably assign (or want to assign) the resulting array to a constexpr variable as its expected to be handled like most other string literals. This allowed some degree of compile-time reflection. It is entirely helpful that such file contents be assigned to constexpr: e.g., string literals of JSON being loaded at compile time to be parsed by Ben Deane and Jason Turner in their CppCon 2017 talk, constexpr All The Things.

4.2.2. Literal-Based, Null Terminated (?)

It is unclear whether the resulting array of characters or bytes was to be null terminated. The usage and expression imply that it will be, due to its string-like appearance. However, is adding an additional null terminator fitting for desired usage? From the existing tools and practice (e.g., xxd -i or linking a data-dumped object file), the answer is no: but the syntax bF"hello.txt" makes the answer seem like a "yes". This is confusing: either the user should be given an explicit choice or the feature should be entirely opt-in.

4.2.3. Encoding

Because the proposal used a string literal, several questions came up as to the actual encoding of the returned information. The author gave both bF"my_file.txt" and F"my_file.txt" to separate binary versus string-based arrays of returns. Not only did this conflate issues with expectations in the previous section, it also became a heavily contested discussion on both the mailing list group discussion of the original proposal and in the paper itself. This is likely one of the biggest pitfalls between separating "binary" data from "string" data: imbuing an object with string-like properties at translation time provide for all the same hairy questions around source/execution character set and the contents of a literal.

4.3. Design Goals

Because of the aforementioned reasons, it seems more prudent to take a "compiler intrinsic"/"magic function" approach. The function overload takes the form:

template <typename T = byte>
consteval span<const T> embed( 
  string_view resource_identifier
);

template <typename T = byte>
consteval span<const T> embed( 
  string_view resource_identifier, size_t limit
);

resource_identifier is a string_view processed in an implementation-defined manner to find and pull resources into C++ at constexpr time. limit is the maximum number of bytes the function call can produce (but it may produce less). The most obvious source will be the file system, with the intention of having this evaluated as a core constant expression. We do not attempt to restrict the string_view to a specific subset: whatever the implementation accepts (typically expected to be a relative or absolute file path, but can be other identification scheme), the implementation should use.

4.3.1. Implementation Defined

Calls such as std::embed( "my_file.txt" );, std::embed( "data.dll" );, and std::embed<vertex>( "vertices.bin" ); are meant to be evaluated in a constexpr context (with "core constant expressions" only), where the behavior is implementation-defined. The function has unspecified behavior when evaluated in a non-constexpr context (with the expectation that the implementation will provide a failing diagnostic in these cases). This is similar to how include paths work, albeit #include interacts with the programmer through the preprocessor.

There is precedent for specifying library features that are implemented only through compile-time compiler intrinsics (type_traits, source_location, and similar utilities). Core -- for other proposals such as p0466r1 - Layout-compatibility and Pointer-interconvertibility Traits -- indicated their preference in using a constexpr magic function implemented by intrinsic in the standard library over some form of template <auto X> thing { /* implementation specified */ value; }; construct. However, it is important to note that [p0466r1] proposes type traits, where as this has entirely different functionality, and so its reception and opinions may be different.

4.3.2. Binary Only

Creating two separate forms or options for loading data that is meant to be a "string" always fuels controversy and debate about what the resulting contents should be. The problem is sidestepped entirely by demanding that the resource loaded by std::embed represents the bytes exactly as they come from the resource. This prevents encoding confusion, conversion issues, and other pitfalls related to trying to match the user’s idea of "string" data or non-binary formats. Data is received exactly as it is from the resource as defined by the implementation, whether it is a supposed text file or otherwise. std::embed( "my_text_file.txt" ) and std::embed( "my_binary_file.bin" ) behave exactly the same concerning their treatment of the resource.

4.3.3. Constexpr Compatibility

The entire implementation must be usable in a constexpr context. It is not just for the purposes of processing the data at compile time, but because it matches existing implementations that store strings and huge array literals into a variable via #include. These variables can be constexpr: to not have a constexpr implementation is to leave many of the programmers who utilize this behavior without a proper standardized tool.

4.3.4. Optional Limit

Consider some file-based resources that are otherwise un-sizeable and un-seek/tellable in various implementations such as /dev/urandom. Telling the compiler to go fetch data from this resource infinitely can result in compiler lockups or worse: therefore, the user can specify another parameter -- a numeric limit to stop things.

4.3.5. Statically Polymorphic

While returning std::byte is valuable, it is impossible to reinterpret_cast or bit_cast certain things at compile time. This makes it impossible in a constexpr context to retrieve the actual data from a resource without tremendous boilerplate and work that every developer will have to do.

5. Changes to the Standard

Wording changes are relative to [n4842].

5.1. Intent

The intent of the wording is to provide a function that:

handles the provided resource identifying string_view in an implementation-defined manner;
and, returns the specified constexpr span representing either the bytes of the resource or the bytes view as the type T.

The wording also explicitly disallows the usage of the function outside of a core constant expression by marking it consteval, meaning it is ill-formed if it is attempted to be used at not-constexpr time (std::embed calls should not show up as a function in the final executable or in generated code). The program may pin the data returned by std::embed through the span into the executable if it is used outside a core constant expression.

5.2. Proposed Feature Test Macro

The proposed feature test macros are __cpp_lib_embed for the library and __cpp_pp_depend for the preprocessor functionality.

5.3. Proposed Wording

Append to §14.8.1 Predefined macro names [cpp.predefined]'s Table 16 with one additional entry:

Macro name Value
__cpp_pp_depend 202006L

Macro name	Value
__cpp_pp_depend	202006L

Add a new section §15.4 Dependency [cpp.depend]:

15.4 Dependency [cpp.depend]
¹ A #depend directive establishes inputs or family of inputs upon which a translation unit depends.

² A preprocessing directive of the form

# depend < h-char-sequence > new-line

or

# depend " q-char-sequence " new-line

provides a dependency name. If an implementation does not find meaning in the quote-delimited q-char-sequence, it may reprocess this directive and treat it as a #depend < h-char-sequence > new-line directive using the same q-char-sequence, including any < or >.

³ The q-char-sequence or h-char-sequence may have one of 3 meanings, depending on the use of * or ** within the sequence.

– If the sequence contains a * it denotes a dependency-family.
– Otherwise, if it contains a ** it denotes a recursive-dependency-family.
– Otherwise, it denotes a single-dependency.

A #depend directive

⁴ [ Example—

#depend "art.png" // this translation unit depends on 'art.png' #depend "assets/**" // this translation unit depends on all resources // the implementation can find along that start with // "assets/", recursively. #depend <config/*.json>" // this translation unit depends on all resources // the implementation can find that // end in ".json" and start with "config/". #depend <data/*/*.bin>" // this translation unit depends on all resources // the implementation can find that // end in ".bin", start with "data/" // and contain a single "/" in-between.
— end Example ].

⁵ Each of the dependency-family, recursive-dependency-family, and single-dependency shall have an implementation-defined meaning which establishes search information for implementation-defined resources (e.g., for 19.20 [const.res]).

Append to §16.3.1 General [support.limits.general]'s Table 35 one additional entry:

Macro name Value
__cpp_lib_embed 202006L

Macro name	Value
__cpp_lib_embed	202006L

Append to §19.1 General [utilities.general]'s Table 38 one additional entry:

Subclause Header(s)
19.20 Constant Resources <embed>

	Subclause	Header(s)
19.20	Constant Resources	<embed>

Add a new section §19.20 Constant Resources [const.res]:

19.20 Constant Resources [const.res]
19.20.1 In general [const.res.general]

Constant resources allow the implementation to retrieve data from a variety of sources -- including implementation-defined places -- and allows their processing during constant evaluation.

19.20.2 Header <embed> synopsis [embed.syn]

namespace std { template <typename T = byte> consteval span<const T> embed( string_view resource_identifier ) noexcept; template <typename T = byte> consteval span<const T> embed( string_view resource_identifier, size_t limit ) noexcept; }

19.20.3 Function template embed [const.embed]

namespace std { template <typename T = byte> consteval span<const T> embed( string_view resource_identifier ) noexcept; template <typename T = byte> consteval span<const T> embed( string_view resource_identifier, size_t limit ) noexcept; }

¹ Mandates: the implementation-defined bit size of the resource is a multiple of sizeof(T) * CHAR_BIT and std::is_trivial_v<T> is true. [ Note— This provides that types with non-trivial destructors do not need to be run for the implementation-provided static storage duration objects. — end Note ].

² Returns: A read-only view to a unique resource identified by the resource_identifier over a contiguous sequence of T objects with static storage duration. The mapping from the contents of the resource to the contiguous sequence of T objects is implementation-defined.

³ Ensures: r.size() <= limit, where r denotes the result of the function call for the second overload.

⁴ Remarks: The value of resource_identifier is used to search a sequence of implementation-defined places for a resource identified uniquely by resource_identifier. If the implementation cannot find the resource specified after exhausting the sequence of implementation-defined search locations, the program is ill-formed. The mapping of the resource to the sequence of T is implementation-defined. [ Note— Implementations should provide a mechanism similar but not identical to #include (15.3 [cpp.include]) for finding the specified resource and in coordination with #depend (15.4 [cpp.depend]). — end Note ]

6. Appendix

6.1. Alternative

Other techniques used include pre-processing data, link-time based tooling, and assembly-time runtime loading. They are detailed below, for a complete picture of today’s sad landscape of options.

6.1.1. Pre-Processing Tools Alternative

Run the tool over the data (xxd -i xxd_data.bin > xxd_data.h) to obtain the generated file (xxd_data.h):

unsigned char xxd_data_bin[] = {
  0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x2c, 0x20, 0x57, 0x6f, 0x72, 0x6c, 0x64,
  0x0a
};
unsigned int xxd_data_bin_len = 13;

Compile main.cpp:

#include <iostream>
#include <string_view>

// prefix as constexpr,
// even if it generates some warnings in g++/clang++
constexpr
#include "xxd_data.h"
;

template <typename T, std::size_t N>
constexpr std::size_t array_size(const T (&)[N]) {
    return N;
}

int main() {
    static_assert(xxd_data_bin[0] == 'H');
    static_assert(array_size(xxd_data_bin) == 13);

    std::string_view data_view(
        reinterpret_cast<const char*>(xxd_data_bin),
        array_size(xxd_data_bin));
    std::cout << data_view << std::endl; // Hello, World!
    return 0;
}

Others still use python or other small scripting languages as part of their build process, outputting data in the exact C++ format that they require.

There are problems with the xxd -i or similar tool-based approach. Lexing and Parsing data-as-source-code adds an enormous overhead to actually reading and making that data available.

Binary data as C(++) arrays provide the overhead of having to comma-delimit every single byte present, it also requires that the compiler verify every entry in that array is a valid literal or entry according to the C++ language.

This scales poorly with larger files, and build times suffer for any non-trivial binary file, especially when it scales into Megabytes in size (e.g., firmware and similar).

6.1.2. `python` Alternative

Other companies are forced to create their own ad-hoc tools to embed data and files into their C++ code. MongoDB uses a custom python script, just to get their data into C++:

import os
import sys

def jsToHeader(target, source):
    outFile = target
    h = [
        '#include "mongo/base/string_data.h"',
        '#include "mongo/scripting/engine.h"',
        'namespace mongo {',
        'namespace JSFiles{',
    ]
    def lineToChars(s):
        return ','.join(str(ord(c)) for c in (s.rstrip() + '\n')) + ','
    for s in source:
        filename = str(s)
        objname = os.path.split(filename)[1].split('.')[0]
        stringname = '_jscode_raw_' + objname

        h.append('constexpr char ' + stringname + "[] = {")

        with open(filename, 'r') as f:
            for line in f:
                h.append(lineToChars(line))

        h.append("0};")
        # symbols aren’t exported w/o this
        h.append('extern const JSFile %s;' % objname)
        h.append('const JSFile %s = { "%s", StringData(%s, sizeof(%s) - 1) };' %
                 (objname, filename.replace('\\', '/'), stringname, stringname))

    h.append("} // namespace JSFiles")
    h.append("} // namespace mongo")
    h.append("")

    text = '\n'.join(h)

    with open(outFile, 'wb') as out:
        try:
            out.write(text)
        finally:
            out.close()


if __name__ == "__main__":
    if len(sys.argv) < 3:
        print "Must specify [target] [source] "
        sys.exit(1)
    jsToHeader(sys.argv[1], sys.argv[2:])

MongoDB were brave enough to share their code with me and make public the things they have to do: other companies have shared many similar concerns, but do not have the same bravery. We thank MongoDB for sharing.

6.1.3. `ld` Alternative

A full, compilable example (except on Visual C++):

Have a file ld_data.bin with the contents Hello, World!.
Run ld -r binary -o ld_data.o ld_data.bin.
Compile the following main.cpp with c++ -std=c++17 ld_data.o main.cpp:

#include <iostream>
#include <string_view>

#ifdef __APPLE__
#include <mach-o/getsect.h>

#define DECLARE_LD(NAME) extern const unsigned char _section$__DATA__##NAME[];
#define LD_NAME(NAME) _section$__DATA__##NAME
#define LD_SIZE(NAME) (getsectbyname("__DATA", "__" #NAME)->size)

#elif (defined __MINGW32__) /* mingw */

#define DECLARE_LD(NAME)                                 \
  extern const unsigned char binary_##NAME##_start[]; \
  extern const unsigned char binary_##NAME##_end[];
#define LD_NAME(NAME) binary_##NAME##_start
#define LD_SIZE(NAME) ((binary_##NAME##_end) - (binary_##NAME##_start))

#else /* gnu/linux ld */

#define DECLARE_LD(NAME)                                  \
  extern const unsigned char _binary_##NAME##_start[]; \
  extern const unsigned char _binary_##NAME##_end[];
#define LD_NAME(NAME) _binary_##NAME##_start
#define LD_SIZE(NAME) ((_binary_##NAME##_end) - (_binary_##NAME##_start))
#endif

DECLARE_LD(ld_data_bin);

int main() {
  // impossible
  //static_assert(xxd_data_bin[0] == 'H');
  std::string_view data_view(
    reinterpret_cast<const char*>(LD_NAME(ld_data_bin)), 
    LD_SIZE(ld_data_bin)
  );
  std::cout << data_view << std::endl; // Hello, World!
  return 0;
}

This scales a little bit better in terms of raw compilation time but is shockingly OS, vendor and platform specific in ways that novice developers would not be able to handle fully. The macros are required to erase differences, lest subtle differences in name will destroy one’s ability to use these macros effectively. We ommitted the code for handling VC++ resource files because it is excessively verbose than what is present here.

N.B.: Because these declarations are extern, the values in the array cannot be accessed at compilation/translation-time.

7. Acknowledgements

A big thank you to Andrew Tomazos for replying to the author’s e-mails about the prior art. Thank you to Arthur O’Dwyer for providing the author with incredible insight into the Committee’s previous process for how they interpreted the Prior Art.

A special thank you to Agustín Bergé for encouraging the author to talk to the creator of the Prior Art and getting started on this. Thank you to Tom Honermann for direction and insight on how to write a paper and apply for a proposal.

Thank you to Arvid Gerstmann for helping the author understand and use the link-time tools.

Thank you to Tony Van Eerd for valuable advice in improving the main text of this paper.

Thank you to Lilly (Cpplang Slack, @lillypad) for the valuable bikeshed and hole-poking in original designs, alongside Ben Craig who very thoroughly explained his woes when trying to embed large firmware images into a C++ program for deployment into production. Thank you to Elias Kounen and Gabriel Ravier for wording review.

For all this hard work, it is the author’s hope to carry this into C++. It would be the author’s distinct honor to make development cycles easier and better with the programming language we work in and love. ♥

P1040R5std::embed

Published Proposal, 2020-01-13

Abstract