| Document #: | P3705R2 |
| Date: | 2025-06-18 |
| Project: | Programming Language C++ |
| Audience: |
SG-9 Ranges LEWG |
| Reply-to: |
Eddie Nolan <eddiejnolan@gmail.com> |
Consider using std::views::take
to get the first five characters from a string. If we pass a string
literal with five or more characters, it looks okay:
static_assert(std::string{std::from_range, "Brubeck" | std::views::take(5)} == "Brube"); // passesHowever, if we pass it a string literal with fewer than five characters, we get in trouble:
// static_assert(std::string{std::from_range, "Dave" | std::views::take(5)} == "Dave"); // fails
using namespace std::string_view_literals;
static_assert(std::string{std::from_range, "Dave" | std::views::take(5)} == "Dave\0"sv); // passesThe reason the null terminator is included in the output here is because undecayed string literals are arrays of characters, including the null terminator, which the ranges library treats like any other array:
#include <algorithm>
#include <array>
#include <type_traits>
static_assert(std::is_same_v<std::remove_reference_t<decltype("foo")>, const char[4]>);
static_assert(std::ranges::equal("foo", std::array{'f', 'o', 'o', '\0'}));A common workaround is to wrap the string literal in std::string_view.
But this has a performance cost: despite the fact that we only care
about the first five characters, we still need to make a pass through
the entire string to find the null terminator:
constexpr std::string take_five(char const* long_string) {
std::string_view const long_string_view = long_string; // read all of long_string!
return std::string{std::from_range, long_string_view | std::views::take(5)};
}This paper introduces
null_sentinel to solve this problem.
It ends the range when it encounters a null:
constexpr std::string take_five(char const* long_string) {
std::ranges::subrange const long_string_range(long_string, std::null_sentinel); // lazy!
return std::string{std::from_range, long_string_range | std::views::take(5)};
}It also introduces a null_term
CPO for the common case of constructing a subrange like the one in the
above example:
constexpr std::string take_five(char const* long_string) {
return std::string{std::from_range, std::views::null_term(long_string) | std::views::take(5)};
}The sentinel type matches any iterator position
it at which
*it is equal
to a value-initialized object of type iter_value_t<I>.
This works for null-terminated strings, but can also serve as the
sentinel for any range terminated by a value-initialized value.
For example, null_term can be
used to iterate argv and
environ. The following program
demonstrates this:
#include <print>
extern char** environ;
int main(int argc, char** argv) {
std::println("argv: {}", std::views::null_term(argv));
std::println("environ: {}", std::views::null_term(environ));
}Output:
$ env --ignore-environment FOO=bar BAZ=quux ./test corge
argv: ["./test", "corge"]
environ: ["FOO=bar", "BAZ=quux"]c_strrange-v3
provides
similar functionality to null_term
using the name c_str– unlike
null_term,
c_str only works with
char,
wchar_t, and
so forth.
The name c_str would not make
sense here because, for example, std::views::c_str(argv)
is nonsensical and misleading as to what the range is doing (terminating
the sequence of strings rather than terminating the strings
themselves).
zstring_sentinel and
zstringAn example from Barry Revzin’s [P2210R2] provides similar functionality
under the names zstring_sentinel and
zstring.
I dislike this name for similar reasons as
c_str.
argv is not a “string.”
Additionally, it invites confusion with [P3655R1?]
zstring_view.
null_sentinel and
null_termThe advantages of these names are that they are terse, they do not
include the word “string,” and the value-initialized
iter_value_t value is referred to as
“null,” which reflects the language we use in the standard (“null
terminated byte string,” “null terminated multibyte string,” “null
terminated character sequence,” “null pointer”)
zero_termination_sentinelThis is used by think-cell’s implementation of this utility. I like it slightly less because it’s more verbose but also consider it acceptable.
This is not an exhaustive list.
default-initializable-and-equality-comparable-iter-valuenamespace std {
template<class I>
concept default-initializable-and-equality-comparable-iter-value =
default_initializable<iter_value_t<I>> &&
equality_comparable_with<iter_reference_t<I>, iter_value_t<I>>; // exposition only
}namespace std {
struct null_sentinel_t {
template<input_iterator I>
requires default-initializable-and-equality-comparable-iter-value<I>
friend constexpr bool operator==(I const& it, null_sentinel_t);
};
inline constexpr null_sentinel_t null_sentinel;
}template<input_iterator I>
requires default-initializable-and-equality-comparable-iter-value<I>
friend constexpr bool operator==(I const& it, null_sentinel_t);Effects:
Equivalent to return *it == iter_value_t<I>();.
null_term CPOnamespace std::views {
inline constexpr unspecified null_term;
}The name null_term denotes a
customization point object ([customization.point.object]). Given a
subexpression E, the expression
null_term(E)
is expression-equivalent to ranges::subrange(E, null_sentinel).
Header <version>
synopsis [version.syn]
#define __cpp_lib_ranges_null_sentinel 2026XXLThis proposal was originally written by Zach Laine as part of [P2728R0], then updated and split out by Eddie Nolan.
null_sentinel_t to a
non-Unicode-specific facility.null_sentinel_t back to
being Unicode-specific.null_sentinel_t to
std, remove its
base member function, and make it
useful for more than just pointers, based on SG-9 guidance.null_sentinel_t.null_sentinel_t
causing it not to satisfy
sentinel_for by changing its operator==
to return
bool.null_sentinel_t
where it did not support non-copyable input iterators by having operator==
take input iterators by reference.null_sentinel and
null_term into P3705unchecked_take_before design
alternativeunchecked_take_before and
pass-by-value design alternativesnull_term to namespace
std::viewsiter_value_t<I>()
over iter_value_t<I>{}
to avoid initializer_list
interactionPOLL: We would like something like the proposed
null_sentinel regardless of the
addition of a hypothetical
unchecked_take_before view that may
come in the future.
Outcome: No objection to unanimous consent
POLL: Always pass the iterator by reference to operator==,
move null_term to namespace
std::views,
add a discussion about alternative names, and forward P3705R1 with those
changes to LEWG for inclusion in C++29.
SF
|
F
|
N
|
A
|
SA
|
|---|---|---|---|---|
| 9 | 7 | 0 | 0 | 0 |
# Of Authors: 1
Author’s Position: SF
Attendance: 18 (2 abstentions)
Outcome: Unanimous consent
POLL: Move null_sentinel_t to std:: namespace
SF
|
F
|
N
|
A
|
SA
|
|---|---|---|---|---|
| 1 | 3 | 1 | 0 | 0 |
# Of Authors: 1
Author’s Position: F
Attendance: 9 (4 abstentions)
Outcome: Consensus in favor
POLL: Remove null_sentinel_t::base member function from the proposal
SF
|
F
|
N
|
A
|
SA
|
|---|---|---|---|---|
| 0 | 4 | 1 | 0 | 0 |
# Of Authors: 1
Author’s Position: F
Attendance: 8 (3 abstentions)
Outcome: Consensus in favor
POLL: Separate std::null_sentinel_t
from P2728 into a separate paper for SG9 and LEWG; SG16 does not need to
see it again.
SF
|
F
|
N
|
A
|
SA
|
|---|---|---|---|---|
| 1 | 1 | 4 | 2 | 1 |
Attendance: 12 (3 abstentions)
Outcome: No consensus; author’s discretion for how to continue.