Document number | P3211R1 |
Date | 2024-07-04 |
Audience | LEWG, SG9 (Ranges) |
Reply-to | Hewill Kang <hewillk@gmail.com> |
views::flat_map
We propose views::flat_map
, a range adaptor that applies a function returning a range for each element,
then flattens the result.
This pattern, commonly known as flat mapping, is widespread in functional programming and data processing.
Providing it as a dedicated view improves readability and expressiveness, and also opens opportunities for
optimization in lazy evaluation contexts.
Noted that this is ranked as Tier 1 in P2760.
Initial revision.
Rename views::transform_join
to views::flat_map
.
Introduce new flat_map_view
class.
Discuss related optimization.
Mapping each element to a subrange and flattening the results into a single range is common in programming tasks like data processing, string handling, and range composition.
This appears frequently enough in practice to justify direct support in the standard ranges library.
Providing views::flat_map
encourages clearer, more maintainable code and lays the groundwork for
potential optimizations specific to this use case. While the same behavior can be achieved through
composition such as views::transform
and then views::join
, a dedicated view enables the
library to make stronger semantic guarantees and allows more efficient
handling of transform function results, particularly for expensive or lvalue-producing mappers.
This pattern arises naturally in code like:
auto all_courses = students | std::views::flat_map([](const Student& s) { return std::views::all(s.courses); });
views::flat_map
instead of views::transform_join
?
The name transform_join
describes how to combine transform
and join
, but it
does not show the real concept. Users may see it as two steps, not as a single map and flatten operation;
The name flat_map
is common in many languages and clearly shows both mapping and flattening. This makes
it easier to find, less confusing, and matches what users expect.
The following table shows how this operation is named in various languages, all of which converge on flat-map:
Language | Function Name | Example | Differences in Semantics |
---|---|---|---|
Haskell | concatMap (>>=) |
concatMap f [1, 2, 3] |
Purely functional; >>= is monadic bind, generalizing flat mapping beyond lists |
C# (LINQ) | SelectMany |
source.SelectMany(x => ...) |
Applies to all IEnumerable; default for flattening nested queries in LINQ syntax |
Python | itertools.chain.from_iterable(map(...)) |
list(chain.from_iterable(map(f, data))) |
No built-in flatMap ; semantics are manual; relies on strict evaluation and eager lists |
Java | flatMap |
stream.flatMap(f) |
Requires the mapper function to return a Stream; lazy evaluation is enforced |
JavaScript | flatMap |
array.flatMap(x => [x, x + 1]) |
One-level flattening only; not lazy; limited to arrays |
Kotlin | flatMap |
list.flatMap { listOf(it, it * 2) } |
Similar to Java, but cleaner syntax; strict (not lazy) evaluation for standard collections |
Rust | flat_map |
iter.flat_map(|x| some_iter(x)) |
Applies to iterators; lazy by default; consumes the original iterator |
Swift | flatMap |
array.flatMap { [$0, $0 * 2] } |
Semantics changed in Swift 4 - used to also remove optionals; now strictly flatten + map |
Given this widespread usage, flat_map
is the most appropriate and intuitive name for the proposed view.
It reflects established terminology, avoids over-specifying implementation details, and aligns well with programmer
expectations.
It is worth noting that the range/v3
uses the name views::for_each
for this operation.
Because
such naming deviates further from common terminology and could cause additional confusion, it is not considered a
suitable option.
Additionally, C++23 introduced a container named std::flat_map
, which is a associative
container that stores key-value pairs.
This is conceptually quite different from the proposed views::flat_map
, which is a range adaptor that
composes a map-then-flatten operation. Clarifying this distinction helps avoid confusion between the two.
Finally, while flat_map
is preferred for its clarity and alignment with established usage, other naming
alternatives are acceptable as long as they convey the correct semantics.
views::transform(f) | views::join
?
Although flat_map
can be expressed as a composition of transform
and
join
, a dedicated flat_map_view
provides clearer semantics and more efficient
behavior in practice.
The first minor issue with composition is that it interferes with base()
. When writing
transform(f) | join
, the result holds a transform_view
internally, so calling
base()
does not give access to the original range, which can be surprising and inconvenient in generic
contexts.
Second, a standalone view can manage both the outer iterator and the inner range. It can cache the transform result and avoid repeated calls, which is important when the function returns an lvalue range.
In the composed form,
join_view
has no knowledge
of how the
inner range was produced - it treats each element as if it were obtained by dereferencing the outer iterator. This
works fine when the base is something like vector of vectors, where dereferencing is
cheap
and deterministic. But with transform | join
, each inner range is the result of invoking a user-defined
transformation, which could be expensive to compute even if the result is a stable lvalue. For example:
auto outer = views::iota(1, 5); auto inner = views::iota(1, 10); auto flatten = views::flat_map(outer, [&](int i) -> auto& { return inner; }); println("{}", flatten); // calls transform function 40 times
While cache_latest
can mitigate redundant evaluation, it forces the result to model only an
input_range
, regardless of the actual capabilities of the underlying ranges. It also adds an extra
layer of adaptor composition and complexity. In contrast, a dedicated flat_map_view
can handle this
caching internally while preserving the strongest valid iterator category.
A dedicated view also avoids unnecessary template instantiations. Since a flat-mapped range can never be more than
bidirectional, there is no benefit in preserving random-access capabilities through transform_view
.
However, when using adaptor composition, those capabilities may still be instantiated - even though
join_view
will downgrade them - leading to code bloat and slower compilation.
In short, using composition is flexible but less efficient and can be awkward. flat_map_view
avoids
repeated evaluations, keeps the best iterator type, and makes the view stack simpler.
Although paper
P2760 suggests there is little benefit to providing a
separat view as state: "Importantly, there really isn't much benefit to providing a bespoke
transform_join
as
opposed to simply implementing it in terms of these two existing adaptors.", practical usage has revealed that
the composed form introduces subtle inefficiencies, especially in the presence of expensive transformation
functions, and not easy to eliminate without internal knowledge of how adaptors interact.
When the inner range is a reference, we can avoid repeated invocations of the mapping function by storing a pointer
to the result instead. This allows us to simply dereference the pointer whenever the inner range is needed, rather
than invoking the transformation again — a technique somewhat analogous to cache_latest_view
. However, it's important to
note that this pointer must be stored inside the iterator, not the flat_map_view
itself, in order to ensure correct
behavior in multi-pass scenarios where multiple iterators may coexist and advance independently.
For prvalue inner ranges, the situation is somewhat more tricky, especially for immovable ranges.
For join_view
, it underlying range's iterator already returns the inner
range.
This property makes it directly suitable for non-propagating-cache::emplace-deref
, which
caches the inner range by
dereferencing the outer iterator.
In contrast, flat_map_view
obtains the inner range by applying a mapping function to the element
referenced by the
outer iterator, rather than by directly dereferencing the outer iterator to yield the inner range. As a result, it
cannot straightforwardly use emplace-deref
to cache the inner range. Instead, it
must store the
entire inner range object returned by the mapping function within it to properly extend its lifetime.
One possible approach to mitigate this is to define a proxy iterator that models invoking the mapping function upon
dereference. This proxy can then be used with emplace-deref
to cache the mapped inner range
indirectly, albeit with
additional complexity.
At first glance, flat_map_view
might not have a stashing issue as discussed in
P2770, since its inner range is always produced by applying a mapping function, rather than being a subrange
stored within the outer range.
However, in particular, the mapping function can certainly return a range whose lifetime is tied to the
outer
iterator -
for example, applying views::split('x')
to strings obtained from an
istream_iterator<string>
— then
flat_map_view
need cache the current outer iterator to ensures the inner range remains
valid
during iteration. Therefore, flat_map_view
follows the design pattern of that paper.
The author implemented views::flat_map
based on libstdc++, see here.
The implementation supports input_range
,
forward_range
, and
bidirectional_range
, and demonstrates correct caching behavior for both lvalue and prvalue immovable
mapped
ranges.
This wording is relative to Latest Working Draft .
Add a new feature-test macro to 17.3.2 [version.syn]:
#define __cpp_lib_ranges_flat_map 2025XXL // freestanding, also in <ranges>
Modify 25.2 [ranges.syn], Header
<ranges>
synopsis, as indicated:
// mostly freestanding #include <compare> // see [compare.syn] #include <initializer_list> // see [initializer.list.syn] #include <iterator> // see [iterator.synopsis] namespace std::ranges { […] namespace views { inline constexpr unspecified join_with = unspecified; } // [range.flat.map], flat map view template<input_range V, move_constructible F> requires see below class flat_map_view; namespace views { inline constexpr unspecified flat_map = unspecified; } […] }
Add 25.7.? Flat map view [range.flat.map] after 25.7.15 [range.join.with] as indicated:
-1- A flat-map view maps each element to a range and flattens the results into a view.
-2- The name views::flat_map
denotes a range adaptor object ([range.adaptor.object]). Given subexpressions
E
and F
,
the expression views::flat_map(E, F)
is expression-equivalent to
flat_map_view(E, F)
.
-3- [Example 1:
vector ints{0, 1, 2}; for (auto elem : ints | views::flat_map([](int i) { return views::repeat(i, 3); })) cout << elem << ' '; // prints 0 0 0 1 1 1 2 2 2— end example]
[25.7.?.2] Class template flat_map_view
[range.flat.map]
namespace std::ranges { template<class R, class F> concept flat-mappable = input_range<R> && // exposition only regular_invocable<F&, range_reference_t<R>> && input_range<invoke_result_t<F&, range_reference_t<R>>>; template<input_range V, move_constructible F> requires view<V> && is_object_v<F> && flat-mappable<V, F> class flat_map_view : public view_interface<flat_map_view<V, F>> { private: using MappedRng = invoke_result_t<F&, range_reference_t<V>>; // exposition only // [range.flat.map.iterator], class template flat_map_view::iterator template<bool Const> struct iterator; // exposition only // [range.flat.map.sentinel], class template flat_map_view::sentinel template<bool Const> struct sentinel; // exposition only V base_ = V(); // exposition only movable-box<F> fun_; // exposition only non-propagating-cache<iterator_t<V>> current_; // exposition only, present only // when !forward_range<V> non-propagating-cache<MappedRng> mapped_; // exposition only, present only // if is_reference_v<MappedRng> is false public: flat_map_view() requires default_initializable<V> && default_initializable<F> = default; constexpr explicit flat_map_view(V base, F fun); constexpr V base() const & requires copy_constructible<V> { return base_; } constexpr V base() && { return std::move(base_); } constexpr auto begin() { if constexpr (forward_range<V>) { constexpr bool use_const = simple-view<V> && is_reference_v<MappedRng>; return iterator<use_const>{*this, ranges::begin(base_)}; } else { current_ = ranges::begin(base_); return iterator<false>{*this}; } } constexpr auto begin() const requires forward_range<const V> && flat-mappable<const V, const F> && is_reference_v<invoke_result_t<const F&, range_reference_t<const V>>> { return iterator<true>{*this, ranges::begin(base_)}; } constexpr auto end() { if constexpr (forward_range<V> && is_reference_v<MappedRng> && forward_range<MappedRng> && common_range<V> && common_range<MappedRng>) return iterator<simple-view<V>>{*this, ranges::end(base_)}; else return sentinel<simple-view<V>>{*this}; } constexpr auto end() const requires forward_range<const V> && flat-mappable<const V, const F> && is_reference_v<invoke_result_t<const F&, range_reference_t<const V>>> { using MappedConstRng = invoke_result_t<const F&, range_reference_t<const V>>; if constexpr (forward_range<MappedConstRng> && common_range<const V> && common_range<MappedConstRng>) return iterator<true>{*this, ranges::end(base_)}; else return sentinel<true>{*this}; } }; template<class R, class F> flat_map_view(R&&, F) -> flat_map_view<views::all_t<R>, F>; }
constexpr explicit flat_map_view(V base, F fun);
-1- Effects: Initializes
base_
withstd::move(base)
, andfun_
withstd::move(fun)
.
[25.7.?.3] Class template flat_map_view::iterator
[range.flat.map.iterator]
namespace std::ranges { template<input_range V, move_constructible F> requires view<V> && is_object_v<F> && flat-mappable<V, F> template<bool Const> struct flat_map_view<V, F>::iterator { private: using Parent = maybe-const<Const, flat_map_view>; // exposition only using Base = maybe-const<Const, V>; // exposition only using MappedBase = // exposition only invoke_result_t<maybe-const<Const, F>&, range_reference_t<Base>>; using BaseIter = iterator_t<Base>; // exposition only using MappedIter = iterator_t<MappedBase>; // exposition only static constexpr bool mapped-is-glvalue = // exposition only is_reference_v<MappedBase>; BaseIter current_ = BaseIter(); // exposition only, present only // if Base models forward_range optional<MappedIter> mapped_it_; // exposition only Parent* parent_ = nullptr; // exposition only add_pointer_t<MappedBase> mapped_ = nullptr; // exposition only, present only // if mapped-is-glvalue is true constexpr BaseIter& current(); // exposition only constexpr const BaseIter& current() const; // exposition only constexpr auto& update-mapped(); // exposition only constexpr auto& get-mapped(); // exposition only constexpr void satisfy(); // exposition only constexpr iterator(Parent& parent, BaseIter current) requires forward_range<Base>; // exposition only constexpr explicit iterator(Parent& parent) requires (!forward_range<Base>); // exposition only public: using iterator_concept = see below; using iterator_category = see below; // not always present using value_type = range_value_t<MappedBase>; using difference_type = see below; iterator() = default; template<class MappedRng = invoke_result_t<F&, range_reference_t<V>>> constexpr iterator(iterator<!Const> i) requires Const && is_reference_v<MappedRng> && convertible_to<iterator_t<V>, BaseIter> && convertible_to<iterator_t<MappedRng>, MappedIter> && convertible_to<add_pointer_t<MappedRng>, add_pointer_t<MappedBase>>; constexpr decltype(auto) operator*() const { return **mapped_it_; } constexpr MappedIter operator->() const requires has-arrow<MappedIter> && copyable<MappedIter>; constexpr iterator& operator++(); constexpr void operator++(int); constexpr iterator operator++(int) requires mapped-is-glvalue && forward_range<Base> && forward_range<MappedBase>; constexpr iterator& operator--() requires mapped-is-glvalue && bidirectional_range<Base> && bidirectional-common<MappedBase>; constexpr iterator operator--(int) requires mapped-is-glvalue && bidirectional_range<Base> && bidirectional-common<MappedBase>; friend constexpr bool operator==(const iterator& x, const iterator& y) requires mapped-is-glvalue && forward_range<Base> && equality_comparable<iterator_t<MappedBase>>; friend constexpr decltype(auto) iter_move(const iterator& i) noexcept(noexcept(ranges::iter_move(*i.mapped_it_))) { return ranges::iter_move(*i.mapped_it_); } friend constexpr void iter_swap(const iterator& x, const iterator& y) noexcept(noexcept(ranges::iter_swap(*x.mapped_it_, *y.mapped_it_))) requires indirectly_swappable<MappedIter>; }; }
-1-
iterator::iterator_concept
is defined as follows:
(1.1) — If
mapped-is-glvalue
is true,Base
modelsbidirectional_range
, andMappedBase
modelsbidirectional-common
, theniterator_concept
denotesbidirectional_iterator_tag
.(1.2) — Otherwise, if
mapped-is-glvalue
istrue
andBase
andMappedBase
each modelforward_range
, theniterator_concept
denotesforward_iterator_tag
.(1.3) — Otherwise,
iterator_concept
denotesinput_iterator_tag
.
-2- The member typedef-name
iterator_category
is defined if and only ifmapped-is-glvalue
istrue
,Base
modelsforward_range
, andMappedBase
modelsforward_range
. In that case,iterator::iterator_category
is defined as follows:
(2.1) — Let BASEC denote
iterator_traits<iterator_t<Base>>::iterator_category
, and let MAPPEDC denoteiterator_traits<iterator_t<MappedBase>>::iterator_category
.(2.2) — If BASEC and MAPPEDC each model
derived_from<bidirectional_iterator_tag>
andMappedBase
modelscommon_range
,iterator_category
denotesbidirectional_iterator_tag
.(2.3) — Otherwise, if BASEC and MAPPEDC each model
derived_from<forward_iterator_tag>
,iterator_category
denotesforward_iterator_tag
.(2.4) — Otherwise,
iterator_category
denotesinput_iterator_tag
.
-3-
iterator::difference_type
denotes the type:common_type_t<range_difference_t<Base>, range_difference_t<MappedBase>>
-4-
flat_map_view
iterators use thesatisfy
function to skip over empty mapped ranges.
constexpr BaseIter& current(); constexpr const BaseIter& current() const;
-5- Returns:
current_
ifBase
modelsforward_range
; otherwise,*parent_->current_
.
constexpr auto& update-mapped();
-6- Effects:
(6.1) — If
mapped-is-glvalue
istrue
, equivalent to:mapped_ = addressof(as-lvalue(std::invoke(*parent_->fun_, *current()))); return *mapped_;(6.1) — Otherwise, equivalent to:
return parent_->mapped_.emplace-def(map-iterator(*parent_->fun_, current()));where
map-iterator
is the exposition-only class:class map-iterator { maybe-const<Const, F>& fun_; const BaseIter& current_; constexpr map-iterator(maybe-const<Const, F>& fun, const BaseIter& current) : fun_(fun), current_(current) {} public: constexpr auto operator*() const noexcept(noexcept(std::invoke(fun_, *current_))) { return std::invoke(fun_, *current_); } };
constexpr auto& get-mapped();
-7- Effects: Equivalent to:
if constexpr (mapped-is-glvalue) return *mapped_; else return *parent_->mapped_;
constexpr void satisfy();
-8- Effects: Equivalent to:
for (; current() != ranges::end(parent_->base_); ++current()) { auto& mapped = update-mapped(); mapped_it_ = ranges::begin(mapped); if (*mapped_it_ != ranges::end(mapped)) return; } if constexpr (mapped-is-glvalue) mapped_it_.reset();
constexpr iterator(Parent& parent, BaseIter current) requires forward_range<Base>;
-9- Effects: Initializes
current_
withstd::move(current)
andparent_
withaddressof(parent)
; then callssatisfy()
.
constexpr explicit iterator(Parent& parent) requires (!forward_range<Base>);
-10- Effects: Initializes
parent_
withaddressof(parent)
; then callssatisfy()
.
template<class MappedRng = invoke_result_t<F&, range_reference_t<V>>> constexpr iterator(iterator<!Const> i) requires Const && is_reference_v<MappedRng> && convertible_to<iterator_t<V>, BaseIter> && convertible_to<iterator_t<MappedRng>, MappedIter> && convertible_to<add_pointer_t<MappedRng>, add_pointer_t<MappedBase>>;
-11- Effects: Initializes
current_
withstd::move(i.current_)
,mapped_it_
withstd::move(i.mapped_it_)
,parent_
withi.parent_
, andmapped_
withi.mapped_
.-12- [Note 1:
Const
can only betrue
whenBase
modelsforward_range
. — end note]
constexpr MappedIter operator->() const requires has-arrow<MappedIter> && copyable<MappedIter>;
-13- Effects: Equivalent to:
return *mapped_it_;
constexpr iterator& operator++();
-14- Effects: Equivalent to:
if (++*mapped_it_ == ranges::end(get-mapped())) { ++current(); satisfy(); } return *this;
constexpr void operator++(int);
-15- Effects: Equivalent to:
++*this
.
constexpr iterator operator++(int) requires mapped-is-glvalue && forward_range<Base> && forward_range<MappedBase>;
-16- Effects: Equivalent to:
auto tmp = *this; ++*this; return tmp;
constexpr iterator& operator--() requires mapped-is-glvalue && bidirectional_range<Base> && bidirectional-common<MappedBase>;
-17- Effects: Equivalent to:
if (current_ == ranges::end(parent_->base_)) { --current_; mapped_it_ = ranges::end(update-mapped()); } while (mapped_it_ == ranges::begin(get-mapped())) { --current_; mapped_it_ = ranges::end(update-mapped()); } --*mapped_it_; return *this;
constexpr iterator operator--(int) requires mapped-is-glvalue && bidirectional_range<Base> && bidirectional-common<MappedBase>;
-18- Effects: Equivalent to:
auto tmp = *this; --*this; return tmp;
friend constexpr bool operator==(const iterator& x, const iterator& y) requires mapped-is-glvalue && forward_range<Base> && equality_comparable<iterator_t<MappedBase>>;
-19- Effects: Equivalent to:
return x.current_ == y.current_ && x.mapped_it_ == y.mapped_it_;
friend constexpr void iter_swap(const iterator& x, const iterator& y) noexcept(noexcept(ranges::iter_swap(*x.mapped_it_, *y.mapped_it_))) requires indirectly_swappable<MappedIter>;
-20- Effects: Equivalent to:
ranges::iter_swap(*x.mapped_it_, *y.mapped_it_);
[25.7.?.3] Class template flat_map_view::sentinel
[range.flat.map.sentinel]
namespace std::ranges { template<input_range V, move_constructible F> requires view<V> && is_object_v<F> && flat-mappable<V, F> template<bool Const> struct flat_map_view<V, F>::sentinel { private: using Parent = maybe-const<Const, flat_map_view>; // exposition only using Base = maybe-const<Const, V>; // exposition only sentinel_t<Base> end_ = sentinel_t<Base>(); // exposition only public: sentinel() = default; constexpr explicit sentinel(Parent& parent); constexpr sentinel(sentinel<!Const> s) requires Const && convertible_to<sentinel_t<V>, sentinel_t<Base>>; template<bool OtherConst> requires sentinel_for<sentinel_t<Base>, iterator_t<maybe-const<OtherConst, V>>> friend constexpr bool operator==(const iterator<OtherConst>& x, const sentinel& y); }; }
constexpr explicit sentinel(Parent& parent);
-1- Effects: Initializes
end_
withranges::end(parent.base_)
.
constexpr sentinel(sentinel<!Const> s) requires Const && convertible_to<sentinel_t<V>, sentinel_t<Base>>;
-2- Effects: Initializes
end_
withstd::move(s.end_)
.
template<bool OtherConst> requires sentinel_for<sentinel_t<Base>, iterator_t<maybe-const<OtherConst, V>>> friend constexpr bool operator==(const iterator<OtherConst>& x, const sentinel& y);
-3- Effects: Equivalent to:
return x.current() == y.end_;