ISO/IEC JTC1 SC22 WG21 N3350=12-0040

Date:

Jeffrey Yasskin <jyasskin@google.com>

Source at http://code.google.com/p/cxx1y-range/

A minimal std::range<Iter>

Overview

Ranges are ubiquitous in programs that use STL algorithms, but they're always written as adjacent iterator arguments, which makes them hard to use. Several groups have proposed a new set of range concepts that would allow ranges to be passed around as single objects. Two notable examples are the Boost.Range library and Andrei Alexandrescu's range concepts in D.

This paper, and the library at http://code.google.com/p/cxx1y-range/, attempt to take these ideas and produce a minimal library that's useful today and is suitable for standardization.

Changes from existing libraries

This library makes certain changes from the existing range libraries:

Boost's range library focuses on defining a set of Range concepts that allow Containers to be Ranges. Because Containers are heavy-weight, this forces Boost to use references for all of their uses of Ranges, including Ranges that are captured by Range Adapters. This worked fine in C++98, where users couldn't name the types of the Range Adapters in order to declare local variables, but with C++11's auto keyword, users can now save adapted ranges. Since Boost's adapted ranges capture references to their arguments, which can be temporaries, this is no longer a safe design.

This paper makes the opposite choice: it defines a concrete type and no concepts, and the concrete type is a lightweight view into a container. This implies that algorithms can't treat Ranges and Containers interchangeably, but there are ways around this, which I explore below. Of the Boost types, this paper's std::range is most like boost::iterator_range.

Andrei Alexandrescu's range concepts provide a complete replacement for iterators. However, they require that the whole machinery be adopted at once, which isn't appropriate for a widely-used language like C++. This paper's std::range allows users to program against the Range concepts while still accepting data from iterator-oriented containers.

This paper's range<> omits Alexandrescu's ForwardRange.save() operation, on the theory that it's just a type assertion for most ForwardRanges (since they provide a full-featured copy operation), so most algorithms would forget to call save(). This would introduce bugs when those algorithms were used with the rare ForwardRange that could take advantage of it to save work on plain copies. It's better to omit the optimization than to lay traps.

Like the D ranges, but unlike the STL's iterators, this paper's range uses the same names for O(1) and O(N) operations. Some reviewers found the O(N) names (advance and shorten) confusing, and with static_assert(), users can now check the category explicitly if they're concerned. The one operation we didn't make work for all categories was operator[] on the theory that indexing is much more likely to indicate a programmer mistake than other operations.

This proposal allows negative indices for indexing operations, following the lead of several scripting languages. Neither of the earlier range implementations does this for arbitrary ranges, but D does allow users to offset from the end of a random-access range.

How to write algorithms that use ranges

As mentioned above, you can't simply write an algorithm that takes either Containers or std::range<>s and treats them interchangeably:

template<typename T>
void my_algorithm(T argument) {  // Copies any containers.
  while (!argument.empty()) {
    process(argument.front());
    // Fails on vectors; destructive on deques; fine on ranges:
    argument.pop_front();
  }
}

The limits of template argument deduction make it inconvenient to take a std::range explicitly:

template<typename Iter>
void my_algorithm(std::range<Iter> argument) {
  while (!argument.empty()) {
    process(argument.front());
    argument.pop_front();
  }
}

void user() {
  ContainerT my_container;
  // Fails! std::range<ContainerT::iterator> has an implicit
  // conversion from ContainerT, but because the type of the
  // explicit argument doesn't match "std::range<*>", the
  // compiler can't deduce the template argument.
  my_algorithm(my_container);

  // Works, but verbose:
  my_algorithm(std::make_range(my_container));
}

The best option seems to be for algorithms that want to take either containers or ranges, and use ranges internally, to take their parameter by T& or const T& and use make_range() to build a range out of it explicitly:

template<typename T>
void partial_sum(T& container) {
  using std::make_range;
  auto range = make_range(container);
  typename T::value_type sum = 0;
  // One would actually use a range-based for loop
  // in this case, but you get the idea:
  for (; !range.empty(); range.pop_front()) {
    sum += range.front();
    range.front() = sum;
  }
}

void user1() {
  std::vector<int> v(5, 2);
  partial_sum(v);
  assert(v[3] == 8);
}

By storing the result of make_range() in an auto variable, we allow authors of future range types to overload it as the identity operation on types that are already ranges.

The one case where taking std::range explicitly seems to make sense is range<T*>. If an algorithm needs contiguous iterators, this can allow its authors to avoid making it a template:

void my_write(int fd, std::range<const char*> buffer) {
  write(fd, buffer.begin(), buffer.size());
}

void user2() {
  std::string s("Hello world");
  my_write(1, s);
}

std::range

A std::range<Iterator> represents a half-open iterator range built from two iterators, 'begin', and 'end'. If end is not reachable from begin, the behavior is undefined.

The mutability of elements of the range is controlled by the Iterator argument. Instantiate range<Foo::iterator> or range<T*>, or call make_range(non_const_container), and you get a mutable range. Instantiate range<Foo::const_iterator> or range<const T*>, or call make_range(const_container), and you get a constant range.

Todo

Inherit from std::pair<Iterator, Iterator>?

Todo

This interface contains some functions that could be provided as free algorithms rather than member functions, and all of the pop_*() functions could be replaced by slice() at the cost of some extra iterator copies. This makes them more awkward to use, but makes it easier for users to write their own types that follow the same interface. On the other hand, a range_facade could be provided to help users write new ranges, and it could provide the members. Such functions are marked with a note in their documentation. (Of course, all of these member functions could be provided as free functions using the iterator access methods, but one goal here is to allow people to program without touching iterators at all.)

Class definition

namespace std {
  template<typename Iterator>
  class range {
    // types
    typedef iterator_traits< Iterator >::iterator_category iterator_category;
    typedef iterator_traits< Iterator >::value_type value_type;
    typedef iterator_traits< Iterator >::difference_type difference_type;
    typedef iterator_traits< Iterator >::reference reference;
    typedef iterator_traits< Iterator >::pointer pointer;

    // constructors
    range();
    constexpr range(Iterator begin, Iterator end);
    template<typename R>
    constexpr range(R && r);
    template<typename R>
    constexpr range(R && r);

    // iterator access
    constexpr Iterator begin() const;
    constexpr Iterator end() const;

    // element access
    constexpr reference front() const;
    constexpr reference back() const;
    constexpr reference operator[](difference_type index) const;

    // size
    constexpr bool empty() const;
    constexpr difference_type size() const;

    // traversal from the beginning of the range
    void pop_front();
    void pop_front(difference_type n);
    void pop_front_upto(difference_type n);

    // traversal from the end of the range
    void pop_back();
    void pop_back(difference_type n);
    void pop_back_upto(difference_type n);

    // creating derived ranges
    pair< range, range > split(difference_type index) const;
    range slice(difference_type start, difference_type stop) const;
    range slice(difference_type start) const;
  };

  // deducing constructor wrappers
  template<typename Iterator>
  constexpr range< Iterator > make_range(Iterator begin, Iterator end);
  template<typename Range>
  constexpr auto make_range(Range && r) -> range< decltype(begin(r))>;
  template<typename Range>
  constexpr auto make_ptr_range(Range && r) -> range< decltype(&*begin(r))>;
}

Member semantics

deducing constructor wrappers

These functions do the same thing as the constructor with the same signature. They just allow users to avoid writing the iterator type.

template<typename Iterator> constexpr range< Iterator > make_range(Iterator begin, Iterator end);
Todo

I'd like to define a make_range taking a single iterator argument representing the beginning of a range that ends with a default-constructed Iterator. This would help with using iterators like istream_iterator. However, using just make_range() could be confusing and lead to people writing incorrect ranges of more common iterators. Is there a better name?

template<typename Range> constexpr auto make_range(Range && r) -> range< decltype(begin(r))>;
Participates in overload resolution if:

begin(r) and end(r) return the same type.

template<typename Range> constexpr auto make_ptr_range(Range && r) -> range< decltype(&*begin(r))>;
Participates in overload resolution if:
  • begin(r) and end(r) return the same type,
  • that type satisfies the invariant that &*(i + N) == (&*i) + N, and
  • &*begin(r) has a pointer type.

types

typedef iterator_traits< Iterator >::iterator_category iterator_category;

The iterator category of Iterator.

Todo

Consider defining range categories. If they don't add anything over the corresponding iterator categories, then they're probably not worth defining.

typedef iterator_traits< Iterator >::value_type value_type;

The type of elements of the range. Not cv-qualified.

typedef iterator_traits< Iterator >::difference_type difference_type;

The type of the size of the range and offsets within the range.

typedef iterator_traits< Iterator >::reference reference;

The return type of element access methods: front(), back(), etc.

constructors

range();

Creates a range of default-constructed (not value-initialized) iterators. For most Iterator types, this will be an invalid range.

constexpr range(Iterator begin, Iterator end);

Requires:

end is reachable from begin.

Postconditions:

this->begin() == begin && this->end() == end

template<typename R> constexpr range(R && r);

Participates in overload resolution if:
  • Iterator is not a pointer type,
  • begin(r) and end(r) return the same type, and
  • that type is convertible to Iterator.
Todo

std::begin and std::end are overloaded between T& and const T&, which means that if a container has only a non-const begin or end method, then it's ill-formed to pass an rvalue to the free function. To avoid that problem, we don't use std::forward<> here, so begin() and end() are always called with an lvalue. Another option would be to insist that rvalue arguments to range() must have const begin() and end() methods.

template<typename R> constexpr range(R && r);

This constructor creates a range<T*> from any range with contiguous iterators. Because dereferencing a past-the-end iterator can be undefined behavior, empty ranges get initialized with nullptr rather than &*begin().

Participates in overload resolution if:
  • Iterator is a pointer type T*,
  • begin(r) and end(r) return the same type,
  • elements i of that type satisfy the invariant &*(i + N) == (&*i) + N, and
  • The result of &*begin() is convertible to T* using only qualification conversions [conv.qual] (since pointer conversions stop the pointer from pointing to an array element).
Todo

The &*(i + N) == (&*i) + N invariant is currently impossible to check for user-defined types. We need a contiguous_iterator_tag to let users assert it.

element access

constexpr reference front() const;

Complexity:

O(1)

Requires:

!empty()

Returns:

a reference to the element at the front of the range.

constexpr reference back() const;

Ill-formed unless:

iterator_category is convertible to bidirectional_iterator_tag.

Complexity:

O(2) (Involves copying and decrementing an iterator, so not quite as cheap as front())

Requires:

!empty()

Returns:

a reference to the element at the front of the range.

constexpr reference operator[](difference_type index) const;

This method is drawn from scripting language indexing. It indexes forward from the beginning of the range if the argument is positive, or backwards from the end of the array if the argument is negative.

Ill-formed unless:

iterator_category is convertible to random_access_iterator_tag.

Complexity:

O(1)

Requires:

abs(index) < size() || index == -size()

Returns:

if index >= 0, a reference to the index'th element in the range. Otherwise, a reference to the size()+index'th element.

size

constexpr bool empty() const;

Complexity:

O(1)

Returns:

true if the range contains no elements.

constexpr difference_type size() const;

Ill-formed unless:

iterator_category is convertible to forward_iterator_tag.

Complexity:

O(1) if iterator_category is convertible to random_access_iterator_tag. O(size()) otherwise.

Returns:

the number of times pop_front() can be called before empty() becomes true.

traversal from the beginning of the range

void pop_front();

Advances the beginning of the range by one element.

Requires:

!empty()

void pop_front(difference_type n);

Advances the beginning of the range by n elements.

Complexity:

O(1) if iterator_category is convertible to random_access_iterator_tag, O(n) otherwise.

Requires:

n >= 0, and there must be at least n elements in the range.

void pop_front_upto(difference_type n);

Advances the beginning of the range by at most n elements, stopping if the range becomes empty. A negative argument causes no change.

Complexity:

O(1) if iterator_category is convertible to random_access_iterator_tag, O(min(n, #-elements-in-range)) otherwise.

Could be provided as a free function with little-to-no loss in efficiency.

traversal from the end of the range

void pop_back();

Moves the end of the range earlier by one element.

Ill-formed unless:

iterator_category is convertible to bidirectional_iterator_tag.

Complexity:

O(1)

Requires:

!empty()

void pop_back(difference_type n);

Moves the end of the range earlier by n elements.

Ill-formed unless:

iterator_category is convertible to bidirectional_iterator_tag.

Complexity:

O(1) if iterator_category is convertible to random_access_iterator_tag, O(n) otherwise.

Requires:

n >= 0, and there must be at least n elements in the range.

void pop_back_upto(difference_type n);

Moves the end of the range earlier by min(n, size()) elements. A negative argument causes no change.

Ill-formed unless:

iterator_category is convertible to bidirectional_iterator_tag.

Complexity:

O(1) if iterator_category is convertible to random_access_iterator_tag, O(min(n, #-elements-in-range)) otherwise.

Could be provided as a free function with little-to-no loss in efficiency.

creating derived ranges

pair< range, range > split(difference_type index) const;

Divides the range into two pieces at index, where a positive index represents an offset from the beginning of the range and a negative index represents an offset from the end. range[index] is the first element in the second piece. If index >= size(), the second piece will be empty. If index < -size(), the first piece will be empty.

Ill-formed unless:

iterator_category is convertible to forward_iterator_tag.

Complexity:
  • If iterator_category is convertible to random_access_iterator_tag: O(1)
  • Otherwise, if iterator_category is convertible to bidirectional_iterator_tag, abs(index) iterator increments or decrements
  • Otherwise, if index >= 0, index iterator increments
  • Otherwise, size() + (size() + index) iterator increments.
Returns:

a pair of adjacent ranges.

Postconditions:
  • result.first.size() == min(index, this->size())
  • result.first.end() == result.second.begin()
  • result.first.size() + result.second.size() == this->size()
Todo

split() could take an arbitrary number of indices and return an N+1-element tuple<>. This is tricky to implement with negative indices in the optimal number of increments or decrements for a bidirectional iterator, but it should be possible. Do we want it?

range slice(difference_type start, difference_type stop) const;

Returns:

A sub-range from start to stop (not including stop, as usual). start and stop are interpreted as for operator[], with negative values offsetting from the end of the range. Omitting the stop argument makes the sub-range continue to the end of the original range. Positive arguments saturate to the end of the range, and negative arguments saturate to the beginning. If stop is before start, returns an empty range beginning and ending at start.

Ill-formed unless:

iterator_category is convertible to forward_iterator_tag.

Complexity:
  • If iterator_category is convertible to random_access_iterator_tag: O(1)
  • Otherwise, if iterator_category is convertible to bidirectional_iterator_tag, at most min(abs(start), size()) + min(abs(stop), size()) iterator increments or decrements
  • Otherwise, if start >= 0 && stop >= 0, max(start, stop) iterator increments
  • Otherwise, size() + max(start', stop') iterator increments, where start' and stop' are the offsets of the elements start and stop refer to.

slice(start) should be implemented with a different overload, rather than defaulting stop to numeric_limits<difference_type>::max(), because using a default would force non-random-access ranges to use an O(size()) algorithm to compute the end rather than the O(1) they're capable of.

Miscellaneous other changes

Certain other pieces of the library should change to make life easier for range<>.

Future work

Many functions will be able to take either Ranges or Containers as arguments. We should work out what concept their argument follows. "Iterable"?

Versions of all of the algorithms should be added that accept ranges. While it would be possible to make them take std::range<> itself, a set of Range concepts will likely work better in the long run, leaving std::range<> as an interface between the iterator and range worlds.

Boost's adapters are a great idea and should be added to the standard.

Infinite ranges could be defined, whose empty() method returns std::false_type.

We should figure out how to fit OutputIterators into the range framework.

A collection of any_range types would be useful.

Open questions

Member make_range
I'd like to define a make_range taking a single iterator argument representing the beginning of a range that ends with a default-constructed Iterator. This would help with using iterators like istream_iterator. However, using just make_range() could be confusing and lead to people writing incorrect ranges of more common iterators. Is there a better name?
Class std::range< Iterator >

Inherit from std::pair<Iterator, Iterator>?

This interface contains some functions that could be provided as free algorithms rather than member functions, and all of the pop_*() functions could be replaced by slice() at the cost of some extra iterator copies. This makes them more awkward to use, but makes it easier for users to write their own types that follow the same interface. On the other hand, a range_facade could be provided to help users write new ranges, and it could provide the members. Such functions are marked with a note in their documentation. (Of course, all of these member functions could be provided as free functions using the iterator access methods, but one goal here is to allow people to program without touching iterators at all.)

Member std::range< Iterator >::iterator_category
Consider defining range categories. If they don't add anything over the corresponding iterator categories, then they're probably not worth defining.
Member std::range< Iterator >::range (R &&r, typename enable_if< is_pointer< Iterator >::value &&detail::is_contiguous_range< R >::value &&detail::conversion_preserves_array_indexing< decltype(&*detail::adl_begin(r)), Iterator >::value >::type *=0)
The &*(i + N) == (&*i) + N invariant is currently impossible to check for user-defined types. We need a contiguous_iterator_tag to let users assert it.
Member std::range< Iterator >::range (R &&r, typename enable_if<!is_pointer< Iterator >::value &&detail::is_range< R >::value &&is_convertible< typename detail::begin_result< R >::type, Iterator >::value >::type *=0)
std::begin and std::end are overloaded between T& and const T&, which means that if a container has only a non-const begin or end method, then it's ill-formed to pass an rvalue to the free function. To avoid that problem, we don't use std::forward<> here, so begin() and end() are always called with an lvalue. Another option would be to insist that rvalue arguments to range() must have const begin() and end() methods.
Member std::range< Iterator >::split (difference_type index) const
split() could take an arbitrary number of indices and return an N+1-element tuple<>. This is tricky to implement with negative indices in the optimal number of increments or decrements for a bidirectional iterator, but it should be possible. Do we want it?