Document number: N4161 Date: 2014-10-09 Project: Programming Language C++, Library Working Group Reply-to: Stephan T. Lavavej Uniform Container Erasure (Revision 1) I. Introduction This is a proposal to add erase_if(container, pred) and erase(container, value), making it easier to eliminate unwanted elements correctly and efficiently. This updates N4009 (see [1]) which proposed erase_if(), by adding erase() and answering additional questions. Please see the original proposal for the rationale behind this feature, which is not repeated here. II. Standardese 1. In 1.3 [general.namespaces] Table 1 "C++ library headers", add: 2. In 1.5 [general.feature.test] Table 2 "Significant features in this technical specification", add: Doc. No.: N4161 Title: Uniform Container Erasure Primary Section: [container.erasure] Macro Name Suffix: erase_if Value: 201410 Header: 3. Add a new section "Uniform container erasure" [container.erasure] containing the following new sections. 4. Add a new section "Header synopsis" [container.erasure.syn] containing: For brevity, this section specifies the contents of 9 headers, each of which behaves as described by [general.namespaces]. namespace std { namespace experimental { inline namespace fundamentals_v2 { // [container.erasure.erase_if], function template erase_if // [container.erasure.erase], function template erase // template void erase_if(basic_string& c, Predicate pred); template void erase(basic_string& c, const U& value); // template void erase_if(deque& c, Predicate pred); template void erase(deque& c, const U& value); // template void erase_if(vector& c, Predicate pred); template void erase(vector& c, const U& value); // template void erase_if(forward_list& c, Predicate pred); template void erase(forward_list& c, const U& value); // template void erase_if(list& c, Predicate pred); template void erase(list& c, const U& value); // template void erase_if(map& c, Predicate pred); template void erase_if(multimap& c, Predicate pred); template void erase(map& c, const U& value); template void erase(multimap& c, const U& value); // template void erase_if(set& c, Predicate pred); template void erase_if(multiset& c, Predicate pred); template void erase(set& c, const U& value); template void erase(multiset& c, const U& value); // template void erase_if(unordered_map& c, Predicate pred); template void erase_if(unordered_multimap& c, Predicate pred); template void erase(unordered_map& c, const U& value); template void erase(unordered_multimap& c, const U& value); // template void erase_if(unordered_set& c, Predicate pred); template void erase_if(unordered_multiset& c, Predicate pred); template void erase(unordered_set& c, const U& value); template void erase(unordered_multiset& c, const U& value); } // inline namespace fundamentals_v2 } // namespace experimental } // namespace std 5. Add a new section "Function template erase_if" [container.erasure.erase_if] containing: template void erase_if(basic_string& c, Predicate pred); template void erase_if(deque& c, Predicate pred); template void erase_if(vector& c, Predicate pred); Effects: Equivalent to: c.erase(remove_if(c.begin(), c.end(), pred), c.end()); template void erase_if(forward_list& c, Predicate pred); template void erase_if(list& c, Predicate pred); Effects: Equivalent to: c.remove_if(pred); template void erase_if(map& c, Predicate pred); template void erase_if(multimap& c, Predicate pred); template void erase_if(set& c, Predicate pred); template void erase_if(multiset& c, Predicate pred); template void erase_if(unordered_map& c, Predicate pred); template void erase_if(unordered_multimap& c, Predicate pred); template void erase_if(unordered_set& c, Predicate pred); template void erase_if(unordered_multiset& c, Predicate pred); Effects: Equivalent to: for (auto i = c.begin(), last = c.end(); i != last; ) { if (pred(*i)) { i = c.erase(i); } else { ++i; } } 6. Add a new section "Function template erase" [container.erasure.erase] containing: template void erase(basic_string& c, const U& value); template void erase(deque& c, const U& value); template void erase(vector& c, const U& value); Effects: Equivalent to: c.erase(remove(c.begin(), c.end(), value), c.end()); template void erase(forward_list& c, const U& value); template void erase(list& c, const U& value); template void erase(map& c, const U& value); template void erase(multimap& c, const U& value); template void erase(set& c, const U& value); template void erase(multiset& c, const U& value); template void erase(unordered_map& c, const U& value); template void erase(unordered_multimap& c, const U& value); template void erase(unordered_set& c, const U& value); template void erase(unordered_multiset& c, const U& value); Effects: Equivalent to: erase_if(c, [&](auto& elem) { return elem == value; }); III. Questions And Answers Q12. What's new in this revision? A12. Three things: * The machinery has moved from std to std::experimental::fundamentals_v2. * The Standardese has been dramatically condensed. * erase(container, value) has been added, as requested by the LEWG. Q13. What about the naming? A13. I was originally worried about creating confusion between non-member erase() and member erase(), so I was planning to use Tony Van Eerd's suggestion of erase_each() and erase_each_if(). But then Hong Hong pointed out something that made me reconsider my original worries: find(). Specifically: * Non-member find() is linear-time, uses operator==(), and accepts heterogeneous types. * set::find() is log-time, uses operator<(), and is homogeneous (except when C++14 heterogeneous associative lookup has been explicitly requested with less<>). * unordered_set::find() is hash-time, uses hash and operator==(), and is homogeneous. Despite the identical naming, this varying behavior isn't widely considered to be confusing. I now believe that having both non-member erase() and member erase() won't cause major confusion, especially because I believe that erase_if() will be used more frequently than non-member erase(). Any minor confusion will be outweighed by the simplicity of the naming. Q14. Can you explain how erase(container, value) has been implemented? A14. These one-liners contain some subtleties. There are three families for erase_if() (vector-like, list-like, and map-like), but only two implementations for erase(). Here's why: * For the vector-like family, std::remove() works (as it accepts heterogeneous types), and it's shorter than calling erase_if() with a lambda. Note that for vector, std::remove() still works, while the lambda depicted here wouldn't. For simplicity, I've chosen to specify the whole vector-like family as using std::remove(), as opposed to specifying basic_string and deque with the lambda, or making the lambda more complicated (with auto&&). * For the list-like family, member remove() can't be used because it's homogeneous - the signature is list::remove(const T&). Therefore I'm calling erase_if() with a lambda. * For the map-like family, calling erase_if() with a lambda is simpler than spamming out another erase_nodes() implementation. (Note that if we spammed out such an implementation, we could use it for the map-like family and list, but not forward_list which has erase_after().) The lambda [&](auto& elem) { return elem == value; } has been written carefully: * [&] avoids copying. * (auto& elem) avoids adding constness. (While tempting, std::remove() doesn't do that, and it's observable to user-defined operator==() implementations.) This lambda isn't used for vector, so all of the other containers will give modifiable lvalues to the lambda. This can be thought of as hardcoded perfect forwarding. * "elem == value" matches 25.3.8 [alg.remove]/2's ordering of "*i == value". Q15. You've provided an implementation, but is there any similar existing practice? A15. Boost.Range (see [2]) provides remove_erase(container, value) and remove_erase_if(container, pred), which differ in: * Naming: remove_erase[_if]() refers to the erase-remove idiom, while erase[_if]() doesn't. * Behavior: remove_erase[_if]() always uses the erase-remove idiom, which is suboptimal for list (see [1]) and won't compile for forward_list (which has erase_after()) and the map-like family (which has immutable keys). erase[_if]() handles the three families optimally. * Return type: remove_erase[_if]() returns Container&, while erase[_if]() returns void. (See the next question.) Boost.PtrContainer (see [3]) provides erase_if(pred) and erase_if(first, last, pred) as members returning void. (See Q17 about sub-ranges and Q18 about members.) Q16. Is it desirable for erase[_if]() to return void? A16. Yes. Note that list::remove[_if]() also returns void. Returning an iterator wouldn't be useful. c.erase(iter) and c.erase(first, last) return an iterator in order to report where the following elements end up. But after erase(c, value) and erase_if(c, pred), there's no positioning information to report. Any unwanted elements are gone, leaving only wanted elements, and the container always knows its own begin() and end(). Returning the number of erased elements wouldn't be especially useful. While associative c.erase(key) does this, all containers now have constant-time size() (except forward_list, the world's least useful container), so the number of erased elements can easily be determined by comparing size() before and after. Returning a reference to the container, like Boost's remove_erase[_if]() does, would permit function call chaining. However, the Standard currently lacks any other container-based or range-based algorithms. Even when such algorithms are present, it's far from clear that chaining *mutating* operations would be desirable. (Applying a series of non-mutating views to a range, like a reversal followed by a filter, would be less troublesome.) And we certainly don't want to encourage any repeated calls to erase[_if](). For example, erase_if(animals, is_bear_or_dog) is far more efficient than erase_if(animals, is_bear) followed by erase_if(animals, is_dog). Finally, returning void keeps our options open. If we later decide that returning the number of erased elements, or a reference to the container, would in fact be very useful, we can always change a void return type to non-void. Shipping a non-void return type in Library Fundamentals v2 would make it much harder to change in the future. Q17. Should erase[_if]() optionally take iterators, in order to allow unwanted elements to be erased from a sub-range of a container? A17. No. That is an extremely rare scenario. The Standard doesn't have to (and shouldn't!) provide helpers for every conceivable task, only the most important and common ones. Q18. What about members versus non-members? A18. The LEWG was split on this issue, slightly favoring non-members. Q19. Do you want to build a snowman? It doesn't have to be a snowman. A19. Go away, Anna. IV. Acknowledgements Thanks to Eric Niebler and Thorsten Ottosen for explaining Boost's behavior, to Hong Hong and Tony Van Eerd for their naming suggestions, to Oleg Smolsky for his questions, and to Billy O'Neal and Giovanni Dicanio for reviewing this proposal. V. References All of the Standardese citations in this proposal are to Working Paper N3936: http://www.open-std.org/jtc1/sc22/wg21/prot/14882fdis/n3936.pdf All of the Library Fundamentals v2 citations in this proposal are to Working Draft N4084: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4084.html [1] N4009 "Uniform Container Erasure" by Stephan T. Lavavej: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4009.htm [2] Boost.Range 1.56.0 documentation: http://www.boost.org/doc/libs/1_56_0/libs/range/doc/html/range/reference/algorithms/new.html [3] Boost.PtrContainer 1.56.0 documentation: http://www.boost.org/doc/libs/1_56_0/libs/ptr_container/doc/ptr_sequence_adapter.html#algorithms VI. Implementation Compiled with Visual C++ 14 (current development build, after CTP3). C:\Temp>type meow.cpp #include #include #include #include #include #include #include #include #include #include namespace std { namespace experimental { inline namespace fundamentals_v2 { // production code would be _Ugly and would avoid ADL template void erase_if(basic_string& c, Predicate pred) { c.erase(remove_if(c.begin(), c.end(), pred), c.end()); } template void erase_if(deque& c, Predicate pred) { c.erase(remove_if(c.begin(), c.end(), pred), c.end()); } template void erase_if(forward_list& c, Predicate pred) { c.remove_if(pred); } template void erase_if(list& c, Predicate pred) { c.remove_if(pred); } template void erase_if(vector& c, Predicate pred) { c.erase(remove_if(c.begin(), c.end(), pred), c.end()); } template void erase_nodes_if(Container& c, Predicate pred) { // exposition only for (auto i = c.begin(), last = c.end(); i != last; ) { if (pred(*i)) { i = c.erase(i); } else { ++i; } } } template void erase_if(map& c, Predicate pred) { erase_nodes_if(c, pred); } template void erase_if(multimap& c, Predicate pred) { erase_nodes_if(c, pred); } template void erase_if(set& c, Predicate pred) { erase_nodes_if(c, pred); } template void erase_if(multiset& c, Predicate pred) { erase_nodes_if(c, pred); } template void erase_if(unordered_map& c, Predicate pred) { erase_nodes_if(c, pred); } template void erase_if(unordered_multimap& c, Predicate pred) { erase_nodes_if(c, pred); } template void erase_if(unordered_set& c, Predicate pred) { erase_nodes_if(c, pred); } template void erase_if(unordered_multiset& c, Predicate pred) { erase_nodes_if(c, pred); } template void erase(basic_string& c, const U& value) { c.erase(remove(c.begin(), c.end(), value), c.end()); } template void erase(deque& c, const U& value) { c.erase(remove(c.begin(), c.end(), value), c.end()); } template void erase(forward_list& c, const U& value) { erase_if(c, [&](auto& elem) { return elem == value; }); } template void erase(list& c, const U& value) { erase_if(c, [&](auto& elem) { return elem == value; }); } template void erase(vector& c, const U& value) { c.erase(remove(c.begin(), c.end(), value), c.end()); } template void erase(map& c, const U& value) { erase_if(c, [&](auto& elem) { return elem == value; }); } template void erase(multimap& c, const U& value) { erase_if(c, [&](auto& elem) { return elem == value; }); } template void erase(set& c, const U& value) { erase_if(c, [&](auto& elem) { return elem == value; }); } template void erase(multiset& c, const U& value) { erase_if(c, [&](auto& elem) { return elem == value; }); } template void erase(unordered_map& c, const U& value) { erase_if(c, [&](auto& elem) { return elem == value; }); } template void erase(unordered_multimap& c, const U& value) { erase_if(c, [&](auto& elem) { return elem == value; }); } template void erase(unordered_set& c, const U& value) { erase_if(c, [&](auto& elem) { return elem == value; }); } template void erase(unordered_multiset& c, const U& value) { erase_if(c, [&](auto& elem) { return elem == value; }); } } // inline namespace fundamentals_v2 } // namespace experimental } // namespace std #include #include using namespace std; using namespace std::experimental; template void print_elem(const T& t) { cout << t; } template void print_elem(const pair& p) { cout << p.first << "/" << p.second; } template void print(const string& s, const C& c) { cout << s << ": ["; for (auto first = c.begin(), last = c.end(), i = first; i != last; ++i) { if (i != first) { cout << " "; } print_elem(*i); } cout << "]" << endl; } int main() { auto is_vowel = [](const char c) { return c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u'; }; auto is_odd = [](const int i) { return i % 2 != 0; }; auto is_odd_pair = [](const pair& p) { return p.first % 2 != 0; }; string str("cute fluffy kittens"); cout << "string before: " << str << endl; erase_if(str, is_vowel); cout << "string after: " << str << endl; deque d = { 10, 11, 12, 14, 15, 17, 18, 19 }; print("deque before", d); erase_if(d, is_odd); print("deque after", d); forward_list fl = { 10, 11, 12, 14, 15, 17, 18, 19 }; print("forward_list before", fl); erase_if(fl, is_odd); print("forward_list after", fl); list l = { 10, 11, 12, 14, 15, 17, 18, 19 }; print("list before", l); erase_if(l, is_odd); print("list after", l); vector v = { 10, 11, 12, 14, 15, 17, 18, 19 }; print("vector before", v); erase_if(v, is_odd); print("vector after", v); map m = { { 10, "A" }, { 11, "B" }, { 12, "C" }, { 14, "D" }, { 15, "E" }, { 17, "F" }, { 18, "G" }, { 19, "H" } }; print("map before", m); erase_if(m, is_odd_pair); print("map after", m); multimap mm = { { 20, "S" }, { 21, "T" }, { 22, "U" }, { 22, "V" }, { 23, "W" }, { 23, "X" }, { 24, "Y" }, { 25, "Z" } }; print("multimap before", mm); erase_if(mm, is_odd_pair); print("multimap after", mm); set s = { 10, 11, 12, 14, 15, 17, 18, 19 }; print("set before", s); erase_if(s, is_odd); print("set after", s); multiset ms = { 20, 21, 22, 22, 23, 23, 24, 25 }; print("multiset before", ms); erase_if(ms, is_odd); print("multiset after", ms); unordered_map um = { { 10, "A" }, { 11, "B" }, { 12, "C" }, { 14, "D" }, { 15, "E" }, { 17, "F" }, { 18, "G" }, { 19, "H" } }; print("unordered_map before", um); erase_if(um, is_odd_pair); print("unordered_map after", um); unordered_multimap umm = { { 20, "S" }, { 21, "T" }, { 22, "U" }, { 22, "V" }, { 23, "W" }, { 23, "X" }, { 24, "Y" }, { 25, "Z" } }; print("unordered_multimap before", umm); erase_if(umm, is_odd_pair); print("unordered_multimap after", umm); unordered_set us = { 10, 11, 12, 14, 15, 17, 18, 19 }; print("unordered_set before", us); erase_if(us, is_odd); print("unordered_set after", us); unordered_multiset ums = { 20, 21, 22, 22, 23, 23, 24, 25 }; print("unordered_multiset before", ums); erase_if(ums, is_odd); print("unordered_multiset after", ums); v = { 17, 29, 1729 }; print("vector before erasing everything", v); erase_if(v, is_odd); print("vector after erasing everything", v); v = { 256, 512, 1024 }; print("vector before erasing nothing", v); erase_if(v, is_odd); print("vector after erasing nothing", v); v.clear(); print("empty vector before", v); erase_if(v, is_odd); print("empty vector after", v); l = { 17, 29, 1729 }; print("list before erasing everything", l); erase_if(l, is_odd); print("list after erasing everything", l); l = { 256, 512, 1024 }; print("list before erasing nothing", l); erase_if(l, is_odd); print("list after erasing nothing", l); l.clear(); print("empty list before", l); erase_if(l, is_odd); print("empty list after", l); s = { 17, 29, 1729 }; print("set before erasing everything", s); erase_if(s, is_odd); print("set after erasing everything", s); s = { 256, 512, 1024 }; print("set before erasing nothing", s); erase_if(s, is_odd); print("set after erasing nothing", s); s.clear(); print("empty set before", s); erase_if(s, is_odd); print("empty set after", s); cout << boolalpha; auto is_true = [](const bool b) { return b; }; vector vb = { false, true, false, false, true, true, false, true }; print("vector before", vb); erase_if(vb, is_true); print("vector after", vb); v = { 0, 11, 0, 0, 22, 33, 0, 0, 44, 0 }; print("vector before erasing 0", v); erase(v, 0); print("vector after erasing 0", v); l = { 0, 11, 0, 0, 22, 33, 0, 0, 44, 0 }; print("list before erasing 0", l); erase(l, 0); print("list after erasing 0", l); s = { 0, 11, 22, 33, 44 }; print("set before erasing 0", s); erase(s, 0); print("set after erasing 0", s); vb = { false, true, false, false, true, true, false, true }; print("vector before erasing false", vb); erase(vb, false); print("vector after erasing false", vb); } C:\Temp>cl /EHsc /nologo /W4 /MTd meow.cpp meow.cpp C:\Temp>meow string before: cute fluffy kittens string after: ct flffy kttns deque before: [10 11 12 14 15 17 18 19] deque after: [10 12 14 18] forward_list before: [10 11 12 14 15 17 18 19] forward_list after: [10 12 14 18] list before: [10 11 12 14 15 17 18 19] list after: [10 12 14 18] vector before: [10 11 12 14 15 17 18 19] vector after: [10 12 14 18] map before: [10/A 11/B 12/C 14/D 15/E 17/F 18/G 19/H] map after: [10/A 12/C 14/D 18/G] multimap before: [20/S 21/T 22/U 22/V 23/W 23/X 24/Y 25/Z] multimap after: [20/S 22/U 22/V 24/Y] set before: [10 11 12 14 15 17 18 19] set after: [10 12 14 18] multiset before: [20 21 22 22 23 23 24 25] multiset after: [20 22 22 24] unordered_map before: [18/G 10/A 19/H 11/B 12/C 14/D 15/E 17/F] unordered_map after: [18/G 10/A 12/C 14/D] unordered_multimap before: [20/S 21/T 22/U 22/V 23/W 23/X 24/Y 25/Z] unordered_multimap after: [20/S 22/U 22/V 24/Y] unordered_set before: [18 10 19 11 12 14 15 17] unordered_set after: [18 10 12 14] unordered_multiset before: [20 21 22 22 23 23 24 25] unordered_multiset after: [20 22 22 24] vector before erasing everything: [17 29 1729] vector after erasing everything: [] vector before erasing nothing: [256 512 1024] vector after erasing nothing: [256 512 1024] empty vector before: [] empty vector after: [] list before erasing everything: [17 29 1729] list after erasing everything: [] list before erasing nothing: [256 512 1024] list after erasing nothing: [256 512 1024] empty list before: [] empty list after: [] set before erasing everything: [17 29 1729] set after erasing everything: [] set before erasing nothing: [256 512 1024] set after erasing nothing: [256 512 1024] empty set before: [] empty set after: [] vector before: [false true false false true true false true] vector after: [false false false false] vector before erasing 0: [0 11 0 0 22 33 0 0 44 0] vector after erasing 0: [11 22 33 44] list before erasing 0: [0 11 0 0 22 33 0 0 44 0] list after erasing 0: [11 22 33 44] set before erasing 0: [0 11 22 33 44] set after erasing 0: [11 22 33 44] vector before erasing false: [false true false false true true false true] vector after erasing false: [true true true true] (end)