This page is a snapshot from the LWG issues list, see the Library Active Issues List for more information and the meaning of NAD status.

3468. Transparent lookups in unordered containers are inconsistent

Section: 24.2.8 [unord.req] Status: NAD Submitter: Marshall Clow Opened: 2020-07-23 Last modified: 2020-08-21

Priority: Not Prioritized

View other active issues in [unord.req].

View all other issues in [unord.req].

View all issues with NAD status.

Discussion:

In C++14, we added "transparent lookups" into the ordered associative containers. This was sold as an efficiency concern, as removing the need to create temporary objects just to compare against.

However, people found clever ways to use this. One of them, in fact, was in the original paper, and was the subject of a question on Stack Overflow.

This all works because the elements in the ordered associative containers are, well, ordered.

For C++20, we added this facility to the unordered containers.

Consider the following code:

#include <unordered_set>
#include <string>
#include <iostream>

struct DumbHash // put everything in the same bucket
{
  using is_transparent = void;

  template<typename T>
  size_t operator()(const T&) const { return 0; }
};

struct CompareEQ 
{
  using is_transparent = void;

  bool operator()(const std::string& lhs, const std::string& rhs) const
  { return lhs == rhs; }

  bool operator()(const std::string& lhs, char rhs) const
  { return !lhs.empty() && (lhs[0] == rhs); }

  bool operator()(char lhs, const std::string& rhs) const
  { return !rhs.empty() && (lhs == rhs[0]); }
};

int main () 
{
  const char* one[] = {"a", "b",  "c", "d",  "e", "bb"};
  const char* two[] = {"b", "e",  "d", "bb", "c", "a"};
  const char* thr[] = {"b", "bb", "a", "c",  "d", "e"};

  typedef std::unordered_set<std::string, DumbHash, CompareEQ> MS;
  MS m1{std::begin(one), std::end(one)};
  MS m2{std::begin(two), std::end(two)};
  MS m3{std::begin(thr), std::end(thr)};

  for (const auto& s: m1) 
    std::cout << s << ' '; 
  std::cout << std::endl;
  for (const auto& s: m2) 
    std::cout << s << ' '; 
  std::cout << std::endl;
  for (const auto& s: m3) 
    std::cout << s << ' '; 
  std::cout << std::endl;

  std::cout << "m1:" << m1.count('b') << ' ';
  std::cout << "m2:" << m2.count('b') << ' ';
  std::cout << "m3:" << m3.count('b');
}

When I run this program on my Mac, I get the following output:

bb e d c b a
a c bb d e b
e d c a bb b
m1:1 m2:1 m3:2

(This is using unreleased code, but I have confirmed this with VS2019's unordered_multiset.)

This is clearly bad; three containers, containing the same elements, doing the same lookups, giving different results. This also applies to the transparent versions of find, equal_range, and contains.

The problem is that the elements in the unordered containers are only partially ordered; i.e, all the elements that are equal (according to the non-transparent version of the comparison predicate) are adjacent in the container, but are unordered relative to the other elements in the container.

My recommendation is to declare this all UB.

Suggested resolution:

Add a precondition to all of the transparent lookup functions for the unordered containers forbidding stuff like this. This should probably go in 24.2.8 [unord.req], maybe at the end of Table [tab:container.hash.req].

Precondition: The value being searched matches at most one unique key in the container.

[2020-08-21 Issue processing telecon: NAD based on reflector discussion. Status changed: New → NAD.]

Proposed resolution: