Smart Pointer Comparison Operators

Introduction

As pointed out in Technical Report Issue 2.3, the comparison operators of shared_ptr may, in some cases, violate the usual

(p == q) == (!(p < q) && !(q < p))

relationship. Specifically, the value-based p == q and the ownership-based p < q may both hold, when p and q both store NULL or have a custom deleter that doesn't delete. The reverse is also true; p == q, p < q and q < p may all yield false, when p and q share ownership but point to different subobjects.

This behavior can be considered surprising and counter-intuitive. The rest of this paper attempts to explain why it was chosen.

weak_ptr::operator<

Since a weak_ptr can expire at any time, it is not possible to order weak pointers based on their value. Accessing the value of a deleted pointer invokes undefined behavior, and reinterpret_cast tricks are no good, either, as the same pointer value can be reused by the next new expression. Using p.lock().get() for ordering is similarly flawed, as this implies that the value of p < q may change when p or q expires. If p and q are members of a std::set, this will break its invariant.

The only practical alternative is to order weak pointers by the address of their control block, as the current specification effectively demands.

A related question is whether weak_ptr needs an ordering at all. The answer is that sets and maps with weak_ptr keys are a very convenient way to non-intrusively associate arbitrary data with an object managed by a shared_ptr. One example is an application that has access to some arbitrary collection or graph of shared_ptr instances and needs to visit every object exactly once. A set< weak_ptr<void> > can be used to record whether an object has been visited:

set< weak_ptr<void> > s;

for( each shared_ptr instance p )
{
   if( s.count(p) )
   {
      // already visited
   }
   else
   {
      s.insert(p);
      visit(p);
   }
}

A slight variation is to assign identifiers to objects:

map< weak_ptr<void>, int > m;

for( each shared_ptr instance p )
{
   map< weak_ptr<void>, int >::iterator i = m.find(p);

   if( i != m.end() )
   {
      emit_reference(i->second);
   }
   else
   {
      m[p] = m.size() + 1;
      emit_object(*p);
   }
}

Another example is an application that needs to determine whether a collection of shared_ptr instances is "self contained", or whether there are external shared_ptr variables that reference some of the objects. The solution is to use map< weak_ptr<void>, long > to manually recreate the use counts of the objects, and then compare the results with use_count(). Objects with external references can now be recognized by their higher use_count():

map< weak_ptr<void>, long > m;

for( each shared_ptr instance p )
{
   ++m[p];
}

for( map< weak_ptr<void>, long >::iterator i = m.begin(); i != m.end(); ++i )
{
   if( i->first.use_count() > i->second )
   {
      // (i->first.use_count() - i->second) external references
   }
}

This can be used to detect cyclic structures, or as a test for use_count().

shared_ptr::operator<

For shared_ptr, there are two possible candidates for a natural ordering, one ownership-based and consistent with weak_ptr::operator<, the other value-based and equivalent to p.get() < q.get().

The value-based alternative considers all shared_ptr instances that store NULL equivalent. It also handles shared pointers to statically allocated objects (with a null deleter) better. For example, given a statically allocated object x of class X, shared_ptr<X>(&x, null_deleter()) will be equivalent to another shared_ptr<X>(&x, null_deleter()).

The ownership-based ordering, on the other hand, is consistent with weak_ptr, and considers two shared_ptr copies equivalent, even when their pointer values, as returned by get(), differ. This is typically the case when two shared_ptr<void> instances point to different subobjects of the same object, or when an object inherits the same interface class I multiply, but not virtually. A std::set< shared_ptr<void> > or a std::set< shared_ptr<I> > will now be able to correctly identify and reject duplicates.

The second option was chosen for the proposal for the following reasons:

Consistency with weak_ptr.
Sets of shared pointers typically contain objects and NULL is uncommon.
A trivial workaround exists for the "shared_ptr to static object" problem, namely, instead of constructing a shared_ptr<X>(&x, null_deleter()) on demand, keep it a static instance, alongside x itself. Now all shared pointers to x can be copies of this "master shared_ptr" and will compare equivalent with respect to operator<.
A programmer can use a predicate that returns p.get() < q.get(), if this is the desired behavior. This predicate is easy to write even for inexperienced programmers.

shared_ptr::operator==

Given the above operator< definitions, it seems reasonable to not provide an operator== at all, to avoid confusion. This is already the case with weak_ptr. However, leaving out operator== entirely for shared_ptr presents a problem.

Because of the conversion to "unspecified-bool-type" (pointer to member in a typical implementation), the expression p == q would compile and have the semantics of (p.get() != 0) == (q.get() != 0), which is undesirable. To suppress this behavior, we must provide some kind of equality comparison.

One option could be to "poison" the operator (as done with std::tr1::function<S>):

  template<class T, class U> void operator==(shared_ptr<T>, shared_ptr<U>); // never defined

(An even better alternative is to define shared_ptr::operator== as a private member.)

However, going to such great lengths to disable equality comparisons, when an intuitive definition of p == q exists (p.get() == q.get()) seems a bit excessive. It is arguably better to supply two comparison operators, p == q and p < q, that are both useful in isolation, and just educate users about the nonobvious relationship between them.

The alternative p == q definition, !(p < q) && !(q < p), remains unexplored at this point.

--end