ISO: WG21/N0501 ANSI: 94-0114 Author: John Max Skaller Date: May 1994 Reply to: maxtal@suphys.physics.su.oz.au The ADDRESS-OF OPERATOR ----------------------- Executive Summary ----------------- This paper is contains two proposals for changing the C++ language -- an extension and a restriction. The restriction depends on the extension. 1) Extension -- allow the address of an rvalue to be taken. 2) Restriction (optional) -- disallow overloading unary operator&. The problem ----------- 1) It is necessary to be able to address rvalues -- in C++ class rvalues are objects, and as such are useless unless member functions can be called on them -- which requires their address be obtained to yield the 'this' pointer. It seems inconsistent to allow the address to be taken indirectly and implicitly but not explicitly: programmers do not favour being denied access and will code dangerous and non-standard workarounds. T const *cadr(T const& t) { return &t; } T *adr(T const& t) { return const_cast(&t); } T g(); cadr(g()); // obtain const pointer to rvalue adr(g()); // obtain non-const pointer to rvalue This code is simple, and can be templated. Although the const version cadr() does not permit modification without a visible cast, dangling references can still occur. The non-const version not only permits modification, but will silently provide a non-const address of a const object. 2) One of the idioms for working around this problem is use of user defined operator&. class Base { Base * operator&(){ return this; } }; class Derived : public Base {}; Base b(); &b(); // address of rvalue Derived d(); &d(); // result not specified by ARM However, it is not clear how user defined operator&, whether global or member, interacts with the built-in operator&. Failure to provide these functions in every derived class has indeterminate results. Const correct results actually require 4 functions, which is best handled by a macro: #define addressing(T) \ T* operator&(){ return this; } \ T const * operator&() const { return this; } \ T volatile * operator&() volatile { return this; } T const voldatile * operator&() const volatile { return this; } class Base { addressing(Base) }; If you want the address of the complete object, you can do that too using a virtual function with a covariant return: class Base { virtual Base * operator&() { return this; } }; class Derived { virtual Derived * operator&() { return this; } }; However these techniques are extremely dangerous and do not lead to simple, readable code. Many of the standard laws of C are broken, and without overloadable operator.() cannot be restored. 3) It seems inconsistent and needlessly complex to ban explicit rvalue addressing when the lifetime of temporary objects is quite specific and well determined. For the storage class static deletion is dangerous, but not prevented. For the storage class auto, deletion and dangling pointers are dangerous, but are not prevented. It is the philosophy of C++ to allow useful operations which are none-the-less potentially dangerous, and this precedent is clearly established for memory allocation issues. Why should temporary storage class be treated differently? int x; int *px = &x; int g(); void f() { delete px; // WOOPS int x; px = &x; delete px; // WOOPS px = *g(); delete &g(); // WOOPS } 4) The inability to explicitly address rvalues conflicts directly with a sensible idiom for parameter passing: use a reference in a constructor or function if the argument does not need to persist beyond the life of the constructor call, use a pointer if the argument must persist throughout the life of the object. Using this idiom, pointers are indicated for handle classes, but it makes sense to make temporary handles. In that case, explicit (and thus visually obvious) address taking is required, and it is easy to check the argument is being passed to a temporary. For example: template class Handle { T* delegate; public: Handle(T* t) : delegate(t) {} Handle(T& t) : delegate(&t) {} void doit() { delegate->doit(); } }; // pointer protocol prefered Handle h(&complex(1,2)); // unsafe -- handle outlasts the complex number // explicit & visible void f(Handle const&); f(Handle(&complex(1,2)); // safe -- complex number outlasts handle // reference protocol currently required Handle h(complex(1,2)); // unsafe -- handle outlasts the complex number // error typedef const complex constcomplex; Handle h(complex(1,2)); // unsafe -- handle outlasts the complex number // no diagnostic complex z; // pollute namespace Handle h(&z); // safe return h; // woops! One hopes no programmer is tempted to write: Handle(T const& t) : delegate( const_cast(&t) ) {} to get around this problem! Advantages of the changes ------------------------- 1) Simpler and more comprehensible: consistent treatment of storage classes. 2) Simpler and more comprehensible: breaches of the One Definition Rule caused by taking the address of an lvalue of incomplete type which is later determined to have a user defined address-of operator can no longer happen. 3) Explicit addressing of rvalues is detectable and warnings may be issued by quality implementations. 4) No code is broken by allowing rvalues to be addressed. 5) Disallowing overloading of operator& may break some code, but some of it is probably not well defined anyhow, and some exists only to circumvent the restriction on addressing rvalues. 6) The difficult task of deciding how user defined and built-in address-of operators interact is obviated if overloading is not permitted. 7) Non-const references may be bound to temporaries using the following idiom: T g(T&); T t; g(*&g(*&T())); The use of *& is explicit and resembles a cast. Binding non-const references to temporaries remains ill-formed, so that the silent hijacking of an rvalue is not possible. 8) The temptation to pervert library designs to use references in order to allow temporary arguments is reduced. 9) Re-establish some basic laws and symmetry between . and &. Disadvantages of the changes ---------------------------- 1) Disallowing user defined address of operator may break some code. 2) Allowing rvalues to be addressed directly permits a new source of error. However, most existing compilers issue errors which can easily be changed to warnings.