Doc. No.: WG21/N2670
J16/08-0180
Date: 2008-06-13
Revision of: WG21/N2586
J16/08-0096
Reply to: Hans-J. Boehm Mike Spertus Clark Nelson
Phone: +1-650-857-3406
Email: Hans.Boehm@hp.com mike_spertus@symantec.com clark.nelson@intel.com

N2670: Minimal Support for Garbage Collection and Reachability-Based Leak Detection (revised)

This is a proposal to implement the "Kona garbage collection compromise", i.e. motion SP1 in N2452 and N2453. It borrows a few small pieces from the preceding garbage collection proposals, e.g. N2310.

Its purpose is to support both garbage collected implementations and reachability-based leak detectors. This is done by giving undefined behavior to programs that "hide a pointer" by, for example, xor-ing it with another value, and then later turn it back into an ordinary pointer and dereference it. Such programs may currently produce incorrect results with conservative garbage collectors, since an object referenced only by such a "hidden pointer" may be prematurely collected. For the same reason, reachability-based leak detectors may erroneously report that such programs leak memory.

Note that for programs using the quick_exit() facility ( N2440, voted into the working paper at the Kona meeting), reachability-based leak detectors are arguably the only viable form of leak detection.

For a more general discussion, and the reasons to support transparent garbage collection, please see N2585 and N2310.

Core Language Wording

Add a new section as 3.7.3.3:

3.7.3.3 Safely-derived pointers [basic.stc.dynamic.safety]

A traceable pointer object is:

A pointer value is a safely-derived pointer to a dynamic object only if it has pointer-to-object type and it is:

An integer value is an integer representation of a safely-derived pointer only if its type is at least as large as std::intptr_t and it is:

If a pointer value that is not a safely-derived pointer value is dereferenced or deallocated, and the referenced complete object is of dynamic storage duration and has not previously been declared reachable ([util.declare_reachable]), the behavior is undefined. [ Note: This is true even if the unsafely-derived pointer value might compare equal to some safely-derived pointer value. — end note ]

Change paragraph 3.9.2p3:
[...] A valid value of an object pointer type represents either the address of a byte in memory (1.7) or a null pointer (4.10). If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained. [ Note: for For instance, the address one past the end of an array (5.7) would be considered to point to an unrelated object of the array's element type that might be located at that address. There are further restrictions on pointers to objects with dynamic storage duration; see [basic.stc.dynamic.safety]. —end note ] [...]

Observations and issues:

  1. As passed by the LWG straw poll, the proposal should require that there be an implementation-dependent way to declare all dynamic storage reachable (thereby reverting to C++03 behavior) in source code. We will provide wording (which should be very simple) once attributes are voted into (or out) of C++0X.
  2. There are benefits, including some of those from the previous bullet, to having an API that indicates whether garbage collection is enabled. This was discussed in LWG, which gave feedback that the API should be proposed in a separate paper. See NXXXX for discussion and wording.
  3. This is really an inductive definition on the chain of operations used to compute a pointer value.
  4. Although this is very similar in spirit to N2310, The N2310 rules based on reachability had a fundamental problem, which is corrected here. Consider
    T *p = new ...;
    intptr_t x = reinterpret_cast<intptr_t>(p) ^ 0x555;
    a:
    T *q = reinterpret_cast<T *>(x ^ 0x555);
    T y = *q;
    
    The newly allocated object N referenced by p is always reachable by the N2310 definition. But at the label a, p is dead, and is quite likely to no longer be visible to the garbage collector, since the register containing p may well have been reused, possibly to hold x. This means that if a garbage collection occurs at point a, N may not appear to be reachable, and thus may be collected anyway.
  5. We could probably get closer to the N2310 definitions if we allowed pointers that were not safely derived to be dereferenced if a safely derived pointer to the same object is stored in a non-stack location. That seems worth considering. It would technically eliminate the need for declare_reachable(), since it could be implemented by the user. It would outlaw some kinds of dead global dead variable and dead field elimination in garbage-collected implementations. These are probably rarely practical for C++ in any case. This alternative was discussed in Bellevue, where it was agreed by straw poll in Evolution to take the approach of this proposal.
  6. Similar issues apply to declare_reachable calls. The only safe way to ensure that a pointer is always visible to the collector is to require that the argument to declare_reachable was safely derived. Thus a pointer to the object is guaranteed to be visible before the call, and the collector treats the object as being reachable after the call.

Library Wording

Add somewhere near the introduction:
Objects constructed by the standard library that may hold a user-supplied pointer value, or an integer of type intptr_t, shall store them in a traceable pointer location (see 3.7.3.3). [Note: Other libraries are strongly encouraged to do the same, since not doing so may result in accidental use of pointers that are not safely derived. Libraries that store pointers outside the user's address space should make it appear that they are stored and retrieved from a traceable pointer location. --end note]

Add, possibly between 20.6.7 and 20.6.8:


void declare_reachable( void* p )
Effects:
If p is not null, the complete object referenced by p is subsequently declared reachable (see 3.7.3.3).
Throws:
May throw std::bad_alloc if the system cannot allocate additional memory that may be required to track objects declared reachable.
Requires:
The argument p shall be a safely derived pointer.


template < typename T >
T* undeclare_reachable( T* p ) 
Returns:
A safely derived copy of p. The result will compare equal to p.
Effects:
Once the number of calls to undeclare_reachable(p) equals the number of calls to declare_reachable(p) for all non-null p referencing the argument is no longer declared reachable (see [above section]). When this happens, pointers to the object referenced by p may not be subsequently dereferenced. [Note: Since the returned pointer is safely derived, it may be used to access the referenced object, even if previously no safely derived pointer existed. -- end note]
Throws:
no exceptions
Requires:
If p is not null, declare_reachable(p) was previously called, and shall be live from the time of the call until the last undeclare_reachable(p) call on the object.

[Note: It is expected that calls to declare_reachable(p) will consume a small amount of memory until the matching call to undeclare_reachable(p) is encountered. In addition, the referenced object cannot be deallocated during this period, and garbage collecting implementations will not be able to collect the object while it is declared reachable. Long running programs should arrange that calls are matched. -- end note.]


void declare_no_pointers( char* p, size_t n )
Effects:
The n bytes starting at p no longer contain traceable pointer locations, independent of their type. Hence pointers located there may not be dereferenced if the object they point to was created by global operator new and not previously declared reachable. [Note: This may be used to inform a garbage collector or leak detector that this region of memory need not be traced.]
Throws:
Throws no exceptions. [Note: Under some conditions implementations may need to allocate memory. However the request can be ignored if memory allocation fails. -- end note]
Requires:
No bytes in the specified range may have been previously registered with declare_no_pointers(). If the specified range is in an allocated object, then it must be entirely within a single allocated object. The object must be live until the corresponding undeclare_no_pointers() call. [Note: In a garbage-collecting implementation, the fact that a region in an object is registered with declare_no_pointers() should not prevent the object from being collected. --end note]


void undeclare_no_pointers( char* p, size_t n )
Effects:
Unregisters a range registered with declare_no_pointers() for destruction. It must be called before the lifetime of the object ends.
Throws:
no exceptions
Requires:
The same range must previously have been passed to declare_no_pointers().

namespace std {
  enum class pointer_safety {
    relaxed, preferred, strict
  };
}

pointer_safety get_pointer_safety()
Returns:
Returns an enumeration value indicating the implementation's treatment of pointers that are not safely derived (See 3.7.3.3). Returns pointer_safety::relaxed if pointers that are not safely derived will be treated the same as pointers that are safely derived for the duration of the program. (See 3.7.3.3) Returns pointer_safety::preferred if pointers that are not safely derived will be treated the same as pointers that are safely derived for the duration of the program but allows the implementation to hint that it could be desirable to avoid dereferencing pointers that are not safely derived as described in 3.7.3.3. [Example: pointer_safety::preferred might be returned to detect if a leak detector is running to avoid spurious leak reports.--end example] Returns pointer_safety::strict if pointers that are not safely derived might be treated differently than pointers that are safely derived.
Throws:
no exceptions

Add to 20.6.8, between paragraphs 4 and 5:

Storage allocated directly with malloc(), calloc(), or realloc() is implicitly declared reachable (see 3.7.3.3) on allocation, ceases to be declared reachable on deallocation, and may not cease to be declared reachable as the result of an undeclare_reachable() call. [Note: This allows existing C libraries to remain unaffected by restrictions on pointers that are not safely derived, at the expense of providing far fewer garbage collection and leak detection options for malloc()-allocated objects. It also allows malloc() to be implemented with a separate allocation arena, bypassing the normal declare_reachable() implementation. The above functions should never intentionally be used as a replacement for declare_reachable(), and newly written code is strongly encouraged to treat memory allocated with these functions as though it were allocated with operator new. --end note]

Observations and issues:

  1. Currently undeclare_reachable is a template, while declare_reachable operates on a void *. This is intentional, since only the former returns a pointer. This was discussed in Bellevue in the LWG where (at least as understood by the authors) this was regarded as a reasonable approach.
  2. It is unclear whether null pointers, and pointers to memory not allocated with one of the system memory allocators should be allowed as arguments to declare_reachable. Disallowing the former seems benign, and simplifies matters a bit in this formulation. Disallowing the latter causes problems for clients who don't know where the memory came from.
  3. By a similar argument,there are reasons to believe that declare_reachable and undeclare_reachable should nest properly. A client that is passed one and wants to put it on an xor-ed list may not know whether the caller has already done so. In this formulation, it's safe to do so again.
  4. It is almost certainly safe to dereference a pointer if the object will be correctly declared reachable later even if the pointer is not safely derived. This is probably too confusing and useless to guarantee.
  5. There is some argument that declare_reachable / undeclare_reachable should be replaced by an RAII mechanism. But the simple version would fail to cover many use cases, e.g. an object declared reachable while a pointer to it resides in an xor-list container. This is designed to be a minimalist proposal, so we leave the RAII version to the programmer for now.
  6. The current no_pointers is inconsistent with the usual STL conventions. But we probably want to preserve the possibility of invoking this on a single integer field that is not part of an array. And there is probably not a standard-conforming way to compute the "one past the end" pointer for that case.
  7. Ideally declare_no_pointers should be invocable even on parts of stack allocated variables. That does complicate the implementation. Even initially, a garbage-collected implementation will have to recognize this case, even if just to ignore it. We concluded that, since a garbage collector will need to know about stack locations anyway, this is not an undue burden, though it may involve some cost.
  8. It is not clear whether declare_no_pointers needs an inverse operation, or whether that should be implicit at the end of the object lifetime. Hans prefers the latter, but is not sure how to implement it if we allow stack-allocated objects to be used. We would need a dynamic check on function return, which seems highly undesirable.
  9. One of the authors is not enthusiastic about the special treatment of malloc() allocated memory. It clearly handicaps garbage collectors or leak detectors, in that such objects cannot be any more reliably collected than they can now, and leaks involving such objects cannot be reliably identified. On the other hand, we expect that it will make straightforward implementations of garbage collection and leak detection of objects allocated using operator new (which is the normative C++ model implied by the standard) far more reliable with existing libraries. Furthermore, this is probably the only thing we can do in good conscience without an endorsement from the C committee for changing the semantics of malloc-allocated memory.
  10. Sean Parent suggests adding operator new overloads that effectively perform a declare_no_pointers() call on the entire resulting objects. Aside from convenience, this has the added benefit that it may be significantly cheaper to implement than the two separate calls, since the object can be placed appropriately. However, it is clearly less general than the separate calls, since it cannot be applied to statically allocated storage, stack allocated storage, or a piece of a heap-allocated object. The authors strongly approve of this addition. However, it superficially makes the interface much wider than what had been discussed earlier, since it involves many new function overloads. Hence we look for guidance from LWG before adding these.

Observations on Implementations

An implementation that does not support garbage collection and implements all library-calls described here as no-ops is conforming. Hence a minimal implementation is trivial. A full implementation supporting either garbage collection or more accurate leak detection requires the following additions. We are in the process of implementing the latter two, and hope to have the implementation available shortly:
  1. It must ensure that for programs obeying the restriction on safely-derived pointers are compiled such that any safely derived pointer remains visible to the garbage collector so long as it might still be dereferenced or used in a deallocation. Existing compilers already have this property for nearly all programs. Existing conservative collectors rely on it. A more careful discussion, including some performance results from a simple implementation that guarantees this property, can be found in Boehm, "Simple Garbage-Collector-Safety", PLDI 1996.
  2. declare_reachable() and undeclare_reachable() can be implemented by simply maintaining a multiset of declared reachable pointers, and making sure that this multiset is visible to the garbage collector.
  3. Efficient implementations of declare_no_pointers() are more challenging, but feasible. Our existing conservative collectors instead implement a facility to provide similar information when an object is allocated or, through a separate API, for roots. The interface proposed here requires the collector to distinguish between various cases, including ranges within stack-allocated objects, and does not allow the collector to segregate objects based on this kind of information. Of course the API proposed here does have the major advantage that it allows pointer layout information to be specified in the constructor instead of at allocation time, which is much more consistent with the usual C++ model.

    Our current implementation strategy is to have declare_no_pointers() simply record its arguments in a data structure that allows efficient lookup of these ranges by address. When an address range is scanned, this data structure can be consulted. For large address ranges the added cost should be minimal, since it is amortized by other scanning overhead. We expect that when scanning small ranges during garbage collection in production code, it may be too expensive to always consult this data structure. In the initial implementation, we will ignore declare_no_pointers calls made on small objects. A better long term strategy would be, for example, to consider the information only on every nth garbage collection. This might cause objects accidentally "referenced" by pointers in such regions to be temporarily retained in spite of the declare_no_pointers calls. But they would not be retained for an unbounded period of time.