Clarifying Memory Allocation

ISO/IEC JTC1 SC22 WG21 N3664 - 2013-04-19

Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org
Chandler Carruth, chandlerc@google.com
Richard Smith, richardsmith@google.com

Introduction
Problem
Solution
    Memory
    Data Races
Wording
    5.3.4 New [expr.new]
    5.3.5 Delete [expr.delete]
    18.6.1.4 Data races [new.delete.dataraces]
Revision History
References

Introduction

The allocation and deallocation of memory has become a significant expense in modern systems. The optimization of that process is important to good performance. However, it is important to distinguish between micro-optimization of the calls and macro-optimization of the allocation strategy. In particular, good system performance may well require adapting the allocation stragegy to the dynamic behavior of the application, or even to hints provided by the application.

Problem

As strict reading of the current C and C++ standards may lead one to conclude that the allocation strategy shall not consider any information not derivable from the sequence of new and delete expressions. In essence, the standards may exclude macro-optimization of allocation.

On the other hand, a strict reading of the standards may lead one to conclude that the implementation must make an allocation function call for each and every new expression. This reading may exclude micro-optimization of allocation.

Solution

We propose to replace existing mechanistic wording with wording more precisely focused on essential requirements. The intent is to enable behavior that some existing compilers and memory allocators already have. For example, see TCMalloc [TCM].

Memory

An essential requirement on implementations is that they deliver usable memory, not that they have a particular sequence of allocation calls. We propose to relax the allocation calls with respect to new expressions.

  1. Within certain constraints, the number of allocation calls is not part of the observable behavior of the program. This enables implementations to reduce the number of allocation calls by avoiding them or fusing them.

  2. When avoiding or fusing allocations, the amount of space requested does not exceed that implied by the new expressions, with the exception of additional padding to meet alignment constraints. This means that the amount of space allocated does not increase peak allocation.

Because C++ class-specific memory allocators are often tuned to specific class sizes, we do not apply this relaxation to those allocators.

Data Races

An essential requirement on implementations is that they be data-race free, yet the standards do not say so directly. We propose to replace the current wording with direct wording, thus explicitly enabling an implementation to consider information beyond the strict sequence of allocation and deallocation calls.

Wording

The wording in this section is relative to WG21 N3485.

5.3.4 New [expr.new]

Edit paragraph 8 as follows.

A new-expression obtains may obtain storage for the object by calling an allocation function (3.7.4.1). If the new-expression terminates by throwing an exception, it may release storage by calling a deallocation function (3.7.4.2). If the allocated type is a non-array type, the allocation function's name is operator new and the deallocation function's name is operator delete. If the allocated type is an array type, the allocation function's name is operator new[] and the deallocation function's name is operator delete[]. [Note: an implementation shall provide default definitions for the global allocation functions (3.7.4, 18.6.1.1, 18.6.1.2). A C++ program can provide alternative definitions of these functions (17.6.4.6) and/or class-specific versions (12.5). —end note]

Paragraph 9 is unchanged. It specifies allocation function lookup.

Add a new paragraph between the existing paragraphs 9 and 10.

An implementation is allowed to omit a call to a replaceable global allocation function (18.6.1.1, 18.6.1.2). When it does so, the storage is instead provided by the implementation or provided by extending the allocation of another new-expression. The implementation may extend the allocation of a new-expression e1 to provide storage for a new-expression e2 if the lifetime of the object allocated by e1 strictly contains the lifetime of the object allocated by e2, e1 and e2 would invoke the same replaceable global allocation function, and, for a throwing allocation function, exceptions in e1 and e2 would be first caught in the same handler.

Edit paragraph 10 as follows.

When a new-expression calls an allocation function and that allocation has not been extended, the A new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array. For arrays of char and unsigned char, the difference between the result of the new-expression and the address returned by the allocation function shall be an integral multiple of the strictest fundamental alignment requirement (3.11) of any object type whose size is no greater than the size of the array being created. [Note: Because allocation functions are assumed to return pointers to storage that is appropriately aligned for objects of any type with fundamental alignment, this constraint on array allocation overhead permits the common idiom of allocating character arrays into which objects of other types will later be placed. —end note]

Add a new paragraph after paragraph 10 as follows.

When a new-expression calls an allocation function and that allocation has been extended, the size parameter to the allocation call shall be no greater than the sum of the sizes for the omitted calls as specified above, plus the size for the extended call had it not been extended, plus any padding necessary to align the allocated objects within the allocated memory.

5.3.5 Delete [expr.delete]

Edit paragraph 7 as follows.

If the value of the operand of the delete-expression is not a null pointer value, then:

Otherwise, it is unspecified whether the deallocation function will be called.

[Note: The deallocation function is called regardless of whether the destructor for the object or some element of the array throws an exception. —end note]

18.6.1.4 Data races [new.delete.dataraces]

Edit paragraph 1 as follows.

For purposes of determining the existence of data races, the library versions of operator new, user replacement versions of global operator new, and the C standard library functions calloc and malloc shall behave as though they accessed and modified only the storage referenced by the return value. The , the library versions of operator delete, user replacement versions of operator delete, and the C standard library function free shall behave as though they accessed and modified only the storage referenced by their first argument. The , and the C standard library function realloc shall behave as though it accessed and modified only the storage referenced by its first argument and by its return value. shall not introduce a data race (17.6.5.9 [res.on.data.races]). Calls to these functions that allocate or deallocate a particular unit of storage shall occur in a single total order, and each such deallocation call shall happen before (1.10 [intro.multithread]) the next allocation (if any) in this order.

Revision History

This paper revises N3537 - 2013-03-12 as follows.

N3537 revised N3433 - 2012-09-23 as follows.

References

[TCM]
TCMalloc : Thread-Caching Malloc, http://goog-perftools.sourceforge.net/doc/tcmalloc.html.