Doc. no.   N2669=08-0179
Date:        2008-06-13
Project:     Programming Language C++
Reply to:   Beman Dawes <bdawes at acm.org>
                Peter Dimov <pdimov at pdimov.com>
                Herb Sutter <hsutter at microsoft.com>
                Hans Boehm <Hans.Boehm at hp.com>
                Lawrence Crowl <crowl at google.com>
                Paul E. McKenney <paulmck at linux.vnet.ibm.com>
                Jeffrey Yasskin <jyasskin at google.com>

Thread-Safety in the Standard Library (Rev 2)

Executive summary
Introduction
Rationale
Existing practice
Proposed wording
Revision history
References

Executive summary

Unless otherwise specified, standard library classes may safely be instantiated from multiple threads and standard library functions are reentrant, but non-const use of objects of standard library types is not safe if shared between threads. Use of non-constness to determine thread-safety requirements ensures consistent thread-safety specifications without having to add additional wording to each and every standard library type.

Introduction

With the introduction of multi-threading into the C++ standard, the contract between standard library users and implementers needs to explicitly state the conditions under which standard library components are or are not thread-safe.

Rationale

The objective is to offer users of the standard library as much thread-safety as is possible without impacting performance or creating an illusion of thread-safety where none exists.

Basic thread-safety guarantee

The basic thread-safety guarantee would be that standard library functions are required to be reentrant, and non-mutating uses of objects of standard library types are required to not introduce data races. This has little or no impact on performance. It does actually deliver the promised safety. Thus this basic thread-safety guarantee is required of implementations.

Strong thread-safety guarantee

The strong thread-safety guarantee would be that mutating uses of objects of standard library types are required to not introduce data races. This would have a severe negative impact on performance. Furthermore, real safety often requires locking across several member function calls, so providing per function-call locking would create a illusion of safety that did in fact not exist. For these reasons, a blanket strong thread-safety guarantee for mutating shared objects is not provided, and constraints are put on programs accordingly.

Meaning of thread-safety

The proposed wording talks in terms of data races and expression evaluation conflicts. Data races and expression evaluation conflicts are defined in the core language portion of the standard, so do not need to be further described in the library clauses.

Standard library components not given strong guarantee

Consideration was given to specifying rand function and the global locale objects on a per-thread basis. That is not required because it does not represent existing practice. Mac OS X, for example, does not support per-thread global locale objects.

Existing practice

As far as is known, the proposed wording reflects existing practice in current implementations of the standard library.

Proposed Wording

Add a new paragraph after 17.4 Library-wide requirements [requirements] paragraph 1:

Requirements specified in terms of interactions between threads do not apply to programs having only a single thread of execution.

Change 17.4.4 Conforming implementations [conforming] as indicated:

This subclause describes the constraints upon, and latitude of, implementations of the C++ Standard library. The following subclauses describe an implementation’s use of headers (17.4.4.1), macros (17.4.4.2), global functions (17.4.4.3), member functions (17.4.4.4), ~~reentrancy~~ data race avoidance (17.4.4.5), access specifiers (17.4.4.6), class derivation (17.4.4.7), and exceptions (17.4.4.8).

Change 17.4.4.5 Reentrancy [reentrancy] from:

17.4.4.5 Reentrancy [reentrancy]

Which of the functions in the C++ Standard Library are not reentrant subroutines is implementation-defined.

To:

17.4.4.5 Data race avoidance [res.on.data.races]

This subclause specifies requirements implementations shall meet to prevent data races ([intro.multithread]). Each requirement applies to all standard library functions unless otherwise specified. Implementations are permitted but not required to prevent data races in cases other than those specified below.

C++ Standard Library functions shall be reentrant subroutines.

A C++ Standard library function shall not directly or indirectly access objects ([intro.multithread]) accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's arguments, including this.

A C++ Standard library function shall not directly or indirectly modify objects ([intro.multithread]) accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's non-const arguments, including this.

[Note: This means, for example, that implementations can't use a static object for internal purposes without synchronization because it could cause a data race even in programs that do not explicitly share objects between threads. --end note]

Implementations are permitted to share their own internal objects between threads if the objects are not visible to users and are protected against data races.

Unless otherwise specified, C++ Standard Library functions shall perform all operations with effects visible ([intro.multithread]) to users solely within the current thread.

[Note: This allows implementations to parallelize operations if there are no visible side effects. --end note]

Somewhere in [constraints] add a constraint on programs:

It is undefined behavior if calls to standard library functions from different threads:

share access to an object directly or indirectly via their arguments, including this, and

at least one of the arguments accessing a shared object is non-const, and

one call does not happen before the other ([intro.multithread]).

[Note: This prohibition against concurrent non-const access means that modifying an object of a standard library type shared between threads without using a locking mechanism may result in a data race. --end note]

To 18.5.1 Storage allocation and deallocation [new.delete], add:

The library versions of operator new and delete, user replacement versions of global replacement operator new and delete, and the Standard C library functions calloc, malloc, realloc, and free shall not introduce data races ([intro.multithread]) as a result of concurrent calls from different threads. Calls to these functions that allocate or deallocate a particular unit of storage shall occur in a single total order, and each such deallocation call shall happen before the next allocation (if any) in this order.

To 19.3 Error numbers [errno] paragraph 1, add:

A separate errno value shall be provided for each thread.

To 20.6.1 The default allocator [default.allocator], add:

Except for the destructor, member functions of the default allocator shall not introduce data races ([intro.multithread]) as a result of concurrent calls to default allocator object's member functions from different threads. Calls to these functions that allocate or deallocate a particular unit of storage shall occur in a single total order, and each such deallocation call shall happen before the next allocation (if any) in this order.

To 20.7 Date and Time [date.time], add:

Functions asctime, ctime, gmtime, and localtime are not required to avoid data races ([res.on.data.races]).

To 21.4 Null-terminated sequence utilities [c.strings], add:

Functions strerror and strtok are not required to avoid data races ([res.on.data.races]).

To 22.1.1 Class locale [locale], add a new paragraph at the end:

Whether there is one global locale object for the entire program or one global locale object per thread is implementation defined. Implementations are encouraged but not required to provide one global locale object per thread. If there is a single global locale object for the entire program, implementations are not required to avoid data races on it ([res.on.data.races]).

At a location in 23 to be determined by the project editor, add a new paragraph:

For purposes of avoiding data races ([res.on.data.races]), implementations shall consider the following functions to be const:begin, end, rbegin, rend, front, back, data, find, lower_bound, upper_bound, equal_range, and, except in associative containers, operator[].

Notwithstanding ([res.on.data.races]), implementations are required to avoid data races when the contents of the contained object in different elements in the same sequence are modified concurrently.

[Example: For a vector<int> xwith a size greater than one, x[1] = 5 and *x.begin() = 10 can be executed concurrently without a data race, but x[0] = 5 and *x.begin() = 10 executed concurrently may result in a data race. -- end example]

Change 26.7 C Library [c.math] paragraph 6 as indicated:

The rand function has the semantics specified in the C standard, except that the implementation may specify that particular library functions may call rand. It is implementation defined whether or not the rand function may introduce data races ([res.on.thread.safety]).

[Note: The random number generation ([rand]) facilities in this standard are often preferable to rand. --end note]

Revision history

N2669 - Revision 2:

Strike "or other undesirable behavior". Data races are the only case we can thing of, so mention them only.
Include user supplied global operator new and delete, and the C library memory allocation functions in the prohibition against allocation data races.
Specify when library implementations may or may not multithread.
Acknowledge N2519.
Add additional authors.
Numerous other changes.

N2410 - Revision 1:

Provided Executive summary.
Simplified wording.
Added wording to cover both direct (i.e. shallow) and indirect (i.e. deep) access.
Removed mention of deadlocks as unnecessary.
Added expression evaluation conflict wording as more precise, and well defined by the Multi-threaded executions and data races wording from N2334.
Added strong exception safety for the default allocator.
Struck *_r functions from proposal.

N2298 - Initial version.

References

N2429, Concurrency memory model (final revision), Clark Nelson and Hans-J. Boehm

N2480, A Less Formal Explanation of the Proposed C++ Concurrency Memory Model, Hans-J. Boehm

N2519, Library thread-safety from a user's point of view, with wording, Jeffrey Yasskin

N1947, The Memory Model and the C++ Library, Non-Memory Actions etc., Nick Maclaren