|Revision of:||WG21/N3040 = J16/10-0030|
|Reply to:||Hans-J. Boehm|
LWG 1151 (US 63) points out that we had made only an incomplete attempt at dealing with threading issues throughout the library. This is an attempt to improve matters by addressing as many of the remaining issues as we are currently aware of. Based on experience with other languages, it seems likely that more issues will arise in this area. But it appears to be getting harder to find the holes. This relies on discussions with many people, including Lawrence Crowl, Howard Hinnant, Peter Dimov, Nick Maclaren, and Paul McKenney. The last section, including the solution, was mostly copied from LWG issue 1329, which was authored by Jeffrey Yaskin.
27.4p4 [iostreams.objects] currently specifies:
Concurrent access to a synchronized (220.127.116.11) standard iostream object’s formatted and unformatted input (18.104.22.168) and output (22.214.171.124) functions or a standard C stream by multiple threads shall not result in a data race (1.10). [ Note: users must still synchronize concurrent use of these objects and streams by multiple threads if they wish to avoid interleaved characters. -end note ]
This raises the question of whether I can use iostreams to synchronize threads. If I set X to 1 in thread 1, write to a stream f, and then read the written value, or one that depends on it in thread 2, is thread 2 guaranteed to see the write to X (or a later one)?
The above paragraph appears to apply only to the standard iostreams, making the answer to this question a bit unclear, since access to other streams already produces a data race, avoiding the question. Based on discussions in the concurrency group, there still appear to be fairly esoteric way to use IO for thread communication, either by rebinding the standard streams to files, or through the C library. Hence we probably do need to address the issue, in spite of statements to N3040 to the contrary.
We should be consistent with the "sequential consistency for data-race-free programs" rule. Anything that does not introduce a race should ensure the appropriate happens-before relationship unless weaker memory ordering is explicitly specified. Thus, in the above example case, thread 2 should be guaranteed to see the write to X.
Insert a second paragraph into [iostreams.threadsafety], after 27.2.3p1:
A number of standard container member functions are treated as const for race detection, in spite of the fact that they are not const functions. The at() function was inadvertently omitted from this list.
During LWG discussions, Alisdair pointed out that "unordered associative containers" are not "associative containers", and should be listed separately.
Edit 23.2.2p1 [container.requirements.dataraces] as follows:
For purposes of avoiding data races (126.96.36.199), implementations shall consider the following functions to be const: begin, end, rbegin, rend, front, back, data, find, lower_bound, upper_bound, equal_range and, except in associative containers, operator.
The current spec is not very clear what it means for an "atomic read-modify-write" operation to be "atomic". Clarification:
Before 29.3p11[atomics.order] add a paragraph:
The current draft is unclear when iterator operations may conflict with accesses to other iterators or to the underlying container.
Add the following paragraph after 188.8.131.52p5 [res.on.data.races]:
There was a question about whether the race behavior of algorithms such as equal() is sufficiently described. Equal() appears to be adequately described, since it uses InputIterators, implying that, according to 184.108.40.206p5 [res.on.data.races] it is not allowed to update anything through the iterators. Unfortunately, many of the algorithms use ForwardIterators only for input, which means they are underspecified. An implementation currently could read and write back the objects referenced by the iterators, which is clearly unacceptable.
Add after 25.1p3 [algorithms.general]:
Howard Hinnant points out that this is not as clear as it should be without more context. I'm also not sure that it is sufficient to deal with something like nth_element(), which has a specification that already seems less clear than I would like about what is being modified. But it was decided that this is a clear improvement over what we have.
This arose during discussion of LWG 1218. LWG 1221 is also related.
I think we all agree that constructors and destructors of all library objects, including synchronization objects like mutexes, are not protected against data races. It is the programmer's responsibility to ensure that construction happens before any other use, and any other use happens before destruction.
The one exception here seem to be the somewhat more liberal condition variable rules, which allow a wait to complete after a condition variable has been destroyed.
There seems to be some agreement that the standard mostly says all of this already. In particular, the object lifetime rules in 3.8 seem to essentially state this, assuming 3.8 is interpreted as referring to happens before (1.10) ordering. Unfortunately, that's not as clear as it should be.
I believe the CWG needs to make a change of adding something like the following at the beginning of 3.8 [basic.life]. This will become a CWG issue. We are NOT requesting that this be added as part of this proposal. However we will assume an interpretation along these lines, and believe that the general intent here is uncontroversial. Possible addition to 3.8:
All statements about the ordering of evaluations in this section, using words like "before", "after", and "during", refer to the happens before order defined in 1.10[intro.multithread]. [Note: We ignore situations in which evaluations are unordered by happens before, since these require a data race (1.10)[intro.multithread], which already results in undefined behavior --end note]
We do propose to make the following related clarification to the library description:
Add the following as a second paragraph to 220.127.116.11 [res.on.objects]:
I believe condition variables are the only intended exception to this. However, the current precondition, as modified by LWG 1221, does not make this clear. It states "There shall be no thread blocked on *this", which is entirely redundant with the default lifetime rules, which would require that any wait calls happen before destruction.
Add after 30.5.1p4 [thread.condition.condvar], whether or not updated by LWG1221, and after 30.5.2p3:
The common implementation of vector<bool> is as an unsynchronized bitfield. The addition of 23.2.2 [container.requirements.dataraces]/2 would require either a change in representation or a change in access synchronization, both of which are undesireable with respect to compatibility and performance.
The bitset has a conceptually similar issue. Unfortunately, it appears easier to resolve a bit differently.
Modify 23.2.2 [container.requirements.dataraces]:
Edit paragraph 2 as follows:
2 Notwithstanding (18.104.22.168), implementations are required to avoid data races when the contents of the contained object in different elements in the same sequence are modified concurrently.
Edit paragraph 3 as follows:
3 [Note: For a
vector<int> xwith a size greater than one,
x = 5and
*x.begin() = 10can be executed concurrently without a data race, but
x = 5and
*x.begin() = 10executed concurrently may result in a data race. —end note]
Add at the end of the returns clause for bitset operator (20.5.2p57 [bitset.members]):