Doc. No.: WG14/N1521
Date: 2010-10-10
Reply to: Hans-J. Boehm
Phone: +1-650-857-3406
Email: Hans.Boehm@hp.com

N1521: Threads API Improvements and Issues

This is an attempt to distill some relatively small and hopefully relatively uncontroversial improvements from a more heated discussion of the C threads API. The earlier discussion involved several Austin Group members, notably David Butenhof, as well as Jim Thomas. Both David Butenhof and Jim Thomas also provided useful comments on an earlier version of this document.

We primarily address three largely independent issues: The separate initialization option to support mtx_trylock, lifetime of thrd_t values, and miscellaneous clarifications of the interaction of the API with the memory model.

We conclude with a short list of open threads issues that still require attention. One of these, the interaction of stdio, is unfortunately nontrivial.

Remove mtx_try

Currently one of the mutex types that may be or'ed together is mtx_try (See mtx_init, 7.25.4.2.) The mtx_trylock() function may only be applied to functions initialized with a combination of flags that includes mtx_try or mtx_timed. This diverges from both pthreads and Windows practice, which allow the corresponding trylock function to be applied to any mutex. The pthreads pthread_mutex_trylock() function may be applied to any pthread mutex, no matter what the type. Any Windows CRITICAL_SECTION object may be used with TryEnterCriticalSection. We are not aware of any platform on which this simplifies mutex implementation. The only reason we know of that might justify such a distinction at initialization time is that if mtx_trylock() guaranteed success when the mutex is not held, a mutex initialized without mtx_try often supports a faster mtx_lock() implementation. (The reason behind this is subtle; see Boehm, Adve, Foundations of the C++ Concurrency Memory Model, PLDI08) However the current specification does not provide such a guarantee, implementations for platforms on which the distinction would matter generally provide no such guarantee for their trylock implementations, and we do not see a convincing argument for such a guarantee. (See WG21/N3152 for more details.) Aside from the above (non)issue, we see no reason that a mutex implementation should be slower because it supports mtx_try. Thus we see no benefit associated with an explicit mtx_try mutex type. However, there are several disadvantages:
  1. It adds a needless opportunity for program bugs. We expect that most implementations would ignore mtx_try, so these bugs are likely to go undetected for long periods. Explicit error checking formtx_try in the implementation would add overhead on a critical path, and may make it harder to implement the C API as a very thin layer. Even then, it only adds a possible run-time error that could easily have been avoided with better API design.
  2. It complicates the specification by significantly increasing the number of possible mutex types.
  3. It requires programmers to remember a spurious deviation from other threads APIs, notably Posix, Windows, and java.util.concurrent.locks.

Proposed wording:

Remove the mtx_try entry from 7.25.1, the <threads.h> description.

Remove all lines in 7.25.4.2p2 that mention mtx_try.

Remove the following sentence from 7.25.4.5p2:

The mutex pointed to by mtx shall be of type mtx_try, mtx_try | mtx_recursive, mtx_timed, or mtx_timed | mtx_recursive.

Remove mtx_try from B.24, the identifies defined in threads.h.

Clarify thrd_t lifetime

It is unclear how long the thrd_t values produced by thrd_create remain valid. There are two plausible answers:
  1. Forever. thrd_t is large enough that it cannot practically wrap. 64 bits are probably enough; 128 are definitely enough. Arguably that's what the current wording requires, but I believe the requirement is way too subtle for implementors to get this right.
  2. thrd_t values can be reused once a thread exits and has been joined or detached. This is the Posix approach.

I believe there is an increasing consensus that the Posix approach is needlessly error prone. For example, when signaling a detached thread, there is a danger that the thread already exited, and its id has been reused by another thread.

However, since I view the C threads API as a thin veneer on the underlying thread library, and I would be opposed to a C threads API that did anything else, I believe compatibility with interfaces like Posix trump other considerations in this case. In addition the C threads API is narrow enough that I expect the problem cases to arise much more rarely. Thus I propose to make it clear that we are pursuing option 2.

If the committee disagrees, I believe an explicit clarification would still be called for.

Proposed wording:

In 7.25.5.1, change

... it sets the thread thr to a value that uniquely identifies the newly created thread.

to

... it sets the thread thr to a value that uniquely identifies the newly created thread. The same thr value may be reused and become associated with a different thread after both (1) the newly created thread exits and (2) the thr value has been set by a call to thr_join or detach.
(The wording with respect to thr_join or detach was copied from elsewhere. I'm not sure this is the best way to state this.)

Clarify memory model constraints

In a few cases, additional "synchronizes with" relationships are required to ensure that updates by one thread become visible to another.

Proposed wording:

In 7.25.4.4p2 and 7.25.4.5p2 (mtx_timedlock/mtx_trylock), replace

Prior calls to mtx_unlock on the same mutex shall synchronize with this operation.

with

If the operation succeeds, prior calls to mtx_unlock on the same mutex shall synchronize with this operation.

In 7.25.5.1p2, add to the description of thrd_create either:

The completion of the thrd_create call synchronizes with the beginning of the execution of func(arg).

or:

The thrd_create call performs an action a that synchronizes with the beginning of the execution of func(arg). The action a need not be sequenced after the setting of thr.

The former requires would imply that the stored thread value is visible immediately to the child thread. The thread cannot be started until it is stored. The latter formulation avoids this requirement. It appears that Posix does not impose this requirement, though that may have been an accident. The first formulation may require minor implementation changes in the underlying thread library, but is slightly friendlier to programmers.

In 7.25.5.6p2, add to the description of thrd_join:

The termination of the thread synchronizes with the thrd_join call.

A few other threads issues

Here are a few more threads-related issues that probably require attention, but for which we are not yet proposing wording:

  1. The standard needs to be much clearer on when and whether stdio functions acquire locks and can be used to communicate between threads. Synchronizes with relationships probably need to be specified. If Posix conventions are followed, and locks are always implicitly acquired, I believe that various _unlocked functions (e.g. putc_unlocked) as well as flockfile must be added to override the default behavior, which is probably less often desirable than not. The default behavior may be an order of magnitude slower than reasonable.
  2. Cross-thread longjmps may need to be explicitly prohibited.