C issue 0470: mtx_trylock should be allowed to fail spuriously

Issue 0470: mtx_trylock should be allowed to fail spuriously

This issue has been automatically converted from the original issue lists and some formatting may not have been preserved.

Authors: WG14, Torvald Riegel, Hans Boehm
Date: 2014-10-10
Reference document: N1882
Submitted against: C11 / C17
Status: Fixed
Fixed in: C17
Converted from: n2396.htm

Summary

C11 does not appear to allow mtx_trylock to fail spuriously (i.e., return thrd_busy even thought the lock was not acquired, yet eventually acquire the lock if it is not acquired by any thread), but C++11 does (see 30.4.1.1/16):

An implementation may fail to obtain the lock even if it is not held by any other thread. [ Note: This spurious failure is normally uncommon, but allows interesting implementations based on a simple compare and exchange (Clause 29). -- end note ] An implementation should ensure that try_lock() does not consistently return false in the absence of contending mutex acquisitions.

It might be better to point out explicitly that programmers should treat mtx_trylock as if spurious failure were allowed, since the memory model is intentionally too weak to support correct reasoning that is based on a return value of thrd_busy. There has been debate on this issue, and we would prefer the standard to be clearer. Consider the following example:

Thread 1:
  v1 = 1;
  mtx_lock(l1);

Thread 2:
  r1 = mtx_trylock(l1);
  while (r1 == thrd_success /* was unlocked */) {
    unlock(l1);
    r1 = mtx_trylock(l1);
  }
  r2 = v1;
  out(r2);

This program is not data-race-free according to C11, independently of whether mtx_trylock is allowed to fail spuriously or not; the happens-before-based definition of a data race and the current specification of synchronizes-with relations between mutex operations makes it clear that the program above has a data race on v1.

However, if spurious failures are not allowed, an intuitive understanding of the memory model in the sense that everything will appear to be sequentially consistent if only locks are used to synchronize does not hold anymore. The intuitive understanding would make the program above correct; in particular the store to v1 by the first thread would be expected to "happen before" the load from v1 by the second thread.

Therefore, to make an intuitive understanding of the C11 memory model and locks match the actual specification, it would be helpful to point out that programmers should assume mtx_trylock to fail spuriously. Otherwise, without spurious failure, we have cases like the example above in which two operations race according to the specification in spite of the fact that they intuitively can't execute at the same time.

Allowing spurious failures does not affect the typical uses of mtx_trylock, for example to acquire several locks without risk of deadlock. It does rule out uses like the example above, however, in which locks are attempted to be used as a replacement for atomics.

(Note that we are not arguing for specifying that mtx_lock should synchronize with a mtx_trylock that returns thrd_busy. This would make the implementation of lock acquisition less efficient on architectures such as ARM or PowerPC. In particular, an atomic_compare_exchange or similar that transitions the lock's state from not acquired to acquired would have to use memory_order_acq_rel instead of memory_order_acquire.)

Suggested Technical Corrigendum

It seems that the normative specification already states the preferred semantics, although the return value specification for thrd_busy may make readers believe that this return code allows one to infer a certain ordering (see the example above).

We propose to add a clarifying note at an appropriate place (e.g., in 7.26.4.5p3):

Programmers should treat mtx_trylock as if spurious failures were allowed; the memory model is intentionally too weak to support reasoning based on a return value of thrd_busy.

Comment from WG14 on 2017-11-03:

Oct 2014 meeting

Committee Discussion

A spurious failure can occur on PPC/ARM style architectures if, after the load-word-and-reserve instruction is issued the operating system schedules the task out, and upon resumption the corresponding store-word fails because the reservation is lost, even if the lock is unlocked. This failure can be seen by mtx_trylock if it is implemented with atomic_compare_exchange_weak where this failure can occur.

A new paper was solicited to extract the corresponding words from C++ so as to keep the two standards as close as possible in this area.

Apr 2015 meeting

Committee Discussion

The paper N1922 was presented and adopted with an editorial improvement.

Proposed Technical Corrigendum

In 7.26.4.5 replace paragraph 3

The mtx_trylock function returns thrd_success on success, or thrd_busy if the resource requested is already in use, or thrd_error if the request could not be honored.

with

The mtx_trylock function returns thrd_success on success, or thrd_busy if the resource requested is already in use, or thrd_error if the request could not be honored. mtx_trylock may spuriously fail to lock an unused resource, in which case it shall return thrd_busy.