Doc. No.:	WG21/N3787
Revision of:	WG21/N3633
Date:	2013-10-14
Reply to:	Hans-J. Boehm
Phone:	+1-650-857-3406
Email:	Hans.Boehm@hp.com

N3787: What can signal handlers do? (CWG 1441)

This is an attempt to summarize the current state of discussions around CWG issue 1441. Much of this discussion has occurred within SG1, and it has moved significantly past the original issue, so it seemed appropriate to turn it into a separate paper. This attempts to reflect the contributions of many people, especially Lawrence Crowl, Jens Maurer, Clark Nelson, and Detlef Vollmann.

This version is a minor revision of N3633. It reflects some changes due to SG1 concerns in Chicago, and reflects the easiest-to-address comments from the CWG discussion in Chicago. It does not reflect all of the latter. More discussion and almost certainly another paper revision is required.

Background

CWG Issue 1441 points out that in the process of relaxing the restrictions on asynchronous signal handlers to allow use of atomics, we inadvertently made it impossible to use even local variables of non-volatile, non-atomic type. As a result of an initial discussion within CWG, Jens Maurer generated a proposed resolution, which addresses that specific issue.

Pre-Bristol discussion in SG1, both in Portland and during the February 2013 SG1 teleconference, raised a number of additional issues. Both Jens' solution and all prior versions of the standard still give undefined behavior to code involving signal handlers which we believe should clearly be legal. Our goal was to correct such oversights, and allow some realistic signal handlers to be portable, while preserving a significant amount of implementation freedom with respect to what is allowable in a signal handler. In particular, we do not want to reinvent Posix' notion of async-signal-safe functions here.

This issue was revisited as part of the concurrency group (SG1) meetings in Bristol. After some initial debate, that discussion concluded with several straw polls reaffirming that we want lock-free atomics to be usable in signal handlers, that signal handlers should be able to read data written before the handler installation, and that accesses to ordinary variables in signal handlers should be OK, so long as there are happens-before relationships separating such code from mainline code accesses. Some of us remember this as the original intent behind the C++11 changes, but recollections vary.

In spite of reaffirming the intent of this paper, the changes below were not brought forward as a change to the C++14 working paper, since there was a feeling we should take more time to consider alternate approaches to the wording, and that input from WG14 would be desirable.

My current hope is that something along the lines of this proposal, but probably not precisely these words, will find its way into the draft.

Proposed resolution and discussion

We give several proposed changes and summarize the reasoning behind the change as well as some of the past discussion:

Replace 1.9p6, the paragraph imposing the restrictions on signal handlers

Replace 1.9p6 [intro.execution]:

~~When the processing of the abstract machine is interrupted by receipt of a signal, the values of objects which are neither~~

~~of type volatile std::sig_atomic_t nor~~
~~lock-free atomic objects (29.4)~~
~~are unspecified during the execution of the signal handler, and the value of any object not in either of these two categories that is modified by the handler becomes undefined.~~

with

If a signal handler is executed as a result of a call to the raise function, then the execution of the handler is sequenced after the invocation of the raise function and before its return. When a signal is received for another reason, the execution of the signal handler is unsequenced with respect to the rest of the program.

The original restriction would now be expressed elsewhere (see below) in terms of data races. This means that signal handlers can now access variables also accessed in mainline code, so long as the required happens-before orders are established.

We concluded during the February discussion that the old "interrupted by a signal" phrase referred to an asynchronous signal, and was basically OK. But after reading the C standard I'm not sure, and it makes sense to me to be more explicit. This is my latest attempt to do so.

Weaken the restriction on unsequenced operations

Change one sentence in 1.9p15 [intro.execution]

If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, and they are not potentially concurrent (1.10, intro.thread), the behavior is undefined. [Note: The next section imposes similar, but more complex restrictions on potentially concurrent computations. -- end note]

Discussion:

This is a delicate area. Asynchronous signal handlers are unsequenced. If we said nothing else, even atomic operations in mainline code and the signal handler might introduce undefined behavior. We don't want that.

This is not an issue for regular unsequenced expressions. Consider the question of whether the following is legal:

{ atomic<int *>p = 0; int i; (i = 17, p = &i, 1) + (p? *p : 0); }

After some false starts, we concluded in the February phone call that the answer is yes, for reasons having more to do with function calls in expressions than atomics. The store to p and the initial test of p are indeterminately sequenced. If the latter occurs first, the potentially unsequenced access to *p doesn't occur. In the other case, the store to i is sequenced before the store to p, which is sequenced before the test on p, which is sequenced before the questionable load from *p. This again relies heavily on the fact that atomic operations are function calls in C++. The situation in C is unfortunately different.

In spite of earlier contradictory conclusions, there are however strong reasons to treat unsequenced expressions differently from data races in signal handlers. These have to do with weaker memory orders. Consider the following example:

Mainline code: x = 1; y.store(1, memory_order_mo1);

Signal handler: if (y.load(memory_order_mo2)) tmp = x;

This should or should not result in undefined behavior, depending on mo1 and mo2. I don't think this is expressible without relying on happens-before.

Fortunately, I think this doesn't apply within expressions:

(x = 1, y.store(1, memory_order_relaxed), 0) + (y.load(memory_order_relaxed)? x : 1)

(all variables initially zero as usual) must return 1. A compiler that violates this by reordering the initial two stores and performing the y.load() in the middle is broken. (At least so we claim with only mild uncertainty.)

Thus the restriction on unsequenced operations should apply only to code that may not run concurrently. For code that may run concurrently (threads and signal handlers) we need the happens-before-based notion of data races that reflects memory_order specifications.

Expand the discussion of data races to cover signal handler invocations we want to prohibit

Change the normative part of 1.10p21 [intro.multithread] as follows:

Two actions are potentially concurrent if they are performed by different threads, or if at least one is performed by a signal handler, they are unsequenced, and they are not both performed by the same signal handler invocation. The execution of a program contains a data race if it contains two potentially concurrent conflicting actions ~~in different threads~~, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior. As a special exception to the preceding rule, two accesses to the same object of type volatile sig_atomic_t cannot result in a data race if both occur in the same thread t (with one occurring in a signal handler). The evaluations of such volatile sig_atomic_t objects take values as though the execution of the signal handler had been sequenced at one particular point, consistent with the actual happens-before ordering, during the execution of t.

Discussion:

By the above reasoning, we need to give signal handlers the same data-race-based treatment as threads. Memory_order specifications must be respected in determining whether there is undefined behavior.

There was some discussion during the February phone call as to whether we should view signal handlers as being performed by a specific thread at all, and I think we were moving towards removing that notion. A signal handler probably cannot portably tell which thread it's running on. But after thinking about this more, I don't know how to reconcile this change with atomic_signal_fence, so I am once again inclined to leave things more like they are.

These changes should now have the effect of allowing full atomics to be used in communicating with a signal handler. I can now allocate an object, assign it to an atomic pointer variable, and have a signal handler access the non-atomic objects through that variable, just as another thread could. Since signal handlers obey strictly more scheduling constraints than threads, I think this is entirely expected, and what we had in mind all the time.

Ensure that signal handler invocation happens after signal handler installation

Insert in 18.10 [support.runtime] after p7:

The function signal defined in <csignal> shall ensure that a call to signal synchronizes with any resulting invocation of the newly installed signal handler.

Discussion:

This is necessary to allow signal handlers to access data that is read-only after installation of the handler. I expect this happens all the time already.

Note that 29.8p6 already talks about synchronizes-with relationships between a thread and a signal handler in the same thread, so I don't think this is a very fundamental change in perspective.

Clarify which C-like functions can be used in a signal handler

Change 18.10 [support.runtime] p9 as follows:

The common subset of the C and C++ languages consists of all declarations, definitions, and expressions that may appear in a well formed C++ program and also in a conforming C program. A plain lock-free atomic operation is an invocation of a function f from clause 29, such that f is not a member function, and either f is the function is_lock_free, or for any atomic argument a passed to f, is_lock_free(a) yields true. A POF ("plain old function") is a function that uses only features from this common subset, and that does not directly or indirectly use any function that is not a POF, except that it may use ~~functions defined in Clause 29 that are not member functions~~ plain lock-free atomic operations. All signal handlers shall have C linkage. ~~A POF that could be used as a signal handler in a conforming C program does not produce undefined behavior when used as a signal handler in a C++ program.~~ The behavior of any ~~other~~ function other than a POF used as a signal handler in a C++ program is implementation-defined.

Discussion:

Since we currently refer to C99 as the base document and C99 does not support thread_local, this somewhat accidentally prohibits use of thread_local in signal handlers. Discussion in Bristol suggests this may be a good thing, since thread_local might be implemented with a e.g. a locked hash table, which would result in deadlocks if access from signal handlers were allowed.

Some of the earlier phone call discussion seems to have overlooked the existing clause 29 exemption which, for example, makes calls to is_atomic legal.

That exemption was too broad, since it allowed non-lock-free calls. All calls that acquire locks need to be prohibited in signal handlers, since they typically deadlock if the mainline thread already holds the lock.

I don't understand the meaning of a normative sentence that says "X does not have undefined behavior". We otherwise define its meaning, so why would it possibly have undefined behavior without this sentence? Hence I'm proposing to rephrase.

(This paragraph removes the need for a 1.10p5 change I previously proposed.)