N3246: restrict atomic_flag creation

Submitter: Philipp Klaus Krause
Submission Date: 2024-04-24

Summary:

Restrict atomic_flag creation.

This changes the optionality of atomics from an all/nothing choice to an all/all(except for an atomic_flag corner case)/nothing choice.

Background:

There are three kinds of atomic types in C23: volatile sig_atomic_t, atomic_flag, and the others. volatile sig_atomic_t can be implemented using just atomic loads and stores, atomic_flag is discussed below, the others can be implemented using locks.

atomic_flag is special, since it shall be lockless, and implementing it needs some kind of atomic exchange instruction. This makes it challenging to implement atomic_flag. Many architectures do not have an obvious atomic exchange instruction with indirect addressing mode.

E.g. SDCC when targeting the STM8 or Z80 implements atomic_flag as a byte, with the set state encoded as 0, the reset state encoded as 1, and atomic_flag_test_and_set is implemented using an 8-bit logical right shift (the previous state of the flag can then be found in the carry bit).

But there are architectures where implementing atomic_flag is even more challenging: MCS-51 and Padauk illustrate typical issues on small embedded devices.

MCS-51 is an ancient, but still extremely common 8-bit architecture originating from Intel's 8051. It has multiple named address spaces (in the Embedded C sense). In SDCC, these are called __data (128 B), __idata (256 B, superset of __data), __pdata (256 B), __xdata (64 KB, superset of __pdata), __code (64 KB). The generic address space is a superset of all these. Typically, __idata and __xdata are RAM, while __code is flash. An atomic exchange instruction exists only for __idata. Thus SDCC has to place any atomic_flag in __idata. However, this means that it is not possible to just use any part of memory as atomic_flag. I.e. when N3238 gets into C2Y, it will no longer be possible to implement atomic_flag for MCS-51. There already is a minor problem for allocated storage: currently, the SDCC user has to make a choice of malloc allocating from __idata (small, but allows for use as atomic_flag) or __xdata (much larger, but cannot be used as atomic_flag), but most programmers don't use malloc on such small devices anyway. Since there are no MCS-51 devices with multiple hardware threads, atomic_flag is only used for communicating with signal (interrupt) handlers. For MCS-51, the stack is typically in __idata, so there are no problems for atomic_flag in automatic storage.

Padauk makes low-end 8-bit microcontrollers in very large numbers. Devices with 1, 2, 4 and 8 hardware threads exist. Clearly, it would be desirable to use C atomics for communicating between the hardware threads, and with signal (interrupt) handlers. However, on these devices, there is no atomic exchange instruction with indirect addressing mode. SDCC currently does not support atomic_flag on these devices, but intends to add support. The only viable option would be to use an atomic exchange instruction with direct addressing mode. Since instruction memory is not writeable at run-time (i.e. self-modifying code is not an option), this means that for every atomic_flag object, a helper function needs to be created that does an atomic exchange on that very object. Thus, the addresses of all atomic_flag objects need to be known at link time. Again, when N3238 gets into C2Y this will not be possible. And already with malloc, it would not be possible to use malloced memory for an atomic_flag (though again, use of malloc is rare on such small devices). A further problem exists for atomic_flag in automatic storage: The current standard offers a way in 6.2.4: "The result of attempting to indirectly access an object with automatic storage duration from a thread other than the one with which the object is associated is implementation-defined." Without N3238, atomic_flag in automatic storage can be diagnosed at compile time, but with N3238 any character array could potentially become an atomic_flag. A typical low-end multicore Padauk device has 64 bytes of data memory, and 1024 14-bit words of program memory. Typically, most of the data memory is not used for the stack, and a helper function would be 2 words. So having helper functions for all potential stack locations would be manageable.

To allow efficient implementation of atomic_flag on small embedded systems, we propose to make support for changing the type of objects to atomic_flag optional.

Do we want to make support for changing the type of an object to atomic_flag optional?

Proposed changes (against N3149): §6.10.9.3: Add: "__STDC_NO_ATOMIC_FLAG_CREATION__ The integer constant 1, intended to indicate that the type of objects cannot be changed to atomic_flag.". In § 6.5p6: Add before the last sentence: if N3238 (or a possible successor paper) does not get adopted into C2Y: "If the type of an object with no type becomes a type that is or contains contains atomic_flag, and __STDC_NO_ATOMIC_FLAG_CREATION__ is defined, the behaviour is undefined.", otherwise: "If the type of a byte array becomes a type that is or contains contains atomic_flag, and __STDC_NO_ATOMIC_FLAG_CREATION__ is defined, the behaviour is undefined.".

In case we decide "no" on the above question: Do we want to make support for changing the type of an object to atomic_flag optional for freestanding implementations only?

Proposed change: Those proposed above, and add at the end of §5.1.2.2 as a new paragraph: "__STDC_NO_ATOMIC_FLAG_CREATION__ shall not be defined, unless __STDC_NO_ATOMICS__ is defined.".