The semantics of the restrict qualifier

Jens Gustedt, INRIA and ICube, France

2024-03-22

target

integration into IS ISO/IEC 9899:202y

document history

document number date comment
n3234 202403 original proposal

license

CC BY, see https://creativecommons.org/licenses/by/4.0

1 Problem description

The restrict qualifier is an important corner stone in C’s memory model. It gives implicit guarantees to avoid pointer aliasing as much as possible and contributes much to the relative efficiency of C code compared to programs that are written in similarly structured programming languages.

Unfortunately, the description of this concept in the current C standard is vague and has not been kept in sync with the newer concepts of data dependencies, sequencing (within a single thread) and synchronization (between several threads) that the C standard has refined over the years.

Recently, several attempts have been made to improve this specification. In our view, none of them is satisfactory because they do not attack the fundamental problems.

The start of the current clause (C23) that specifies the semantics of the restrict qualifier reads as follows:

6.7.4.2 Formal definition of restrict

1 Let D be a declaration of an ordinary identifier that provides a means of designating an object P as a restrict-qualified pointer to type T.

2 If D appears inside a block and does not have storage class extern, let B denote the block. If D appears in the list of parameter declarations of a function definition, let B denote the associated block. Otherwise, let B denote the block of main (or the block of whatever function is called at program startup in a freestanding environment).

3 In what follows, a pointer expression E is said to be based on object P if (at some sequence point in the execution of B prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E.153) Note that “based” is defined only for expressions with pointer types.

By its title it is a promise (to provide a formal definition) but it is in fact very delicate mix up of semantic concepts that make it almost impossible to comprehend from the given text. And as a consequence it is not very surprising that after all these years people are still detecting problems.

The text has many problems. In the following we give an unordered list of problems that we detected. It is possible that there are some more that we overlooked.

It is possible that any of these problems could be solvable with some care or even by imposing a specific reading of the text, but seen as a whole, the text is not in a state a center piece of the C standard should be.

2 The underlying model and its intent

There is a clear intent behind the introduction of the restrict qualifier, namely to give programmers better possibilities to deal with aliasing between different pointer targets that might be accessed within the same program block. In particular, if a guarantee is provided to the compiler that accesses to a specific object only happen through one specific pointer value (traced from a unique pointer object when entering into block), many optimization strategies may be applied without changing the semantics of the program. These optimizations include in particular reordering of statements (because there are gurantees of independence) and dead-store elimination (because certain stores then are known to be overwritten before they are ever read).

This primary intent has to be distinguished from the model that is chosen to implement a feature with these properties in the standard. This is where unfortunately the current text fails to provide a clear and concise specification that would integrate well into the terminology, which has much evolved, and for which the text for restrict has not been properly adapted. Our proposal tries to use existing concepts in other clauses of the C standard to provide a sound model:

It has to be noted that all three relations are relations between runtime-features, namely evaluations. They are not necessarily linked to specific lexical features and therefore might be difficult to deduce by a compiler. Therefore an important property of the specification is that

restrict provides guarantees given by the programmer to the compiler

and not the other way around. In particular, the properties that are guaranteed in their generality are not easily deducible since they would need to argue about all possible evaluations of the program block in question. Therefore the only tool at hand in the C standard to describe a violation is UB

Violating the properties induced by restrict qualification has undefined behavior.

So the primary intent of this feature is to ensure that within a block of the program the same object cannot be accessed (and modified) through different access patterns, patterns for which it would not be possible for the compiler to deduce that they do in fact refer to the same object. In this proposal we use a model of deduction for information flow that had been worked out for the introduction of threads into the C standard, namely the “carries a dependency” relation.

It ideally records any information flow that might go from one evaluation to another. We use this very general model for information dependency to cover all possible paths that might lead to pointer values that refer to the same object, but that originate in different sources relative to the block in question. The different forms of dependencies as listed above can all be used (or abused) to form a pointer to a given object, and the programmer who claims a restict has to guarantee that none of such constructed patterns leads to aliasing.

Another important feature of the new wording that we propose is that it attempts to properly distinguish lexical features from their execution time instantiation:

3 An unresolved defect

The fact of using undefined behavior of the worrisome kind in its definition, is a major difficulty for tools to aid programmers in detecting erroneous usage of restrict. Nevertheless, this difficulty seems to be unavoidable, in view of the generality of the definition and in view of the essential run-time dependency.

There is one principal defect of the feature, though, that has been built into its design: the restric-feature lacks enforcement, even where that would be syntactically possible. This defect is due to the idea of modeling the feature with a qualifier that applies to a pointer type, but not to the pointer target. The analogy to the other two standard qualifiers const and volatile is really far fetched, and not really productive. These other qualifiers change the rules for access of the value under consideration, whereas restrict changes the rules of access of the target of the value under consideration.

In particular, here for restrict we have that qualifying a parameter is not part of a function type, and thus function declarations with different restrict-qualifications are compatible.

A restrict-qualification of a function parameter has no influence on the type of the function.

This has it that in general restrict-qualification remains fragile, needs thorough documentation and relies on voluntary cooperation between function provider and function caller.

In this paper here, we will not address this defect, but just try to provide a sound specification of the feature as it has been intended and understood.

4 Wording

Replace 6.7.4.2, title and p1–p5 by the following.

6.7.4.2 The semantics of the restrict qualifier

1 The purpose of the restrict qualifier is to indicate that within a specific block B of the program a pointer object P provides a unique access to the objects to which it refers, such as to avoid aliasing accesses via different access paths.1 In the following let Bₓ an execution of B; if B is the block associated to a function definition, Bₓ corresponds to the execution of a call to the function; a store of an argument to a parameter is considered to happen before Bₓ has started.2

1The intent is to provide optimization opportunities to the translator, such as reordering or dead-store elimination. In general, the verification of the requested properties lies in the responsibility of the programmer; either by ensuring that they hold for all possible executions of block B, or, by enforcing that B is never reached if the properties are not verified.

2In the following an index such as in Bₓ to the name of a lexical feature B of the program indicates a specific instance of that feature during an execution of the program currently under consideration.

2 An evaluation Eₓ of pointer type of an expression E that occurs during the execution of a program is said to be based on a store operation Sₓ to an object instance Pₓ, if Pₓ has effective pointer type and there is an lvalue conversion referring to Pₓ or any of its representation bytes that reads the value stored by Sₓ and that carries a dependency to Eₓ. An evaluation Eₓ of pointer type that occurs during Bₓ is said to be uniquely Bₓ-based on a store operation Sₓ if

3 In the following

If any of the object (*Vₓ) or one of its parts is modified during the execution of Bₓ

  • the pointer target of the effective type of Pₓ shall not be const-qualified, and
  • any evaluation in Bₓ with pointer value Wₓ that is used to access (*Vₓ) or one of its parts3 shall be uniquely Bₓ-based on Sₓ.
3That is (char*)Vₓ(char*)Wₓ < (char*)Vₓ + sizeof(*Vₓ).

4 NOTE 1: It is possible that the effective type of Pₓ is determined as late as by the store operation Sₓ, that the store operation Sₓ is a store to a representation byte of Pₓ and does not affect its effective type, or that the store operation is in fact the initialization of Pₓ.

5 NOTE 2: The definition of restrict only applies if the store operation Sₓ does not happen after the start of the execution of the block Bₓ, that is if it happens before the start of Bₓ or is executed in a different thread or in a signal handler and if Eₓ synchronizes with Sₓ. No properties are implied for store operations within Bₓ that change the value a restrict-qualified pointer object.

6 NOTE 3: It is not specified if Sₓ is executed in the same thread as Eₓ or is executed in a signal handler, but synchronization aspects as they are imposed by the happens before relation constitute an important consideration to determine if the uniquely Bₓ-based property holds.