*/ } ul /* Whole unordered list */ { } ul li /* Unordered list item */ { } ol /* Whole ordered list */ { } ol li /* Ordered list item */ { } hr {} /* ---- Some span elements --- */ sub /* Subscripts. Pandoc: H~2~O */ { } sup /* Superscripts. Pandoc: The 2^nd^ try. */ { } em /* Emphasis. Markdown: *emphasis* or _emphasis_ */ { } em > em /* Emphasis within emphasis: *This is all *emphasized* except that* */ { font-style: normal; } blockquote > p > em /* Emphasis within emphasis: *This is all *emphasized* except that* */ { font-style: normal; } blockquote > * > p > em /* Emphasis within emphasis: *This is all *emphasized* except that* */ { font-style: normal; } blockquote > p > ins > em /* Emphasis within emphasis: *This is all *emphasized* except that* */ { font-style: normal; } blockquote > * > p > ins > em /* Emphasis within emphasis: *This is all *emphasized* except that* */ { font-style: normal; } /* ---- Links (anchors) ---- */ a /* All links */ { /* Keep links clean. On screen, they are colored; in print, they do nothing anyway. */ text-decoration: none; } @media screen { a:hover { /* On hover, we indicate a bit more that it is a link. */ text-decoration: underline; } } @media print { a { /* In print, a colored link is useless, so un-style it. */ color: black; background: transparent; } a[href^="http://"]:after, a[href^="https://"]:after { /* However, links that go somewhere else, might be useful to the reader, so for http and https links, print the URL after what was the link text in parens */ content: " (" attr(href) ") "; font-size: 90%; } } /* ---- Images ---- */ img { /* Let it be inline left/right where it wants to be, but verticality make it in the middle to look nicer, but opinions differ, and if in a multi-line paragraph, it might not be so great. */ vertical-align: middle; } div.figure /* Pandoc figure-style image */ { /* Center the image and caption */ margin-left: auto; margin-right: auto; text-align: center; font-style: italic; } p.caption /* Pandoc figure-style caption within div.figure */ { /* Inherits div.figure props by default */ } /* ---- Code blocks and spans ---- */ pre, code { background-color: #fdf7ee; /* BEGIN word wrap */ /* Need all the following to word wrap instead of scroll box */ /* This will override the overflow:auto if present */ white-space: pre-wrap; /* css-3 */ white-space: -moz-pre-wrap !important; /* Mozilla, since 1999 */ white-space: -pre-wrap; /* Opera 4-6 */ white-space: -o-pre-wrap; /* Opera 7 */ word-wrap: break-word; /* Internet Explorer 5.5+ */ /* END word wrap */ } pre /* Code blocks */ { /* Distinguish pre blocks from other text by more than the font with a background tint. */ padding: 0.5em; /* Since we have a background color */ border-radius: 5px; /* Softens it */ /* Give it a some definition */ border: 1px solid #aaa; /* Set it off left and right, seems to look a bit nicer when we have a background */ margin-left: 0.5em; margin-right: 0.5em; } pre.yacc, code.yacc { background-color: #f0f0f0; } pre.yacc /* Code blocks */ { /* Distinguish pre blocks from other text by more than the font with a background tint. */ padding: 0.0em; /* Since we have a background color */ border-radius: 5px; /* Softens it */ /* Give it a some definition */ border: 0px solid #aaa; /* Set it off left and right, seems to look a bit nicer when we have a background */ margin-left: 0.0em; margin-right: 0.0em; } @media screen { pre { white-space: pre; /* Dotted looks better on screen and solid seems to print better. */ border: 1px dotted #777; } } code /* All inline code spans */ { } p > code, li > code /* Code spans in paragraphs and tight lists */ { /* Pad a little from adjacent text */ padding-left: 2px; padding-right: 2px; } li > p code /* Code span in a loose list */ { /* We have room for some more background color above and below */ padding: 2px; } span.option { color: blue; text-decoration: underline; } /* ---- Math ---- */ span.math /* Pandoc inline math default and --jsmath inline math */ { /* Tried font-style:italic here, and it messed up MathJax rendering in some browsers. Maybe don't mess with at all. */ } div.math /* Pandoc --jsmath display math */ { } span.LaTeX /* Pandoc --latexmathml math */ { } eq /* Pandoc --gladtex math */ { } /* ---- Tables ---- */ /* A clean textbook-like style with horizontal lines above and below and under the header. Rows highlight on hover to help scanning the table on screen. */ table { border-collapse: collapse; border-spacing: 0; /* IE 6 */ border-bottom: 2pt solid #000; border-top: 2pt solid #000; /* The caption on top will not have a bottom-border */ /* Center */ margin-left: auto; margin-right: auto; } thead /* Entire table header */ { border-bottom: 1pt solid #000; background-color: #eee; /* Does this BG print well? */ } tr.header /* Each header row */ { } tbody /* Entire table body */ { } /* Table body rows */ tr { } tr.odd:hover, tr.even:hover /* Use .odd and .even classes to avoid styling rows in other tables */ { background-color: #eee; } /* Odd and even rows */ tr.odd {} tr.even {} td, th /* Table cells and table header cells */ { vertical-align: top; /* Word */ vertical-align: baseline; /* Others */ padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.2em; padding-bottom: 0.2em; } /* Removes padding on left and right of table for a tight look. Good if thead has no background color*/ /* tr td:last-child, tr th:last-child { padding-right: 0; } tr td:first-child, tr th:first-child { padding-left: 0; } */ th /* Table header cells */ { font-weight: bold; } tfoot /* Table footer (what appears here if caption is on top?) */ { } caption /* This is for a table caption tag, not the p.caption Pandoc uses in a div.figure */ { caption-side: top; border: none; font-size: 0.9em; font-style: italic; text-align: center; margin-bottom: 0.3em; /* Good for when on top */ padding-bottom: 0.2em; } /* ---- Definition lists ---- */ dl /* The whole list */ { border-top: 2pt solid black; padding-top: 0.5em; border-bottom: 2pt solid black; } dt /* Definition term */ { font-weight: bold; } dd+dt /* 2nd or greater term in the list */ { border-top: 1pt solid black; padding-top: 0.5em; } dd /* A definition */ { margin-bottom: 0.5em; } dd+dd /* 2nd or greater definition of a term */ { border-top: 1px solid black; /* To separate multiple definitions */ } /* ---- Footnotes ---- */ a.footnote, a.footnoteRef { /* Pandoc, MultiMarkdown footnote links */ font-size: small; vertical-align: text-top; } a[href^="#fnref"], a.reversefootnote /* Pandoc, MultiMarkdown, ?? footnote back links */ { } @media print { a[href^="#fnref"], a.reversefootnote /* Pandoc, MultiMarkdown */ { /* Don't display these at all in print since the arrow is only something to click on */ display: none; } } div.footnotes /* Pandoc footnotes div at end of the document */ { } div.footnotes li[id^="fn"] /* A footnote item within that div */ { } table tr td,th { border-right: 1px solid; border-left: 1px solid; } /* You can class stuff as "noprint" to not print. Useful since you can't set this media conditional inside an HTML element's style attribute (I think), and you don't want to make another stylesheet that imports this one and adds a class just to do this. */ @media print { .noprint { display:none; } }
2025-11-16
preliminary discussion for possible integration into IS ISO/IEC 9899:202y
| document number | date | comment |
|---|---|---|
| n3739 | 202511 | Original proposal |
CC BY, see https://creativecommons.org/licenses/by/4.0
Traditionally, C’s function interfaces are particularly poor in the way information that can be transferred from the calling context into the caller and even poorer for the way back from the called function back to the caller. Originally, C only accounted for the number of arguments of a call and the return type, the addition of prototypes to the language added more static information about parameter types. This has later be enriched with only some more bits of dynamic information, in particular array sizes.
The goal of this paper is to continue this process of improvement of interface specifications to the dynamic verification of predicates which the current syntax is not capable to assert. Algorithmic invariants or properties that are needed for optimization are not expressible in the interfaces that the language currently offers and, in general, are either missed, annotated manually with external tools, or, most commonly, by handwritten textual proofs.
Nevertheless, many predicates that would be needed here can be
written as C expressions, provided that we add some additional state
that, without interacting with the core of the computation, helps to
propagate properties forward. The goal of this paper is to show how
function interfaces in C can be enriched such that necessary predicates
for calling them and conclusions from a call can be naturally expressed
in otherwise existing C syntax and semantics. The core argument of this
paper is to show that this execution model, with reasonable assumptions
on the acceptable predicates, is equivalent to an implementation in
conventional C where shallow inline wrappers execute cascaded if statements to
check the predicates.
The goal of this paper is not to provide methods for hardening existing interfaces, and, in particular, the C library. In the contrary, adding contracts as proposed here changes the semantics of interfaces (in general from UB to well-defined failures) and they are thus not suited to paint over existing deficiencies in interface design.
Numerous tools exist in the field that ensure the check of
predicates, usually expressed either as assertions or as pre- or
postconditions. In the following we discuss three of them that have had
the most influence on this proposal, here. The main motivation came from
a similar feature that is a candidate to make it into C++26, C++
contracts; others are the traditional assert macro and Annex K of the C
standard.
C++ contracts are a complicated feature, in particular they let
implementations chose between different semantics of a
translated contract (ignore, observe, enforce
and quick-enforce). Additionally, for two of these semantics
(observe and enforce) they allow to configure behavior
in the case of a contract failure by means of a user-supplied contract
handler, ::handle_contract_violation.
We don’t think that this whole spectrum of possibilities is adequate
for a C feature. First, for C it is not appropriate that readily
compiled translation units (such as the C library or other third-party
libraries) change behavior by means of a user-supplied contract handler.
Second, the ignore semantics are counter-productive as they
inhibit the evaluation of the predicate, and thus change semantics even
in case that a contract holds. Thereby, the ignore semantics
partially repeat a design error that was already present for C’s assert feature, see below. Third, C++
contracts allow the predicates to have side-effects, which makes it
difficult to argue about semantics of contracts and which makes it
impossible for both the caller and the callee to check the contract
independently if necessary.
As a consequence our proposal only is completely compatible with C++’s quick-enforce semantics, and this only if the contract has no side effects. In addition to that, we provide undefined-behavior semantics that actually assume that the predicate is evaluated but imply undefined behavior if it doesn’t hold. These may be different from C++’s ignore semantics even in the case that the evaluated predicate doesn’t have side-effects. Namely, as proposed here for C an optimizing compiler will be able to assume that the predicate holds unconditionally in all code that is sequenced after the evaluation.
Other features in our proposal that are different from C++ contracts:
if-statements._ReturnValue and cannot be annotated with
attributes._Pre and _Post for contracts.
C++’s syntax with pre and post is available with macros.stdc_contract_terminate that in general
leads to the printing of a diagnostic and then to program termination. A
user defined configuration macro __STDC_CONTRACT_UNDEFINED_BEHAVIOR__ can be
set locally to have the execution continue with undefined behavior,
instead.stdc_contract_assert,
which is the equivalent of C++’s contract_assert,
stdc_contract_assume that replaces C++’s
[[assume(PREDICATE)]].
Instead of terminating, it leads to undefined behavior if the predicate
doesn’t hold.Our proposed feature is meant to be compatible to C++ contracts at
least for the quick-enforce semantics and in the absence of
side-effects. We impose that the strategy in case of contract failure is
statically determined for each assertion and pre- or postcondition, and
that either the execution (of the program or thread) terminates or that
the execution continues as-if with an invocation of unreachable. In any
case, a C proposal for contracts cannot have exactly the same syntax as
for C++, because C++ uses syntactic concepts
that are not available in C. Nevertheless, we try to provide a syntax
that can be mapped between the languages by using some macro glue.
assert C library macroHistorically, C also has a macro that is intended to ensure a given
predicate, assert from <assert.h>.
Unfortunately this macro has not aged well and lacks capacities that
would be needed in a modern environment of translators with improved
optimization capabilities or even with integrated static analyzers. The
main defect of that macro is that it is switched off as a whole by the
user defined macro NDEBUG and that
thus the resulting executables have different syntax and semantics:
assert invocation may inadvertently be
invalidated by distant changes of the source.NDEBUG is set:
NDEBUG.NDEBUG implicitly leads to
undefined behavior if the predicate does not hold.These semantic defects seem to be shared with C++’s ignore semantics.
For the C library, Annex K has already taken the approach to replace the existing C library interfaces by new variants that make more guarantees and that introduce runtime checks for constraints. Our approach differs from there in multiple ways
As it is the common model in C, we will assume that the functions that make up a program are split in translation units, TU, that each correspond to a C file that is compiled separately from all the others. A TU knows about functions in another TU only by declarations, most properties of these other functions are hidden. The advantages of such an approach are plenty, for example modularization, maintainability and fast compilation times, but there are also severe disadvantages: false assumptions lead to complicated bugs and lack of information misses optimization opportunities.
The proposed model for adding contracts to C interfaces is relatively simple at the surface: we create the possibility to add a list of pre- and postconditions to function declarations, right after the closing parenthesis of the parameter list (resp. after possible attributes). These contracts can then be ensured in different TU by a concept that we coin “syntactically equivalent”. Contracts can thereby effectively be checked from the calling context of a function or within the function itself (or both) without leading to semantic differences. The intent is that in general for preconditions the check is effected in the caller and for postconditions in the callee, such that these checks may seamlessly use knowledge that is already present in these contexts.
We took care that contracts as proposed here
<stdc_contract.h>Remark: It would also be possible to add these macros to
<assert.h>(but without using theNDEBUGmechanism) or<stdlib.h>(because it already hostsunreachable).
1 Contracts enable macros (stdc_contract_terminate, stdc_contract_assume, stdc_contract_assert, pre and post) and annotations for function
declarations (_Pre and _Post) that check
specified predicates and that declare a number of state variables that
can be used to ensure the consistency of an execution. The intent is to
make such contracts visible simultaneously for the call side of a
function and its implementation, and, to restrict the specified
predicates such that evaluating them on either side (caller or callee)
leads to the same result. Thereby contracts open the possibility to
verify programs and to provide optimization opportunities, even if the
program is composed of several translation units.
2 Here, a predicate is a logical condition specified as an
expression of scalar type. When converted to bool a predicate is
said to hold if the result is true and it is said
to be violated if it is false. If a
predicate that is evaluated as an argument to one of stdc_contract_assume, stdc_contract_assert, _Pre or _Post is violated,
an execution (of the program or of an individual thread) is henceforth
erroneous and it is either terminated or the behavior is undefined.
Additionally, if the value of such a predicate is an integer constant
expression, the value is determined at translation time and if it does
not hold a diagnostic is required.
3 For stdc_contract_assert, _Pre and _Post, an optional
program-defined configuration macro __STDC_CONTRACT_UNDEFINED_BEHAVIOR__
describes if a violated predicate terminates execution in a defined way
or if such an execution continues with undefined behavior. The intent
here is that executions that respect all contracts show the same
behavior, regardless if or how this macro is set.
__STDC_CONTRACT_UNDEFINED_BEHAVIOR__ is not
defined, expands to itself or expands to the token 0 at the point of
declaration of an assertion or contract, a violation terminates the
execution of the program or the current thread. How that termination
happens is described for stdc_contract_terminate.unreachable().4 The macros pre and post are defined as if by the following
#ifndef pre
# define pre _Pre
#endif
#ifndef post
# define post _Post
#endif5 Recommended practice Application programs are
encouraged to define __STDC_CONTRACT_UNDEFINED_BEHAVIOR__ only if
this is unavoidable. For example, if the predicates that lead to a
contract violation are not traceable by the translator, or in situations
where terminating an execution in a controlled way is not possible.
stdc_contract_terminate macro1 The stdc_contract_terminate macro
takes one argument, which is a string literal that provides a diagnostic
message.
Rationale: strings are always required to be string literals, here, to ensure that these strings are valid during all phases of termination of the execution. In particular, dynamic allocations of diagnostic strings are to be avoided.
If at the lexical position of an invocation of the stdc_contract_terminate macro the
configuration macro __STDC_CONTRACT_UNDEFINED_BEHAVIOR__ is not
defined, expands to itself or expands to the token 0, the invocation is
as if a call to a function with the following declaration were
placed:
[[noreturn]] void (stdc_contract_terminate)(char const*mess);
typedef void stdc_contract_termination_t(int);Such a call issues a diagnostic message that contains the string to
which mess points on the standard
error stream and then terminates the execution of the program or of the
current thread. This termination happens via a call to a function that
is compatible to stdc_contract_termination_t
with the argument EXIT_FAILURE. It is
implementation-defined if the function that is called is _Exit, quick_exit, exit or thrd_exit, or if it is a different
implementation-defined function with type stdc_contract_termination_t.
The implementation-defined choice is determined before or during
translation phase
7.Foot)
Foot) In addition to a translation-time
configuration by means of the user-defined macro __STDC_CONTRACT_UNDEFINED_BEHAVIOR__, it is
implementation-defined if and how the choice of the function is
configurable at translation time.
2 If the execution reaches an invocation of the stdc_contract_terminate macro where the
configuration __STDC_CONTRACT_UNDEFINED_BEHAVIOR__ macro
expands to a token sequence different from itself or from the token
0, the effect
is as-if the unreachable macro is
called.
3 Recommended practice It is recommended that the
implementation-defined diagnostic message that is printed before
termination also contains the values of __FILE__ and __LINE__ as of the point of invocation of
the macro.
stdc_contract_assert and stdc_contract_assume macros1 Similar to a static_assert
expression, these macros have the syntactic properties of a function
call of type void where the first
parameter has type bool and the second
parameter has type char const*,
only that the second argument is optional and, if present, is a string
literal. The first argument to an invocation of these macros is called
the predicate, the optional string argument, if any, the
message.
2 If the predicate is an integer constant expression, an invocation
is equivalent to a static_assert
expression with the same arguments. Otherwise, if the predicate converts
to true the
behavior of the execution for both macros is is the same; the only
effects on the invocation then are those of the evaluation of the
predicate.FTN)
FTN) Thus, the translator may assume that the predicate holds for program code that is sequenced after such an invocation and optimize the program with that knowledge.
3 Otherwise, the macro stdc_contract_assume behaves as-if defined
as follows.
#define stdc_contract_assume(PREDICATE, ...) \
((bool)(PREDICATE) \
? (void)0 \
: unreachable())Similarly, if the configuration macro __STDC_CONTRACT_UNDEFINED_BEHAVIOR__ expands
to anything but itself or the token 0 at the point of
invocation the behavior of stdc_contract_assert is as-if defined as
follows.
#define stdc_contract_assert(PREDICATE, ...) \
((bool)(PREDICATE) \
? (void)0 \
: unreachable())4 Otherwise, if the configuration macro __STDC_CONTRACT_UNDEFINED_BEHAVIOR__ is not
defined, expands to itself or expands to the token 0 at the point of
invocation the behavior of stdc_contract_assert is as-if defined as
follows.footer)
#define stdc_contract_assert(PREDICATE, ...) \
((bool)(PREDICATE) \
? (void)0 \
: stdc_contract_terminate("" __VA_ARGS__))footer) If the predicate does not hold, it follows that the diagnostic string that is printed contains the message argument, if any.
5 NOTE For both macros the predicate is evaluated unconditionally. This property is important to warrant a consistent execution of the program.
add _Post
and _Pre to
the list of keywords.
Rationale: We want the expressions that we use for contracts to be robust against some reformulations, namely against
- grouping with redundant parenthesis
- expressing constant expressions in different ways.
1 Two expressions are syntactically equivalent if one of the following holds:
Otherwise two expressions are syntactically different.
2 Example With the following declarations
enum { zero = 0, nix = 0, };
constexpr double one = 1.0;
extern float what;
typedef double bubble;the following hold
| A | B | synt. | remarks |
|---|---|---|---|
0 |
'\000' |
equiv | literals |
0 |
0 + 0 |
equiv | constant expressions |
0 |
zero |
equiv | two constants |
nix |
zero |
equiv | two named constants |
0 |
0U |
diff | different type |
1.0 |
one |
equiv | |
uabs(1) |
uabs(1L) |
diff | arguments have different types |
what + what * what |
what + (what * what) |
equiv | |
one + one |
2 * one |
equiv | two constant expressions |
what + what |
2 * what |
diff | not constant |
(double)32 |
(bubble)0x20 |
equiv |
Apply the following changes.
Add contract-list to the syntax in 6.7.7.1
direct-declarator:
identifier attribute-specifier-sequenceopt
( declarator )
array-declarator attribute-specifier-sequenceopt
function-declarator attribute-specifier-sequenceopt contract-listopt
and in 6.7.8
direct-abstract-declarator:
( abstract-declarator )
array-abstract-declarator attribute-specifier-sequenceopt
function-abstract-declarator attribute-specifier-sequenceopt contract-listopt
15 Two function types without contract list are compatible if and only if all of the following hold:
- They specify compatible return types.
- The parameter type lists agree in the number of parameters and in whether the function is variadic or not.
- The corresponding parameters have compatible types.
In the determination of type compatibility and of a composite type, each parameter declared with function or array type is taken as having the adjusted type and each parameter declared with qualified type is taken as having the unqualified version of its declared type.
15′ Two function types that both have a contract list are compatible if they are compatible without a contract, if they have the same number of contracts, and, if after possible naming or renaming of parameters and ghost variables, the ordered pairs of contracts are syntactically equivalent¸ see 6.7.7.4.1. Otherwise, two function types where one is without a contract list are compatible if they are compatible when the contract list of the other is omitted.
Rationale: because of the constraints in 6.7.7.4.1, all syntactically equivalent predicates will evaluate to the same values, regardless which declaration is used.
6.7.7.4.1 Contracts
Syntax
1 contract-list:
precondition-list postcondition-listopt
postcondition-list
precondition-list:
precondition precondition-listopt
postcondition-list:
postcondition postcondition-listopt
precondition:
_Preattribute-specifier-sequenceopt(selection-header)
postcondition:
_Postattribute-specifier-sequenceopt(selection-header)
Description
2 A contract is either a pre- or a postcondition. The expression part, if present, of a selection header in a contract is called the predicate of the contract; if there is no such expression, the selection header is a simple declaration with an initializer and the predicate is that initializer. The optional attribute specifier sequence in a contract appertains to the declared contract.
3 Contracts serve to formulate requirements that are transferred from the calling context into the function context (preconditions) and others that are transferred back from the call into the calling context (postconditions).
4 In the selection header of each contract, zero, one or several variables with automatic storage duration, called ghost variables, are defined and initialized. They are visible to all following contracts in the list.
5 Additionally, if the return type of the function is not void, there is a
predefined ghost variable named _ReturnValue of that return type that holds
the return value of the function during all evaluations in a
postcondition. It is visible and accessible in all postconditions.
6 Within preconditions, parameters evaluate to the value as provided by the call. For postconditions, it is unspecified if the result of the evaluation of a parameter is the value provided by the call or as possibly modified by the function body.FT)
FT) If a postcondition needs to access the unmodified value of a parameter as provided by the corresponding call, it is recommended to store that initial value in a ghost variable of a precondition and then to access that ghost variable in the postcondition.
7 A pair of preconditions (respectively postconditions) are syntactically equivalent if their predicates are syntactically equivalent and if all ghost variables, if any, are declared
Constraints
8 All identifiers used in an evaluation in a contract shall have
visible declarations and, unless they are named constants, shall not
refer to objects or functions with internal linkage. All ghost variables
shall have an initializer and a type that is neither atomic nor volatile qualified.
If the function declaration is also a definition, a ghost variable shall
not be used in any way inside the function body.
Rationale: contracts are primarily thought as annotations of an interface between different translation units. Therefore they should not read the contents of static variables or call static functions. They must not have UB if a ghost variable would be uninitialized in some situations.
9 A predicate of a contract that is an integer constant expression shall not have the value zero.
Rationale: predicates that are known to be always false should abort compilation with semantics similar to
static_assert.
10 If within a translation unit that contains the definition of a
function F and anywhere in the
translation unit there is a declaration of F with contract list, that declaration shall
be visible in the position of the definition of F.
Rationale: The definition must always implement the contracts for the case that it is called through a function pointer.
11 The address of a function where the definition has a visible
declaration with a contract list shall not be passed as the second
argument to a call to thrd_create.
Rationale: Calling such a function would evaluate the contracts in a different thread than the visible context of the call to
thrd_create.
Semantics
12 On entry to the function, after the parameter types and values
have been determined but before the function body is executed, all
preconditions are evaluated. Similarly, after a return statement has
been reached and a possible return expression has been evaluated or if
the closing brace of the body has been reached, all postconditions are
evaluated. All evaluations of contracts are sequenced in lexicographic
order within the same thread as the function call.
13 For each contract, if during the evaluation the predicate converts
to true, the
execution continues normally. Otherwise: if __STDC_CONTRACT_UNDEFINED_BEHAVIOR__ is not
defined, expands to itself or expands to the token 0, the execution (of
the program or thread) terminates as-if by an invocation of stdc_contract_terminate with a string
literal argument that contains at least the current expansion of the
macros __FILE__ and __LINE__; otherwise, the execution continues
as-if with an invocation of unreachable.
14 Evaluations in contracts shall not have side effects other than initializing a ghost variable. They shall not access modifiable objects of static storage duration unless these stem from a definition with external linkage or unless they are accessed through a pointer parameter.Footy)
Footy) This does not exclude string literals but
all variables or compound literals declared with static unless they
are const and
not volatile
qualified.
15 If a function definition that has no visible declaration with a contract list is called with a visible declaration that has a contract list, the behavior is undefined.
Rationale: This forbids the compilation of a function without contracts and then adding such contracts later to a header file. If that would be allowed, the function could eventually be called without the check for any of the predicates in place, and thus the semantics of the contracted interface would be violated.
16 NOTE If a function definition that has a visible declaration with contract list is called with a declaration that has a syntactically different contract list the two declarations are not compatible, thus the behavior is undefined. This does not exclude the case that such a function is called with a declaration that has no contract list. In particular, such a function may always be called by using a function pointer value holding its address.
17 EXAMPLE Provided that the implementation has the
type uintptr_t
and that the ordering of that type is compatible with the conversion
from unsigned char*
the following presents contracts for a function my_memcpy that fulfills the same
requirements as memcpy. Other than for
memcpy, here, passing overlapping byte
arrays as arguments terminates the program execution in a controlled
way, no undefined behavior occurs in that case.
void *my_memcpy(void*restrict dest, const void*restrict src, size_t n)
_Pre(dest) _Pre(src) _Pre(n)
_Pre(const unsigned char *const dest_start = dest,
*const src_start = src;
((uintptr_t)(src_start + n) <= (uintptr_t)dest_start)
|| ((uintptr_t)(dest_start + n) <= (uintptr_t)src_start))
_Post(_ReturnValue == dest_start);The first preconditions make the general guarantees for functions in
the library clause that all parameters have valid values. The
precondition with ghost variables tests that both byte arrays of length
n that are passed as arguments do not
overlap. Last, the postcondition assures that the function returns the
same value as dest.
Information for the range check will in general be present at the
point of a call; thus if a translator is able to prove that requirement
the check for it can be omitted. For the postcondition, the ghost
variable dest_start is used, because
it always holds the original value that was passed as the first
argument. In general, the implementation will be able to make this
guarantee when translating the definition of the function, so then the
check for this postcondition is superfluous. On the other hand, a
translator where only the declaration is visible is able to deduce that
property from the postcondition and optimize accordingly in the context
of the call.
As defined in the suggested text above, it may seem as if contracts impose severe constraints to all executions, as they assume that all contracts are checked on all function calls. Because of the strong impact that such a behavior would then have for example on execution times, as such this would probably not be a viable proposal.
In fact, modern compilers and static analyzers should be able to prove for many contracts that they hold unconditionally; thus they are dead code and can be removed. One of the keys to that is that all evaluations in contracts are forced to be free of side effects and can in general be hoisted out of the immediate calling context. The other key is the requirement that contracts for the same function, even if specified in different translation units, have to be syntactically equivalent, and, given the constraints, will always evaluate the same in the calling and the called context.
In the following a selection-header of a precondition is given in the form
dec_preᵢ predicate_preᵢwhere the part
dec_preᵢ is a declaration (of one
or several ghost variables) in the formtype_preᵢ ghost_pre⁰ᵢ = init_pre⁰ᵢ, ...;predicate_preᵢ is a predicate
that is to be checked.The same syntax as for if-statements is
admissible, that is either dec_preᵢ or
predicate_preᵢ can be empty, but not
both. Both, dec_preᵢ and predicate_preᵢ, are the usual C constructs
as they may appear in if-statements.
If predicate_preᵢ is empty, then
(per definition of the if statement) dec_preᵢ is a simple declaration defining
exactly one ghost variable ghost_pre⁰ᵢ
with initializer init_pre⁰ᵢ. predicate_preᵢ is then assumed to be the
expression init_pre⁰ᵢ.
Analogous definitions hold for postconditions in the form dec_postᵢ predicate_postᵢ.
In the following we discuss three different views of a function with contracts,
If a function with contracts is given as an inline definition, the
semantics are relatively simple and easily explained in existing C
syntax: tests for the preconditions have to be inserted before the
remaining of the function body and tests for post conditions afterwards.
There is only one subtlety, namely the handling of return statements.
In fact, all return statements
are replaced by assignments to the auxiliary variable _ReturnValue and then a proper return statement is
added at the end, after all postconditions have been checked.
Suppose a list of contracts is attached to a function definition with
inline:
// header
inline returnType func(parameterTypeList)
pre(dec_pre₀ predicate_pre₀)
pre(dec_pre₁ predicate_pre₁)
…
pre(dec_preₙ predicate_preₙ)
post(dec_post₀ predicate_post₀)
post(dec_post₁ predicate_post₁)
…
post(dec_postₘ predicate_postₘ)
function-body// translation unit
typeof(func) func; // instantiation of funcThe effect is as-if the function was augmented with cascaded if statements that
test for the conditions.
// header
inline returnType func(parameterTypeList) {
if (dec_pre₀ !predicate_pre₀) {
static_assert(ICE_OR_TRUE(predicate_pre₀), "some string");
stdc_contract_terminate("some string");
} else if (dec_pre₁ !predicate_pre₁) {
static_assert(ICE_OR_TRUE(predicate_pre₁), "some string");
stdc_contract_terminate("some string");
} else if (…) {
...
} else if (dec_preₙ !predicate_preₙ) {
static_assert(ICE_OR_TRUE(predicate_preₙ), "some string");
stdc_contract_terminate("some string");
} else {
returnType _ReturnValue;
replaced-function-body // replace return by assignment
if (dec_post₀ !predicate_post₀) {
static_assert(ICE_OR_TRUE(predicate_post₀), "some string");
stdc_contract_terminate("some string");
} else if (dec_post₁ !predicate_post₁) {
static_assert(ICE_OR_TRUE(predicate_post₁), "some string");
stdc_contract_terminate("some string");
} else if (…) {
...
} else if (dec_postₘ !predicate_postₘ) {
static_assert(ICE_OR_TRUE(predicate_postₘ), "some string");
stdc_contract_terminate("some string");
} else {
return _ReturnValue;
}
}
stdc_contract_terminate("severe bug, execution must never reach here");
}Here, replaced-function-body is the function-body
where all returnexpression
statements are replaced by assignments _ReturnValue =expression
and the macro ICE_OR_TRUE is supposed
to resolve to its argument if that is an ICE and to true, otherwise.
Note that the if-statements with
the inverted condition are such that all definitions of ghost variables
in the form dec_preᵢ or dec_postᵢ are visible to all evaluations in
the subsequent contracts.
The derived inline function definition has the following properties.
static_assert
expressions only trigger for ICE.In the calling context some information may be present which ensures that the preconditions are always satisfied. Then, these can effectively be optimized out at compile time without changing the semantics. Similarly, the knowledge about postconditions can be integrated into the calling context.
If a function is split into an external declaration in a header and a definition in one TU, things get a bit more complicated. Here, we define an auxiliary inline definition that tests for contracts as above and calls a different internal function to access the original definition. The subtlety here is that we only want preconditions to be evaluated in the calling context as assertions, but postconditions should only appear as assumptions.
Suppose a list of contracts is attached to a function declaration
without inline:
// header
returnType func(parameterTypeList)
pre(dec_pre₀ predicate_pre₀)
pre(dec_pre₁ predicate_pre₁)
…
pre(dec_preₙ predicate_preₙ)
post(dec_post₀ predicate_post₀)
post(dec_post₁ predicate_post₁)
…
post(dec_postₘ predicate_postₘ);// translation unit
returnType func(parameterTypeList) function-bodyThe effect is as-if the following function were defined in the header.
// header
inline returnType func(parameterTypeList) {
if (dec_pre₀ !predicate_pre₀) {
static_assert(ICE_OR_TRUE(predicate_pre₀), "some string");
stdc_contract_terminate("some string");
} else if (dec_pre₁ !predicate_pre₁) {
static_assert(ICE_OR_TRUE(predicate_pre₁), "some string");
stdc_contract_terminate("some string");
} else if (…) {
...
} else if (dec_preₙ !predicate_preₙ) {
static_assert(ICE_OR_TRUE(predicate_preₙ), "some string");
stdc_contract_terminate("some string");
} else {
extern typeof(func) func_aux;
auto const _ReturnValue = func_aux(parameters, …);
if (dec_post₀ !predicate_post₀) {
static_assert(ICE_OR_TRUE(predicate_post₀), "some string");
unreachable();
} else if (dec_post₁ !predicate_post₁) {
static_assert(ICE_OR_TRUE(predicate_post₁), "some string");
unreachable();
} else if (…) {
...
} else if (dec_postₘ !predicate_postₘ) {
static_assert(ICE_OR_TRUE(predicate_postₘ), "some string");
unreachable();
} else {
return _ReturnValue;
}
}
stdc_contract_terminate("severe bug, execution must never reach here");
}The function func_aux will be shown
in the next section. The name func_aux
is only chosen for the example, a real implementation chooses a unique
name that is not predictable for the user.
The auxiliary inline function definition has the following properties.
ICE_OR_TRUE ensures that the static_assert
expressions only trigger for ICE.func_aux that implements the user code and
that asserts the postconditions.Note that again preconditions that are known to hold in the calling context can effectively be optimized out at compile time without changing the semantics. Similarly, postconditions can provide valuable information for the calling context. For these optimization opportunities it is essential that contracts do not have side effects.
The translation unit then provides three function symbols: a static function
func_inner that implements the user
code without contracts, the auxiliary function func_aux that is called by func as given above, and the instantiation
of func itself. The names func_inner and func_aux are only for the example; both are
unique names that are not predictable for the user.
// translation unit
static returnType func_inner(parameterTypeList) function-body
returnType func_aux(parameterTypeList) {
if (dec_pre₀ !predicate_pre₀) {
static_assert(ICE_OR_TRUE(predicate_pre₀), "some string");
unreachable();
} else if (dec_pre₁ !predicate_pre₁) {
static_assert(ICE_OR_TRUE(predicate_pre₁), "some string");
unreachable();
} else if (…) {
...
} else if (dec_preₙ !predicate_preₙ) {
static_assert(ICE_OR_TRUE(predicate_preₙ), "some string");
unreachable();
} else {
auto const _ReturnValue = func_inner(parameters, …);
if (dec_post₀ !predicate_post₀) {
static_assert(ICE_OR_TRUE(predicate_post₀), "some string");
stdc_contract_terminate("some string");
} else if (dec_post₁ !predicate_post₁) {
static_assert(ICE_OR_TRUE(predicate_post₁), "some string");
stdc_contract_terminate("some string");
} else if (…) {
...
} else if (dec_postₘ !predicate_postₘ) {
static_assert(ICE_OR_TRUE(predicate_postₘ), "some string");
stdc_contract_terminate("some string");
}
}
stdc_contract_terminate("severe bug, execution must never reach here");
}
typeof(func) func; // instantiation of funcThe only function of the three that is callable by user code is func, either as the inline definition in the
header file or via the external symbol; both types of calls have the
same semantics.
In particular, func_aux, will never
be called directly by user code. It
func_inner under these
assumptions,Note that in the translation unit the whole definition of func_inner is visible, and that the only use
of that function is the call inside func_aux. The address of func_inner is never taken directly and it
does not escape from the TU where it is defined. So for the
implementation of func_inner the
translator may assume that the preconditions hold and it may transfer
the preconditions as assumptions to the user code. Similarly, in func_aux all information from the user code
is available for the tests of the postconditions such that they possibly
can be optimized out.
A non-optimized version of the instantiation of func would evaluate each predicate twice,
once from the inline body of func and once from the body of funx_aux. Indeed, each predicate is first
checked as an assertion (possibly calling stdc_contract_terminate())
and then as an assumption (running into unreachable()),
so because the predicate cannot have side effects and are syntactically
equivalent the second check can easily be optimized away.
It seems that in the advent of C++26, all major C++ compilers have implementations of C++ contracts with some of the semantics models mentioned above.
In C, there is now wide support for some form of “assume” feature,
either as builtins (clang, MSV) or as an attribute (gcc). Together with
C23’s unreachable, in C
this helps to implement the semantics with shallow inline wrappers as
described previously.
If there is no additional help from the compiler, the need to
implement two consistent wrappers (the inline version of
func and func_aux above) makes the approach a bit
tedious and not very practical. Nevertheless, this already provides a
way to test the semantics and to see how modern compilers are able to
detect redundancies in chained contracts.
Our eĿlipsis preprocessor provides include files <ellipsis-contracts.h>,
<ellipsis-interface.h>
and <ellipsis-implementation.h>
that uses a syntax that is relatively close to the proposed syntax for
declarations with contracts, and uses them to implement the following
features:
inline
wrappers.func_aux
above).if statement with
declarations to implement the ghost variable feature.The features stdc_contract_assert,
_Pre and _Post use static
strings for the diagnostics that are generally embedded in the
executable. This can be used to evaluate which of the contracts actually
survived optimization and thereby to make a detailed assessment of the
particular optimization capabilities.
eĿlipsis’ implementation also uses contracts as described itself, which also provides a lot of test cases for the abilities of modern compilers. An evaluation with gcc shows an interesting level of optimization, as long as not too much aliasing analysis needed for the contracts.
Thanks to Martin Uecker and David Tarditi for review and discussions.