N3199
Improved __attribute__((cleanup)) Through defer

Published Proposal,

Previous Revisions:
None
Authors:
Paper Source:
GitHub
Issue Tracking:
GitHub
Project:
ISO/IEC 9899 Programming Languages — C, ISO/IEC JTC1/SC22/WG14
Proposal Category:
Change Request, Feature Request
Target:
C2y/C3a

Abstract

Many compilers’s "cleanup" attribute has long-since provided scope-based, compile-time deterministic, well-known mechanism for the C language to clean up resources of all kinds (not just memory). This proposal attempts to standardize something as close to existing practice as possible while providing a select and measured few set of behaviors to ensure greater portability and usability in the C ecosystem.

1. Changelog

1.1. Revision 0 - December 10th, 2023

2. Introduction, Motivation, and Prior Art

defer in C would have avoided probably half of the issues ever caught by the Clang Static Analyzer’s malloc/free and retain/release checking, at least ten years ago when I worked on it.

Jordan Rose, Formerly Swift @ Apple, Currently Signal, December 1st, 2023

The need to clean up resources, undo partially-successful function invocations, and perform actions upon early return has been a computing need since we started having computers that were capable of calculation. This need intensified with the introduction of resources that work with the boundaries of a system, from sockets and files to memory allocations and parallelism primitives.

We have also had a large variety of failures related to goto cleanup; or goto fail;-style of programming. It becomes incredibly precarious to balance such code correctly, and sometimes individuals even opt out of goto entirely and simply repeat necessary cleanup on exit of each scope:

So this isn’t as ... "nice" ... as those "goto err1; ... goto err2;" style solutions, this is my own little piece of hell :)

It’s the result of retrofitting freeing memory in error situations to an application which used to not care about that because it was a one-shot thing.

— Martin Dørum, with code from: Housecat

As stated by the code author, this is not more robust code. In fact, this sort of idiom — while explicit — has the user repeating themselves multiple times over and over again. Each nested scope, each conditional, is a chance to potentially forget to free an element, or free too many elements too many times.

Conversely, there is the opposite idiom where — in an attempt to reduce the sort of code as shown above — the author deploys a series of outside-in, inside-out gotos and labels to handle failure. Other times, there is only one failure and the use of sentinel values (such as INVALID_HANDLE or the null pointer constant or 0) are used to effectively have no-op free/DestroyHandle/ReleaseResource/etc. calls. But, even a simple set of conditionals with only one goto fail; can be error-prone in conjunction with formatting failures and other issues.

2.1. Language-based Solutions

Language-based solutions are far superior to library-based solutions for this problem. They provide a level of guarantees to make code a lot cleaner than normal.

Even individuals with legitimate technical grievances against C++ speak highly of things such as g_autoptr:

It takes about 10 minutes to get used to the “g_autoptr(foo)” stuff, and then its not a big deal.

I’ve basically removed all of my “goto error;” handling.

If you need MSVC support, clearly you shouldn’t be using this yet. My hope is that the clang interop with MSVC will help make that a non-issue before long.

XlC might actually support it with recent versions. They are pretty good about tracking GCC frontend features. I doubt suncc will get to it though.

Christian Hergert, February 5th, 2015

This is not the first time programmers have moved towards this sort of solution. However, doing so often came at the cost of needing to leave the C language, or — as done above — embracing a potentially high degree of non-portability to achieve the goals. As a Standards Committee, it would seem prudent to produce a viable implementation of this that is capable of satisfying all of C’s stakeholders without adding undue burden to the language.

The only drawback of __attribute__((cleanup())) is that it is tied to passing in a single function, and only taking a single void* parameter. It becomes difficult to properly coordinate additional information to the function without figuring out potentially contested data transmission mechanisms between the function put into the cleanup portion of the attribute and the actual run of the function itself.

2.2. Library-based Solutions

Just because there are language solutions to the library that others have built on top of, it has not stopped others from creating their own solutions produced entirely within library-based code. There are both simple and robust examples of this sort of code in the wild, with some of the most readily available being:

The details of these libraries — both the simple version in moonchilled’s rendition and the more robust version in Jens Gustedt’s offering — are testaments to the ability of a dedicated individual to work through the conditions of their language to produce remarkable code to solve the engineering problems they are facing. Unfortunately, each has a has a wide variety of implementation drawbacks:

Even when the final code compiles down to something fairly efficient, these macros still tend to cost extra stack space or take up additional binary size through (potential) dynamic allocations to hold defers that go beyond a certain limit and need extra space to hold certain kinds of callbacks. This means that most code written using library solutions end up fairly suboptimal to the language based or preprocessor-based solutions.

3. Design

The design of this feature chooses 4 core tenets upon which it is based:

These four tenets are important to cleave close to existing practice and avoid any potential run-time overhead.

By making it a locally-scoped entity, we can have many more statements and much more data accessible to what is the effective equivalent of the cleanup function. This prevents us from fiddling with making larger structures and calling a function, when we can instead have it entirely in-line and improve code motion and optimization opportunities that would not be available otherwise.

Notably, this is not similar to C++'s "RAII" (Resource Acquisition is Initialization) idiom in the specific area that std::terminate() will run every object’s destructor up the entire call stack. While C++ implementations with extensions may offer to give the same behavior as a run destructor for defer blocks in their C++-with-extensions mode, C does not need to have any terminate()-like function or unwinding capabilities at all.

We explain these tenets and our design choices in the following subsections.

3.1. Defer Binding: Scope-Based

defer binds to the scope it was defined in. There is one other choice for defer, which is to bind to the scope of the function call. This is the programming language Go’s choice of defer[go-defer]. Choosing function-based binding for C would be an unmitigated disaster of corner cases and add the potential for needing run-time accumulation (e.g., dynamic allocation) of resources for defer in order to handle defers which appear in loops or other constructs whose execution is only determined at run-time but is still scoped to a compile-time entity like the scope of a function.

To give an example of how quickly this behavior unravels itself, consider the following Go code:

for i := 0; i < 100000; i++ {
  mutex.Lock()
  defer mutex.Unlock()
  *counter += 1
}

This code immediately deadlocks. The fix to this in Go is to write the loop like this:

for i := 0; i < 100000; i++ {
  func() {
    mutex.Lock()
    defer mutex.Unlock()
    *counter += 1
  }()
}

This is just an extremely long-winded way of having a scope-based defer. It is not all bad for function-based defer: advantages include queueing up a piece of work to be done only if certain conditions are met. E.g., one can place a defer inside of an if statement and then having it run at the end of the function if and only if that if was entered. However, the cost of such behaviors means attempting to shoehorn a design from Go which has the backing of a garbage collector and on-demand allocation. The first go snippet above, if it had not deadlocked, required dynamic allocation in an earlier version of Go. It took significant optimization work to get to a place where this would no longer be the case.

We know for a fact that many C compilers are averse to taking control of hidden dynamically-sized (not necessarily heap) allocations. It can often result in issues in the portability of code to smaller platforms. We also know for a fact that memory is neither free nor cheap in the C programming language; Go can pull this off because it has a run-time that can manage its garbage collector and is generally geared towards high-resource environments (even if it uses those resources efficiently). As a language feature, we cannot prioritize a design which may require an unbounded amount of code to be (potentially) stored in heap space so that it can be run as a callback, with potentially-saved data from each iteration of a loop stored in that construct as well.

3.2. defer Syntax and Grammar: secondary-block

The syntax of a defer block simply uses:

defer-statement:

defer secondary-block

A secondary-block is the same grammar term used for e.g. if statements, so all of the typical syntactic constructs — even the ones that look questionable — are allowed. We expect coding guidelines and build-failing tools to apply conformance to make these more legible (e.g., clang-tidy) and resistant to the usual failure cases.

We also chose the all-lowercase name defer for this feature. This is, technically, a breaking change. We do not mind swapping every instance of defer in this paper to be _Defer, or _Cleanup, or _Meow; the exactly spelling of the introductory keyword is of little consequence to us. We use defer in this paper to draw clear connections to the existing practice in other languages which use a similar keyword, such as the Go programming language. defer/Defer is also, as shown in the existing practice above, a very common spelling for this feature.

3.3. Reference Captures "by Default"

This proposal does not tie the acceptance of the proposal to the presence of an explicit capturing clause, as was the case in other versions[N2895][N2542]. It simply allows for any variables that are visible in the scope of the defer statement to be allowed like any other named entity. No implicit or invisible copy of the variables is performed: it simply refers to those variables just as in the same way as the rest of the surrounding code. This is the safest and best way to handle the way that this feature works.

The reason this is critically important is due to the potential to double-free. For example, consider the following function call:

void f () {
  void* p = malloc(meow);
  defer free(p);
  /* … */
  if (some_important_condition) {
    take_ownership_and_use(p);
    p = NULL;
  }
  /* … */
}

If captures are done by-value (the pointer’s value is copied and held onto for the duration of the code until the function is exited by some means), then this example is a potential double-free. This is an enormous footgun. Copying by-value also introduces a (hidden, semi-uncontrolled) state that will exist on all implementations until optimizers can potentially get rid of the extra copies stored in the defer statement. We note that this is the position Swift tends to take with its closures: values are referred to by-reference if they can prove such accesses are safe, but otherwise decay to by-value copies. However, Swift generally codifies and relies more on its ability to perform certain optimizations, whereas C implementations are allowed to be far weaker in terms of their optimization and translation/evaluation/execution guarantees.

Copying by-value for defer is bad design in-general; anything that exists in the same scope and cannot escape said scope (such as statement expressions, defer statements, and otherwise) should always refer to existing variables through their name and at the same location/address. There is no risk of failure here because defer statements are not objects or declarations; they do not occupy a fixed amount of space as a reference-able object, they cannot be passed to sizeof, they cannot be copied or put on the heap versus the stack or some other form of storage location. They are simply a form of code (and code organization) like other flow control and code control entities. Treating them like callbacks (e.g., things that can be saved/transported/invoked at an arbitrary point later in time) is antithetical to the feature itself.

For these reasons, defer should refer to existing variables as thought they were normal l-values (because they are and they should be). We anticipate that, in a future where Lambdas/Nested Functions/etc. are possible, specific styles of capture can be obtained through their use where it will be explicit and documented neatly by the use of such hypothetical features themselves.

3.4. "Why Does This Not Unwind The Whole Call Stack??"

No C implementation provides a compiler-driven unwinding that we could find, even with __attribute__((cleanup())) p = malloc(2);. There is one notable exception, but it requires code to be in "C++ mode" (or have the equivalent of -fexceptions passed to the compiler to enable it in "C mode"). Right now, calling any of:

did not produce any code that called either the cleanup-annotated variables, or other code. defer works similarly: no stack unwinding or call stack back-travel is done when any function that refuses to return and returns control to the host environment is done.

Note: This is compatible with C++ semantics for a similar C++ feature: constructors and destructors.

It is noteworthy that not even C++ destructors run on the invocation of any of these functions, either. (You can test that assumption here.) They have to use the C++-specific function std::terminate() and work with the std::terminate_handler in order to get appropriate unwinding behavior. Therefore, there is no precedent — not even from C++ — that C or C++ code should appropriately and carefully unwind the stack. defer, therefore, will not provide this functionality. This makes it cheaper and easier to implement for platforms that do not have __attribute__((cleanup())), while also following existing practice to the letter. Notably, the "cheapness" and "ease" that will come from the implementation means that at no point will there ever need to be a maintained runtime of unwind scopes or exception handling-alike tables. In fact, no storage of any form of propagation information is necessary for this feature. It simply incentivizes the programming practices currently available to C programs: error codes, structured returns (with error codes embedded), and other testable function outputs in conjunction with better-defined cleanup code.

The one place this does not hold up is thrd_exit. Consider the following code:

#include <stdlib.h>
#include <stdio.h>
#include <threads.h>

extern void* ep;
extern void* ep2;
extern int alternate;

void cfree(void *userdata) {
  void **pp = (void**)userdata;
  printf("freeing %p !!\n", *pp);
  free(*pp);
}

[[gnu::noinline]] void use(void* p) {
  if ((++alternate % 2) == 0)
    ep = p;
  else
    ep2 = p;
}

int thread([[maybe_unused]]void* arg) {
  __attribute__((cleanup(cfree))) void* p = malloc(1);
  printf("allocating %p !!\n", p);
  use(p);
  thrd_exit(1);
  return 1;
}

int main () {
  __attribute__((cleanup(cfree))) void* p = malloc(1);
  printf("allocating %p !!\n", p);
  int r = 0;
  thrd_t th0 = {};
  thrd_create(&th0, thread, NULL);
  thrd_join(th0, &r);
  use(p);
  exit(0);
  return 0;
}

void* ep = 0;
void* ep2 = 0;
int alternate = 0;

As of December 1st, 2023 on GCC trunk with the latest libpthreads, this code will print:

allocating 0xa072a0 !!
allocating 0x7f8034000b70 !!
freeing 0x7f8034000b70 !!

with -fexceptions turned on (or built in C++ mode), and

allocating 0x47e2a0 !!
allocating 0x7f7e14000b70 !!

with -fexceptions not provided. (See it running and change the flags here.) This indicates that, specifically for thrd_exit and its underlying implementation on pthread_cancel/pthread_exit, the system will deploy a C++-style exception to do unwinding. This is fine for an implementation, and it is a conforming extension to add unwinding on top of C in this manner (to e.g. be more behavior-compatible with C++ or to protect precious thread-based resources).

However, note that even in this example, the memory from main is always leaked, no matter what. This means that even in C++ mode or C mode with -fexceptions specified, exit, quick_exit, and similar do not provide unwinding capabilities. Implementations should feel free to change or enhance this behavior.

Finally, we note that pretty much everything in MSVC is done by doing stack unwinding with their Structured Exception Handling (SEH) or similar techniques, so for the macros we provide almost every single one will be defined and have the value of 1. This includes even longjmp.

3.4.1. What about thread static storage destructors, atexit, etc.?

These features require explicitly opt-in from the user in order to do program-specific and thread-specific cleanup in C. (C++, for threads, just relies on its RAII primitives in conjunction with parallelism language features and parallelism primitives). They can be hooked into while writing defer to register each defer statement’s code into them, and provide a form of artisanal & manual unwinding. Some applications that must retain the integrity of its data tend to use these features as a way to perform rollback or as a last-minute way to sanity check assumptions and data.

This proposal does not change anything about the semantics of these functions in any way.

3.4.2. So how does this proposal handle it??

We leave room for a future paper adding conditionally supported, compile-time checkable unwinding semantics to C. That is, we say that any defer D that is reached may or may not run if a non-local jump or program termination occurs. We state that this is implementation-defined. Right now, we provide no macros or other hard-specified behavior on this. This will allow us to write papers immediately after the defer paper to properly define unwinding/stack unwinding, and their associated behaviors with longjmp/setjmp, exit/_Exit, quick_exit, abort, thrd_exit, and other [[no_return]]/_Noreturn-style of functionality.

To prepare for such a future, this paper was written to eventually cover such behaviors and document them in a way that a program can react to the presence of unwinding reliably. That paper is here: https://thephd.dev/_vendor/future_cxx/papers/C%20-%20Unwinding.html.

3.5. Compile-time Construct

Due to the nature of the design, all defer blocks can be transformed during translation and require no execution-time coordination or marking. This is imperative to ensure that the feature produces no overhead compared to __attribute__((cleanup())) functionality, to __try + __finally with no Structured Exception Handling catching on MSVC, or to manually writing a series of goto cleanup0;goto cleanup1; in a (potentially deeply-nested) set of gotos. For example, consider this function from real-world code (the linked code from Martin, listed previously in this proposal) that performs a series of nested ifs with a series of gotos for cleanup.

h_err* h_build_plugins(const char* rootdir, h_build_outfiles outfiles, const h_conf* conf)
{
  char* pluginsdir = h_util_path_join(rootdir, H_FILE_PLUGINS);
  if (pluginsdir == NULL)
    return h_err_create(H_ERR_ALLOC, NULL);
  char* outpluginsdirphp = h_util_path_join(
    rootdir,
    H_FILE_OUTPUT "/" H_FILE_OUT_META "/" H_FILE_OUT_PHP
  );
  if (outpluginsdirphp == NULL)
  {
    free(pluginsdir);
    return h_err_create(H_ERR_ALLOC, NULL);
  }
  char* outpluginsdirmisc = h_util_path_join(
    rootdir,
    H_FILE_OUTPUT "/" H_FILE_OUT_META "/" H_FILE_OUT_MISC
  );
  if (outpluginsdirmisc == NULL)
  {
    free(pluginsdir);
    free(outpluginsdirphp);
    return h_err_create(H_ERR_ALLOC, NULL);
  }

  //Check status of rootdir/plugins, returning if it doesn’t exist
  {
    int err = h_util_file_err(pluginsdir);
    if (err == ENOENT)
    {
      free(outpluginsdirphp);
      free(outpluginsdirmisc);
      free(pluginsdir);
      return NULL;
    }
    if (err && err != EEXIST)
    {
      free(outpluginsdirphp);
      free(outpluginsdirmisc);
      free(pluginsdir);
      return h_err_from_errno(err, pluginsdir);
    }
  }

  //Create dirs if they don’t exist
  if (mkdir(outpluginsdirphp, 0777) == -1 && errno != EEXIST) {
    free(outpluginsdirphp);
    free(outpluginsdirmisc);
    free(pluginsdir);
    return h_err_from_errno(errno, outpluginsdirphp);
  }
  if (mkdir(outpluginsdirmisc, 0777) == -1 && errno != EEXIST) {
    free(outpluginsdirphp);
    free(outpluginsdirmisc);
    free(pluginsdir);
    return h_err_from_errno(errno, outpluginsdirmisc);
  }

  //Loop through plugins, building them
  struct dirent** namelist;
  int n = scandir(pluginsdir, &namelist, NULL, alphasort);
  int i;
  for (i = 0; i < n; ++i)
  {
    struct dirent* ent = namelist[i];
    if (ent->d_name[0] == '.')
    {
      free(ent);
      continue;
    }

    char* dirpath = h_util_path_join(pluginsdir, ent->d_name);
    if (dirpath == NULL)
    {
      free(outpluginsdirphp);
      free(outpluginsdirmisc);
      free(pluginsdir);
      return h_err_create(H_ERR_ALLOC, NULL);
    }
    char* outdirphp = h_util_path_join(outpluginsdirphp, ent->d_name);
    if (outdirphp == NULL)
    {
      free(dirpath);
      free(outpluginsdirphp);
      free(outpluginsdirmisc);
      free(pluginsdir);
      return h_err_create(H_ERR_ALLOC, NULL);
    }
    char* outdirmisc = h_util_path_join(outpluginsdirmisc, ent->d_name);
    if (outdirmisc == NULL)
    {
      free(dirpath);
      free(outdirphp);
      free(outpluginsdirphp);
      free(outpluginsdirmisc);
      free(pluginsdir);
      return h_err_create(H_ERR_ALLOC, NULL);
    }

    h_err* err;
    err = build_plugin(dirpath, outdirphp, outdirmisc, outfiles, conf);
    if (err)
    {
      free(dirpath);
      free(outdirphp);
      free(outdirmisc);
      free(outpluginsdirphp);
      free(outpluginsdirmisc);
      free(pluginsdir);
      return err;
    }

    free(dirpath);
    free(outdirphp);
    free(outdirmisc);
    free(ent);
  }

  free(pluginsdir);
  free(outpluginsdirphp);
  free(outpluginsdirmisc);
  free(namelist);

  return NULL;
}

This is a fairly small function, clocking in at some 130 lines long. There are, as far as most reviewers can tell, no errors in the creation or deletion of the various kinds of resources (particularly, repeated memory allocations). The exact same code can be restructured as follows.

h_err* h_build_plugins(const char* rootdir, h_build_outfiles outfiles, const h_conf* conf)
{
  char* pluginsdir = h_util_path_join(rootdir, H_FILE_PLUGINS);
  if (pluginsdir == NULL)
    return h_err_create(H_ERR_ALLOC, NULL);
  defer free(pluginsdir);
  char* outpluginsdirphp = h_util_path_join(
    rootdir,
    H_FILE_OUTPUT "/" H_FILE_OUT_META "/" H_FILE_OUT_PHP
  );
  if (outpluginsdirphp == NULL)
  {
    return h_err_create(H_ERR_ALLOC, NULL);
  }
  defer free(outpluginsdirphp);
  char* outpluginsdirmisc = h_util_path_join(
    rootdir,
    H_FILE_OUTPUT "/" H_FILE_OUT_META "/" H_FILE_OUT_MISC
  );
  if (outpluginsdirmisc == NULL)
  {
    return h_err_create(H_ERR_ALLOC, NULL);
  }
  defer free(outpluginsdirmisc);
  //Check status of rootdir/plugins, returning if it doesn’t exist
  {
    int err = h_util_file_err(pluginsdir);
    if (err == ENOENT)
    {
      return NULL;
    }
    if (err && err != EEXIST)
    {
      return h_err_from_errno(err, pluginsdir);
    }
  }

  //Create dirs if they don’t exist
  if (mkdir(outpluginsdirphp, 0777) == -1 && errno != EEXIST) {
    return h_err_from_errno(errno, outpluginsdirphp);
  }
  if (mkdir(outpluginsdirmisc, 0777) == -1 && errno != EEXIST) {
    return h_err_from_errno(errno, outpluginsdirmisc);
  }

  //Loop through plugins, building them
  struct dirent** namelist;
  int n = scandir(pluginsdir, &namelist, NULL, alphasort);
  if (n == -1) {
    return h_err_from_errno(errno, namelist);
  }
  defer {
    for (int i = 0; i < n; ++i)
    {
      free(namelist[i]);
    }
    free(namelist);
  }
  for (int i = 0; i < n; ++i)
  {
    struct dirent* ent = namelist[i];
    if (ent->d_name[0] == '.')
    {
      continue;
    }
    char* dirpath = h_util_path_join(pluginsdir, ent->d_name);
    if (dirpath == NULL)
    {
      return h_err_create(H_ERR_ALLOC, NULL);
    }
    defer free(dirpath);
    char* outdirphp = h_util_path_join(outpluginsdirphp, ent->d_name);
    if (outdirphp == NULL)
    {
      return h_err_create(H_ERR_ALLOC, NULL);
    }
    defer free(outdirphp);
    char* outdirmisc = h_util_path_join(outpluginsdirmisc, ent->d_name);
    if (outdirmisc == NULL)
    {
      return h_err_create(H_ERR_ALLOC, NULL);
    }
    defer free(outdirmisc);

    h_err* err;
    err = build_plugin(dirpath, outdirphp, outdirmisc, outfiles, conf);
    if (err)
    {
      return err;
    }
  }
	
  return NULL;
}

All of the special resource cleanup along certain branches are now completely gone. We have shrunk this function by around 14 lines of code. There are no resource leaks in this code. There was a potential resource leak in the first bit of the code, where if n was six in the code’s scandir loop and the third loop failed, it would return but fail to release the rest of the scandir results. We added an additional early exit and check to this, and such a check does not require another list of frees (which makes the differential here even more significant, since we have eliminated even potential future code with defer). When defer is used, additional early returns and checks added to the function later give no risk of forgetting to free or clean up specific temporary resources. Getting to:

is a fairly good yield for a standard C feature. This sort of high-impact, high-quality refactoring enables better cleanup (as realized by GTK in its earliest iterations of g_autoptr) and resistance to changes-over-time.

3.6. Visibility & Clarity of Code

This feature also achieves something that C users have frequently requested for functionality of this caliber. While C++ destructors "hide" the code behind the destruction of an object, defer leaves that effect clear in the code by requiring that it is placed in the requisite scope. One can simply trace backward from a return or a goto paired with its label and see what actions will be taken by defer.

This is also a bit harder to achieve with __attribute__((cleanup()))-style code. Furthermore, because of the way the attribute works, one cannot use the normal and typical free/delete functions that have the usual behavior and are easily understandable. __attribute__((cleanup())) passes a pointer to the declared variable. So, writing the following:

int main () {
  __attribute__((cleanup(free))) void* p = malloc(1);
  return 0;
}

is incorrect, because it will pass (void*)&p — a void** that is then cast to the void* — to the free function. This means this code will compile, link, and run, but attempt to free the address of the stack variable that represents the pointer and not the value of the pointer itself. (Thankfully, GCC will warn about this.) A new function has to be written, that takes a void*, then casts it to void**, and then dereferences the void** to pass it to the free function.

It is not an ideal interface.

3.7. Safety & defer: Preventing Leaks from Human Vulnerability

One of the most important tenets of this feature is resistance to human fallibility. There is a lot of ways in which a human being who deals with resource handling and required calls may fail to do such explicitly paired calls for creating a resource/entity and releasing a resource/entity. Many vulnerabilities happen because restructuring to if ladders (shown above to have still leaked some resources from the scandir code, though this code itself is not directly vulnerable and just a leak) or goto fail;-style code and leave out necessary actions, e.g. [goto-fail]. One such CVE that highlights human fallibility in the face of necessary control flow is the Linux Kernel’s somewhat recent CVE-2021-3744[cve-2021-3744].

[cve-2021-3744] is not a vulnerability where they forget to free the data in totality: it was that the data for tag was not freed along very specific paths in a crop of sensitive code.

Adding a defer at the allocation of these resources (particularly, line 877 after the tag was successfully init with the DM work area) would make it impossible to forget to release the resources associated with that variable (or setting it to NULL along specific paths where the data was transferred off or taken ownership of).

Note: This is not an indictment of the quality of Linux Kernel source code, but — as this sort of vulnerability has been repeated time and time again over the last two decades — a cautionary tale of how human beings are allowed to be fallible.

To quote Daniel Stenberg, maintainer of curl:

It burns in my soul

Reading the code now it is impossible not to see the bug. Yes, it truly aches having to accept the fact that I did this mistake without noticing and that the flaw then remained undiscovered in code for 1315 days. I apologize. I am but a human.

How I Made A Heap Overflow In Curl; October 11th, 2023

(Emphasis mine.) We are all human. One of the takeaways that people usually have from this is that we need to put "more eyeballs" or "do better teaching", but this is — again — not the first time this sort of vulnerability happened. For example, the same kind of bug also came from the GnuTLS implementation back in 2014[gnu-tls-bug-analysis]. The moral of the story is not to beat human beings up for their fallibility or mistakes, but to turn around to the language designers and actually ask them why this sort of problem can go on for nearly 40 years of C programming and nobody actually bring a solution to the problem.

It is time to start acknowledging that our lack of built-in tools in C is not doing the job quite right. The fact that titans in our industry over 30, 40 years can have even the tiniest slip-up or slightest indentation mistake used against them and their code, points to a fundamental issue in the way the language interacts with and works with its user base. Requiring perfect, 24/7 vigilance from people who support trillion-USD market cap industries and billion-USD quarterly budget businesses across the entire globe — and in space itself — while only earning a fraction of that while sometimes unable to support themselves is quite frankly bonkers. Our users have faithfully served the C language for decades.

They deserve tools and features that can cover their fallibility and make it difficult for them to forget to handle certain cases of bugs, whether it’s heap overflow, use-after-free/double-free, integer overflow or otherwise. It is time we recognize that, in some ways, we made it error-prone and wrong, and that we can do small, simple things to make it better.

3.8. defer Ordering?

The code in a defer block runs after every other statement and expression in the block, save for other defers (which execute in reverse-lexical order from whence they appeared). Nested defer statements execute at the end of the block for the defer they are within. It is recognized that nested defers, for some people may be considered a sincere code smell. Therefore, there is optional wording below (6.6) to allow for this if people rally behind this idea.

3.8.1. defer interleaved with return ?

A defer clause executes at the end of the scope, after every other kind of execution for the scope finishes. This includes after the invocation of the expression of a return statement. That is, given this example code:

int woof();
int bark();

int use (int x){
  defer {
    woof();
  }
  return bark();
}

The order of execution is bark and then woof. Another example:

int use (int x){
  int* p = &x;
  *p = 400;
  defer {
    *p = 500;
  }
  return *p;
}

The return value of the use function is 400, not 500. Also notably, because it runs after all other non-defer statements but just before the termination of the (function or block) scope, all of the variables are still alive when referred to.

Note: This is compatible with C++ semantics for a similar C++ feature: constructors and destructors. See this live code snippet:

struct destroy_me {
  int& r;
  ~destroy_me() {
    r = 5;
  }
};

int main () {
  int r = 4;
  destroy_me dm{r};
  return r;
}

3.8.2. Flow Control / Jumps out of defer

defer statements cannot allow compile-time jumps out of themselves and into other defers or out into a surrounding scope. Otherwise, that would damage the integrity of a sequence of defers:

int get_work_order();
void rollback(int handle);
void attempt_transaction(int handle0, int handle1);

int main () {
  int very_important_handle = get_work_order();
  defer { 
    rollback(very_important_handle);
  }
  int very_important_handle2 = get_work_order();
  defer { 
    rollback(very_important_handle2);
    // !!!
    goto try_attempt;
  }
  try_attempt:
  opt(int) result = attempt_transaction(
    very_important_handle,
    very_important_handle2
  );

  return 0;
}

That goto jumps out of the second defer statement, and back into the main block. This would skip over executing of the first defer (due to the reverse-lexical-order of how defer blocks are run), which would result in unintuitive behavior. In order to cut this off, we simply do not allow compile-time jumps (goto, break, continue, or return) out of a defer block. Should more reasonable semantics be nailed down at a later date, we can go back and fill in these intentional blanks (which, generally, is not possible when something is made undefined behavior rather than a constraint violation).

3.8.3. goto and other Flow Control over an existing defer

Much like the previous section, this code is also banned:

#include <stdio.h>

int main () {
  goto b;
  defer { printf(" meow"); }
  b:
  printf("cat says");
}

Ostensibly, one could justify the way this works. "The defer is not really executed, it is more-so translated to happen at the end of the scope. Therefore, this should print "cat says meow"", some may state. However — for consistency’s sake — this gets a bit more confusing when it is not just a goto jumping over a defer statement, but a return making it look like it’s entirely unreachable:

#include <stdio.h>

int main () {
  printf("cat says");
  return 0;
  defer { printf(" meow"); }
}

Do we still print "cat says meow" here? The conclusion this paper comes to is "no". While defer provides code motion at compile-time, there is still the mental model of the programmer to consider and the tooling of the compiler to consider. In almost every case that anyone looks at this code, the defer looks like it is part of an unreachable set of code; if it were mandated to run, this would be problematic.

For this case specifically, return before a defer in the same scope simply means it is not run. Therefore, defers cannot be jumped over by gotos, similar to how Variable-Length Arrays cannot be jumped over by similar control flow constructs when the compiler can know about it. But, one can return early before a defer is ever reached.

3.8.4. goto and other Flow Control into an existing defer

This case has to be banned. Under no circumstances can we allow goto or similar into a defer statement. Consider the following code, which has 2 potential exit branches (however contrived):

int main (int argc, char* argv[]) {
  void* p = malloc(1);
  defer {
    my_label:
    free(p);
  }
  goto my_label;
  if (argc < 2) {
    return 1;
  }
  /* … */
  return 0;
}

If the defer is leapt into, which "branch" of defer are we running? The one in argc < 2, or the one at the other end at return 0? Furthermore, defer runs after, not before, things like return expressions are evaluated. What is the return value from main? What does this function end up doing? All of these make absolutely no sense. Any flow control into a defer is 110% banned, and for good reason.

3.8.5. What about longjmp?

Unfortunately, longjmp is non-local, runtime-controlled control flow. It can easily defy all of C’s typical static analysis. Therefore, we cannot form any reasonable Constraint Violations (compile-time errors) for this category of behavior. Still, we enumerate several cases of execution-time behavior where, so long as the non-local jump does not put execution outside of the current defer block, the behavior remains well-defined (modulo other issues with jumping into/over/etc. things that may be within the defer block). If the non-local jump escapes the defer, though, we say the behavior is utterly undefined.

4. C++ Compatibility: Why Not Member Functions + Constructors/Destructors?

As will be asked one hundred thousand times throughout the course of this proposal’s life:

4.1. Why Not Just Put Member Functions And Constructors/Destructors Into C? RAII Is Powerful And Solves This Problem?

There are several purely technical reasons for not pursuing a constructor/destructor-alike solution that is fully compatible with C and C++ member declarations. Briefly, they can be categorized as follows.

The first two are interconnected and also simply part of the bargain. If any of GCC, Clang, or MSVC wanted to adopt a proposal for C that would inject member functions into the language, they would — naturally and correctly — do so with the implementation that has served C++ well over the last few decades, by using name mangling. But, implementation-controlled name mangling is abhorrent to C developers for a wide variety of reasons, least of all being they have less control over their Application Binary Interfaces than they have already been deprived of. While C mangling is fairly consistent across most platforms (in that there is either little or none at all), C++ name mangling implementations can outstrip the implementation of many C17 frontends in their entirety (and, in one documented case, has done so for a vendor in WG14 that supports both C and C++).

Function overloading — and the requisite name mangling schemes that would come with it — are very much not feasible for C implementations or the C language.

Furthermore, defer actually escapes a small issue that C++ has created for itself with noexcept, exceptions, and destructors. As C++ would generally rightfully claim, this feature is redundant with its RAII concept. They would be correct, except in one place: the C++ standard library. Peter Sommerlad has spent over 10 years working on a proposal for a "Generic Scope Guard" for C++, so long that it has a C++ Paper Number from both the P-based proposal system and also a large series of WG21 N-document numbers[p0052].

Sommerlad’s efforts ultimately failed in C++ because C++ has a rule that objects created and owned by the C++ Standard Library must never have a non-noexcept destructor. One of the most pertinent use cases is having scope guards which, in their destructor, could roll back transactions and then throw an exception. Because a hypothetical std::scope_guard object could not possibly have a non-noexcept destructor, this meant that creating an object, giving it a destructor that invokes the code contained by the function stored in a std::scope_guard, and having that function throw an exception, it would not achieve its desired purpose. It would, instead of throwing that exception, immediately call std::terminate() and kill the program on the spot. (All exceptions thrown in a noexcept destructor — as would be mandated would a std::scope_guard — immediately call std::terminate() when an exception hits the boundaries of a noexcept-marked function, including C++ destructors.)

Thusly, C++'s own rules about destructors — that they refused to break in this one case — makes it impossible to create an RAII object that fulfills one of the primary uses of std::scope_guard.

defer, as a neutral language construct that is not tied to a member object, does not have this problem as there is no implication of success or failure; it is just an alternate form of code motion that is not tied to the lifetime of an object, but instead to lexical scope directly. Given the history of Peter Sommerlad’s [p0052], we are unsure if they will pursue a language-based solutions to escape the rules of their own Standard Library (which are, for many reasons, completely justified). Therefore, this looks like one of those features that will, despite being very fundamental in either C or C++ code, be taken care of by user-defined libraries, shims, and polyfills (e.g., hand-rolled or Boost) for the foreseeable future.

4.2. Signaling Failure in Destructors

Destructors also present another serious problem in that the only way to communicate things out of them is to either:

Asides from the general issues of how palatable exceptions may or may not be for C, destructors with unwinding and exceptions prove to be truly untenable in many cases. For example, for file-type resources in C++ or thread-type resources in C++, the standard mandates that any errors or exceptions generated are just completely consumed and swallowed and never communicated outside of their destructor. This presents a greater issue for the degree of resource safety, and often users have had to manually flush std::(i|o)fstreams and close them manually to check for errors properly rather than find out log files would suddenly truncate their output in the middle of serious outages where that information was necessary:

virtual ~basic_filebuf();

Effects: Calls close().

If an exception occurs during the destruction of the object, including the call to close(), the exception is caught but not rethrown (see [res.on.exception.handling]).

— C++ Standard, 31.10.3.2 [filebuf.cons], December 10th, 2023

This happens in numerous other places in the C++ standard library. It is sufficient to state that this is very undesirable for C; swallowing exceptions -- or any other failure -- in a destructor-like design is not a good design for C. It is analogous to having error codes set on errno that get ignored, and this has been a significant source of problems for C development (especially when the actual values of errno being set end up implementation-defined, as they did for the malloc(0)/realloc(p, 0) messes that resulted in the functionality of the latter becoming undefined behavior).

Note: Having cleanup behavior completely devoid of context and potential failures, trapped in its own function scope, removes the ability to react to important context related to the success or failure of certain complex hardware and operating system resources.

defer, by its design, is always placed local to the scope and has access to local information. At the cost of needing to constantly write the defer itself, it retains the ability to be shuffled or move and take into account error codes, failed operations, and more that typical context-devoid type-based RAII resources cannot (without being explicitly written to with function objects or similar variables taken into its constructor).

4.3. The Ideal World

Speaking briefly from a language design standpoint: in an ideal world, both of these solutions would be present side by side to offer the user a maximally flexible choice of error handling, automated cleanup, general-purpose undo power, and freedom to choose context (or not have any contextual information at all). Unfortunately, due to the technical challenges of name mangling from member functions (destructors and constructors) it may be some time before C sees a solution where an object contains its own clean up code and that clean up code follows an object through the system (e.g., by using types).

Despite the advances that defer makes for C, this will present a greater problem over time as C users get used to functional, block-scoped cleanup but do not have the ability to attach defers to individual struct fields or to the lifetime of a given object. __attribute__((cleanup())) attempted to do this, but in a way that was brittle. It did not transfer to other object declarations and could not be properly relocated without runtime coordination and orchestration by the user. It also could not be applied to structure fields, providing one kind of block-scope safety but making it less ideal in other cases and requiring extra boilerplate functions to get the job done. RAII achieves this by leveraging the type system. defer achieves this but requires that every place a resource is used, it must have a defer block written. It is much more manual than RAII.

4.4. The Polyfill/C++ Fix

In either case, for the above stated reasons, we will not be pursuing providing member functions (constructors/destructors) for C. We also do not anticipate C++ being thrilled about that, and may see us asking for defer as trying to inject yet another alternative design into C++ for poor reasons. We would like to not antagonize C++ any further while respecting the design of C, and therefore offer the below partial solution to the problem.

Note: We do invite C++ users, for the sake of interoperation, to create their own using a structure/class with Class Template Argument Deduction (CTAD) and lambdas, as that can cover the space fairly nicely with (effectively, needs an extra ; and requires braces) identical syntax to this proposal and no Standard Library rules to worry about. Feel free to use ours:

#include <type_traits>
#include <utility>

template <typename _Fx>
struct __defer_t {
  _Fx __fx;

  __defer_t(_Fx&& __arg_fx)
  noexcept(::std::is_nothrow_move_constructible_v<_Fx>)
  : __fx(::std::move(__arg_fx)) {}

  ~__defer_t()
  noexcept(::std::is_nothrow_invocable_v<_Fx>) {
    __fx();
  }
};

template <typename _Fx>
__defer_t(_Fx __fx) -> __defer_t<::std::decay_t<_Fx>>;

#define __DEFER_TOK_CONCAT(X, Y) X ## Y
#define __DEFER_TOK_PASTE(X, Y) __DEFER_TOK_CONCAT(X, Y)
#define defer __defer_t \
  __DEFER_TOK_PASTE(__scoped_defer_obj, __COUNTER__) =  \
  [&]()

#include <stdio.h>

int main () {
  defer {
    defer {
      printf(" :3");
    };
    printf(" meow");
  };
  printf("cat says");
  return 0;
}

5. Implementation Experience

This proposal is modeled after existing practice, but is not directly provided in a C compiler in C mode. It can be approximated (as shown above) using C++, but C++ does not have a language feature for this form that isn’t just a stripped-down form of RAII.

This leaves only __attribute__((cleanup())) as has been implemented in many compilers, some of which are GCC, Clang, XLC, and Tiny C Compiler. Most of what has shaped this proposal has been driven by this feature. However, we specifically do not use an attribute in this version of the proposal because [[cleanup(my_free)]] some_variable; is:

In contrast defer has much clearer mechanisms to achieve the same goal, and may be applied to a wider variety of instances and cases than the attribute can.

In the future, we expect that — should someone solve the tension between name mangling, member functions, constructors/destructors, and more — we could consider moving into a destructor-based solution that is tied to the type system and objects. We view that as better to solve the problem for individual member fields and larger objects, while reducing the amount of times that the cleanup code may need to be written. However, C users have expressed a deep tie to having code that will run be visible in the scope. Macros violate this rule (e.g. g_autoptr backed by __attribute__((cleanup()))), destructors violate this rule, but defer as a feature does not.

6. Wording

Wording is relative to the latest draft revision of the C Standard.

Note: This wording uses the lowercase defer keyword directly in its wording. We recognize that this may not be suitable, and it is not an integral part of this proposal to type in s/defer/_Defer/g for us. It is just simpler to read in this form. If it is necessary to change it, we’ve got more than enough time to add _Defer and do the same #include <stddefer.h> to provide the convenience macro for the lowercase spelling.

6.1. Modify §5.1.2.2.3 Program termination to ensure defers in main are run

5.1.2.2.3 Program termination

If the return type of the main function is a type compatible with int, a return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument after all active defer statements of the function body of main have been executed ; …

6.2. Modify 6.4.1 Keywords to include defer

6.4.1 Keywords
Syntax

keyword: one of

default

defer

do

6.3. Modify §6.8 Statements’s unlabeled-statement grammar production to include a new defer-statement

6.8 Statements
Syntax

statement:

labeled-statement

unlabeled-statement

unlabeled-statement:

expression-statement

attribute-specifier-sequenceopt primary-block

attribute-specifier-sequenceopt jump-statement

defer-statement

primary-block:

compound-statement

selection-statement

iteration-statement

secondary-block:

statement

6.4. Add a new §6.8.7 's section describing the new defer-statement

6.8.7 Defer statements
Syntax

defer-statement:

defer secondary-block

Description

Let D be a defer statement, S be the secondary block of D referred to as its deferred content, and E be the enclosing block of D.

Constraints

Jumps by means of goto into E shall not jump over a defer statement in E.

Jumps by means of goto shall not jump into any defer statement.

Jumps by means of return, goto, break, or continue shall not exit S.

Semantics

When execution reaches a defer statement D, its S is not immediately executed during sequential execution of the program. Instead, S is executed upon:

  • the termination of the block E (such as from reaching its end);

  • or, any exit from E through means of flow control such as return, goto, break, switch, or continue.

The execution is done just before leaving the enclosing block E.

Multiple defer statements execute in the reverse lexical order they appeared in E. Within a single defer statement D, if D contains one or more defer statements of its own, then these defer statements are also executed in reverse lexical order at the end of S, recursively, according to the rules of this clause.

If E has any defer statements D that have been reached and their S have not yet executed, but the program is terminated or leaves *E through any means such as:

  • a function with the deprecated _Noreturn function specifier, or a function annotated with the no_return/_Noreturn attribute, is called;

  • or, any signal SIGABRT, SIGINT, or SIGTERM occurs;

then any such S are not run, unless otherwise specified by the implementationFN0✨). Any other D that have not been reached are not run.

FN0✨)The execution of deferred statements upon non-local jumps or program termination is a technique sometimes known as "unwinding" or "stack unwinding", and some implementations perform it. See also ISO/IEC 14882 Programming languages — C++, section [except.ctor].

If a non-local jump (such as longjmp) is used within E but before the execution of D:

  • if control leaves E, D's statements will not be executed;

  • otherwise, if control returns to a point in E and causes D to be reached more than once, there is no effect.FN1✨)

FN1✨)This is because the "execution" of a defer statement only lets the program know that S will be run. There is no observable side effect to repeat from reaching D, as the manifestation of any of the effects of S will happen when if and only if it is exited or terminated as previously specified.

If a non-local jump (such as longjmp) is executed from S and control leaves S, the behavior is undefined.

If a non-local jump (such as longjmp) is executed outside of any D and:

  • it jumps into any S;

  • or, it jumps over any D;

the behavior is undefined.

EXAMPLE 1: Defer statements cannot be jumped over or jumped out of.

#include <stdio.h>

int f () {
  goto b; // constraint violation
  defer { printf(" meow"); }
  b:
  printf("cat says");
  return 1;
}

int g () {
  return printf("cat says");
  defer { printf(" meow"); } // okay: no constraint violation, not executed
  // print "cat says" to standard output
}

int h () {
  goto b;
  {
    // okay: no constraint violation
    defer { printf(" meow"); }
  }
  b:
  printf("cat says");
  return 1; // prints "cat says" to standard output
}

int i () {
  {
    defer { printf("cat says"); }
    // okay: no constraint violation
    goto b;
  }
  b:
  printf(" meow");
  return 1; // prints "cat says meow" to standard output
}

int j () {
  defer {
    goto b; // constraint violation
    printf(" meow");
  }
  b:
  printf("cat says");
  return 1;
}

int k () {
  defer {
    return 5; // constraint violation
    printf(" meow");
  }
  printf("cat says");
  return 1;
}

int j () {
  defer {
    b:
    printf(" meow");
  }
  goto b; // constraint violation
  printf("cat says");
  return 1;
}

int k () {
  goto b; // okay: no constraint violation
  {
    b:
    defer { printf("cat says"); }
  }
  printf(" meow");
  return 1; // prints "cat says meow" to standard output
}

int m () {
  goto b; // constraint violation
  {
    defer { printf(" meow"); }
    b:
  }
  printf("cat says");
  return 1;
}

EXAMPLE 2: All the expressions and statements of an enclosing block are evaluated before executing defer statements. After all defer statements are executed, then the block is left.

int main () {
  int r = 4;
  int* p = &r;
  defer { *p = 5; }
  return *p; // return 4;
}

EXAMPLE 3: It is implementation-defined if defer statements will execute if the exiting / non-returning functions detailed previously are called.

#include <stdio.h>
#include <stdlib.h>

int main () {
  void* p = malloc(1);
  if (p == NULL) {
    return 0;
  }
  defer free(p);
  exit(1); // "p" may be leaked
}

EXAMPLE 4: Defer statements, when execution reaches them, are tied to their enclosing block.

#include <stdio.h>
#include <stdlib.h>

int main () {
  {
    defer {
      printf(" meow");
    }
    if (true)
      defer printf("cat");
    printf(" says");
  }
  // "cat says meow" is printed to standard output
  exit(0);
}

EXAMPLE 5: Defer statements execute in reverse lexical order, and nested defer statements execute in reverse lexical order but at the end of the defer statement they were invoked within. The following program:

int main () {
  int r = 0;
  {
    defer {
      defer r *= 4;
      r *= 2;
      defer {
        r += 3;
      }
    }
    defer r += 1;
  }
  return r; // return 20;
}

is equivalent to:

int main () {
  int r = 0;
  r += 1;
  r *= 2;
  r += 3;
  r *= 4;
  return r; // return 20;
}

EXAMPLE 6: Defer statements can be executed within a switch, but a switch cannot be used to jump over a defer statement.

#include <stdlib.h>

int main () {
  void* p = malloc(1);
  switch (1) {
  defer free(p); // constraint violation
  default:
    defer free(p);
    break;
  }

  return 2;
}

6.5. OPTIONAL: Add to 6.8.7 Defer statements a new paragraph 3 additional constraint to reject multiply-nested defer.

6.8.7 Defer statements
Syntax

defer-statement:

defer secondary-block

Description

Let D be a defer statement, S be the secondary block of D, and E be the enclosing block of D.

Constraints
A defer statement shall not appear within another defer statement.

Note: 📝 Editor: also edit Example 4 and Example 5 with // constraint violation in the appropriate places, and change the description to make it clear it is a constraint violation.

6.6. Modify Annex J’s list of undefined behaviors with non-local jump undefined behavior (e.g. longjmp)

Note: 📝 For the editor to do within the Annex J undefined behavior list.

References

Informative References

[CVE-2021-3744]
National Institute of Standards and Technology. CVE-2021-3744 Detail. February 12th, 2023. URL: https://nvd.nist.gov/vuln/detail/CVE-2021-3744
[GNU-TLS-BUG-ANALYSIS]
Sean Cassidy. The Story of the GnuTLS Bug. March 14th, 2014. URL: https://www.seancassidy.me/the-story-of-the-gnutls-bug.html
[GO-DEFER]
Andrew Gerrand. . August 4th, 2010. URL: https://go.dev/blog/defer-panic-and-recover
[GOTO-FAIL]
Arie van Deursen. Learning from Apple's #gotofail Security Bug. February 22nd, 2014. URL: https://avandeursen.com/2014/02/22/gotofail-security/
[N2542]
Aaron Ballman; et al. Defer Mechanism for C. August 8th, 2020. URL: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2542.pdf
[N2895]
Jens Gustedt; Robert C. Seacord. N2895: A simple defer feature for C. December 31st, 2021. URL: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2895.htm
[P0052]
Peter Sommerlad; et al. Generic Scope Guard and RAII Wrapper for the Standard Library. February 19th, 2019. URL: https://wg21.link/p0052