Proposal for C2y
WG14 3196

Title:               `if` declarations
Author, affiliation: Alex Celeste, Perforce
Date:                2023-11-30
Proposal category:   New feature
Target audience:     Compiler implementers, users

Abstract

The C if statement only admits an expression as its operand. In contrast, the for statement admits either an expression or a declaration as its first operand. We propose that C modifies the if statement to allow the operand to be either a declaration or an expression, and to optionally allow a second expression clause when the first clause is a declaration. This is taken from existing practice in C++.


if declarations

Reply-to:     Alex Celeste (aceleste@perforce.com)
Document No:  N3196
Revises:      (n/a)
Date:         2023-11-30

Summary of Changes

N3196

Introduction

C99 introduced the ability to treat clause-1 of a for statement as either an expression or a declaration (as part of the wider change to allow mixing declarations with code). The benefits of this are by now well-accepted: for a variable whose sole job is to act as an iterator for the loop, it makes the most sesnse for that variable to have its scope be bound as tightly as possible to the loop. This is so widely accepted that we will not justify it further.

In C++, if has allowed the condition to be either an expression or a declaration with initializer since C++98. In C++17, this was enhanced to allow for a second clause in if (and also switch), so that the declaration could put a name to a temporary which the condition itself could then use in an expression, functioning altogether much like the first two clauses of for in C (where the declaration/first expression "does" nothing in terms of control except exist within scope; the second clause provides the control expression), without the looping behaviour.

We propose that C should adopt the C++17 enhancement and allow declarations directly within the condition of if and switch. This provides tighter scoping of temporaries, and has downstream effects allowing for substantially simpler definitions of "library control structures" via macro that previously had to use for and/or anaphoric constructions.

Most of the motivation is thoroughly described by p0305.

Control macros

An example of a control macro which would be simpler to implement with this feature is just (and expect), a pseudo-monadic way to safely access the content of a library Optional type. A just macro admits an Optional as its operand, and if it has a value, declares that a variable with that value (or perhaps a native pointer to it, depending on the implementation) as visible in the operand-scope-block.

In C23, the most straightforward way ends up looking like:

Optional(int) ox = ...

just (ox) {
  useValue (it);
} else {
  useNil ();
}

The declaration of it is implicit (anaphoric) for the operand scope, because it is impractical to write a macro that would allow the syntax we really want:

if (int * x = just (ox)) {
  ...
}

(in this case the entire control macro is a workaround, although there are also examples of control macros that would still be useful abstractions building on top of if (;) - for instance we could rename the if keyword anyway to show intent)

Further discussion of use cases like this is also covered by p0305.

Alternatives

Alternatives are described by p0305.

None of the suggestions there really help with library control structures, unless the library author gives up on integration completely and starts defining new controls with a FORM ... ENDFORM paired-keyword syntax, which does not integrate well with C at all either in readability, or, more importantly, composability.

Prior Art

The feature was standardized in C++17 and is now widely used.

Impact

No existing code is affected by this change.

Having implemented this feature in our C++ compiler, and having similar experiences in the past enhancing other control structures as and when they were upgraded (such as the range-for in C++11), we do not find that there is likely to be any substantial impact to a mature C compiler from a development perspective in adding this feature.

Our C++ compiler was able to add new classes representing different control structures and transparently see them "just work" with existing queries.

In our C compiler, which uses a different architecture, the representation of any control structure is homogeneous anyway; since a control structure was already able to have a declaration and a controlling expression we found that this feature fell out completely naturally, being able to be added with minimal effort (essentially only needing to add the implicit controlling expression for the C++98 syntax).

We expect other tools to have very similar experiences and therefore consider this proposal to have minimal development impact. Any "small" tool should have little trouble integrating this change.

Further discussion of impact is also covered by p0305 and is largely the same for C as it was for C++.

Proposed wording

The proposed changes are based on the latest public draft of C23, which is N3096. Bolded text is new text when inlined into an existing sentence.

This wording borrows directly from p0305.

Note: technically the proposed grammar also allows formulations like:

if (int x = 0; int y = x + 1) {
  y;
}

because the second clause is a condition rather than forced to be an expression. This results in a simpler definition and overall it probably makes sense to be compatible with C++ here, rather than add extra wording just to disallow a niche emergent feature users do not usually notice.

Toplevel grammar

Add a new rule to the end of the toplevel statement grammar in 6.8 "Statements and blocks", Syntax, paragraph 1:

init-statement:
expression ;
declaration ;
;

Selection

Modify the statement grammar in 6.8.4 "Selection statements", Syntax, paragraph 1:

selection-statement:
if ( init-statementopt condition ) secondary-block
if ( init-statementopt condition ) secondary-block else secondary-block
switch ( init-statementopt condition ) secondary-block

Add two new rules to the end of the statement grammar:

condition:
expression
declaration

Add a new paragraph and header for "Constraints" after paragraph 1:

Constraints

If the condition is a declaration, it shall have an initializer, and shall declare exactly one object.

(note that the C++ change makes this a syntax error, but this is simpler and more consistent with the C document)

Modify 6.8.4.1 "The if statement", paragraph 2:

In both forms, the first substatement is executed if the condition compares unequal to 0. In the else form, the second substatement is executed if the condition compares equal to 0. If the first substatement is reached via a label, the second substatement is not executed.

Add a new paragraph:

If the condition is a declaration, the value of the declared variable after initialization shall act as the controlling expression.

Add an example after paragraph 3:

EXAMPLE The controlling expression of an if statement that uses a declaration as its condition is implicitly the value of the declaration:

if (int x = get ()) {
  ...  
}

is equivalent to

{
  int x = get ();
  if (x) {
    ...  
  }
}

and

if (int x = get (); x) {
  ...
}

Modify 6.8.4.2 "The switch statement", adding a paragraph before paragraph 4:

If the condition is a declaration, the value of the declared variable after initialization shall act as the controlling expression. Otherwise, the condition is an expression and serves as the controlling expression directly.

(The example for switch would essentially just be repetition.)

Iteration

The definition of for does not need to change for this feature to be integrated. This change is therefore optional.

Modify the statement grammar in 6.8.5 "Iteration statements", Syntax, paragraph 1:

iteration-statement:
while ( expression ) secondary-block
do secondary-block while ( expression ) ;
for ( init-statement expression opt ; expression opt ) secondary-block
for ( init-statement expression opt ) secondary-block

This does not require a change to 6.8.5.3 "The for statement", which uses a non-grammar explanation of the clauses already.

Questions for WG14

Does WG14 want to declaration-in-selections to C using the proposed syntax and wording?

Would WG14 prefer to disallow the case where both clauses can declare a new identifier (divergent from C++)?

Would WG14 like to use unified wording between if, switch and for that redefines for in terms of the init-statement, or leave for as-is?

References

C23 public draft
N740 Declarations in for
p0305r1 Selection statements with initializer
Anaphoric macros
Example of just and expect
C++17