handle NB comments concerning type inference

Alex Gilding (Perforce UK)

Jens Gustedt (INRIA France)

2023-01-08

org: ISO/IEC JCT1/SC22/WG14 document: N3079
target: IS 9899:2023 version: 1
date: 2023-01-08 license: CC BY

SE-001

We only deal with the second part of that comment that addresses type inherence.

n2953 Type inference for object definitions / Under specified Types.

We believe that the proposal, that allows the use of auto to infer the type from initialization, will have a detrimental effect on safety and reliability if that chooses to use it. In C different types can have very different behaviors. The most obvious case is if the different ways signed and unsigned integers handle overflow. Not explicitly stating the type of but to rather rely on an implicit assumption, is in our view a serious risk.

Consider the following:

auto limit = MY_LIMIT;
if(limit + add < limit) /* overflow protection */
 return;

The user here assumes that MY_LIMIT is an unsigned int. Since the unsigned int has a well defined overflow behavior, the overflow test is entirely valid. However, should the MY_LIMIT define be a signed int, then overflow is undefined, and a compiler may choose to optimize away the overflow protection. The implementation will issue no warning if the assumption is wrong. If instead the user would have explicitly declared that they expect a limit to be, say an unsigned int, then the code would not break, and if MY_LIMIT would be out of range of the unsigned int, the implementation can issue a clear diagnostic.

For this reason we believe that this functionality will be discurraged by organizations like MISRA, and various style guides, since it actively prevents compilers and other diagnostic tools from verifying that the actions of the program matches the programmers intentions. We believe that the wg14 should not introduce new functionality whose use is likely to be discouraged from a safety and reliability perspective.

While we believe this to be functionality to be adopted by a very small minority of C programmers. This however leaves the possibility of accidental use, when a user forgets to specify a type properly and the implementation is no longer able to issue a diagnostic, because it is forced to assume that the omission is intentional.

This comment shows quite a narrow perspective on the feature. Thereby it is not very well aligned with the use that the feature already has in the field, nor does it even discuss the primary use cases for which it was designed and why it was integrated into the current proposal for C23, namely type-generic programming.

Security concerns are of course valid concerns, and they had been discussed during the adoption phase for the feature. One of the proponents of the feature, Alex Gilding, is much implicated in MISRA and will almost certainly write a proposal that bans the use of auto in the presented form, there. But it is not appropriate to assume that a feature that is not be suited for one part of our community would not be well suited and appropriate for other parts.

Although we suspect that this is not the whole reason for this strong allergic reaction, on the surface the only technical objection that is clearly issued here is that this feature reuses the keyword auto for this.

We had proposed to use either auto or __auto_type (the current implementation in gcc) for this and WG14 went clearly for auto. This was mainly to have a better cross-language compatibility with C++. We do not think that the comment gives any new argument to question the consensus that had been found in WG14.

GB-024

The specification of linkage for file-scope objects fails to cover the case of objects declared as thread_local without static or extern.

Change “no storage-class specifier or only the specifier auto” to “does not contain the storage-class specifier static or constexpr”.

Proposed resolution

6.2.2 Linkages of identifiers

…

5 If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern. If the declaration of an identifier for an object has file scope and no storage-class specifier or only the specifier autodoes not contain the storage-class specifiers static or constexpr, its linkage is external.

(note the editorial change “specifier” → “specifiers”)

Poll

  1. Does WG14 want to adopt the proposed resolution of NB comment GB-024?

US-121

As we’ve been working on implementing this functionality in Clang, we’re finding that the specification differences between C and C++ are a significant source of consternation for us. In C++, auto is a type specifier. In C, auto is not a type, it’s the absence of a type and the use of a storage class specifier.

Please do not resurrect implicit int with different semantics, but define this as a type specifier. Logically, a deduced type is a type and not a class of storage.

The keyword auto already has a meaning in C and that use has never been deprecated. Therefore a removal of that functionality would be a direct violation of WG14’s policy. So seen like that, if we stick to auto as the keyword for this feature, it can always be read as giving new semantics to type omission.

Historically, this feature was implemented by gcc with __auto_type to mark exactly the difference between a storage class and a type specification. It was actually clang’s choice to map this feature to the auto feature from C++ and that thereby eliminated the conceptual difference between the two.

The alternative would be to consequently use __auto_type and to constrain (not only restrict) its use to the grammar as implemented by gcc. WG14 was not in favor of that and we do not see new data here that would warrant to reconsider that decision.

As a consequence, we don’t know what an appropriate action would be to accommodate these concerns.

US-122 and 123

US-122

Introduces incompatible semantics with C++ regarding the following example (undefined behavior in C, accepted in C++):

int i;
auto good = &i;
auto *bad = &i; // Cannot specify the pointer

This style is often a coding standard requirement for code bases in C++ due to the improved code readability: e.g., https://llvm.org/docs/CodingStandards.html#beware-unnecessary-copies-with-auto.

We prefer that it be required to support (optional) pointer and array declarators as part of a deduced type.

First, the introductory phrase of this NB comment is not correct, the semantics are not incompatible here, but one standard defines semantics (C++) whereas the other doesn’t (C).

Nothing prohibits implementations to extend the C semantics towards the C++ semantics. In fact, it has been an explicit choice to make this undefined in C (and not a constraint) such that existing implementations (such as clang) would not have to change.

The reason why this semantic restriction exists in C23 is internal. In C++ type-inferrence was well-established before the introduction of this feature (namely by overloading and templates) and the definition actually uses rules that were designed for templates, there. We were not able to come up with text that would have covered the C++ semantics well enough and decided to first with the restricted version as implemented in gcc as __auto_type that was doable in the context of terminology that pre-existed in C.

N3079 presents an approach that mostly follows C++ and for which we are convinced that it should be integrated into the standard at some point. Below we provide a less complex alternative that keeps the status quo and improves the text where it seemed possible.

US-123

Adds a constraint on programs under a “Description” heading which makes it a bit less clear as to how to interpret the “shall” clauses used. For example, is this code UB or is it simply not possible to write:

auto a = { 1, 2 };

If it’s UB, an implementation could elect to deduce a as int[3] or int * and I don’t think we want to allow an extension into that space (FWIW, in C++ that would deduce to a std::initializer_list).

Clarify the intent by moving the specification either to a Constraints or Semantics section, or remove use of the word “shall”

“Shall” outside of constrains sections always indicated UB, see Clause 4 p2. But indeed, the use of “Description” as a heading falls out of line in the context where this text is located. It would better have been “Semantics”. We apologize for this mistake.

It also seems that the syntax using braced initializers is not compatible with the corresponding construct in C++, since there braced initializers for auto declarations always indicates a inferred array type.

Proposed accommodation

Note that a resolution that mostly implements the C++ approach is presented in N3079.

As an accommodation for these concerns there would be the possibility to move from UB to implementation-defined behavior and to recommend that implementations stick to reasonable semantics as are already provided by the corresponding feature in C++.

6.7.9 Type inference

Constraints

1 A declaration for which the type is inferred shall contain the storage-class specifier auto.

Description Semantics

2 For such a declaration that is the definition of an object the init-declarator shall have one of the formsthe form

direct-declarator = assignment-expression
direct-declarator = { assignment-expression }
direct-declarator = { assignment-expression , }

The declaredinferred type of the declared object is the type of the assignment expression after lvalue, array to pointer or function to pointer conversion, additionally qualified by qualifiers and amended by attributes as they appear in the declaration specifiers, if any176). If theImplementations need not accept a direct declarator that is not of the form

identifier attribute-specifier-sequenceopt

optionally enclosed in balanced pairs of parentheses, the behavior is undefined; if a direct declarator of a different form is accepted, the behavior is implementation-definedFNT).

FNT) It is recommendation that implementations that accept different forms of direct declarators follow the syntax and semantics of the corresponding feature in ISO 14882.

Add to the bibliography (and not to the normative references!)

Programming languages — C++, ISO/IEC IS 14882

Poll

  1. Does WG14 want to accommodate the concerns expressed in NB comments US-122 and 123 as presented?