Issue 1063: Conversions to signed integer types

Authors: Jay Ghiron
Date: 2026-05-20
Submitted against: C23
Status: Open

When a value with integer type is converted to another integer type other than bool, if the value can be represented by the new type, it is unchanged.

...

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

(C23 6.3.2.3 "Signed and unsigned integers" paragraphs 1 and 3.)

According to this, the following has an implementation-defined result or causes an implementation-defined signal to be raised (assume char is signed and eight bits):

int main(){
return(char)128;
}

However, this has some effects that seem unintended:

#include<stdio.h>
int main(){
int i=getchar();
if(i!=EOF){
char c=i;/* might not be representable */
}
}

So the strictly conforming version should actually be doing:

char c=i>CHAR_MAX?i+CHAR_MIN+CHAR_MIN:i;

I assume the overwhelming majority of programs are not carefully doing things like this and are implicitly relying on conversions to signed integers working as expected even when the result is not representable. More importantly, it appears that some parts in the standard make the same assumptions.

Question 1

If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.

(C23 6.4.5.5 "Character constants" paragraph 11.)

With char being eight bits and signed, does '\x80' have an implementation-defined value or result in an implementation-defined signal being raised? That is, does "whose value is that of the single character or escape sequence" mean to do a conversion of the value to char?

Question 2

With char being sixteen bits and unsigned while int is also sixteen bits, does '\x8000' have an implementation-defined value or result in an implementation-defined signal being raised?

Question 3

If the answer to either of the previous two questions is that integer character constants with a single character or single escape sequence can have implementation-defined values or result in implementation-defined signals being raised, does this also apply to string literals? For example "\xC3\x80" with char being eight bits and signed (U+00C0 encoded in UTF-8).

Question 4

Can any of the macros in <stdckdint.h> cause the result to have an implementation-defined value or raise an implementation-defined signal? For example:

#include<limits.h>
#include<stdckdint.h>
int main(){
int r;
ckd_add(&r,INT_MAX,1);
return r;
}

Each operation is performed as if both operands were represented in a signed integer type with infinite range, and the result was then converted from this integer type to type1.

(C23 7.20.2 "Checked Integer Operation Type-generic Macros" paragraph 2.)

If these type-generic macros return false, the value assigned to *result correctly represents the mathematical result of the operation. Otherwise, these type-generic macros return true. In this case, the value assigned to *result is the mathematical result of the operation wrapped around to the width of *result.

(C23 7.20.1 "Checked Integer Operation Type-generic Macros" paragraph 5.)

So the infinite precision integer with the value INT_MAX+1 is converted to int, which is by definition not representable. The paragraph in returns does not seem to clearly circumvent this. Specifically, "wrapped around" is defined as:

wraparound

the process by which a value is reduced modulo 2N, where N is the width of the resulting type

(C23 3.28 paragraph 1.)

INT_MAX+1 mod 2INT_WIDTH is by definition INT_MAX+1, so the wrapped around value might not always be representable. Note that C++ more carefully says "the unique value of the destination type that is congruent to the source integer modulo 2N" [conv.integral] when describing conversion to integer types. As an aside, the paragraph in returns does not seem to require the value false to be returned if the mathematical result of the operation is representable in the result.

Question 5

If UCHAR_MAX is greater than INT_MAX, can conversions from unsigned char to int in functions such as fgetc cause implementation-defined results or raise implementation-defined signals?

Question 6

Can conversions from wchar_t to wint_t or vice versa in functions such as fgetwc or printf (%lc) cause implementation-defined results or raise implementation-defined signals?

Question 7

What would it mean for an implementation-defined signal to be raised in a constant expression? For example:

static_assert((int)-1U*0==0);