Authors: Jay Ghiron
Date: 2026-05-20
Submitted against: C23
Status: Open
When a value with integer type is converted to another integer type other than
bool, if the value can be represented by the new type, it is unchanged....
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
(C23 6.3.2.3 "Signed and unsigned integers" paragraphs 1 and 3.)
According to this, the following has an implementation-defined result
or causes an implementation-defined signal to be raised (assume char
is signed and eight bits):
int main(){
return(char)128;
}
However, this has some effects that seem unintended:
#include<stdio.h>
int main(){
int i=getchar();
if(i!=EOF){
char c=i;/* might not be representable */
}
}
So the strictly conforming version should actually be doing:
char c=i>CHAR_MAX?i+CHAR_MIN+CHAR_MIN:i;
I assume the overwhelming majority of programs are not carefully doing things like this and are implicitly relying on conversions to signed integers working as expected even when the result is not representable. More importantly, it appears that some parts in the standard make the same assumptions.
If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type
charwhose value is that of the single character or escape sequence is converted to typeint.
(C23 6.4.5.5 "Character constants" paragraph 11.)
With char being eight bits and signed, does '\x80' have an
implementation-defined value or result in an implementation-defined
signal being raised? That is, does "whose value is that of the single
character or escape sequence" mean to do a conversion of the value to
char?
With char being sixteen bits and unsigned while int is also
sixteen bits, does '\x8000' have an implementation-defined value or
result in an implementation-defined signal being raised?
If the answer to either of the previous two questions is that integer
character constants with a single character or single escape sequence
can have implementation-defined values or result in
implementation-defined signals being raised, does this also apply to
string literals? For example "\xC3\x80" with char being eight
bits and signed (U+00C0 encoded in UTF-8).
Can any of the macros in <stdckdint.h> cause the result to have an
implementation-defined value or raise an implementation-defined
signal? For example:
#include<limits.h>
#include<stdckdint.h>
int main(){
int r;
ckd_add(&r,INT_MAX,1);
return r;
}
Each operation is performed as if both operands were represented in a signed integer type with infinite range, and the result was then converted from this integer type to type1.
(C23 7.20.2 "Checked Integer Operation Type-generic Macros" paragraph 2.)
If these type-generic macros return
false, the value assigned to*resultcorrectly represents the mathematical result of the operation. Otherwise, these type-generic macros returntrue. In this case, the value assigned to*resultis the mathematical result of the operation wrapped around to the width of*result.
(C23 7.20.1 "Checked Integer Operation Type-generic Macros" paragraph 5.)
So the infinite precision integer with the value INT_MAX+1 is
converted to int, which is by definition not representable. The
paragraph in returns does not seem to clearly circumvent this.
Specifically, "wrapped around" is defined as:
wraparound
the process by which a value is reduced modulo 2N, where N is the width of the resulting type
(C23 3.28 paragraph 1.)
INT_MAX+1 mod 2INT_WIDTH is by definition INT_MAX+1,
so the wrapped around value might not always be representable. Note
that C++ more carefully says "the unique value of the destination type
that is congruent to the source integer modulo 2N"
[conv.integral] when describing conversion to integer types. As an
aside, the paragraph in returns does not seem to require the value
false to be returned if the mathematical result of the operation is
representable in the result.
If UCHAR_MAX is greater than INT_MAX, can conversions from
unsigned char to int in functions such as fgetc cause
implementation-defined results or raise implementation-defined
signals?
Can conversions from wchar_t to wint_t or vice versa in functions
such as fgetwc or printf (%lc) cause implementation-defined
results or raise implementation-defined signals?
What would it mean for an implementation-defined signal to be raised in a constant expression? For example:
static_assert((int)-1U*0==0);