Document Number: N2484
Submitter: Aaron Peter Bachmann
Submission Date: 2020-02-19
Make pointer type casting useful without negatively impacting performance

Summary

Pointer-type-punning is useful but mostly undefined behavior according to the standards ISO/IEC 9899:1999 ... ISO/IEC 9899:2018.
Before that there were implementation defined aspects. Therefore we have to resort to additional - non-portable - extensions compiler normally provide or use another programming language (often assembler) or work-arounds to achieve our intended goals as programmers. Strict aliasing rules are also useful. They allow for more efficient code without extra work for the programmer. We can allow type-punning in many cases without negatively impacting performance.

Prior work

n2279 - Proposal to make aliasing consistent proposed to reallow pointer-casts. The proposal is very clear and simple. Its adoption would have made C a much more user-friendly language. It seems it has been rejected by the committee in order to keep the benefit of strict pointer aliasing i. e. more efficient code, unrefuitable a valid argument.

Why pointer casting is useful

Why strict aliasing rules are useful

Prior art

Problem

If pointers are derived from other pointers via type-casting then the strict aliasing-rules are often violated. Thus, the programs are incorrect.

Proposed solution

Make object-pointer-casting (casting a pointer-type to a pointer to a different  type)  valid and well defined in a local scope, i. e. function-scope and block-scopes within a function, provided that the value of the pointer derived from the original pointer via the cast to a fundamentally different pointer does not escape the scope. The accesses via this pointer shall be valid as well, provided we honor other restrictions (const, alignment, ...).

Discussion of the proposed solution and examples

Except as noted in the proposal, strict aliasing-rules shall remain intact.
    #include <limits.h>   
    void f1(float **f, unsigned **u){
        _Static_assert(sizeof(float)==sizeof(unsigned),"precondition violated");
        _Static_assert(4==sizeof(unsigned),"precondition violated");
        _Static_assert(8==CHAR_BIT,"precondition violated");       

        unsigned u32;
    #ifdef DO_SOMETING_INVALID

        *f=*u; // invalid

        *u=*f; // invalid
        *f=*(float**)u; // invalid

        *u=*(unsigned**)f; // invalid  
    #else
        (void)u;
    #endif
        u32=**(unsigned**)f;
        u32^=1u<<31;
        // under the restrictions given above a compiler not seeing the implemention
        // of the function but its prototype only must already assume
        // **f may be changed (as float); in this case **f=-**f
        **f=(float)u32; // valid according to the proposal

    }

#include <string.h>
#include <limits.h>
#include <stdint.h>
#define ALIGN (sizeof(size_t))
#define ONES ((size_t)-1/UCHAR_MAX)          // 0x01010101 for 32 bit integer
#define HIGHS (ONES * (UCHAR_MAX/2+1))       // 0x80808080 for 32 bit integer
#define HASZERO(x) ((x)-ONES & ~(x) & HIGHS) // only 0 has OV & high bit 0

size_t strlen(const char *s){
    const char *a = s;
    const size_t *w;       
    for (; (uintptr_t)s % ALIGN; s++) if (!*s) return s-a;
    // harmless but undefined, because we eventually read more than object-size!
#ifdef STRLEN_USE_MEMCOPY
    // pray compiler will remove memcpy()   
    for (;memcpy(&w,s,sizeof(size_t)), !HASZERO(*w); s+=sizeof(size_t));
#else
    for (w = (const void *)s; !HASZERO(*w); w++); // code matching this proposal
    s = (const void *)w;
#endif
    for (; *s; s++);
    return s-a;
}
static float *Fp=(float*)&Something;
static unsigned *Uu=(unsigned*)&Something; // invalid

Proposed wording changes

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:89)