From nmm1@cus.cam.ac.uk  Wed Sep 26 17:52:15 2001
Received: from virgo.cus.cam.ac.uk (cusexim@virgo.cus.cam.ac.uk [131.111.8.20])
	by dkuug.dk (8.9.2/8.9.2) with ESMTP id RAA25400
	for <sc22wg14@dkuug.dk>; Wed, 26 Sep 2001 17:52:15 +0200 (CEST)
	(envelope-from nmm1@cus.cam.ac.uk)
Received: from nmm1 by virgo.cus.cam.ac.uk with local-smtp (Exim 3.33 #2)
	id 15mGyo-0002LS-00
	for sc22wg14@dkuug.dk; Wed, 26 Sep 2001 16:52:10 +0100
To: sc22wg14@dkuug.dk
Subject: What is an Object in C Terms?
Date: Wed, 26 Sep 2001 16:52:10 +0100
From: Nick Maclaren <nmm1@cus.cam.ac.uk>
Message-Id: <E15mGyo-0002LS-00@virgo.cus.cam.ac.uk>

This is a slight redraft of a documented that was circulated to the UK
C panel.  I am not sure what it is proposing, except that effective
types need an improved specification, but it attempts to describe a
very serious problem.


THE OBJECT PROBLEM
------------------

This is asking the question "What is an object when it is at home?"  We
need to start with an explanation of why a precise answer to the
question is so important.

This issue is very closely related to the sequence point issue,
obviously, and can be regarded as of equal importance as far as the C
language's semantics are concerned.  However, it is even more serious as
far as its effects on other standards is.  C and POSIX have now agreed
that their approaches to signal handling are incompatible, though not in
those words.  This needs to be resolved, but will not even be tackled
for POSIX 2001, as it is far too difficult a problem.

The actual incompatibility is that C defines all parallelism and 'real'
or external signal handling to be undefined behaviour, as it is outside
C's mode, and that POSIX leaves all language issues to C, as it is
mainly a library interface.  Therefore such things may be well-defined
in POSIX terms, but have undefined effects on the program which uses the
POSIX facilities!  The incompatibility is therefore of the form that
POSIX relies on C to define behaviour that is explicitly undefined!
Obviously, the solution is for POSIX to define the language aspects that
it needs, but the problem is that doing so is very tricky.

It was pretty complex in C90, but C99 has complicated this area very
considerably, and the problems are going to cause major difficulty over
the next decade.  The incompatibility between C90 and POSIX already
causes intermittent, unrepeatable program failure on most of the systems
that I have access to but, precisely because of those properties, it is
almost impossible to prove this to the vendors.  Few people recognise
the symptoms and a negligible proportion have the skills to investigate
further.

When I am referring to the SMP aspects, I am talking about the fact that
two independent threads must not access overlapping objects if one of
them is updated.  Whether the SMP is cache-coherent is irrelevant, as
exactly the same problem occurs with data in registers as with data in
memory; when using parallelism for efficiency, storing and reloading all
registers at every synchronisation point is an unacceptable overhead,
and so is not done.  Also, note that this applies to all state (such as
floating-point exception flags) as well as data.  There is an exactly 
corresponding problem with asynchronous signal handlers.


Commercial Implementors and Important Users
-------------------------------------------

The vast majority of commercial implementors and important users are not
members of SC22/WG14 or even any national body affiliated with it, and
work solely from the standard.  As they should, because ISO rules (and
all normal commercial practice) specify that the standard is the
official document, and records of its development should be used for at
most clarification.

I can witness that this area has caused major headaches ever since C90
was developed, and that many implementors are very seriously unhappy
about C99.  They simply do not know what most of it implies and, as is
generally agreed, the subtleties in this area are critical for both
optimisation and the development of robust applications.  Most of the
ones that I have contacts with had severe problems deciding exactly
what optimisations were allowed in C90, and often went through several
iterations before and industry consensus emerged.

Their approach is typically to use restrict and static in array
parameters (but not effective types) enough to optimise the BLAS and
LAPACK, and perhaps a little further, but to await clarification before
being more radical.  The fact that a very few vendors are gung ho about
effective types, without it being absolutely clear what the constraints
mean, is going to be a major portability problem in certain complex
codes.  I already see far too many that have to disable optimisation on
some systems because it is unclear whether they or the implementation
has broken the standard.


The C99 Standard
----------------

Let us look at some relevant definitions:

    [A] access
    <execution-time action> to read or modify the value of an object

    [B] alignment
    requirement that objects of a particular type be located on storage
    boundaries with addresses that are particular multiples of a byte
    address

    [C] object
    region of data storage in the execution environment, the contents of
    which can represent values

    NOTE: When referenced, an object may be interpreted as having a
    particular type; see 6.3.2.1.

    [D] parameter, formal parameter, formal argument (deprecated)
    object declared as part of a function declaration or definition that
    acquires a value on entry to the function, or an identifier from the
    comma-separated list bounded by the parentheses immediately
    following the macro name in a function-like macro definition

    [E] value
    precise meaning of the contents of an object when interpreted as
    having a specific type

Other sections that are particularly relevant include 5.1.2.3 passim
(side effects, sequence points and conformance requirements), 6.2.5 [#1]
(object type and meaning of a value), 6.2.5 [#20] (derived types),
6.2.6.1 [#2,#4] (values are stored in bytes), 6.2.6.1 [#5,#6,#7] (trap
representations and padding bytes), 6.2.6.1 [#8] (validity of alternate
representations), 6.2.6.2 (integer types), 6.3.2.1 [#1] (definition of
lvalue), 6.2.7 [#2] (multiple declarations and compatible types),
6.3.2.1 [#2] (lvalue to value conversion), 6.3.2.1 [#3] (array to
pointer conversion), 6.3.2.3 [#1,#7] (pointer conversion), 6.5 [#1]
(sequence points and accesses), 6.5 [#6] (effective type), 6.5 [#7]
(permitted access types), 6.5 passim (expressions that require or
deliver lvalues), 6.5.3.2 [#3] (unary '&' operator), 6.5.8 [#5,#6]
(pointer comparison), 6.7.3.1 (formal definition of restrict), 6.7.5.3
[#7] (array parameter adjustment and implication of static therein),
6.9.1 [#9] (parameter identifiers are lvalues) and 7.21.1 [#1]
(<string.h> access conventions).  I may well have missed some important
ones.

What is abundantly clear is that the basic model used by C is this.  An
object type is specified to be a complete type that is neither void nor
a function type.  An expression that can access an object has a
one-to-one mapping to an lvalue, and they correspond to a unique and
well-defined object and object type.  A pointer value with a pointer to
object type defines an lvalue that is an array object with a base type
the corresponding object type (the array may be of length 1, of course),
except for the odd case of the element one beyond the end of an array.
A pointer value with a pointer to an incomplete type or to the element
beyond the end of an array defines an address but not an lvalue, though
its base type correspondence is otherwise the same as that for a pointer
to an object.  A pointer value with a pointer to void type specifies
only an address.  So far, so good.

Unfortunately, this breaks down in several places in C90, and in many
more in C99.  In most cases, the failure of the basic model is not
serious in the effects on the C language as such, because the
constructions that show up the problems are either perverse or show up
only in the most extreme optimising compilers.  So it is mainly the High
Performance Computing people who notice, and few of them.  However, it
is very serious as far as the effects on related standards are
concerned; this applies particularly to parallel models, such as POSIX
threads or OpenMP, but also includes anything that needs reliable signal
handling.

To a first approximation, the major problems separate into two parts:
what the base type of an object is and, if an object is an array type,
what size the array is.  These can be considered separately.

The first question is less clear-cut than the second, and has several
aspects, which all interact with one another.  These include whether an
object or subobject is the relevant one, which arises in several guises,
and the new concept of effective type.

There is a further subtlety, which will not be considered here in any
detail, which is when a datum with one base type can be accessed through
an lvalue of a different type.  This was extremely unclear in C90 and
has been significantly changed in C99.  In practice, this is mostly
about the signedness of integer types, because all other base types tend
to be either wholly incompatible or wholly compatible.  The issue
applies as much to them; it is merely that the problem does not show up
in the current C standard.  However, it is worth noting that it will
probably be exposed for fixed-point data types as well, and in a worse
way than for signedness, if they are introduced.

It is also critical to realise that the examples given later are a small
proportion of the problems that I have seen actually cause trouble, and
a very small proportion of those where I know that the C99 and POSIX or
OpenMP standards' models conflict.  So there is virtually no point in
trying to resolve any of the following individual examples; what is
needed is a clarification of these aspects of the C standard, along the
lines of the sequence point approaches (only probably rather more
radical).


What is the Base Type?
----------------------
in the details
In this context, type qualifiers are largely irrelevant, and the most
critical properties of the base type are its size and alignment.
While the C standard does not say so, it is clear that aliasing,
sequence point and SMP problems are primarily affected by those two
properties alone.

In C90, things were relatively clear, and it had three categories of
type, all of which have been preserved in C99:

    Datum type.  This is the actual type of the datum, ignoring any
    syntactic issues, and is generally not decidable statically.

    Access type.  This is the type of the lvalue used for access, and
    can be decided statically.

    Allocation type.  This is the type of the original definition or
    result of malloc etc., and is generally not decidable statically.

The requirements in both C90 and C99 are that the datum and access types
must be compatible, and that the size and alignment of the allocation
type must be adequate for both.  There is also the exception for
character access types, which may be used on any datum type.  There was
a thread on the reflector which claimed that the allocation type could
affect later accesses which used another type of lvalue, but it did not
seem to have much support, and the consensus was that its only lasting
effect was through its alignment.  All of this specification could do
with clarification.

In C90, the base type of an object was the base type of the lvalue
(i.e. the access type), except for certain library functions
(e.g. memcpy) which accessed objects as arrays of bytes.  While there
were a good many cases where the array length was unclear (see below), I
cannot think of many where the alignment of the base type was.  What
confusion there was mainly which of a set of compatibly aligned base
types to choose from when using functions like memcpy.

There were also some confusions about whether the type of an argument
(as distinct from a parameter) was relevant, subtleties about
signedness, and whether copying only the data of a structure (i.e. not
its padding) was safe.  Some of them have been clarified in C99, but
others have not.  These will be ignored here, as they affect the
semantics of C programs, but have relatively little effect on other
standards; i.e. they affect what can be done with an object, rather than
exactly where it exists in memory.

However, C99 has changed this area quite drastically, by introducing the
concept of effective type.  This has not clarified the situation, and
some of its problems are described below.  But we now have to add:

    Effective type.  This seems to be an extension of datum type, but
    intended for type-dependent optimisation.

I shall not describe this in the next section, as I do not understand
it, for reasons given later.  I think that its introduction was a very
serious mistake.


Objects and Subobjects
----------------------

There are three entwined aspects to this.  One is the question of when
an object access refers to a subobject (as distinct from the whole
object), such as an array access referring to an array element or a
structure or union access referring to a member.  Another is when a
valid object can be created by pure pointer manipulation, bypassing the
type mechanism entirely.  And the last is when the relevant object is an
array of uninterpreted characters (i.e. unsigned char); while this is
most common in the library, it can occur in the language proper.

One of the reasons that C90 and C99 work at all is that they define
almost no type-generic facilities other than ones that map objects to
arrays of characters.  <tgmath.h> is irrelevant here, because it does
not raise the problems being considered.  However, the same is not true
in other standards that use C as a base, because they may (for example)
have facilities that operate on arrays of generic data pointers.  But
this does not mean that there is not an ambiguity in C, so much as that
the ambiguity is not exposed in the current language.

It appears that the standard says neither that a member of a struct is
itself an object, nor that an element of an array is, but clearly they
are.  However, there is more to subsectioning than that.  Consider:

    #include <stddef.h>

    typedef struct {int a; double b; int c;} COMPOSITE;
    typedef struct {int a; double b;} SUBSET;

    int main (void) {
        double array[5] = {0.0};
        COMPOSITE composite = {0,0.0,0};
        double (*p)[2];
        SUBSET *q;
        int *r;

        array[0] += (array[1] = 0.0);
        composite.a += (composite.c = 0);

        p = (double (*)[2])(array+2);
        q = (SUBSET *)&composite;
        r = (int *)(((char *)&composite)+offsetof(COMPOSITE,c));

        return 0;
    }

It is clear that the first two assignments would be illegal if the
relevant objects were the whole array and structure, and constructions
like that are so common that the only plausible interpretation of the C
standard is that the relevant objects are the elements and members.
This should be stated explicitly.

However, I assert that it is also the case that *p, *q and *r are
well-defined objects of type double[2], SUBSET and int.  The validity of
the first follows from the equivalence of pointers and arrays and the
fact that a pointer to an array is a pointer to its first element, the
second follows from the common initial sequence rules for unions, and
the third from the very definition and purpose of the offsetof macro.

The constructions given here may not be common, but there are a vast
number of programs that use equivalent ones (often implicitly), and
forbidding them would break a huge number of programs.  In all cases, if
they are regarded as undefined, I am almost certain that I can produce a
program that shows a clear inconsistency in the standard.

Perhaps worse is the fact that there are so many unrelated ways of
bypassing the formal type structure; I do not know offhand how many more
there are.  So subsetting is not a simple phenomenon.

There is a similar problem with the cases where objects are handled as
arrays of characters.  Consider:

    #include <string.h>

    typedef struct {int a; double b; int c;} COMPOSITE;

    int main (void) {
        COMPOSITE composite;
        char string[] = "ABB";

        memcpy(&composite,&((char *)&composite)[sizeof(int)],sizeof(int));

        strtok(string,&string[2]);

        return 0;
    }

The first function call is a fairly common construction, though rarely
in that form, and generally regarded as acceptable; again, forbidding it
would break a huge number of programs and could be used to show an
inconsistency in the standard.  The second one is obscene, but I can see
no reason that it is not conforming; the consequences of forbidding such
things generically are beyond my imagination.

My view is that these questions are fairly easy to resolve within the C
language, except for the matter of effective type discussed below.
However, it is not so clear when other standards are involved, or for C
extensions that allow operations on whole arrays.  For example, it is
very unclear whether an array should be treated as a single object, as
an array of discrete elements or as an array of characters.  I doubt
that this problem is soluble without a major rethink of the C object
model.  Hence my view is the following:

The generic C model is that the object used for access through an lvalue
is the smallest possible subobject compatible with the type of the
lvalue and form of access, and that arrays (including arrays of
uninterpreted characters) are invariably treated as discrete elements
even when operated on in bulk.  I can think of no exceptions to this,
but the point is not clearly stated and other interpretations may be
possible.  Some wording improvement is essential.

Related to this is the interpretation of the base type of an object: if
the lvalue used for access has an object type, that type and the datum
type are the only ones considered; if it has none (as with memcpy), the
lvalue type is treated as if it were an array of uninterpreted
characters and the datum type is irrelevant.  The main exception is the
concept of effective type, if it is retained.

The third aspect is that a valid object can be created by any legal
pointer operations, casts etc., subject to each conversion having an
appropriate alignment for its type and value, and the result ending up
with an appropriate datum type and size.  I would hate to have to draft
the specification of this, but it badly needs doing.  The current
wording is probably adequate for the alignment, but is very unclear on
when the result is a valid object.

The C standard should make a committment (in Future Directions) that it
will not be defining operators or other facilities that access arrays as
a whole without resolving this problem.  Without such a committment,
many implementors will be reluctant to trust the existing standard to
remain stable, as a trivial change in this area could have immense
consequences for implementors.

It should also point out to the developers of extensions or other
standards based on C that this area is a minefield.  In particular, it
should say that it is essential that the extension or standard specifies
whether arrays will be treated as single objects, as their composite
elements, or as arrays of bytes in any particular context, as the C
language allows any of those options.


The restrict Qualifier
----------------------

This issue was raised on the reflector with the previous specification
of restrict, but got dropped because it was not clearly applicable to
the new one.  It is considering the question of whether the objects
affected by the restrict qualifier are considered to have type the base
type of the restrict qualifier pointer or that of the lvalue used for
the actual access.

    typedef struct {int a; double b; int c;} COMPOSITE;
    COMPOSITE composite;

    void copy (COMPOSITE * restrict A, COMPOSITE * restrict B) {A.a = B.c;}

    void scatter (double * restrict A, double * restrict B) {
        int i;
        for (i = 0; i < 100; i += 2) A[i] = B[i] = 0.0;
    }

    int main (void) {
        double array[200];
        copy(&composite,&composite);
        scatter(array,&array[1]);
        return 0;
    }

Now, if the objects being referred to in 6.7.3.1 are the access (lvalue)
types, there is nothing wrong with these; but, if they are considered
relative to the base type of the restrict qualified pointers, both
function calls are undefined.  As I understand the wording, the former
is the correct interpretation, though it is not the best specification
for SMP optimisation.

My belief is that this does not need attention, if the previous queries
are resolved and clarified in the directions described.  If, however,
they are resolved in any other way, then it needs reconsidering in the
light of that resolution.


Effective Type
--------------

C99 introduced the concept of effective type (6.5 paragraph 6), but it
has had the effect of making a confusing situation totally baffling.
This is because it has introduced a new category of types, it has
invented new terminology without defining it, its precise intent is most
unclear, and it has not specified its effect on the library.  The first
aspect was mentioned above.

6.5 paragraph 6 uses the term "declared type" of an object, which is
otherwise used in only five other places, and then in contexts that
makes its meaning clear.  In all of those cases, the context is
discussing an aspect of a particular declaration or lvalue, and the term
is a reference back to that declaration.  But that is not the case in
this section, and it is clear that the term is being used with a
different meaning.  It clearly and repeatedly distinguishes the type of
the lvalue used for access from the declared type, so it cannot be that,
but what does it mean?

The third question is related, in the sense that knowing what that term
means in this context might enable one to deduce the intent of this
section, and knowing the intent of this section might enable one to
deduce the meaning of that term.

Consider:

    #include <stdlib.h>

    typedef struct {double a;} A;
    typedef struct {A b;} B;

    void fred (A *a) {
        double *p = (double *)a;
        void *q = malloc(sizeof(B));
        memcpy(q,(char *)p,sizeof(B));
    }

    int main (void) {
        B b[1000];
        fred((A *)b);
        return 0;
    }

After the call to memcpy, is the effective type of the object pointed to
by q an array of B, A, double or char?  In other words, which of those is
the "declared type" and why?

It is probable that it is not an array of char, because of the lvalue
versus declared type distinction.  But there is no obvious reason to
choose any one of B, A or double as the preferred interpretation.

This is not the only problem with this section.  Another very nasty one
relates to partial copying, which is extremely common in many C
programs.  Consider:

    #include <stddef.h>
    #include <stdlib.h>

    typedef struct {double a; int b;} C;

    int main (void) {
        C c = {0.0,0};
        void *p = malloc(sizeof(int));
        memcpy(p,(void *)(((char *)&c)+offsetof(C,b)),sizeof(int));
        return 0;
    }

The wording of 6.5 paragraph 6 would seem to imply that the effective
type of the object pointed to by p becomes C, which means that the
object is necessarily invalid as there is insufficient space for a C
datum.  But surely it cannot be the intent that the effective type of
the object pointed to by p is incompatible with its datum type, which is
most definitely int in this case?  If it is, this is the most serious
incompatibility between C99 and C90 yet reported, and will break a huge
number of reasonable programs.

For the fourth aspect, consider:

    #include <stdlib.h>

    void *compare (const void *A, const void *A) {
        return *(double*)A-*(double *)B;
    }

    int main (void) {
        double array[100] = {0.0};
        void *p;

        p = bsearch(array[0],array,100,sizeof(double),compare);
    }

What is the effective type of the object pointed to by p?  It is pretty
clearly double, an array of double or no effective type.  But which and
why?

There are probably other problems, but there is little point in giving
yet more examples.  What is clear is that I (and some other people I
have spoken to) cannot work out what this section is attempting to
specify, and it desperately needs clarification.  When it is clarified,
it may well need further work, but the first step is to clarify its
intent.  If this turns out to be infeasible, then the concept of
effective type should be scrapped until and unless someone can come up
with an adequate description.


What Size is an Array?
----------------------

In this section, the question being asked is how large the array object
that corresponds to a pointer value is.  Please ignore any assumptions
that can be used by the fact that two constructions are in the same
compilation unit, as they could easily be separated.  For the purposes
of these examples, let us assume no padding and that all structures have
the same alignment.

    typedef struct {double z[5];} small;
    typedef struct {double z[7];} large;

    double *pp;
    small *qq;

    void ONE (double *a) {a[8] = 1;}

    void TWO (double a[5]) {ONE((double *)a);}

    void THREE (double a[5]) {a[8] = 1;}

    void FOUR (double a[5]) {if (a == pp) ONE((double *)a);}

    void FIVE (double a[5]) {if (a == pp) ONE((double *)pp);}


    void SIX (large *a) {a->z[6] = 1;}

    void SEVEN (small *a) {SIX((large *)a);}

    void EIGHT (large *a) {SEVEN((small *)a);}

    void NINE (large *a) {if ((char *)a == (char *)qq) SEVEN((small *)a);}

    void TEN (large *a) {if ((char *)a == (char *)qq) SEVEN((small *)qq);}

    void ELEVEN (small *a, int n) {
        int i;
        large *b;
        for (i = 0; i < n; ++i) {b = (large *)a; a = (small *)b;}
        a->z[3] = 1;
    }

    int main (void) {
        double p[10];
        small q[2];

        ONE(p);
        TWO(p);
        THREE(p);
        pp = p;
        FOUR(p);
        FIVE(p);

        SIX((large *)q);
        SEVEN(q);
        EIGHT((large *)q);
        qq = q;
        NINE((large *)q);
        TEN((large *)q);

        ELEVEN(q,1);
        ELEVEN(q,2);

        return 0;
    }

Now, which of the first five calls are conforming?  There are obviously
plausible interpretations of the standard that allow all of them, all of
them except THREE, or only ONE and FIVE.  But consider TWO: exactly
WHICH construction breaks any wording in the standard?  There is clearly
nothing in either ONE or TWO on their own that is not conforming, and
the standard has no explicit concept of a value having a 'history' that
affects its semantics.  And is the intent REALLY to distinguish FOUR and
FIVE?

The second set of five calls is mainly to show that the effect does not
depend on array parameter adjustment.  Let us assume that the
introduction of effective type does not exclude SIX; if it does, then
C99 has broken a huge number of important existing codes!  The key point
is that '(large *a)q' points to an array object of length 1; hence, if
the history matters, so does '(small *)(large *a)q'; which in turn means
that '(large *a)(small *)(large *a)q' is a zero-length array, and so the
access is illegal.

The calls to ELEVEN are to show that this is not solely a static problem
but, in some cases and with some interpretations, depends on the
execution of the program.

The last time this issue was raised, several people said that the
standard was clear, but did not agree on exactly which examples were
conforming.  Other people felt they they knew what was intended, but
doubted that the standard made it entirely clear.  And yet others were
certain that the standard was unclear or even inconsistent.  My view is
that there are three possible approaches to consistency:

    1) The original K&R unchecked approach.

All pointer values are simply addresses and, if they are object
pointers, that object is the largest array of their base type that will
fit within the originally allocated object.  I.e. the object as
originally defined, returned from malloc etc. or returned from some
specified library call (like getenv).  I believe that this is the
approach being taken by the gcc checked pointer project (on grounds of
practicality), even though they know that it does not match the
requirements of the C standard.

If this approach is taken, then the ONLY subsequent constraints on the
size of an array object pointed to are those imposed by the restrict
qualifier and the static qualifier in array parameters.  I favour this
interpretation, as it is most consistent with traditional expectations
and existing codes, though it is the worst approach for SMP
optimisation.  If this were done, there is a case for rephrasing the
static qualifier in array parameters to mean precisely that number of
elements (i.e. to adopt traditional Fortran semantics in that case,
and that case alone).

    2) The approach that only visible information is relevant.

The first thing to do is to specify what 'visible' means in this
context, and that is not an easy thing to do.  One obvious rule is to
say that only the type actually used in an array subscription is
relevant (i.e. the lvalue type).  As pointed out before, this still
means that the declared size of array parameters (without the static
qualifier) is ignored, because their type has already been adjusted to a
pointer type.

If the declared size of array parameters should be be significant, then
the wording of array parameter adjustment will need considerable
rethinking, because it will introduce yet another category of types
associated with an lvalue.  This is not pretty, and still allows:

    void THREE (double a[5]) {double *b = &a[5]; b[3] = 1;}

To lock that out needs a more complex definition of visible, and I
doubt that it will be easy to find a consistent one, short of the next
approach.

    3) The approach that pointer values have a history.

This is essentially approach (2) applied dynamically as well as
statically and taken to the limit.  Every time that a pointer is cast to
a new type, implicitly or explicitly, the object that it refers to is
reduced in size to a whole number of elements.  And every time that it
is passed as a complete array parameter, it is reduced to the size of
the declared array (before adjustment).  Thus the size associated with a
pointer value can decrease but never increase.

However, this still leaves the question of whether passing a pointer
through a library function like memcpy clears its history, in the same
way that memchr can be used to clear the const qualifier from a pointer.
And, of course, it does not say anything about the properties of
pointers created from integers.

This is clearly the best approach for SMP optimisation, but would break
a number of important existing codes in diabolically obscure ways,
though it is unclear how many of those are already not conforming.
Codes which I am pretty certain would fall foul of this include most BSD
string handling and much of the X11 system.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email:  nmm1@cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679