Defect Report #042

Submission Date: 10 Dec 92
Submittor: WG14
Source: X3J11/92-001 (Tom MacDonald)
Question 1
The description of memcpy in subclause 7.11.2.1 says:
void *memcpy(void *s1, const void *s2, size_t n);
The memcpy function copies n characters from the object pointed to by s2 to the object pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.
The definition of the term object in subclause 3.14 is:
object - A region of data storage in the execution environment, the contents of which can represent values. Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined. When referenced, an object may be interpreted as having a particular type...
Are the objects in the description of memcpy the largest objects into which the arguments can be construed as pointing?
In particular, is the behavior of the call of memcpy in Example 1 defined:
void f1(void) {
extern char a[2][N];
memcpy(a[1], a[0], N);
}

because the arguments point into the disjoint array objects, a[1] and a[0]? Or is the behavior undefined because the arguments both point into the same array object, a?
Response
From subclause 3.14, an object is ``a region of data storage ... Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined ...'' From subclause 7.11.1, ``the header <string.h> declares one type and several functions, and defines one macro useful for manipulating arrays of character type and other objects treated as arrays of character type.'' ``Various methods are used for determining the lengths of the arrays...'' From subclause 7.11.2.1, description of memcpy, ``if copying takes place between objects that overlap, the behavior is undefined.'' Therefore, the ``objects'' referred to by subclause 7.11.2.1 are exactly the regions of data storage pointed to by the pointers and dynamically determined to be of N bytes in length (i.e. treated as an array of N elements of character type).
  1. So, no, the objects are not ``the largest objects into which the arguments can be construed as pointing.''
  2. In Example 1, the call to memcpy has defined behavior.
  3. The behavior is defined because the pointers point into different (non-overlapping) objects.
Question 2
For the purposes of the description of memcpy, can a contiguous sequence of elements within an array be regarded as an object in its own right? If so, are the objects in the description of memcpy the smallest contiguous sequences of bytes that can be construed as the objects into which the arguments point?
In Example 2:
void f2(void) {
extern char b[2*N];
memcpy(b+N, b, N);
}

can each of the first and last half of array b be regarded as an object in its own right, so that the behavior of the call of memcpy is defined? (Although they are not declared as separate objects, each half does seem to satisfy the definition of object quoted above.) Or is the behavior undefined, since both arguments point into the same array object b?
In Example 3:
void f3(void) {
void *p = malloc(2*N); /*
Allocate an object. */
{
char (*q)[N] = p; /*
The object pointed to by p may
be interpreted as having type
(char [2][N]) when referenced
through q.
*/
/*
... */
memcpy(q[1], q[0], N);
/* ... */
}
{
char *r = p; /*
The object pointed to by p may
be interpreted as having type
(char [2*N]) when referenced
through r.
*/
/*
... */
memcpy(r+N, r, N);
/*
... */
}
}

the types of the objects are inferred from the pointers, and the underlying storage is dynamically allocated. Is the behavior of each call of memcpy defined?
Since the relationship between the values of the arguments presented to memcpy is the same in all the above calls, it seems reasonable to expect that either all these calls of memcpy give defined behavior, or none do. But which is it?
Response
  1. Yes, for memcpy, a contiguous sequence of elements within an array can be regarded as an object in its own right.
  2. The objects are not the smallest contiguous sequence of bytes that can be construed; they are exactly the regions of data storage starting at the pointer and of N bytes in length.
  3. Yes, the non-overlapping halves of array b can be regarded as objects in their own rights.
  4. The behavior (in Example 2) is defined.
  5. The definition of object is independent of the method of storage allocation. The array length is determined by ``various methods.'' So, yes, the behavior of each call of memcpy is well-defined.
  6. All of the calls of memcpy (in Example 3) give defined behavior.
Question 3
Similar questions arise for the other library string handling functions that have undefined behavior when copying between overlapping objects. These include strcpy, strncpy, strcat, strncat, strxfrm, mbstowcs, wcstombs, strftime, vsprintf, sscanf, and sprintf. For these functions, however, the number of bytes referenced through each pointer depends, at least in part, upon the values stored in the bytes.
Consider a library function for which the number of bytes accessed or modified is affected by the values of the bytes. Is the object associated with each of its pointer arguments the smallest contiguous sequence of bytes actually accessed or modified through that pointer?
In Example 4:
void f4(void) {
extern char b[2*N];
strcpy(b+N, b);
}

is the behavior defined if N >> strlen(b)?
In Example 5:
void f5(void) {
extern char c[2*N];
strcat(c+N, c);
}

is the behavior defined if both N >> strlen(c) and N >> strlen(c) + strlen(c+N)?
Response
Length is determined by ``various methods.'' For strings in which all elements are accessed, length is inferred by null-byte termination. For mbstowcs, wcstombs, strftime, vsprintf, sscanf, sprintf and all other similar functions, it was the intent of the C Standard that the rules in subclause 7.11.1 be applicable by extension (i.e., the objects and lengths are similarly dynamically determined). The behavior (in Examples 4 and 5) is defined.
Previous Defect Report < - > Next Defect Report