SC22/WG14 N791 Solving the struct hack problem Clive D.W. Feather clive@demon.net 1997-10-22 Abstract ======== Several DRs have attempted to address the issue of the "struct hack". This paper proposes an approach to making the technique available while avoiding most of the problems of current practice. Discussion ========== The "struct hack" is a technique for using a dynamically sized structure: a structure type is declared like this: struct hack { size_t n_elements; int data [1]; }; space is then malloced: size_t n; /* ... */ struct hack *p; p->n_elements = n; p = malloc (sizeof (struct hack) + sizeof (int) * (n - 1)); and the entire space is used: for (i = 0; i < p->n_elements; i++) p->data [i] = 0; The problem is that accesses to p->data [i] for i > 0 are undefined behavior, because a pointer (p->data + i) to beyond the end of the array is being used. To quote the DR response (slightly modified): Subclause 6.3.2.1 describes limitations on pointer arithmetic, in connection with array subscripting (see also subclause 6.3.6). Basically, it permits an implementation to tailor how it represents pointers to the size of the objects they point at. Thus, the expression p->data[5] may fail to designate the expected [object], even though the malloc call ensures that the [object] is present. The idiom, while common, is not strictly conforming. This paper implements a technique, apparently already supported by at least one declaration, of allowing the structure to be declared as: struct hack { size_t n_elements; int data []; }; and then explicitly permitting the access to any element of the array that is within the bounds of the malloced space. Proposal ======== [References are to draft 11 pre 3.] In subclause 6.5.2.1 (Structure and union specifiers), paragraph 2, change: A structure or union shall not contain a member with incomplete or function type. to: A structure or union shall not contain a member with incomplete or function type, except that the last element of a structure may have incomplete array type. add a new paragraph at the end of the semantics: As a special case, the last element of a structure may be an incomplete array type. This is called a /flexible array member/, and the size of the structure shall be equal to the offset of the last element of an otherwise identical structure that replaces the flexible array member with an array of one element. When an lvalue whose type is a structure with a flexible array member is used to access an object, it behaves as if that member were replaced by the longest array that would not make the structure larger than the object being accessed. If this array would have no elements, then it behaves as if there was one element, but the behavior is undefined if any attempt is made to access that element. and add an example: Example: After the declarations: struct s { int n; double d []; }; struct ss { int n; double d [1]; }; the three expressions: sizeof (struct s) offsetof (struct s, d) offsetof (struct ss, d) have the same value. The structure /struct s/ has a flexible array member /d/. If /sizeof (double)/ is 8, then after the following code is executed: struct s *s1; struct s *s2; s1 = malloc (sizeof (struct s) + 64); s2 = malloc (sizeof (struct s) + 46); and assuming that the calls to /malloc/ succeed, /s1/ and /s2/ behave as if they had been declared as: struct { int n; double d [8]; } *s1; struct { int n; double d [5]; } *s2; Following the further successful assignments: s1 = malloc (sizeof (struct s) + 10); s2 = malloc (sizeof (struct s) + 6); they then behave as if they had been declared as: struct { int n; double d [1]; } *s1, *s2; and: double *dp; dp = &(s1->d[0]); // Permitted *dp = 42; // Permitted dp = &(s2->d[0]); // Permitted *dp = 42; // Undefined behavior