JTC1/SC22/WG14
N791
SC22/WG14 N791
Solving the struct hack problem
Clive D.W. Feather
[email protected]
1997-10-22
Abstract
========
Several DRs have attempted to address the issue of the "struct hack". This
paper proposes an approach to making the technique available while avoiding
most of the problems of current practice.
Discussion
==========
The "struct hack" is a technique for using a dynamically sized structure:
a structure type is declared like this:
struct hack
{
size_t n_elements;
int data [1];
};
space is then malloced:
size_t n;
/* ... */
struct hack *p;
p->n_elements = n;
p = malloc (sizeof (struct hack) + sizeof (int) * (n - 1));
and the entire space is used:
for (i = 0; i < p->n_elements; i++)
p->data [i] = 0;
The problem is that accesses to p->data [i] for i > 0 are undefined behavior,
because a pointer (p->data + i) to beyond the end of the array is being
used. To quote the DR response (slightly modified):
Subclause 6.3.2.1 describes limitations on pointer arithmetic, in
connection with array subscripting (see also subclause 6.3.6).
Basically, it permits an implementation to tailor how it represents
pointers to the size of the objects they point at. Thus, the
expression p->data[5] may fail to designate the expected [object],
even though the malloc call ensures that the [object] is present.
The idiom, while common, is not strictly conforming.
This paper implements a technique, apparently already supported by at least
one declaration, of allowing the structure to be declared as:
struct hack
{
size_t n_elements;
int data [];
};
and then explicitly permitting the access to any element of the array that
is within the bounds of the malloced space.
Proposal
========
[References are to draft 11 pre 3.]
In subclause 6.5.2.1 (Structure and union specifiers), paragraph 2, change:
A structure or union shall not contain a member with incomplete or
function type.
to:
A structure or union shall not contain a member with incomplete or
function type, except that the last element of a structure may have
incomplete array type.
add a new paragraph at the end of the semantics:
As a special case, the last element of a structure may be an incomplete
array type. This is called a /flexible array member/, and the size of
the structure shall be equal to the offset of the last element of an
otherwise identical structure that replaces the flexible array member
with an array of one element. When an lvalue whose type is a structure
with a flexible array member is used to access an object, it behaves as
if that member were replaced by the longest array that would not make
the structure larger than the object being accessed. If this array
would have no elements, then it behaves as if there was one element,
but the behavior is undefined if any attempt is made to access that
element.
and add an example:
Example:
After the declarations:
struct s { int n; double d []; };
struct ss { int n; double d [1]; };
the three expressions:
sizeof (struct s)
offsetof (struct s, d)
offsetof (struct ss, d)
have the same value. The structure /struct s/ has a flexible array
member /d/.
If /sizeof (double)/ is 8, then after the following code is executed:
struct s *s1;
struct s *s2;
s1 = malloc (sizeof (struct s) + 64);
s2 = malloc (sizeof (struct s) + 46);
and assuming that the calls to /malloc/ succeed, /s1/ and /s2/ behave
as if they had been declared as:
struct { int n; double d [8]; } *s1;
struct { int n; double d [5]; } *s2;
Following the further successful assignments:
s1 = malloc (sizeof (struct s) + 10);
s2 = malloc (sizeof (struct s) + 6);
they then behave as if they had been declared as:
struct { int n; double d [1]; } *s1, *s2;
and:
double *dp;
dp = &(s1->d[0]); // Permitted
*dp = 42; // Permitted
dp = &(s2->d[0]); // Permitted
*dp = 42; // Undefined behavior