Doc. no.:	WG14/N1520
Date:	2010-10-08
Reply to:	Clark Nelson
Phone:	+1-503-712-8433
Email:	[email protected]

Fixing the rules for type-based aliasing

Introduction

The issues to be addressed and rationale for the proposed solution are discussed in the context of the C standard. However, the issues are equally relevant to C++. In particular, WG21's core issue 636 poses the same question as WG14's DR 236.

My original intention was that this document would contain equivalent wording for C++ as well, but I ran short of time. But I still intend to propose analogous changes to WG21.

The problem described in N1409

Richard Hansen pointed out a problem in the type-based aliasing rules, as follows:

My question concerns the phrasing of bullet 5 of 6.5p7 (aliasing as it applies to unions/aggregates). Unless my understanding of effective type is incorrect, it seems like the union/aggregate condition should apply to the effective type, not the lvalue type.

Here are some more details:

Take the following code snippet as an example:
union {int a; double b;} u;
u.a = 5;
From my understanding of the definition of effective type (6.5p6), the effective type of the object at location &u is union {int a; double b;}. The type of the lvalue expression that is accessing the object at &u (in the second line) is int.

From my understanding of the definition of compatible type (6.2.7), int is not compatible with union {int a; double b;}, so bullets 1 and 2 of 6.5p7 do not apply. int is not the signed or unsigned type of the union type, so bullets 3 and 4 do not apply. int is not a character type, so bullet 6 does not apply.

That leaves bullet 5. However, int is not an aggregate or union type, so that bullet also does not apply. That means that the above code violates the aliasing rule, which it obviously should not.

I believe that bullet 5 should be rephrased to indicate that if the effective type (not the lvalue type) is an aggregate or union type that contains a member with type compatible with the lvalue type, then the object may be accessed.

Effectively, what he points out is that the rules are asymmetrical with respect to struct/union membership. I have been aware of this situation, and considered it a (non-urgent) problem, for quite some time. A series of examples will better illustrate the problem. (These examples were originally presented at the Santa Cruz meeting.)

In my experience with questions about whether aliasing is valid based on type constraints, the question is invariably phrased in terms of loop invariance. Such examples bring the problem into extremely sharp focus.

Example 1:

void f1(int *pi, double *pd, double d)
{
	for (int i = 0; i < *pi; i++) {
		*pd++ = d;
	}
}

In f1, *pi can be assumed to be loop-invariant, because the type-based aliasing rules don't allow *pd to be an alias for *pi. Therefore, it is valid to transform the loop such that *pi is loaded only once, at the top of the loop.

Example 2:

struct S { int a, b; };
void f2(int *pi, struct S *ps, struct S s)
{
	for (int i = 0; i < *pi; i++) {
		*ps++ = s;
	}
}

In f2, *pi can not be assumed to be loop-invariant. *pi (having type int) can be accessed (presumably including modification) by an lvalue that has aggregate type including a member with type int. struct S is of course such a type, and the lvalue *ps has that type. Therefore, f2 could be called as follows:

struct S a[10] = { [9].b = 1000 };
f2(&a[9].b, a, (struct S){ 0, 0 });

Despite the fact that the initial value of *pi in the loop would be way too large for the array, this usage would nevertheless be safe; *pi would have to be reloaded for each iteration, and on the tenth iteration, the value of *pi would become zero, and the loop would terminate before the writing past the end of the array.

Example 3:

struct S { int a, b; };
void f3(int *pi, struct S *ps1, struct S const *ps2)
{
	for (*pi = 0; *pi < 10; ++*pi) {
		*ps1++ = *ps2;
	}
}

The question here is whether the object *ps2 may be accessed (and especially modified) by assigning to the lvalue *pi — and if so, whether the standard actually says so. It could be argued that this is not covered by the fifth bullet of 6.5p7, since *pi does not have aggregate type at all.

Perhaps the intention is that the question should be turned around: is it allowed to access the value of the object *pi by the lvalue *ps2. Obviously, this case would be covered by the fifth bullet.

All I can say about this interpretation is that it never occurred to me as a possibility until the Santa Cruz meeting, even though I've thought about these rules in considerable depth over the course of many years. Even if this case might be considered to be covered by the existing wording, I'd suggest that it might be worth looking for a less opaque formulation.

The originally proposed solution

In N1409, Richard Hansen also proposed to rewrite 6.5p7, as follows. (I have added markup to clarify and highlight his proposed changes.)

An object with effective type Q shall have its stored value accessed only by an lvalue expression ~~that has~~ of type T when one of the following ~~types~~ applies:

~~a type~~ T is compatible with ~~the effective type of the object~~ Q,

T is a qualified version of a type compatible with ~~the effective type of the object~~ Q,

~~a type that~~ T is the signed or unsigned type corresponding to ~~the effective type of the object~~ Q,

~~a type that~~ T is the signed or unsigned type corresponding to a qualified version of ~~the effective type of the object~~ Q,

T is a character type, or

Q is an aggregate or union type that includes ~~one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),~~ a member of type R that satisfies one of these conditions (substituting R for Q, recursively if necessary). or

~~a character type.~~

This formulation would make a significant, and presumably unintended (and probably undesired) technical change: it would make any struct/union type that (recursively) includes a member with character type a valid type for aliasing any object. (The previous formulation excluded that by mentioning character types last, and by using the word "aforementioned" in the penultimate bullet.)

The commonality with DR 236

While N1409 concerns itself with aggregate and union types in general, DR 236 is concerned specifically with the overlapping nature of union members.

The history of this DR is long and tortuous. It was originally submitted in 2000-10, along with a proposal to apply the concept of effective type to unions. The minutes of the 2001-04 (Copenhagen) meeting say, in part, "The committee does not think that the suggested wording is acceptable." No specifics are given concerning the reason the proposed wording was considered unacceptable. In particular, there is no hint whether the approach of applying effective type to unions was considered acceptable.

The DR was then the subject of apparently lively debate at each subsequent meeting through 2003-10 (Kona). From that meeting, the minutes record substantive consensus, along with an action item for drafting of words describing that consensus.

In 2005-04 (Lillehammer), the committee gave up waiting for the words to materialize, instead deciding simply to state the committee's intention in the DR response, without worrying about whether that intention was accurately described by the standard.

Now that the release of a new revision of the standard is approaching, it is time again to seek a normative description of the committee's intent.

The committee's decision concerning DR 236 was that the form of an lvalue is significant in determining what operations on a union are valid. Accesses to a union member through a pointer to non-union type are not allowed to change the identity of the last-stored member of the union; changing the "effective type" of the union is allowed only where the appropriate union member's name actually appears in the expression effecting the change. For example:

union X { double a; int b; };
union X x, *px = &x;
int *pi        = &x.b;
double *pd     = &x.a;

If x currently contains a double, assigning to *pi is not a valid way to change that fact, whereas assigning to x.b or px->b would be; so would assigning to x or *px.

I believe that this implies that, when the presence of a union object is explicit in the form of an lvalue expression, an implementation is required to assume that more aliasing possibilities are valid than when no explicit union is present. *pi and *pd may be assumed not to be aliases for a single object, but any reference involving a union object (such as x, x.a, *px, or px->b) must be assumed to be a possible alias for an unknown object whose type appears as a member of the union, including *pi and *pd.

Discussion

From considering N1409 along with DR 236, it is now clear to me that there are at least three different questions concerning containment or aggregation that it is sometimes interesting to ask:

Does an lvalue expression imply/require the existence of a containing object?
Does an aggregate or union type contain a different type as a member?
Does an object appear as a subobject of another object?

For the sake of a resolution to DR 236, the standard must somehow take into account the first question. The existing aliasing rules are formulated solely in terms of the second question.

Concerning the third question, I think it's fairly obvious that accessing an aggregate object implies accessing all of its member objects (Rule A). For example:

struct { int a; double b; } x, y;
x = y;		// #1

Here the value computation of y implies the value computation of both y.a and y.b, and the assignment to x implies assignment to both x.a and x.b.

Similarly, in general, access to a subobject also constitutes access to the containing object (Rule B). Consider this as an immediate continuation of the previous example:

x = y;		// #2
y.a = 17;	// #3
x = y;		// #4

Assignment #2, immediately following assignment #1, could be optimized away as redundant, because the value assigned by #2 is the same as that assigned by #1. On the other hand, assignment #3, in addition to changing the value of y.a, also changes the value of y, so assignment #4 is not redundant with the previous assignments of y to x.

(It might be feared that defining "access" in this way would cause unnecessary data races, by expanding the set of conflicting accesses. But I don't believe that to be a problem, because the definition of "conflict", and therefore of "data race", is in terms of "memory location" (i.e. scalar object or sequence of bit-fields), not in terms of "object" (unqualified).)

Let's look back at the fifth bullet of 6.5p7 (presented here in context):

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:⁸⁷⁾

...

an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

...

I now believe that this bullet is an incomplete and unnecessary attempt to take into account the implications of subobject inclusion. It comes very close to being an adequate statement of Rule A. However, it doesn't cover Rule B at all. (This is effectively a restatement of the problem pointed out by Richard Hansen.)

Basically, if it is kept in mind that an lvalue expression (as mentioned in the introductory clause of 6.5p7) with aggregate or union type implies access of all of its member subobjects (recursively), and that accessing a subobject of an aggregate or union constitutes an access to the containing aggregate or union, it seems to me that the fifth bullet is not necessary, and could just be deleted.

My own inclination is to conclude that Rules A and B are implied by the existing definitions of structure and union type (see 6.2.5p20), and therefore do not need to be normatively stated. (Although I can see that additional explanation in the rationale might be useful.)

Concerning wording of a resolution to DR 236, I reiterate the original proposal to extend the concept of effective type to apply to unions. (I'm hoping that the problem was in the original wording, not in the approach.) C++ does not have the concept of "effective type", but instead has "dynamic type", which serves basically the same purpose (among others).

Proposed normative wording for C

Change 6.5p6:

The ~~effective~~ established type of an object for an access to its stored value is the declared type of the object, if any.⁸⁶⁾ If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the ~~effective~~ established type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the ~~effective~~ established type of the modified object for that access and for subsequent accesses that do not modify the value is the ~~effective~~ established type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the ~~effective~~ established type of the object is simply the type of the lvalue used for the access.

Insert a new paragraph before 6.5p7:

The effective type of an object whose established type is not a union type is its established type. The effective type of an object whose established type is a union type is the type of the union member last assigned a value. The effective type of a union shall be changed only by assignment to an lvalue involving the union's type, or by copying an object using memcpy, memmove, or as an array of character type.

The intention is that a member-selection expression that selects a member of a union does involve the union's type, but indirection of a pointer to a type that happens to be a member of the union does not involve the union's type. I don't know whether it is necessary or desirable to state this intention normatively, in a note, or in the rationale.

Change 6.5p7:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:⁸⁷⁾

a type compatible with the effective type of the object,

a qualified version of a type compatible with the effective type of the object,

a type that is the signed or unsigned type corresponding to the effective type of the object,

a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object, or

~~an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or~~

a character type.