Document number: WG14 N866 (J11/99-001) Title: Proposals for restricted pointer issues Author: Bill Homer Date: 10-Jan-99 URL: http://reality.sgi.com/homer_craypark/c9x/n866.txt http://reality.sgi.com/homer_craypark/c9x/n866.html 1 ---------- Summary ------------------------------------------------- 2 3 This document addresses four issues with the restrict qualifier, 4 three from PC-UK0118, and one a side effect of the recent change 5 to the specification for realloc. 6 7 It supersedes the changes proposed in: 8 SC22 N2794 Final CD Ballot for FCD 9899 Comment #9-4 9 and it supplies the details of the change proposed in: 10 SC22 N2794 Final CD Ballot for FCD 9899 Comment #9-5 11 12 The primary goal of these changes is to limit undefined behavior 13 to those cases in which it promotes optimization. This allows a 14 programmer to use the restrict qualifier in more contexts, but 15 does not impose additional burdens on a translator. 16 17 What follows is a brief description of the issues, then the 18 proposed changes to the specification that resolve these issues, 19 and finally some examples that illustrate the effects of the 20 changes. 21 22 ---------- Description of the issues ------------------------------- 23 24 1. The FCD specification of restrict forbids aliasing of 25 unmodified objects. Doing so does not promote optimization, 26 and has other disadvantages, which are discussed in examples 27 A-E below. It is also contrary to the prior art in Fortran. 28 29 2. If a restricted pointer points to an object with const-qualified 30 type, the FCD specification allows casting away the const 31 qualifier, and modifying the object. Disallowing such 32 modifications promotes optimization as illustrated in example F 33 below. 34 35 3. The FCD specification does not address the effect of accessing 36 objects through pointers of various types, all based on one 37 restricted pointer, as in Example G below. In particular, 38 these objects are supposed to determine an array, but the 39 element type is not specified. 40 41 4. The specification of realloc now states that the old object is 42 freed, and a new object is allocated. The old and new objects 43 cannot, in general, be viewed as being members of an array of 44 such objects. With the FCD specification, this appears to 45 prohibit the use of the restrict qualifier for a pointer that 46 points to an object that is realloc'd. There are also related 47 issues for dynamically allocated linked lists. See examples 48 H-K below. 49 50 ---------- Proposed edits to FCD 6.7.3.1 --------------------------- 51 52 === Append to paragraph #1 53 the text marked with > at left: 54 55 [#1] Let D be a declaration of an ordinary identifier that provides a 56 means of designating an object P as a restrict-qualified pointer 57 > to objects of type T. 58 59 60 === Change paragraph #4 from 61 (text being changed marked with < at left): 62 63 [#4] < During each execution of B, let A be the array object that 64 < is determined dynamically by all references through pointer 65 < expressions based on P. Then all references to values of A 66 < shall be through pointer expressions based on P. Furthermore, 67 if P is assigned the value of a pointer expression E that is 68 based on another restricted pointer object P2, associated with 69 block B2, then either the execution of B2 shall begin before 70 the execution of B, or the execution of B2 shall end prior to 71 the assignment. If these requirements are not met, then the 72 behavior is undefined. 73 74 === to 75 (new text indicated by > at left): 76 77 [#4] > During each execution of B, let L be any lvalue that has &L based 78 > on P. If L is used to access the value of the object X that it 79 > designates, and X is also modified (by any means), then the 80 > following requirements apply. T shall not be const-qualified. 81 > Every other lvalue used to access the value of X shall also have 82 > its address based on P. Every access that modifies X shall be 83 > considered also to modify P, for the purposes of this section. 84 If P is assigned the value of a pointer expression E that is 85 based on another restricted pointer object P2, associated with 86 block B2, then either the execution of B2 shall begin before 87 the execution of B, or the execution of B2 shall end prior to 88 the assignment. If these requirements are not met, then the 89 behavior is undefined. 90 91 === Change example 1 from 92 (text being changed marked with < at left): 93 94 [#7] EXAMPLE 1 The file scope declarations 95 96 int * restrict a; 97 int * restrict b; 98 extern int c[]; 99 100 < assert that if an object is accessed using the value of one 101 < of a, b, or c, then it is never accessed using the value of 102 < either of the other two. 103 104 === to 105 (modified text indicated by > at left): 106 107 [#7] EXAMPLE 1 The file scope declarations 108 109 int * restrict a; 110 int * restrict b; 111 extern int c[]; 112 113 > assert that if an object is accessed using one of a, b, or c, 114 > and that object is modified anywhere in the program, 115 > then it is never accessed using either of the other two. 116 117 === Change example 3 from 118 (text being changed marked with < at left): 119 120 [#10] EXAMPLE 3 The function parameter declarations 121 122 < void h(int n, int * const restrict p, 123 < int * const q, int * const r) 124 { 125 int i; 126 for (i = 0; i < n; i++) 127 p[i] = q[i] + r[i]; 128 } 129 130 < show how const can be used in conjunction with restrict. 131 < The const qualifiers imply, without the need to examine the 132 < body of h, that q and r cannot become based on p. The fact 133 < that p is restrict-qualified therefore implies that an 134 < object accessed through p is never accessed through either 135 < of q or r. This is the precise assertion required to 136 < optimize the loop. Note that a call of the form h(100, a, 137 < b, b) has defined behavior, which would not be true if all 138 < three of p, q, and r were restrict-qualified. 139 140 === to 141 (modified text indicated by > at left): 142 143 [#10] EXAMPLE 3 The function parameter declarations 144 145 > void h(int n, int * restrict p, 146 > int * restrict q, int * restrict r) 147 { 148 int i; 149 for (i = 0; i < n; i++) 150 p[i] = q[i] + r[i]; 151 } 152 153 > illustrate how an unmodified object can be aliased through two 154 > restricted pointers. In particular, if a and b are disjoint 155 > arrays, a call of the form h(100, a, b, b) has defined behavior, 156 > because array b is not modified within function h. 157 158 ---------- End of proposed edits to FCD 6.7.3.1 -------------------- 159 160 The following examples are included here to illustrate the benefits 161 of the proposed changes, but are not themselves proposed as 162 additions to the standard. 163 164 --------- Examples for Issue 1 ------------------------------------- 165 166 Example A: restrict in standard library 167 168 extern in printf(const char * restrict format, ...); 169 170 printf("%s", "s"); 171 172 Under C89, a translator could "pool" unmodifiable string 173 literals so that "s" refers to a subobject of "%s". Such 174 pooling is not allowed by the restrict qualifier on the 175 first parameter of printf with the FCD specification of 176 restrict, but is allowed with the proposed specification 177 (because there are no requirements for unmodified objects). 178 179 Example B: restricted pointer structure members 180 181 typedef struct { int n; double * restrict v; } vector; 182 183 void addB(vector a, vector b, vector c) 184 { 185 int i; 186 for(i=0; in; i++) 209 p->v[i] += q->v[i-1]; 210 } 211 212 Calls of the form addC(&x, &x) should be undefined to promote 213 optimization of the loop, and they are undefined under both the 214 FCD and the proposed specifications. The analysis for the 215 proposed specification is as follows. Let X be p->v[i], and 216 let B be the block of addC. Then for a call addC(&x, &x) with 217 x.n>1, &(p->v[i]) is based on the restricted pointer p->v, and 218 so the modification of p->v[i] is considered also to modify 219 p->v. Recursively, &(p->v) is based on the restricted pointer 220 p, and the same object is also accessed through the lvalue q->v 221 (because p and q have the same value), but &(q->v) is not based 222 on p, so the behavior is undefined. 223 224 Example D: array arguments with unmodified overlap 225 226 void addD(int n, int * restrict x, int * restrict y) 227 { 228 int i; 229 for(i=0; ip += 1; 253 return *b->q; 254 } 255 256 In this example, even if *a and *b happen to refer to the same 257 object of type T, the members a->p and b->q will be distinct 258 restricted pointers. A translator can infer that *a->p and 259 *b->q are distinct objects, and so there is no potential 260 aliasing to inhibit optimization of the body of f(). 261 262 Note that this means that the restrict qualifiers in the 263 declarations of parameters a and b are not needed to promote 264 optimization of f, and so the question in this example is 265 whether they "do harm" by giving calls of the form f(x, x) 266 undefined behavior. The answer is clearly yes for the FCD 267 specification. 268 269 The answer is also yes for the specification proposed in 270 SC22 N2794 Final CD Ballot for FCD 9899 Comment #9-4. In 271 particular, *a has property M because it contains a subobject, 272 (*a).p, or a->p, that is a restricted pointer with a target, 273 *a->p, that is modified. This requires that every lvalue used 274 to access *a must have its address based on a. But in a call 275 of the form f(x, x), *b will designate the same object as *a, 276 and b is not based on a. 277 278 In contrast, the answer is no for the proposed specification. 279 In particular, during an execution of f, the only object that 280 is actually modified is *a->p. Now &(*a->p) is a->p, a 281 restricted pointer, so a->p is also considered to be modified. 282 Recursively, &(a->p) is based on a, another restricted pointer. 283 Because no other lvalues are used to access either *a->p or 284 a->p, all the requirements of the proposed specification are 285 met (and they do not involve b). 286 287 For this example, the proposed specification can be said to 288 be more precise that the one based on property M because its 289 "bottom up" recursion better limits the extent of objects 290 considered to be modified as a side effect of modifying the 291 target of a restricted pointer. 292 293 --------- Examples for Issue 2 ------------------------------------- 294 295 Example F: asserting unmodifiability 296 297 extern void g(); 298 299 T f(const T * restrict p) 300 { 301 g(p); 302 return *p; 303 } 304 305 Under the proposed specification, but not under the FCD 306 specification, the use of const and restrict in this example 307 implies that function g only accesses the value of *p, and does 308 not modify it. On some architectures, this would allow the 309 load from *p to be initiated prior to the function call, so 310 that its value would be available sooner both within that call 311 and for the return value. The potential benefit could be large 312 if T is a large aggregate type. 313 314 --------- Example for Issue 3 -------------------------------------- 315 316 Example G: casting restricted pointer-to-double to pointer-to-char 317 318 The following example from PC-UK0118 illustrated uncertainty 319 about how to determine the type and extent of the array 320 accessed through a restricted pointer, under the FCD 321 specification. 322 323 void fred (double restrict *a, double restrict *b) { 324 /* Assume sizeof(double) > 1 */ 325 ((char *)b)[sizeof(double)+2] = 326 ((char *)a)[sizeof(double)+1]; 327 a[0] = b[2]; 328 } 329 330 double array[3]; 331 fred(array,array); 332 333 Under the proposed specification, the requirements implied by 334 the restrict qualifiers apply separately to each modified 335 object, and it follows that this example has defined behavior. 336 In particular, the 1-byte object ((char *)b)[sizeof(double)+2] 337 is accessed only through b, and the disjoint object a[0] is 338 accessed only through a. 339 340 Here it is assumed that the modification of the 1-byte object 341 ((char *)b)[sizeof(double)+2] is not considered to be a 342 modification of the containing object b[1]. By the way, the 343 same issue arises if the expressions appear as operands of 344 a comma-operator: 345 346 ((char *)b)[sizeof(double)+1]++, 347 ((char *)b)[sizeof(double)+2]++ 348 349 Thus the extent of the object modified by a particular 350 operation is an issue that is important for the correct 351 interpretation of the restrict qualifier, but is separate from 352 its specification. 353 354 --------- Examples for Issue 4 ------------------------------------- 355 356 Example H: realloc of a restricted pointer 357 358 void f(T * restrict q) { 359 T * restrict p = (T *)malloc(sizeof(T)); 360 p[0] = q[0]; 361 ... 362 p = (T *)realloc(p, 2*sizeof(T)); 363 p[1] = p[0] + q[1]; 364 ... 365 } 366 367 With recent changes to the specification of realloc, the two 368 instances of the lvalue p[0] refer to two different objects 369 (with the value copied from the old object to the new object by 370 realloc). Because there is no guarantee that there is an array 371 containing both these objects, this use of restrict does not 372 conform to the FCD specification. It does conform to the 373 proposed specification, which does not require the existence of 374 such an array. 375 376 Example I: disjoint linked lists 377 378 typedef struct item_t { 379 double d; 380 struct item_t * next; 381 } item; 382 383 void f(item * restrict list1, item * restrict list2) 384 { 385 while(list1 != 0 && list2 != 0) { 386 list1->d += list2->d; 387 list1 = list1->next; 388 list2 = list2->next; 389 } 390 } 391 392 This use of restrict does not conform to the FCD specification 393 unless each list is contained within an array. It does conform 394 to the proposed specification, if the items that are modified 395 through list1 are disjoint from the items that are accessed 396 through list2. This allows the translator greater flexibility 397 in scheduling the loads and stores of the d members of the 398 items. 399 400 Example J: disjoint linked lists, with a recursion 401 402 void g(item * list1, item * restrict list2) 403 { 404 if(list1 != 0) { 405 while(list1->next != 0 && list2 != 0) { 406 list1->next->d += list1->d + list2->d; 407 list1 = list1->next; 408 list2 = list2->next; 409 } 410 } 411 } 412 413 In this example, the modified objects, list1->next->d, 414 are accessed though the 'next' pointers in the first list. 415 The values of these next pointers are not based on either 416 list1 or list2. This implies that 417 1) list1 could not be restrict-qualified, because the modified 418 objects are also accessed through list1->d, and 419 &(list1->d) is based on list1. 420 2) list2->d is disjoint from the modified objects, because 421 &(list2->d) is based on the restrict-qualified pointer 422 list2. 423 424 Example K: even/odd index values 425 426 void f(int n, char * restrict p, char * restrict q) 427 { 428 int i; 429 for(i=0; i