Doc. No.:	WG14/N1528
Date:	2010-10-27
Reply to:	Hans-J. Boehm
Phone:	+1-650-857-3406
Email:	Hans.Boehm@hp.com

N1528: Why undefined behavior for infinite loops?

This is a response to N1509: Optimizing away infinite loops. My goal is to explain the rationale behind the current wording.

The question of whether or not side-effect-free infinite loops have defined semantics impacts primarily compiler back-ends. As we show below, guaranteeing stronger properties for infinite loops, as suggested in N1509, inhibits certain optimizations. Since compiler back-ends are typically shared between C and C++ compilers, it appears most important that WG14 and WG21 agree on whatever solution is adopted. The alternatives would be special treatment of the two languages by the back-end, or disabling optimizations when processing C code. Neither appears at all desirable.

On a purely technical basis, there are solid arguments both for the current wording, and for the N1509 position. We hope that this document, together with N1509 present those properly.

The deciding factor in the WG21 discussions was the realization that the current wording is in fact much more of a clarification of the existing standard than a change from it. If the N1509 wording were adopted, many existing compilers would be rendered nonconforming. And a reasonable argument can be made that those compilers currently conform to existing C and C++ standards.

The current wording, in both C and C++ drafts, explicitly allows the behavior exhibited by many current implementations, while giving the programmer explicit tools for disabling that behavior when it is not desired. It was felt that this was a clear improvement over the status quo, which leaves it ambiguous whether current behavior is allowed, and leaves it unclear, especially with the addition of thread support, exactly what is required to obtain desired behavior.

The technical issue: loop optimization

As N1509 correctly points out, the current draft essentially gives undefined behavior to infinite loops in 6.8.5p6. A major issue for doing so is that it allows code to move across a potentially non-terminating loop. For example, assume we have the following loops, where count and count2 are global variables (or have had their address taken), and p is a local variable, whose address has not been taken:

for (p = q; p != 0; p = p -> next) {
    ++count;
}
for (p = q; p != 0; p = p -> next) {
    ++count2;
}

Could these two loops be merged and replaced by the following loop?

for (p = q; p != 0; p = p -> next) {
        ++count;
        ++count2;
}

Without the special dispensation in 6.8.5p6 for infinite loops, this would be disallowed: If the first loop doesn't terminate because q points to a circular list, the original never writes to count2. Thus it could be run in parallel with another thread that accesses or updates count2. This is no longer safe with the transformed version which does access count2 in spite of the infinite loop. Thus the transformation potentially introduces a data race.

In cases like this, it is very unlikely that a compiler would be able to prove loop termination; it would have to understand that q points to an acyclic list, which I believe is beyond the ability of most mainstream compilers, and often impossible without whole program information.

The restrictions imposed by non-terminating loops are a restriction on the optimization of terminating loops for which the compiler cannot prove termination, as well as on the optimization of actually non-terminating loops. The former are much more common than the latter, and often more interesting to optimize.

There are clearly also for-loops with an integer loop variable in which it would be difficult for a compiler to prove termination, and it would thus be difficult for the compiler to restructure loops without 6.8.5p6. Even something like

for (i = 1; i != 15; i += 2)

for (i = 1; i <= 10; i += j)

seems nontrivial to handle. (In the former case, some basic number theory is required to prove termination, in the latter case, we need to know something about the possible values of j to do so. Wrap-around for unsigned integers may complicate some of this reasoning further.)

This issue seems to apply to almost all loop restructuring transformations, including compiler parallelization and cache-optimization transformations, both of which are likely to gain in importance, and are already often important for numerical code. This appears likely to turn into a substantial cost for the benefit of being able to write infinite loops in the most natural way possible, especially since most of us rarely write intentionally infinite loops.

The status quo

I initially still had serious misgivings about allowing this kind of transformation. But in WG21 discussions, I was told that some compilers already remove empty loops without proving termination; they already effectively assume something like 6.8.5p6. Much later, John Regehr confirmed this in his blog. These observations already make it unsafe for portable programs to rely on the behavior of arbitrary infinite loops.

John argues that these compilers are violating the current specification, but that was unclear in the WG21 discussions, and it is still unclear to me now. Specifically 5.1.2.3p6 in the current C draft states, and I believe this is old text, which is similar to text in C++ standards and drafts:

The least requirements on a conforming implementation are:

Accesses to volatile objects are strictly according to the rules of the abstract machine.
At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.
The input and output dynamics of interactive devices shall take place as specified in 7.21.3. The intent of these requirements is that unbuffered or line-buffered output appear as soon as possible, to ensure that prompting messages actually appear prior to a program waiting for input.
This is the observable behavior of the program.

It is at best unclear whether the second bullet imposes any constraints on a non-terminating program, and whether the other two bullets are violated by removing an empty non-terminating loop.

In my mind, there is no controversy about the abstract semantics of a loop as quoted in N1509. The question is whether the current conformance requirements enforce this behavior for actual implementations.

Thus, in my view, the statement in 6.8.5p6 is really a clarification of a piece of the standard, rather than a change. It is a clear improvement, in that it both clarifies the status quo, and gives programmers an easy way to intentionally write an infinite loop, if that is what they intended. In light of this, it seemed difficult to me to argue that optimizations like the above should be disallowed.

Recommendation

The current statement in 6.8.5p6 is not perfect, in that it doesn't handle infinite tail call sequences. WG21 is considering improved wording that makes the treatment consistent. Hopefully WG14 will track any resulting changes.

Aside from such relatively minor wording issues, I believe that the current statement in 6.8.5p6 should remain as it is. As evidenced by the behavior of current compilers in this regard (see John Regehr's blog, cited above) it is unlikely that simply removing the paragraph would make life simpler for programmers; they would continue to be faced with the current inconsistent behavior and unclear specification.

A possible alternative would be to clarify 5.1.2.3 to make it unambiguously require that non-terminating loops must behave according to the abstract semantics. This would require significant changes to a number of existing compilers. Given the history in this area, I'm not sure such a change is technically justifiable. If WG14 decided to pursue such a change, this should immediately become a liaison issue with WG21, since divergence in this area would be very undesirable.