We are currently formalizing our escape analysis algorithm for trace-trees with the help of Christian Stork. As part of this work Chris observed that we are overly pessimistic when deciding which loads we can eliminate through load propagation. So far we first ran an escape analysis to find out which allocations are contained (are not leaked), and then performed load propagation only for loads from these local allocations.This is sub-optimal for several reasons. First of all, the escape analysis pass has to detect two scenarios that permit values to escape our loop scope. On the one hand we have to track stores to non-captured memory (memory escape), and we have to flag values escaping into future loop iterations (loop escape). Both scenarios cause the allocation to be flagged as escaped. However, only the memory escape situation actually inhibits load propagation. While loop escaping allocations cannot be optimized through allocation hoisting, there is no reason not to perform load propagation on them.Secondly, and independently of the above observation, there is also no need to stop load propagation altogether just because a reference escapes to memory somewhere in the loop. Due to the tree shape of our intermediate representation we can actually keep performing load propagation until we see a reference escaping to memory, at which point we have to inhibit any further load propagations for this reference or any reference stored in the associated objects (and recursively all references stored there). This essentially kills load propagation for the rest of this trace, but for any trace “higher up” in the tree is not affected.Its important to kill not only the current reference, but any other allocation that can escape through it as well. For this every time we see a load we explicitly check whether the base reference is another load and follow this chain of loads all the way to the source. If any of this hops is an escaped object, we cannot perform load propagation. Previously we were able to cache this lookup. The slightly worse performance is well worth in this case though.In addition to being more precise, the new load propagation pass is also no longer dependent on escape analysis and we can move it to a much earlier point in the compilation pipeline. During escape analysis almost all loads are eliminated now, which also increases the quality of the escape analysis results.
Pingback: Tracing the Web « Andreas Gal