We have added a load elimination pass to the compiler. It runs behind a barrier after the escape analysis pass, but the load elimination itself can be pipelined since it only considers one instruction at a time. We eliminate all loads that read from captured objects. For this we track the base address and offset and the corresponding reference that was written to that location. If we see a matching load, its substituted with a direct link to the instruction that produced the reference that was written to the location. Since we are operating on a mere sequence of instructions, this can happen in linear time.
We only eliminate loads with a constant offset. This does include arrays accessed with a constant index. Also, we currently do not eliminate loads that didn’t have a corresponding store. These read unitialized fields and thus always return 0/0.0/null/false, which of course can be optimized into a constant. This will be added soon since its one of the prerequisites of hoisting the allocation itself out of the loop (without having to manually clear it during every iteration to ensure that reading uninitialized fields doesn’t return some results from a previous loop iteration).
What we don’t eliminate are stores. These are trickier, since we might have to reconstitute them in side exits in case a reference escapes through a side exit (which is otherwise not an issue for our scheme). However, since we do eliminate all dependencies on the stores through load elimination, the stores likely do not create a significant performance loss.