Untyped Context Nodes

Michael wrote on his blog about untyped context nodes.

Context nodes represent the local variable/stack state along the incoming edge into the trace tree, which again is really a loop. Since loops almost always have an empty stack at entry, its really just the local variables we have to be concerned with. Context nodes point to these local variable slots, and whenever an instruction in the tree refers to them it really means that the instruction wants to operate on the value that was in that JVM local variable at loop entry, or whatever value was written their during a previous loop iteration in case the value is updated by at least one path in the tree.

Our interpreter ensures that we precisely know the type of each value on the JVM stack. This was not the case for my previous JamVM implementation, mostly because I was too lazy to do proper stack type analysis for the entry point location in the bytecode. Since our new interpreter dynamically tracks the type of all values on the stack and in local variables, we actually get this information for free.

One might now think that due to this we always know the type of all context nodes, since they essentially point to a stack/local variable cell (read above, really just local variable cell). However, this is not always the case. Uninitialized local varibales are an exception, because … well they are uninitialized, and thus untyped.

Dealing with this situation is straight forward, but not really trivial.

First of all, an untyped context node is only a problem if its actually read somewhere in the tree. That can happen in two ways. Either, its used by an instruction as operand, or it appears in a side exit map (for the value to be written back in case that side exit happens). The first case doesn’t make sense at the first glimpse. How can a value be an operand of some instruciton if its uninitialized? The JVM doesn’t allow this to happen. Well true, BUT there is actually a possibility for such values to be initialized along one path of the loop, just not along all.

The root cause becomes more obvious when we consider the second case: a seemingly untyped context node appears in a side exit map. These maps indicate which values from the tree have to be written back into JVM local variables (and stack, in some odd cases). Why would we want to write back an untyped context node? We really should have to write back values only that were changed in the trace, right? Well, yes and no. We have to write back anything that was changed in this iteration (so along this trace), or in any previous iteration. Since we don’t keep track which paths we took in previous iterations, we thus have to write back all values in the context that get ever changed along all side exits from the tree.

With this in mind we can now easily imagine a scenario where an side exit map points to a untyped context node, which is valid only because some other trace in the tree actually writes to that context node. This writing to the context node is equivalent to what Static Single Assignment (SSA) form calls phi nodes, and we use the same terminology.

So how do we deal with this problem? Well, as we said above for an untyped context node to be a problem, it must be used. And for its used to be legal, in turn, it must be initialized along at least one path in the tree. So all we need to do is traverse all phi nodes first, looking at both operands of the phi instruction. If one is an untyped context and the other is properly typed (everything but a context is always properly typed), we have to propage the type to the untyped context node.

Now the next question is what happens if we have multiple phi nodes trying to update the same untyped context node? Can they have different types? No! The reason is very simple. JVM stack and local variable slots are dynamically typed, but there is a fixed type map, so at every point in the program every cell has exactly one type. Thus, in the trace tree entry point every local variable/stack location (and thus context node associated with it) has exactly one type. Therefore, all phis explicitly will indicate exactly that type for the untyped context node.

The final question is what to do with untyped context nodes that are updated with the value of another untyped context node. One might think that this can happen in cases where a local variable outside the loop is updated with the value of another local variable outside the loop.


int a;
int b;
for (...) {
a = b;
}

However, for such a case to be valid JVM code the right side (b) of course has to be initialized. So in fact this scenario can never happen, because in reality b would have to be initialized with a value, forcing it to have a proper type.

As I said: straight forward, but not trivial.

This entry was posted in Trace Compilation. Bookmark the permalink.