Skip to content

JIT Compiler IR

opencode-agent[bot] edited this page May 10, 2026 · 1 revision

JIT Compiler IR

Just-In-Time compiler intermediate representation.

Overview

The JIT Compiler IR (Intermediate Representation) is a central component of JNode's tiered compilation pipeline. It provides an optimized representation of Java bytecode that can be efficiently translated to native x86 machine code. The IR is primarily used by the L2 (Level 2) optimizing compiler to perform advanced optimizations that would be difficult or impossible to implement in the L1 compiler.

The IR transforms the stack-based Java bytecode into a structured graph representation with control flow, allowing sophisticated optimizations like constant folding, dead code elimination, and register allocation. Unlike the L1 compiler which directly emits instructions as it processes bytecode, the L2 compiler builds a complete intermediate representation before performing any optimizations.

Key Components

Class / File Role
core/src/core/org/jnode/vm/compiler/ir/IRControlFlowGraph.java Main IR representation - contains all basic blocks and provides dominance analysis
core/src/core/org/jnode/vm/compiler/ir/IRBasicBlock.java Represents a basic block in the control flow graph
core/src/core/org/jnode/vm/compiler/ir/IRGenerator.java Translates bytecode into quads and populates the IR
core/src/core/org/jnode/vm/compiler/ir/CodeGenerator.java Abstract base class for translating IR to native code
core/src/core/org/jnode/vm/compiler/ir/LinearScanAllocator.java Linear scan register allocator
core/src/core/org/jnode/vm/compiler/ir/RegisterPool.java Manages register allocation during code generation
core/src/core/org/jnode/vm/compiler/ir/Variable.java Represents a virtual register in the IR
core/src/core/org/jnode/vm/compiler/ir/Operand.java Base class for IR operands (constants, variables)
core/src/core/org/jnode/vm/compiler/ir/quad/Quad.java Base class for all IR instructions
core/src/core/org/jnode/vm/compiler/ir/SSAStack.java SSA form construction helper

How It Works

IR Generation Phase

The IR is built from Java bytecode using the IRGenerator class which extends BytecodeVisitor. As each bytecode instruction is visited, appropriate IR quads are created and added to basic blocks:

  1. Bytecode Parsing: The BytecodeParser analyzes the method bytecode and identifies basic block boundaries using the IRBasicBlockFinder class.

  2. Control Flow Graph Construction: The IRControlFlowGraph constructor creates IRBasicBlock instances for each identified block and connects them with predecessor/successor relationships.

  3. Quad Generation: The IRGenerator walks through each bytecode instruction and creates corresponding IR quads (like BinaryQuad, AssignQuad, BranchQuad). Each quad represents a single operation in three-address code form.

SSA Construction and Optimization

Once the initial IR is built, the L2 compiler performs several optimization passes:

  1. Dominance Analysis: The control flow graph computes immediate dominators for each block using the algorithm in IRControlFlowGraph.doComputeDominance().

  2. SSA Form Construction: The constructSSA() method converts the IR to Static Single Assignment form, which makes data flow analysis more precise by ensuring each variable is assigned exactly once.

  3. Optimization Passes:

    • optimize() performs constant folding and algebraic simplifications
    • removeUnusedVars() eliminates dead code
    • removeDefUseChains() cleans up unnecessary assignments
  4. De-SSA Conversion: After optimization, deconstrucSSA() converts the IR back from SSA form, inserting phi functions where needed.

Register Allocation

The LinearScanAllocator performs register allocation using a linear scan algorithm based on live range analysis computed by computeLiveVariables(). This assigns physical x86 registers to virtual registers, spilling to stack when necessary.

Code Generation

The final phase translates the optimized IR to native x86 instructions. The CodeGenerator abstract class provides the interface, with X86CodeGenerator providing the x86-specific implementation. Each quad type has a corresponding generateCodeFor() method that emits the appropriate machine instructions.

Gotchas & Non-Obvious Behavior

  • IR is tied to L2 compiler: The IR infrastructure is specifically designed for the L2 compiler; L1 uses a completely different, faster compilation approach without building a graph representation.

  • Type inference limitations: The IR generator must infer operand types from bytecode usage patterns since bytecode doesn't carry explicit type information for locals/stack slots.

  • SSA memory overhead: Converting to SSA form increases memory usage significantly, which is why L1 doesn't use it for fast compilation.

  • Register allocation affects correctness: The linear scan allocator must correctly handle live ranges that span basic block boundaries; incorrect handling leads to incorrect code.

  • Exception handling integration: The IR must properly handle exception edges in the control flow - exception handlers become additional basic blocks with specific dominance properties.

  • Stack variable aliasing: Java's stack can contain multiple simultaneous values; the IR must track these properly through the StackVariable class.

Related Pages

Clone this wiki locally