Skip to content

MarkSweep GC

opencode-agent[bot] edited this page May 10, 2026 · 1 revision

MarkSweep GC

JNode's Mark-Sweep garbage collector implementation integrated via MMTk.

Overview

JNode's MarkSweep GC is the default garbage collection strategy for the JNode VM. It uses a classic mark-sweep algorithm implemented through MMTk (Memory Management Toolkit), providing a simple but reliable tracing collector for the JNode heap. The implementation spans two layers: the MMTk base class (org.mmtk.plan.MarkSweep) and JNode-specific bindings under core/src/mmtk-vm/org/jnode/vm/memmgr/mmtk/ms/.

The MarkSweep collector operates in two phases: a stop-the-world mark phase that traces all reachable objects starting from GC roots, followed by a sweep phase that deallocates unreachable objects. This approach is straightforward but effective, making it suitable as the default GC for the JNode OS-level Java VM.

Key Components

Class / File Role
core/src/mmtk-vm/org/jnode/vm/memmgr/mmtk/ms/Plan.java JNode's MarkSweep plan, extends MarkSweep base class and implements Uninterruptible for safe GC during critical sections
core/src/mmtk-vm/org/jnode/vm/memmgr/mmtk/ms/HeapManager.java Heap manager for MarkSweep, wraps BaseMmtkHeapManager and delegates allocation/collection to the plan
core/src/mmtk-vm/org/jnode/vm/memmgr/mmtk/ms/MarkSweepStatistics.java GC statistics collection for MarkSweep, extends BaseMmtkGCStatistics
core/src/mmtk-vm/org/jnode/vm/memmgr/mmtk/ms/PlanConstants.java Constants for the MarkSweep plan, extends MarkSweepConstants
core/src/mmtk-vm/org/jnode/vm/memmgr/mmtk/BaseMmtkHeapManager.java Base heap manager bridging JNode's VM to MMTk's memory interface
core/src/core/org/jnode/vm/memmgr/def/GCMarkVisitor.java Tracing visitor for the mark phase, traverses object graphs and updates GC color states
core/src/core/org/jnode/vm/memmgr/def/GCSweepVisitor.java Sweep visitor for the sweep phase, frees unreachable objects and handles finalization
core/src/core/org/jnode/vm/memmgr/def/GCStack.java Mark stack for BFS traversal during GC tracing
core/src/core/org/jnode/vm/memmgr/def/GCManager.java Manages the GC cycle, coordinates marking and sweeping

How It Works

Object Color States

The GC uses a tri-color marking scheme with four states defined in ObjectFlags:

  • GC_WHITE (0): Unreachable objects, candidates for sweep/finalization
  • GC_GREY (1): Objects discovered but not yet traced
  • GC_BLACK (2): Objects fully traced, all references processed
  • GC_YELLOW (3): Objects awaiting finalizer invocation on a separate thread

Mark Phase

The mark phase begins when a GC collection is triggered. GCMarkVisitor performs BFS traversal of the object graph:

  1. Root set initialization: All white and grey objects are marked starting from GC roots (registers, stacks, static fields)
  2. Stack-based traversal: Objects are pushed onto GCStack and processed iteratively
  3. Reference field scanning: For each object, markObject() traverses reference fields using VmNormalClass.getReferenceOffsets(); for arrays, markArray() scans element references
  4. TIB and monitor processing: After field scanning, the TIB and any inflated monitors are processed as additional roots
  5. Color transition: Each object transitions WHITE→GREY→BLACK as it's discovered and traced

Sweep Phase

The sweep phase (GCSweepVisitor) frees unreachable objects:

  1. Color check: Objects in WHITE state are candidates for collection
  2. Finalizer check: Objects with finalizers are moved to YELLOW (GC_YELLOW) to defer collection until finalizer thread runs; already-finalized objects are freed immediately
  3. Non-finalizable objects: Objects without finalizers are freed immediately
  4. Color reset: Objects not swept (GREY, BLACK) are reset to WHITE for the next GC cycle

Heap Manager Integration

HeapManager delegates to the MMTk plan for allocation:

@Inline
protected final Address alloc(int bytes, int align, int offset, int allocator) {
    return Plan.getInstance().alloc(bytes, align, offset, allocator);
}

@Inline
protected final void postAlloc(ObjectReference object, ObjectReference typeRef, 
        int bytes, int allocator) {
    Plan.getInstance().postAlloc(object, typeRef, bytes, allocator);
}

Plan Lifecycle

The plan is created per-processor via BaseMmtkHeapManager.createProcessorHeapData():

public Object createProcessorHeapData(VmProcessor cpu) {
    Constructor cons = Class.forName("org.mmtk.vm.Plan").getConstructor(HeapHelper.class);
    return cons.newInstance(helper);
}

At runtime, Plan.getInstance() retrieves the processor-local plan via VmProcessor.getHeapData().

Gotchas & Non-Obvious Behavior

  • MMTk integration: JNode's Plan extends MMTk's MarkSweep base class (org.mmtk.plan.MarkSweep), not a custom implementation. This means GC behavior is governed by MMTk's implementation, with JNode providing only the VM interface bindings.

  • Uninterruptible constraint: Plan implements Uninterruptible, ensuring GC operations cannot be preempted. This is critical for safe access to heap data structures during collection.

  • Two-layer GC visitors: JNode maintains its own GCMarkVisitor/GCSweepVisitor in the default heap package (org.jnode.vm.memmgr.def), separate from MMTk's internal tracing. This is for the legacy (non-MMTk) heap path, but the visitor pattern and color semantics are shared.

  • Mark stack overflow: GCStack has a fixed size. If the mark stack overflows during tracing, visit() returns false and marking resumes on the next iteration. The setRootSet(true) mechanism allows restarting from roots if needed.

  • Yellow object handling: Objects in GC_YELLOW state during sweep are those with finalizers that haven't been invoked yet. They remain in memory until FinalizerThread runs, at which point they transition to WHITE and get freed on the next sweep.

  • Boot image memory: The mark phase does not trace the boot image (loaded at fixed virtual addresses before heap initialization). This memory is reserved via LazyMmapper.boot() during BaseMmtkHeapManager.initialize().

  • GCSpy disabled: All MarkSweep statistics classes have WITH_GCSPY = false, meaning GC visualization support is compiled out in production builds.

  • Reference counting absent: Unlike GenRC, the MarkSweep plan does not use reference counting. Objects are only collected during sweep when they remain WHITE after the mark phase.

Related Pages

  • MMTk-Bindings — MMTk VM interface and plan selection mechanism
  • Memory-Management — Higher-level memory management overview
  • Boot-Sequence — Boot image memory reservation before GC initialization
  • Object-Layout — Object header structure that GC marking depends on
  • GenRC — The generational reference counting GC alternative
  • NoGC — The no-op GC plan for debugging

Clone this wiki locally