perf(pine-java): pool OperatorOutput — requires nested-map → flat list refactor first

## Context

#119 (PR #120) brought a profile-driven 7-9% ns/op + 19% B/op reduction to **pine-go** by pooling `*OperatorOutput` via `sync.Pool` in the scheduler, eliminating per-Execute alloc of `OperatorOutput` + the dominant `itemWrites []ItemWrite` grow cost.

**The win does not transfer to pine-java.** Java and Go runtimes are independent — sharing only the JSON contract and Apple DSL. #119 touched `pine-go/internal/runtime/scheduler.go`; nothing in that change reaches the JVM side.

## Current Java state

`pine-java/src/main/java/page/liam/pine/OperatorOutput.java` uses:

```java
private final Map<String, Object> commonWrites = new LinkedHashMap<>();
private final Map<Integer, Map<String, Object>> itemWrites = new HashMap<>();  // nested
private final List<Map<String, Object>> addedItems = new ArrayList<>();
private final Set<Integer> removedItems = new HashSet<>();
```

The blocker is **structural, not lifetime-related**:

| Storage | Per-Execute behaviour | Pooling viability |
|---|---|---|
| `itemWrites: Map<Integer, Map<String,Object>>` (nested) | Each `setItem(i, field, v)` does `computeIfAbsent(i, k -> new LinkedHashMap<>()).put(field, v)` — tree-node alloc on first touch per row, then per-cell map put | Pooling the outer map keeps O(N) inner LinkedHashMaps live; clearing them on Reset is itself O(N×M) — worse than current alloc |

pine-go had the same nested form historically (commit `d238098` "replace nested map item writes with flat `[]ItemWrite` slice", v0.7 era). The refactor unlocked everything downstream — including #119's pool. **Java needs the same structural refactor first**.

## Suggested phasing

**Phase 1 (refactor — separate PR, no pooling yet):**
- Introduce `ItemWrite { int index; String field; Object value; }` record
- Change `itemWrites` to `List<ItemWrite>` (or `ArrayList<ItemWrite>` for capacity reuse later)
- Update `Engine.applyOutput` / `ColumnFrame.applyOutput` / `DataFrame.applyOutput` / `ParallelExecutor.mergeOutputs` to iterate the flat list
- Update Java fuzz / unit tests as needed; cross-validate's byte-equal /execute parity (`scripts/cross-validate/02-engine-byte-exact.sh`) gates correctness — no behaviour change should leak

**Phase 2 (pool — once Phase 1 lands):**
- Reset() method analogous to Go's: null slot refs, truncate to size 0 (`ArrayList.clear()` retains capacity)
- `ThreadLocal<OperatorOutput>` or `ConcurrentLinkedDeque` pool keyed at Engine instance
- Expect 5-10% throughput improvement based on Go numbers + JVM GC overhead profile

## Why JVM-specific concerns matter

JVM is **not Go**:
- Short-lived objects often live in TLAB (Thread-Local Allocation Buffer), young gen, never reach old gen
- Escape analysis can sometimes stack-allocate
- BUT: nested LinkedHashMap allocs are heavy enough to escape TLAB on hot paths; profiling is required to confirm the win

**Recommended approach**: profile pine-java first via JMH (which #119 already noted is missing — see `pine-java/benchmarks/`'s placeholder note). If `OperatorOutput.setItem` map allocs dominate hot-path GC pressure, do Phase 1+2. If JVM is amortizing it away cheaply, defer indefinitely.

## Risk

- **Behaviour parity**: byte-equal `/execute` parity is gated by `cross-validate/02-engine-byte-exact.sh`. Both phases must preserve it.
- **Concurrency**: Java's `OperatorOutput` is currently mutable-by-single-thread per Execute; the same contract must hold post-refactor.
- **Phase 1 LOC**: ~80-120 lines (Java reformat of the nested-map walk is the bulk).

## Related

- #119 — the Go-side optimization that triggered this issue
- PR #120 — Go implementation that this Java counterpart would mirror at a structural level
- pine-go commit `d238098` — the prior-art refactor (Go side, v0.7 era)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(pine-java): pool OperatorOutput — requires nested-map → flat list refactor first #121

Context

Current Java state

Suggested phasing

Why JVM-specific concerns matter

Risk

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

perf(pine-java): pool OperatorOutput — requires nested-map → flat list refactor first #121

Description

Context

Current Java state

Suggested phasing

Why JVM-specific concerns matter

Risk

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions