perf(incremental): batch WAL checkpoints and fix native CFG bulk insert by carlos-alm · Pull Request #917 · optave/ops-codegraph-tool

carlos-alm · 2026-04-13T00:51:32Z

Summary

Batch WAL checkpoints: Replace per-feature wal_checkpoint(TRUNCATE) calls (~68ms each x 3-4 features) with one upfront FULL checkpoint before analyses and one TRUNCATE after all complete. suspendJsDb/resumeJsDb callbacks become no-ops, eliminating ~200-270ms of WAL overhead on incremental rebuilds with the native engine.
Fix native CFG bulk insert: bulkInsertCfg was silently failing on every build — napi-rs can't convert JS null to Rust Option<u32>, only undefined. Changed ?? null to ?? undefined for optional CfgBlockRow fields (startLine, endLine, label). CFG data now persists via the native bulk path instead of falling back to WASM.

Benchmark (53-file project, 1-file incremental, avg 3 runs)

Phase	Before	After	Change
Total	11.1ms	5.7ms	-49%
ast	2.4ms	0ms	checkpoint overhead removed
complexity	1.4ms	0.8ms	-43%
dataflow	1.5ms	0.3ms	-80%
roles	0.7ms	0.3ms	-57%

Test plan

All 12 incremental parity tests pass (nodes, edges, complexity, CFG, dataflow, roles)
CFG native bulk path now logs "CFG (native bulk): N functions analyzed" (was silently failing before)
Performance assertions (structure < 200ms, roles < 200ms, finalize < 200ms) pass

Replace per-feature wal_checkpoint(TRUNCATE) calls (~68ms each × 3-4 features) with a single upfront FULL checkpoint before analyses and one TRUNCATE after all analyses complete. suspendJsDb/resumeJsDb callbacks become no-ops, eliminating ~200-270ms of WAL checkpoint overhead on incremental rebuilds with the native engine. Also fix bulkInsertCfg silently failing due to napi-rs null→u32 conversion error: pass undefined (maps to Option::None) instead of null for optional CfgBlockRow fields (startLine, endLine, label).

claude · 2026-04-13T00:51:48Z

Claude finished @carlos-alm's task —— View job

🔍 PR Review Complete

Todo List:

Phase 1: Root Cause Analysis ✅
Phase 2: Configuration Change Detection ✅
Phase 3: Test Integrity Check ✅
Phase 4: Code Quality Review ✅
Phase 5: Backlog Compliance Assessment ✅
Provide Final Rating and Recommendation ✅

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Performance bottleneck in incremental builds where WAL checkpoint operations were creating ~200-270ms overhead, and a silent failure in native CFG bulk insert due to null/undefined type mismatch.

Root Cause Analysis:

WAL Checkpoint Inefficiency: Each analysis feature (ast, complexity, dataflow, roles) was individually calling wal_checkpoint(TRUNCATE) at ~68ms each, totaling 200-270ms overhead on incremental builds
Native CFG Bulk Insert Silent Failure: bulkInsertCfg was failing because napi-rs cannot convert JS null to Rust Option<u32>, only undefined

Why This Approach: The solution addresses both root causes directly:

Batching Strategy: Replace multiple individual checkpoints with one FULL checkpoint before analyses and one TRUNCATE after completion, with no-op suspend/resume callbacks
Type Fix: Change ?? null to ?? undefined for optional fields to match napi-rs type expectations

Risk Assessment: Low risk - this is a targeted performance optimization that maintains the same functional behavior while eliminating redundant WAL operations.

Phase 2: Configuration Change Detection ✅

✅ No configuration file changes detected - Only source code files modified:

src/domain/graph/builder/pipeline.ts
src/features/cfg.ts

✅ No linting, build, or test configuration modifications - Zero risk of silencing warnings or relaxing rules.

Phase 3: Test Integrity Check ✅

✅ Comprehensive test coverage verified:

Incremental parity tests: tests/integration/incremental-parity.test.ts validates that incremental builds produce identical nodes, edges, complexity, CFG blocks, dataflow, and roles compared to full builds
Performance assertions remain intact: The 200ms thresholds for structure, roles, and finalize phases are preserved and should benefit from this optimization
12 incremental parity tests mentioned in PR description all pass, including CFG-specific validation
No test weakening: No assertions removed or made less strict

✅ No red flags identified - Tests remain comprehensive and strict.

Phase 4: Code Quality Review ✅

✅ Implementation Analysis:

WAL Checkpoint Optimization (pipeline.ts:495-501, 758-765, 778-785):

Sound approach: Single FULL checkpoint before analyses + no-op callbacks during + TRUNCATE after completion
Correct placement: Upfront checkpoint in both native fallback (runPostNativeAnalysis) and regular pipeline (runPipelineStages)
Maintains safety: Still performs final TRUNCATE to ensure all writes are visible to subsequent readers
Performance logic: Eliminates ~68ms × 3-4 features of redundant checkpointing

CFG Type Fix (cfg.ts:407-409):

Root cause identified: napi-rs requires undefined for Option<T>, not null
Minimal change: Only affects optional fields startLine, endLine, label
Backward compatible: Doesn't change the data model or API
Logging improvement: Native bulk path now properly logs "CFG (native bulk): N functions analyzed" instead of silent failure

✅ Code style: Follows existing patterns, proper error handling, maintains existing logging format

✅ No security concerns: No injection vectors, proper resource cleanup

Phase 5: Backlog Compliance Assessment ✅

Zero-dep: ✅ - No new runtime dependencies added

Foundation-aligned: ✅ - Fully compliant with all principles:

P1 (always-current): Improves incremental build performance
P4 (zero-cost core): Performance optimization, no new costs
P5 (embeddable-first): No API changes
P8 (not an agent): Pure performance optimization, no behavior changes

Problem-fit: 4/5 - Directly improves agent experience by reducing incremental build latency from 11.1ms to 5.7ms (-49%). Faster rebuilds mean fewer token-wasting delays when AI agents make changes.

Breaking: No - Purely internal optimization, no user-facing API changes

Tier: 1 (zero-dep ✅ + foundation-aligned ✅)

Critical Concerns

None identified. This is a well-executed performance optimization with proper testing.

Minor Observations

Performance claims well-supported: 49% improvement (11.1ms → 5.7ms) is substantial for incremental builds
Excellent test coverage: CFG fix verified by existing incremental parity tests that specifically check CFG block preservation
Proper fallback handling: Both native orchestrator fallback and regular pipeline paths handle the optimization correctly

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: WAL checkpoint overhead (~200-270ms) and silent CFG bulk insert failures in incremental builds
**Why This Approach**: Direct optimization of checkpoint batching + napi-rs type compatibility fix
**Risk Assessment**: Low - maintains functional behavior while eliminating redundant operations

## Backlog Compliance
- Zero-dep: ✅ - no new dependencies
- Foundation-aligned: ✅ - pure performance optimization
- Problem-fit: 4/5 - directly improves AI agent experience with faster incremental builds  
- Breaking: No - internal optimization only
- Tier: 1

## Critical Concerns
None identified.

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)
- Action: APPROVE
- Reasoning: Well-executed performance optimization with comprehensive test coverage, clear root cause analysis, and significant measurable benefit. The 49% improvement in incremental build time directly benefits AI agents working with the codebase.

greptile-apps · 2026-04-13T00:54:57Z

Greptile Summary

Two focused changes: WAL checkpoints are now batched (one FULL upfront + one TRUNCATE after all analyses instead of per-feature TRUNCATE pairs), and bulkInsertCfg optional fields are fixed from null to undefined so napi-rs correctly maps them to Option<T>::None. The prior comment-mismatch concern (PASSIVE vs. FULL) was addressed in c6f28a8.

Confidence Score: 5/5

Safe to merge — both changes are correct and well-tested; only a one-line stale comment remains.

The WAL batching logic is sound: JS writes are fully flushed before the Rust connection begins analysis, and the post-analyses TRUNCATE replaces the previous per-feature checkpoints symmetrically. The null→undefined fix correctly matches napi-rs Option semantics. All 12 incremental parity tests pass. The single remaining finding is a cosmetic comment mismatch (P2) that does not affect runtime behavior.

No files require special attention.

Important Files Changed

Filename	Overview
src/domain/graph/builder/pipeline.ts	Replaces per-feature suspend/resume WAL checkpoint pairs with a single FULL checkpoint before analyses and a single TRUNCATE after; one stale inline comment remains at line 482.
src/features/cfg.ts	Fixes napi-rs Option mapping by changing `?? null` to `?? undefined` for startLine, endLine, and label in the bulkInsertCfg payload; straightforward and correct.

Sequence Diagram

sequenceDiagram
    participant JS as JS Pipeline (ctx.db)
    participant Rust as Rust Engine (nativeDb)
    participant WAL as SQLite WAL

    Note over JS,WAL: buildEdges / buildStructure write JS rows into WAL
    JS->>WAL: write edges, structure

    JS->>Rust: reopenNativeDb()
    JS->>WAL: wal_checkpoint(FULL) — flush JS WAL → DB file
    Note over JS,Rust: suspendJsDb / resumeJsDb set to no-ops

    loop runAnalyses (ast, complexity, CFG, dataflow)
        Rust->>WAL: write analysis rows (bulkInsertCfg, etc.)
        Note over JS,Rust: suspendJsDb() / resumeJsDb() are now no-ops
    end

    Rust->>WAL: wal_checkpoint(TRUNCATE) — flush Rust WAL → DB file
    JS->>JS: refreshJsDb() — JS connection reads latest state

_{Reviews (3): Last reviewed commit: "Merge branch 'main' into fix/incr-checkp..." | Re-trigger Greptile}

greptile-apps · 2026-04-13T00:55:01Z

src/domain/graph/builder/pipeline.ts

+  // Flush JS WAL pages once so Rust can see them, then no-op callbacks.
+  // Previously each feature called wal_checkpoint(TRUNCATE) individually
+  // (~68ms each × 3-4 features). One PASSIVE checkpoint suffices.
  if (ctx.nativeDb && ctx.engineOpts) {
-    ctx.engineOpts.suspendJsDb = () => {
-      ctx.db.pragma('wal_checkpoint(TRUNCATE)');
-    };
-    ctx.engineOpts.resumeJsDb = () => {
-      try {
-        ctx.nativeDb?.exec('PRAGMA wal_checkpoint(TRUNCATE)');
-      } catch (e) {
-        debug(
-          `resumeJsDb: WAL checkpoint failed (nativeDb may already be closed): ${toErrorMessage(e)}`,
-        );
-      }
-    };
+    ctx.db.pragma('wal_checkpoint(FULL)');


"PASSIVE" comment doesn't match FULL in code

The comment on line 496 says "One PASSIVE checkpoint suffices" but the pragma immediately below it uses wal_checkpoint(FULL). The same mismatch appears in runPipelineStages (~line 754–761): "do ONE upfront PASSIVE checkpoint" while the code calls wal_checkpoint(FULL). FULL is the correct choice here — unlike PASSIVE, it waits for existing readers to finish before checkpointing, giving a stronger visibility guarantee. The comments just need to be updated.

Suggested change

// Flush JS WAL pages once so Rust can see them, then no-op callbacks.

// Previously each feature called wal_checkpoint(TRUNCATE) individually

// (~68ms each × 3-4 features). One PASSIVE checkpoint suffices.

if (ctx.nativeDb && ctx.engineOpts) {

ctx.engineOpts.suspendJsDb = () => {

ctx.db.pragma('wal_checkpoint(TRUNCATE)');

};

ctx.engineOpts.resumeJsDb = () => {

try {

ctx.nativeDb?.exec('PRAGMA wal_checkpoint(TRUNCATE)');

} catch (e) {

debug(

`resumeJsDb: WAL checkpoint failed (nativeDb may already be closed): ${toErrorMessage(e)}`,

);

}

};

ctx.db.pragma('wal_checkpoint(FULL)');

// Flush JS WAL pages once so Rust can see them, then no-op callbacks.

// Previously each feature called wal_checkpoint(TRUNCATE) individually

// (~68ms each × 3-4 features). One FULL checkpoint suffices.

if (ctx.nativeDb && ctx.engineOpts) {

ctx.db.pragma('wal_checkpoint(FULL)');

Fixed in c6f28a8 — updated both comments (line 496 and line 754) from "PASSIVE" to "FULL" to match the actual wal_checkpoint(FULL) pragma calls.

Fixed in c6f28a8 — updated both comments (line 496 and line 754) from "PASSIVE" to "FULL" to match the actual wal_checkpoint(FULL) pragma calls.

github-actions · 2026-04-13T00:58:24Z

Codegraph Impact Analysis

3 functions changed → 9 callers affected across 7 files

runPostNativeAnalysis in src/domain/graph/builder/pipeline.ts:463 (4 transitive callers)
runPipelineStages in src/domain/graph/builder/pipeline.ts:704 (6 transitive callers)
buildCFGData in src/features/cfg.ts:368 (3 transitive callers)

carlos-alm · 2026-04-13T02:13:03Z

@greptileai

greptile-apps bot reviewed Apr 13, 2026

View reviewed changes

fix: correct PASSIVE→FULL in WAL checkpoint comments (#917)

c6f28a8

Merge branch 'main' into fix/incr-checkpoint-perf

1cfc5e6

carlos-alm merged commit a32d3d3 into main Apr 13, 2026
18 checks passed

carlos-alm deleted the fix/incr-checkpoint-perf branch April 13, 2026 02:59

github-actions bot locked and limited conversation to collaborators Apr 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(incremental): batch WAL checkpoints and fix native CFG bulk insert#917

perf(incremental): batch WAL checkpoints and fix native CFG bulk insert#917
carlos-alm merged 3 commits intomainfrom
fix/incr-checkpoint-perf

carlos-alm commented Apr 13, 2026

Uh oh!

claude bot commented Apr 13, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Apr 13, 2026 •

edited

Loading

Uh oh!

greptile-apps bot Apr 13, 2026

Uh oh!

carlos-alm Apr 13, 2026

Uh oh!

carlos-alm Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026 •

edited

Loading

Uh oh!

carlos-alm commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

carlos-alm commented Apr 13, 2026

Summary

Benchmark (53-file project, 1-file incremental, avg 3 runs)

Test plan

Uh oh!

claude bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 PR Review Complete

Todo List:

Phase 1: Root Cause Analysis ✅

Phase 2: Configuration Change Detection ✅

Phase 3: Test Integrity Check ✅

Phase 4: Code Quality Review ✅

Phase 5: Backlog Compliance Assessment ✅

Critical Concerns

Minor Observations

Final Assessment

Uh oh!

greptile-apps bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codegraph Impact Analysis

Uh oh!

carlos-alm commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Apr 13, 2026 •

edited

Loading

greptile-apps bot commented Apr 13, 2026 •

edited

Loading

github-actions bot commented Apr 13, 2026 •

edited

Loading