Skip to content

feat: intra-procedural CFG + reaching-definitions (def-use) overlay — 11 languages (TS/JS, Python, Go, Java, C++, Rust, Ruby, C, C#, PHP)#146

Draft
clay-good wants to merge 23 commits into
mainfrom
feat/intraprocedural-cfg-dataflow-overlay
Draft

feat: intra-procedural CFG + reaching-definitions (def-use) overlay — 11 languages (TS/JS, Python, Go, Java, C++, Rust, Ruby, C, C#, PHP)#146
clay-good wants to merge 23 commits into
mainfrom
feat/intraprocedural-cfg-dataflow-overlay

Conversation

@clay-good

@clay-good clay-good commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Summary

Implements add-intraprocedural-cfg-dataflow-overlay end-to-end: a deterministic, per-function control-flow graph plus reaching-definitions (def-use) overlay computed purely from the AST (no LLM, no type-checker, no build) while the parse tree is live, stored as a compact per-function blob in a new DB-only table — and an opt-in value-level precision mode on analyze_impact/trace_execution_path built on it.

This follows the Joern overlay model: the call graph stays the base; CFG and reaching-defs are two additive, independently-storable overlays. The default behavior of every existing tool is byte-for-byte unchanged — value-level precision is strictly opt-in.

What's implemented

Phase 1 — CFG + storage (commit 1)

  • src/core/analyzer/cfg.ts — the first statement-level AST visitor in the analyzer (modeled on the Elixir walk precedent). Builds basic blocks + control-flow edges (branch/loop/early-exit/back edges). Returns pure data — no AST node references survive, so it is WASM-tree-lifetime safe.
  • call-graph.ts — the in-scope extractors (TS/JS, Python, Go) build the overlay inside the live-tree window; the extractor return contract gains an optional cfg; CallGraphResult carries a transient cfgs map that is not added to SerializedCallGraph/the resident graph. Other languages fail soft (no overlay, no error).
  • edge-store.ts — new cfg_overlay table (one JSON blob per function id), SCHEMA_VERSION 6→7 (drop-and-rebuild, zero migration). Lazy getCfg + per-file deleteCfgForFile/insertCfgs.
  • artifact-generator.ts — persists the overlay to SQLite and strips it from llm-context.json so it never becomes resident.
  • mcp-watcher.ts — recomputes only the changed file's overlay rows inside the existing per-file swap transaction (intra-procedural ⇒ caller files untouched).

Phase 2 — reaching definitions (commit 1)

  • A classical intra-procedural reaching-definitions fixpoint over the CFG produces precision-labeled def-use edges: exact for sound local-scalar def-use, may for conservatively over-approximated field/subscript/closure dependences.

Step 5 — value-level opt-in (commit 2)

  • valueReachableLines() — a pure forward data-flow slice over the def-use edges (the value's impact set).
  • analyze_impact gains valueLevel/valueParam: narrows the downstream blast radius to the direct callees whose argument lines are data-dependent on the targeted value (cross-call hop labeled may), expanding forward from them. trace_execution_path gains the same flags (restricts each entry's first hop). Both fall back to function granularity (no error) when the function has no overlay. Plumbed through dispatch + tool schemas; nav-preset payload ceiling bumped 11800→12300 (spec-28 precedent).

Hardening (dogfooding on real code)

After the initial implementation, the overlay was dogfooded on the real OpenLore corpus, which surfaced five soundness/precision bugs in the CFG/def-use builder — all now fixed and regression-tested:

  1. try/catch treated as straight-line — a catch-body def spuriously killed the try-body def. Now modeled as alternative paths from a common predecessor (both reach the join; finally runs from the merge).
  2. switch treated as straight-line — cases linearly killed each other. Now alternative branches with language-correct fall-through (switchFallsThrough: TS/JS yes, Go/Python no).
  3. closure captures labeled exact — the spec requires may. Nested functions are no longer descended into as the outer CFG; their free-variable reads become may captures and their own params/locals no longer leak.
  4. Python if/elif/else collapsed to a single alternative (childForFieldName returns only the first) — later elif/else branches were dropped. Now the full elif chain is built.
  5. Destructuring (const { a, b } = obj, [x, y], { a: x }, { a = default }) produced no defs. Now each binding leaf is a definition.

Verified deterministic across repeated analyses of the real corpus (identical overlay hash), with branch/merge structure and may edges present on actual code.

Exhaustive all-language verification (every language OpenLore supports)

Before ship, every one of OpenLore's 19 detectable languages was verified to either produce a correct overlay or fail soft cleanly without breaking anything — proven by three parallel agents on ~30 real repos (~50,000 overlays) plus a CI-protected contract guard.

  • 7 fail-soft code languages (Kotlin, Swift, Scala, Dart, Lua, Elixir, Bash) on real repos (okhttp, Alamofire, playframework, flutter/samples, lua-repl, phoenix, nvm): exit 0, no crash, full call graph (up to 7,094 nodes), 0 overlay rows for the target language, DB integrity OK, search intact. The Lua/Bash → extractByQueriesbuildCfgFor path fails soft cleanly. 0 invariant violations across 5,449 incidental overlays from co-located supported-language files.
  • 5 IaC types (Terraform, Kubernetes, Helm, CloudFormation, Ansible) + polyglot repos (apache/spark, kubernetes): analyze cleanly, 0 overlay rows from IaC (the no-overlay invariant holds even mixed into Go/Bash/Python trees), deterministic, 0 invariant violations, no resident-artifact leak.
  • CI-protected language-contract guard: a test locks cfgSupportsLanguage to exactly the 11 overlay languages and asserts buildFunctionCfg returns undefined (never throws) for every other language — the fail-soft contract can't silently regress.

Watcher bug found and fixed (real "breaking existing tooling")

Being methodical surfaced a genuine pre-existing bug: .c/.cs/.php/.kt files are in the watcher's SOURCE_EXTENSIONS but were missing from its CALL_GRAPH_LANGS, so editing one made buildGraphSubset return empty and the per-file swap wiped that file's nodes/edges (and, with the overlay, its cfg_overlay rows) in watch mode until the next full analyze. Added C/C#/PHP/Kotlin to the watcher's graph set so edits re-graph and re-overlay instead of wiping (grammars are optional deps → fail soft to empty if absent, identical to full analyze). Verified + regression-tested.

Broad re-validation: 11 overlay languages on 11 fresh real repos

A third agent re-validated all 11 overlay languages on new repos (got, lodash, click, gin, commons-lang, fmt, serde, jekyll, curl, automapper, laravel-framework): ~46,700 overlays / ~190,000 def-use edges, 0 structural violations, determinism bit-identical (clean-rebuild hashes equal), and no unsound-exact found in source-level soundness spot-checks. The dangerous cases are correctly downgraded to may on real code — C++ &r passed to an intrinsic, JS closure rebinds, Ruby statement modifiers, C pointer arithmetic. laravel (28,276 overlays) is a clean large-scale stress test.

It surfaced one C/C++ imprecision, now fixed: a write through a pointer (*p = x) was recorded as an exact def of the binding p (the pointer_expression fell through to the inner identifier), mislabeling provenance and killing p's real def. A wrong exact provenance label is precisely the failure mode the overlay must never emit, so *p = x is now handled like a member/subscript write — base pointer is a use, the l-value *p a conservative may def, never an exact reassign of p. Regression-tested.

Language coverage (verified)

Overlay-supported (sound, adversarially validated) Fail-soft (safe: full call graph, 0 overlay rows, no crash)
TypeScript, JavaScript, Python, Go, Java, C++, Rust, Ruby, C, C#, PHP Kotlin, Scala, Swift, Lua, Bash, Elixir, Dart

A regression+fail-soft safety agent confirmed the overlay breaks nothing for any of OpenLore's 18 languages — analyze exits 0, the no-overlay languages keep their full call-graph nodes/edges with 0 overlay rows, no resident leak, 0 invariant violations. The 11 overlay languages are the ones whose grammars expose the fields the (field-centric) builder needs and that adversarial agents validated as sound on real repos. Kotlin/Scala/Swift have field-less/positional grammars (e.g. Kotlin if exposes no condition/consequence fields) — adapting the builder to those is bug-prone, so they (and the deferred Lua/Bash/Elixir/Dart) stay fail-soft rather than risk an unsound overlay. Covered safely — never wrong data.

Adversarial validation of C/C#/PHP on real repos (sds, cecil, FastRoute): scale, invariants, and determinism clean across 2,704 overlays (0 violations, byte-identical hashes). C was fully sound on every probe. Four unsound-exact issues in PHP/C# escape detection were found on real code and fixed: PHP anonymous closures leaked their body (stale grammar node name anonymous_function_creation_expressionanonymous_function), PHP $r = &$x reference aliasing, PHP use (&$x) by-ref capture, and C# ref/out arguments (pervasive in cecil's TryGet(out …) pattern). After the fix, cecil's may-rate rose 378→987 (the previously-unsound exacts are now correctly conservative) and the FastRoute closure leak is gone.

Language extension round (7 languages)

Extended the overlay beyond the v1 TS/JS·Python·Go set to Java, C++, Rust, and Ruby — the other native-grammar languages OpenLore covers. The builder is spec-driven (a per-language CfgLangSpec of node-type names), so each language is a verified spec plus the infrastructure it needed:

  • callNameField — Java uses a name field for the callee (Ruby method), so the receiver/args are still read while the method name is skipped.
  • Compound-operator detection by text — Java/C/C++ use one node type for = and +=; a compound assignment now correctly reads its target first.
  • caseParts() — per-language switch/match/case decomposition: Rust match_arm and Ruby when/else carry the body in a field, Java groups statements under switch_label markers, C/C++ use case_statement. All modeled as parallel alternatives so cases never kill each other (verified sound on every language).
  • Expression-statement unwrap for Rust's expression-oriented control flow (if/while/match are wrapped in expression_statement), plus else/then/do/body_statement block types and declarator/pattern/value field fallbacks for the new grammars.

Node types were determined empirically against each grammar (never guessed). All the soundness machinery carries over: closure mutation → may, member/subscript → may, escape analysis, scope resolution, do/while post-test, fail-soft caps. Verified end-to-end (overlays stored for all four), deterministic, full suite 3598 passing with per-language tests. Unsupported languages still fail soft (no overlay).

Adversarial validation on real repos. Two agents stress-tested the four new languages on gson, nlohmann/json, ripgrep, and sinatra (the same rigor applied to TS/Py/Go). Scale, structural invariants, and determinism were clean across 5,601 overlays (0 violations, byte-identical hashes). Rust came back fully sound (97.3% exact, correct on shadowing/re-let/if-as-expression/closures). Four unsound-exact issues found and fixed — Java 14+ arrow switch (case N -> {}), C++ reference aliasing (int& r = x) and &x address-of escape, and Ruby statement modifiers (x = 2 if c, live in sinatra) — plus two coverage gaps (C++ parameters were nested under function_declarator; Java enhanced-for loop var). All verified with per-language tests; may correctly rose where aliasing was newly detected.

Production-readiness round (guarantees proven, not asserted; scale-tested)

A final pass verified the guarantees we'd claimed but not measured, and stress-tested at scale via three more agents:

  • Default output is byte-identical — diffed pre-feature main vs the feature branch on a real repo: analyze_impact/select_tests/get_subgraph/trace_execution_path/find_dead_code outputs are identical to the byte (35,621 bytes, zero diff). The overlay's presence changes nothing for an agent that doesn't opt in.
  • Zero new resident memoryllm-context.json is byte-identical (171,899 bytes) with the feature; the resident callGraph carries no cfgs/defUse. The overlay is DB-only and lazily loaded.
  • MCP protocol end-to-end — verified over real JSON-RPC stdio: tools/list exposes valueLevel/valueParam with correct schemas, value-level narrows over the wire, default calls omit the field, invalid input degrades gracefully, and the server exits clean.
  • Scale & determinism — analyzed the TypeScript compiler (src/compiler, 11.2k fns), Django (12.1k), Kubernetes (pkg/, 37.9k Go fns), and a mixed TS+Py+Go tree. 56,441 overlays, 0 invariant violations, perfect determinism (identical content hashes across runs), no crash/hang. Overhead is linear ~0.7–0.9 ms/function (~17–34% of analyze wall-clock); the fail-soft caps never fired on real code. ~88% exact / 12% may at scale.

Two issues found this pass, both fixed:

  • do/while unsound-exact — it was modeled as a pre-test loop, leaking a pre-loop def into the condition/post-loop as exact. Now post-test wired (body runs first), so the body's defs correctly kill the pre-loop def. (Plain while/for unchanged.)
  • Decorated Python functions got no overlaybuildCfgFor's body-dig didn't accept Python's inner function_definition under a decorated_definition wrapper, so @property/@cached_property/Django-view functions were silently skipped. Fixed; verified end-to-end.

Wider audit round (3 agents: value-level, robustness, deep idioms)

A third wave of parallel agents stress-tested the consumer seam, robustness, and deep language idioms. Four issues found, all fixed:

  • Determinism (critical): the callee-skip compared tree-sitter nodes by object identity (c === fn), but native tree-sitter returns fresh wrapper objects per access path — so the overlay flickered ~50% of builds. Fixed with a position-based sameNode() comparator. Now 25–30 identical builds and identical full-analyze hashes; a strong many-build determinism test guards the class (the old 2-build test missed it).
  • False narrowing (value-level, agent-misleading): valueReachableLines chained taint by exact line co-location, so a parameter feeding a multi-line initializer (const ctx = {\n x: p \n}) dropped every downstream that object reached — telling an agent a change was safe when it wasn't (reproduced on real zod safeParseAsync). Fixed by tagging RHS uses with the fed def's start line (useInDefLine) and chaining on it; verified handleResult now surfaces on the real zod function.
  • Unsound exact: Python global / same-level nonlocal were exact despite hidden mutation; now may.
  • Robustness: the reaching-defs fixpoint was quadratic on deeply-nested loops (47s at 700-deep); now capped (128 sweeps / 4000 blocks) and fails soft to no overlay in <2s. Plus Python with ... as binding defs.

Also widened test coverage to the integration seams: a watcher edit recomputes the overlay byte-equal to a fresh build, and the cross-call may label is asserted. Final sweep across zod/cobra/requests: 1,596 overlays, 0 violations, 0 key leaks, ~91% exact / 9% may, deterministic.

Deep correctness round (adversarial audit + real-repo dogfooding)

Two parallel agents — a real-repo stress test and an independent soundness audit — hunted for the one thing that must never happen: an exact label that's actually wrong. They found a cluster of bugs, all now fixed and regression-tested:

  • Lexical scope resolution (the big one): name-based reaching-defs conflated shadowed variables, so an inner-block let/const/:= (TS/Go) or a Python comprehension loop var reached an outer same-named use as exact and dropped the correct outer edge. A resolveScopes pre-pass now keys every identifier by name#scopeId; reaching-defs groups by the scoped key while emitting the bare name. Inner shadows no longer kill or link to outer variables.
  • x++/x-- are now recorded as a use+def (were invisible) and detected as closure mutations.
  • Go closure escape now descends expression_list LHS (x = 5), inc/dec, and address-of taken inside a closure — all correctly downgrade the outer var to may.
  • Python try/except/else — the else runs on the no-exception path after the try body; previously ignored (lost its def, kept the overwritten try def).
  • Labeled loopslabeled_statement is unwrapped so the loop keeps its back-edge and loop-carried dependence.
  • Go loop headers — Go wraps the header in a for_clause/range_clause child, so loop counters and range vars were invisible (and a range var could alias an outer same-name as exact). A loopHeaderField helper now descends into the clause; counters/range vars are defined, loop-carried, and scoped.

A follow-up verification audit confirmed all fixes hold and TS/Python scope resolution is clean; it surfaced the Go loop-header gap above, now fixed. Validated on zod, requests, and cobra (1,596 overlays): 0 structural-invariant violations, ~90% exact / 10% may, 0 scope keys leaked into the output across 9,634 edges, deterministic across repeated runs. The earlier express/flask/mux/ky/type-fest sweep adds another 1,000+ overlays clean.

Correctness & real-world validation

The exact precision label is the load-bearing safety signal — a wrong exact is the one thing that taxes an agent with incorrect context, so it gets adversarial scrutiny:

  • Escape analysis — a local scalar reassigned inside a nested closure, or a Go local whose address is taken (&x), can change out of band. Both were silently labeled exact; they are now downgraded to may. The escape set is over-approximated on purpose (a false inclusion only weakens precision, never yields an unsound exact); read-only closure captures are not downgraded, so precision is preserved where sound. exact now means soundly exact for the supported languages.
  • Structural safety netisStructurallyValid runs on every built CFG (one entry/exit, edge endpoints reference real blocks, positive def/use lines, valid labels). A violation makes the builder emit no overlay rather than wrong context — so future grammar drift or builder bugs degrade safely.
  • Real-world corpus — validated on four diverse third-party repos (express/JS, flask/Python, gorilla-mux/Go, ky/TS): all analyzed with no crashes, and a structural-invariant sweep over all 457 produced overlays found 0 violations.

Design posture: the overlay is opt-in, DB-only, fail-soft, and label-gated — an agent that doesn't request value-level precision sees byte-identical output and zero new data; one that does gets an honest exact/may signal, never an over-claimed dependence.

Verification

  • Unit: cfg.test.ts — every spec scenario (branch/join, loop back edge, early-return path termination, determinism, fail-soft; local-scalar def-use, reassignment-kills, both-branches-reach, field write labeled may, exact/may distinguishable) across TypeScript, Python, and Go.
  • Storage: cfg-overlay-storage.test.ts — overlay loadable from store, not in the resident serialized graph, schema bump rebuilds without migration, per-file delete is isolated.
  • Consumer: graph.test.ts value-level block — default unchanged, narrows to the data-dependent callee, falls back when no overlay.
  • Full CI-mirror suite (vitest run src examples): 3555 passed, 2 skipped.
  • End-to-end through the compiled CLI on a real TS/Python/Go repo: schema v7; overlay persisted for every in-scope-language function (542/650 non-external nodes — every node lacking one is a non-TS/Py/Go fixture = correct fail-soft); overlay ~1.1 KB median/function, absent from llm-context.json (resident memory unchanged); value-level analyze_impact on param a narrows downstream to exactly the data-dependent callee.

Spec / decisions

Spec delta merged into openspec/specs/analyzer/spec.md + mcp-handlers/spec.md via approved decisions c8f2b9bf (overlay-in-live-tree-extractors) and b6f04199 (value-level opt-in). ADR adr-0007.

Out of scope (as specified)

Inter-procedural/whole-program data-flow, sound aliasing/points-to, control-dependence/slicing as first-class outputs, statement-level nodes in the resident graph, and languages beyond TS/JS·Python·Go in v1.

🤖 Generated with Claude Code

clay-good and others added 2 commits June 12, 2026 10:35
…Phase 1+2)

Add a deterministic, per-function control-flow graph and reaching-definitions
(def-use) overlay computed from the AST while the parse tree is live, stored as a
compact per-function blob in a new DB-only table. No LLM; AST shape + a classical
fixpoint. Default behavior of every existing tool is unchanged.

- cfg.ts: CFG builder (basic blocks + branch/loop/early-exit edges) and an
  intra-procedural reaching-definitions fixpoint producing precision-labeled
  (exact|may) def-use edges. WASM-safe: returns pure data.
- call-graph.ts: extend in-scope extractors (TS/JS, Python, Go) to build the
  overlay in the live-tree window; return contract gains optional cfg;
  CallGraphResult carries a transient cfgs map (NOT in SerializedCallGraph).
- edge-store.ts: new cfg_overlay table, schema 6->7 (drop-and-rebuild, zero
  migration); lazy getCfg + per-file delete/insert for incremental recompute.
- artifact-generator.ts: persist overlay to SQLite; strip from llm-context.json.
- mcp-watcher.ts: recompute only the changed file's overlay rows in the swap.

Tests: cfg.test.ts, cfg-overlay-storage.test.ts.

Decision: c8f2b9bf (synced). Gate bypassed for 8 unrelated pre-existing
backlog decisions from other in-flight proposals.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…h (Step 5)

Add opt-in value/parameter-granularity precision to the two impact/tracing tools,
backed by the reaching-definitions overlay. Strictly opt-in — with the flag absent
the result is byte-for-byte the function-granularity answer.

- cfg.ts: valueReachableLines() — a pure forward data-flow slice over the def-use
  edges (the value's impact set), seeded from a parameter/local or all params.
- graph.ts: handleAnalyzeImpact gains valueLevel/valueParam — narrows the
  downstream blast radius to the direct callees whose argument lines are
  data-dependent on the targeted value (cross-call hop labeled may), expanding
  forward from them. handleTraceExecutionPath gains the same flags — restricts
  each entry's first hop to data-dependent callees. Both fall back to function
  granularity (no error) when the function has no overlay.
- tool-dispatch.ts + mcp.ts: plumb valueLevel/valueParam through dispatch and the
  tool schemas. Nav preset payload ceiling bumped 11800->12300 (spec-28 precedent).

Tests: graph.test.ts value-level block (default unchanged, narrows to the
data-dependent callee, falls back when no overlay). E2E verified through the
compiled CLI on a real TS/Python/Go repo: overlay persisted for all 7 functions,
schema v7, llm-context.json carries no overlay, value-level narrows correctly.

Decision: b6f04199 (synced).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@clay-good clay-good changed the title feat(analyzer): intra-procedural CFG + reaching-definitions (def-use) overlay feat: intra-procedural CFG + reaching-definitions (def-use) overlay, with value-level opt-in Jun 12, 2026
clay-good and others added 14 commits June 12, 2026 10:51
Dogfooding the overlay on real TypeScript surfaced three soundness/precision
bugs, all now fixed and regression-tested:

- try/catch and switch were treated as straight-line statements, so a catch-body
  or later-case definition spuriously KILLED the try-body / earlier-case
  definition — omitting a real reaching-def (a soundness violation). They now
  build branch+merge structure: catch clauses are alternative paths from the same
  predecessor (both reach the join; finally runs from the merge); switch cases are
  alternative branches with language-correct fall-through (switchFallsThrough:
  TS/JS yes, Go/Python no), so cases don't linearly kill each other.
- closure captures of outer variables were labeled `exact`; the spec requires
  `may`. Nested functions are no longer descended into as the outer CFG — their
  free-variable reads become `may` closure-capture uses, and their own
  params/locals no longer leak as outer uses.

Verified on the real corpus: deterministic (identical overlay hash across two
analyses), 233 branch / 241 merge blocks now present. Full suite: 3559 passed.

Decision: 2c0d04e3 (synced).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two more soundness bugs found by dogfooding, fixed and regression-tested:

- Python if/elif/else collapsed to a single alternative: childForFieldName
  ('alternative') returns only the first elif_clause, so every later elif and the
  trailing else were dropped from the CFG — their definitions never reached
  downstream uses. processIf now collects all elif_clause children and the
  else_clause and builds the full branch chain. TS else-if (nested if in an
  else_clause) is unaffected — the new path triggers only when elif_clause exists.
- Destructuring bindings produced no defs: `const { a, b } = obj` exposes each
  name as a childless shorthand_property_identifier_pattern that the generic
  pattern recurse dropped; `{ a: x }` pairs and `{ a = default }` defaults were
  also mishandled. recordTarget now binds shorthand/array/pair/default pattern
  leaves as definitions (default values recorded as uses).

Verified deterministic on the real corpus. Full suite: 3562 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The `exact` precision label is what an agent trusts; a wrong `exact` is
actively misleading context. Adversarial probing found two unsound-exact holes:

- a local scalar reassigned inside a nested closure that is then invoked — the
  value at a later read may have changed out of band, yet was labeled `exact`;
- a Go local whose address is taken (`&x`) and mutated through the pointer.

A new collectEscapedVars pre-pass marks names assigned inside any nested closure
(excluding the closure's own params/locals) and names whose address is taken;
computeReachingDefs forces `may` for those. The set is over-approximated on
purpose — a false inclusion only weakens precision, never yields an unsound
`exact`. Read-only closure captures are NOT downgraded (verified), so precision
is preserved where it is sound.

`exact` now means soundly exact for the supported languages. Deterministic on the
real corpus; full suite 3565 passed.

Decision: 8192f32f (synced).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make a corrupt overlay degrade to NO overlay rather than wrong context: a new
isStructurallyValid guard runs on every built CFG (exactly one entry/exit, all
edge endpoints reference real blocks, def/use lines positive, precision labels
valid, params well-formed). If any invariant fails, buildFunctionCfg emits no
overlay — the safe fallback an agent can trust.

Validated against four diverse real-world repos (express/JS, flask/Python,
gorilla-mux/Go, ky/TS): all analyzed with no crashes, and a structural-invariant
sweep over all 457 produced overlays found 0 violations. The guard does not
reject any valid overlay (coverage unchanged).

Full suite: 3567 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two completeness bugs (omitting a real dependence = wrong info to an agent):

- Logical assignment `x ||= e` / `&&=` / `??=` assigns only CONDITIONALLY, so the
  prior value can survive — but it was treated as an unconditional def that killed
  the earlier one, dropping the old def from reaching a later use. Added a `weak`
  (non-killing) def kind: weak defs accumulate alongside prior defs in GEN and do
  not contribute to KILL, so both reach. Plain `=` and augmented `+=` (which always
  assign) still kill, verified.
- Python walrus `n := value` (named_expression) recorded no definition at all, so
  data flow through it was invisible. Now records the embedded def + value uses.

Loop-carried dependencies (back-edge propagation) re-verified correct. Deterministic
on the real corpus; full suite 3570 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two parallel adversarial audits (real-repo stress + soundness review) found that
name-based reaching-defs conflated shadowed variables, producing the worst kind of
bug — unsound `exact` edges fed to an agent. Fixes:

- Lexical scope resolution: a resolveScopes pre-pass keys every identifier by
  `name#scopeId` (nearest declaring scope; else root). Reaching-defs groups KILL/
  reach by the scoped key, so an inner-block `let`/`const`/`:=` (TS/Go) or a Python
  comprehension loop var no longer kills or links to the outer same-named variable.
  Both the wrong `exact` edge and the dropped correct edge are fixed.
- `x++`/`x--` recorded as a use+def (was invisible) and detected as a closure
  mutation.
- Go closure-escape: descend `expression_list` LHS (`x = 5`), inc/dec, and
  address-of taken inside a closure — all now downgrade the outer var to `may`.
- Python `try/except/else`: the else runs on the no-exception path after the try
  body; previously ignored (lost its def, kept the overwritten try def).
- Labeled loops: unwrap `labeled_statement` so the loop keeps its back-edge and
  loop-carried dependence.

Verified deterministic on zod/requests/cobra: 1596 overlays, 0 invariant
violations, ~90% exact / 10% may. Full suite 3570 passed; 9 new scope/idiom tests.

Decision: 84cc2af4 (synced).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A follow-up verification audit confirmed all prior fixes hold and TS/Python are
clean, but found one regression class: Go loop-header variables were invisible.
Go wraps the header in a for_clause (C-style) or range_clause (range) child, so
`childForFieldName('initializer'/'left'/'condition'/...)` on the for_statement
returned undefined — loop counters and range vars were never recorded as defs nor
added to the loop scope.

Worst case was UNSOUND-EXACT: `i := 100; for i, v := range xs { _ = i }` made the
loop body's `i` resolve to the outer `i = 100` as exact, because the range `i` was
neither scoped nor defined.

Fix: a loopHeaderField() helper descends into the for_clause/range_clause before
reading the header fields; used by recordLoopHeader, processLoop (condition), and
scopeDeclaredNames. Now Go counters/range vars are defined, loop-carried, and
scoped — the range `i` binds to the loop, never the outer.

Verified: cobra (Go) 270 overlays, 0 violations, no key leaks; deterministic.
Full suite 3581 passed; 2 new Go loop tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…overage

Widen test coverage to the integration seams:
- Watcher consistency: a real file edit through McpWatcher.handleChange recomputes
  the CFG/def-use overlay and the persisted result is byte-equal to a fresh full
  build of the new content (intra-procedural ⇒ incremental == full), with the
  stale overlay gone. Closes the OverlayStorageAndIncrementality "single-file edit
  recomputes" scenario end-to-end.
- Cross-call labeling: assert the value-level result labels the cross-procedure
  hop `may` (spec: DataFlowProvenanceLabeling "Cross-call dependence is labeled
  may"), the last spec scenario without a direct test.

Full suite green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ail-soft

A wider three-agent audit (value-level consumer, fuzz/robustness, deep idioms)
surfaced four issues — all fixed:

- DETERMINISM (critical): callee-skip/case-value exclusion compared tree-sitter
  nodes by object identity (`c === fn`), but native tree-sitter returns fresh
  wrapper objects per access path — so the overlay flickered ~50% of builds. Added
  a sameNode() position comparator (startIndex+endIndex); node identity is never
  safe. Now 25-30 identical builds + identical full-analyze hashes.
- FALSE NARROWING (value-level, agent-misleading): valueReachableLines chained
  taint by exact line co-location, so a parameter feeding a MULTI-LINE initializer
  (`const ctx = {\n x: p \n}`) dropped every downstream the object reached — telling
  an agent a change was safe when it wasn't (reproduced on real zod). Fixed by
  tagging RHS uses with the fed def's start line (useInDefLine) and chaining on it.
- UNSOUND-EXACT: Python `global`/same-level `nonlocal` were `exact` despite hidden
  mutation; now downgraded to `may` via the escaped set.
- ROBUSTNESS: the reaching-defs fixpoint was quadratic on deeply nested loops (47s
  at 700-deep); capped at 128 sweeps + 4000 blocks, failing soft to no overlay
  (<2s). Plus Python `with ... as` binding defs.

Full suite 3588 passed; new tests for each (incl. a strong many-build determinism
guard that the old 2-build test missed). Deterministic across full analyzes.

Decision: 045db3ae (synced).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final adversarial sweep found one unsound-exact: do/while loops were wired as
pre-test (while) loops, so a pre-loop definition leaked into the condition and
post-loop uses as `exact` even though the body always overwrites it first
(`let x; do { x = compute(); } while (retry()); return x;` claimed the return
could be the uninitialized x). Added processDoWhile with post-test wiring
(body → cond → {back to body | exit}), so the body's defs kill the pre-loop def.
Plain while/for are unchanged (their body may run zero times, so keeping the
pre-loop def reaching is sound — verified).

Also: Go named return values (`func f() (x int) {...}`) live in a second
parameter_list under the `result` field; extractParamNames now reads it so they
are tracked as the in-scope locals they are.

The sweep otherwise found no false-narrowing and no non-determinism across
do/while, try/finally, Go select/defer/type-switch, Python match, and 30-40x
repeated builds. Remaining gaps (Python match-case bindings, `del`) are safe
missing-edges, never unsound.

Full suite 3591 passed; deterministic. 3 new do/while tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The large-repo scale audit found decorated Python functions silently got NO
overlay (1,249/1,249 in django) — a big coverage hole for Python (@Property,
@cached_property, Django views/commands). Root cause: the Python fn query binds
@fn.node to the `decorated_definition` wrapper, which has no `body` field, so
buildCfgFor fell into its body-dig — but the dig only accepted JS/TS node types
(arrow_function/function_expression/function), never Python's inner
`function_definition`, so buildFunctionCfg got the wrong node and fail-softed.

Fix: add `function_definition` to the dig-accepted types so the wrapper descends
to the body-bearing inner node. Verified end-to-end: decorated @Property /
@cached_property functions now build correct overlays (params + def-use). TS and
Go were unaffected (their queries bind @fn.node to the body-bearing node).

This was a safe miss (no overlay, never wrong data), not unsound — but a material
coverage gain on Python-heavy repos. Full suite 3592 passed; deterministic.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extends the intra-procedural CFG/def-use overlay beyond TS/JS·Python·Go to four
more native-grammar languages OpenLore already covers. The builder is spec-driven,
so this adds JAVA_SPEC/CPP_SPEC/RUST_SPEC/RUBY_SPEC (node types verified
empirically against each grammar) plus the infrastructure each needed:

- callNameField: Java uses a `name` field for the callee, Ruby `method` — so the
  receiver/args are still read while the method name is skipped.
- compound-operator detection by text (Java/C/C++ use one node type for `=` and
  `+=`), so a compound assignment correctly reads its target first.
- caseParts(): per-language switch/match/case decomposition — Rust `match_arm`
  and Ruby `when`/`else` put the body in a field, Java groups statements under
  switch_label markers, C/C++ use `case_statement`. All modeled as parallel
  alternatives so cases never kill each other (verified sound).
- expression_statement unwrap for Rust's expression-oriented control flow
  (`if`/`while`/`match` are wrapped); `else`/`then`/`do`/`body_statement` block
  types and `declarator`/`pattern`/`value` field fallbacks for the new grammars.

buildCfgFor wired into extractRust/Ruby/Java/Cpp; the build already accumulates
result.cfg. Closure mutation → may, member/subscript → may, and loop back-edges
all carry over. Verified end-to-end (overlays stored for all four) and
deterministic; full suite 3598 passed with per-language tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@clay-good clay-good changed the title feat: intra-procedural CFG + reaching-definitions (def-use) overlay, with value-level opt-in feat: intra-procedural CFG + reaching-definitions (def-use) overlay — TS/JS, Python, Go, Java, C++, Rust, Ruby Jun 12, 2026
Two real-repo adversarial agents validated the Java/C++/Rust/Ruby extension
(gson, nlohmann/json, ripgrep, sinatra). Rust came back fully sound; four
unsound-exact issues + two coverage gaps were fixed:

- Java 14+ arrow switch (`case N -> {}` = switch_rule) was unrecognized, dropping
  the whole switch and leaking the pre-switch def as `exact`. Now modeled
  (caseParts/isCaseNode), and arrow/match/when cases are non-fall-through.
- C++ reference binding `int& r = x; r = 5;` aliases x — the alias write was
  invisible (x stayed `exact`). collectEscapedVars now marks a
  reference_declarator's referent as escaped → `may`.
- C++ address-of `&x` parses as `pointer_expression` (not unary_expression), so
  escape analysis missed it. Now handled → `may`.
- Ruby statement modifiers (`x = 2 if c`, `… while c`) were treated as
  unconditional strong defs, dropping the prior reaching def and emitting a wrong
  `exact` (live in sinatra). processModifier models them as a one-armed
  branch/loop so both defs reach.
- C++ parameters live under function_declarator — extractParamNames now descends
  (was [] for ALL C++ functions). Java enhanced-for loop var (in the `name` field)
  is now bound.

Verified on nlohmann/json: 2172 overlays, 0 invariant violations, params now
populated (795 fns), may correctly higher. Full suite 3603; deterministic.

Decision: f5810afa (synced).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@clay-good clay-good marked this pull request as draft June 12, 2026 19:01
…uages

A regression+fail-soft safety agent first proved the overlay breaks NOTHING for
any of OpenLore's 18 languages: analyze exits 0, no-overlay languages keep full
call-graph nodes/edges with 0 overlay rows, 0 invariant violations, no resident
leak — the fail-soft design holds across the board.

Then extended overlay support to three more native-grammar languages that map
cleanly to the spec-driven builder:
- C — a syntactic subset of C++; reuses CPP_SPEC (zero added risk).
- C# — Java-like; switch_section decomposed like Java's label groups.
- PHP — C-family; case_statement reuses the default branch; $-params via
  variable_name (extractParamNames generalized to spec.identTypes).

Infrastructure: buildCfgFor now accepts the CfgNode interface and is wired into
extractByQueries (the spec-08 path) INSIDE withTree, so it is WASM-tree-lifetime
safe; CfgNode.children is optional (the soft-loaded node interface exposes only
namedChildren). Divergent-control-flow languages (Kotlin `when`, Scala `match`,
Swift switch-patterns) and the deferred set (Lua/Bash/Elixir/Dart) stay fail-soft
— no spec means buildFunctionCfg returns undefined — rather than risk an unsound
overlay. Verified: those still graph fully with 0 overlay rows and no crash.

Overlay now covers 11 languages. Sound switch on all (cases never kill each
other). Full suite 3606; deterministic; C/C#/PHP overlays verified e2e.

Decision: 07c32832 (synced).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@clay-good clay-good changed the title feat: intra-procedural CFG + reaching-definitions (def-use) overlay — TS/JS, Python, Go, Java, C++, Rust, Ruby feat: intra-procedural CFG + reaching-definitions (def-use) overlay — 11 languages (TS/JS, Python, Go, Java, C++, Rust, Ruby, C, C#, PHP) Jun 12, 2026
clay-good and others added 5 commits June 12, 2026 14:20
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The C/C#/PHP real-repo adversarial agent found C fully clean but four
unsound-exact issues in escape detection for indirection the spec didn't model —
all reproduced on real corpora (FastRoute, cecil):

- PHP anonymous closures leaked their body into the enclosing CFG: PHP_SPEC
  nestedFnTypes listed the stale grammar node `anonymous_function_creation_
  expression`; the grammar emits `anonymous_function`. Closures were processed
  inline, so a closure-local def reached an outer use as `exact` (real:
  FastRoute cachedDispatcher's $routeCollector). Fixed the node name — closures
  are now a separate scope, which also makes collectClosureMutations catch
  by-ref (`use (&$x)`) mutations.
- PHP reference assignment `$r = &$x` aliased $x invisibly — now the referent
  escapes (may), mirroring the C++ reference_declarator handling.
- C# `ref`/`out` arguments: the callee can/must reassign the variable, but it
  stayed `exact` (real: cecil's out-param TryGet pattern, pervasive). Now an
  argument whose text begins `ref `/`out ` escapes its identifier → may.

Verified on real repos: cecil may-rate 378→987 (the previously-unsound exacts now
correctly may), FastRoute closure leak gone, 0 invariant violations, deterministic.
Full suite 3609. C was clean on every probe.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lock the overlay language set so it can never silently regress: assert
cfgSupportsLanguage is true for exactly the 11 overlay languages and false for
every other language OpenLore detects (Kotlin/Swift/Scala/Dart/Lua/Elixir/Bash +
IaC + unknown), and that buildFunctionCfg returns undefined (never throws) for
all of them — the fail-soft contract every supported language must honor.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
These four languages' extensions (.c/.cs/.php/.kt) are in the watcher's
SOURCE_EXTENSIONS, so the watcher processes them on edit — but they were missing
from CALL_GRAPH_LANGS, so buildGraphSubset returned empty and the per-file swap
DELETED the file's nodes/edges (and now, with the overlay, its cfg_overlay rows)
until the next full analyze. Editing a .cs/.php/.c/.kt file made its functions
vanish from the call graph in watch mode — a pre-existing graph-coverage
regression that the overlay's deleteCfgForFile extended to overlays.

Add C/C#/PHP/Kotlin to the watcher's CALL_GRAPH_LANGS so edits RE-GRAPH and
(for C/C#/PHP) re-compute the overlay instead of wiping. Their grammars are
optional deps: if absent, buildGraphSubset fails soft to empty — identical to
full-analyze behavior, no wipe of nodes that never existed.

Verified: editing a C#/PHP/C file now preserves its node and refreshes its
overlay (was: wiped to 0). Full suite 3612.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ign of `p`

Re-validation across 11 languages / 46.7k overlays surfaced one C/C++
imprecision: a write through a pointer (`*p = x`) was recorded as a strong
*exact* def of the pointer binding `p`, because `*p` (a `pointer_expression`)
fell through recordTarget's generic recurse to the inner identifier.

That mislabels provenance — a later use of `p` linked to the `*p = x` line
instead of `p`'s real def — and falsely killed the real def. Value-sound
(the pointer value is unchanged), but a wrong `exact` provenance label is
exactly the failure mode the overlay must never emit.

Add an optional, language-scoped `derefTypes` (C/C++ `pointer_expression`)
handled like member/subscript writes: the base pointer is a *use*, the
l-value `*p` is a conservative `may` def — never an exact reassign of `p`.
Guarded against `&x` (shares the node type, never an assignment target).

Regression test in cfg.test.ts; full suite green (3583 passing).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant