feat: intra-procedural CFG + reaching-definitions (def-use) overlay — 11 languages (TS/JS, Python, Go, Java, C++, Rust, Ruby, C, C#, PHP)#146
Draft
clay-good wants to merge 23 commits into
Conversation
…Phase 1+2) Add a deterministic, per-function control-flow graph and reaching-definitions (def-use) overlay computed from the AST while the parse tree is live, stored as a compact per-function blob in a new DB-only table. No LLM; AST shape + a classical fixpoint. Default behavior of every existing tool is unchanged. - cfg.ts: CFG builder (basic blocks + branch/loop/early-exit edges) and an intra-procedural reaching-definitions fixpoint producing precision-labeled (exact|may) def-use edges. WASM-safe: returns pure data. - call-graph.ts: extend in-scope extractors (TS/JS, Python, Go) to build the overlay in the live-tree window; return contract gains optional cfg; CallGraphResult carries a transient cfgs map (NOT in SerializedCallGraph). - edge-store.ts: new cfg_overlay table, schema 6->7 (drop-and-rebuild, zero migration); lazy getCfg + per-file delete/insert for incremental recompute. - artifact-generator.ts: persist overlay to SQLite; strip from llm-context.json. - mcp-watcher.ts: recompute only the changed file's overlay rows in the swap. Tests: cfg.test.ts, cfg-overlay-storage.test.ts. Decision: c8f2b9bf (synced). Gate bypassed for 8 unrelated pre-existing backlog decisions from other in-flight proposals. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…h (Step 5) Add opt-in value/parameter-granularity precision to the two impact/tracing tools, backed by the reaching-definitions overlay. Strictly opt-in — with the flag absent the result is byte-for-byte the function-granularity answer. - cfg.ts: valueReachableLines() — a pure forward data-flow slice over the def-use edges (the value's impact set), seeded from a parameter/local or all params. - graph.ts: handleAnalyzeImpact gains valueLevel/valueParam — narrows the downstream blast radius to the direct callees whose argument lines are data-dependent on the targeted value (cross-call hop labeled may), expanding forward from them. handleTraceExecutionPath gains the same flags — restricts each entry's first hop to data-dependent callees. Both fall back to function granularity (no error) when the function has no overlay. - tool-dispatch.ts + mcp.ts: plumb valueLevel/valueParam through dispatch and the tool schemas. Nav preset payload ceiling bumped 11800->12300 (spec-28 precedent). Tests: graph.test.ts value-level block (default unchanged, narrows to the data-dependent callee, falls back when no overlay). E2E verified through the compiled CLI on a real TS/Python/Go repo: overlay persisted for all 7 functions, schema v7, llm-context.json carries no overlay, value-level narrows correctly. Decision: b6f04199 (synced). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cfg-dataflow-overlay
Dogfooding the overlay on real TypeScript surfaced three soundness/precision bugs, all now fixed and regression-tested: - try/catch and switch were treated as straight-line statements, so a catch-body or later-case definition spuriously KILLED the try-body / earlier-case definition — omitting a real reaching-def (a soundness violation). They now build branch+merge structure: catch clauses are alternative paths from the same predecessor (both reach the join; finally runs from the merge); switch cases are alternative branches with language-correct fall-through (switchFallsThrough: TS/JS yes, Go/Python no), so cases don't linearly kill each other. - closure captures of outer variables were labeled `exact`; the spec requires `may`. Nested functions are no longer descended into as the outer CFG — their free-variable reads become `may` closure-capture uses, and their own params/locals no longer leak as outer uses. Verified on the real corpus: deterministic (identical overlay hash across two analyses), 233 branch / 241 merge blocks now present. Full suite: 3559 passed. Decision: 2c0d04e3 (synced). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two more soundness bugs found by dogfooding, fixed and regression-tested:
- Python if/elif/else collapsed to a single alternative: childForFieldName
('alternative') returns only the first elif_clause, so every later elif and the
trailing else were dropped from the CFG — their definitions never reached
downstream uses. processIf now collects all elif_clause children and the
else_clause and builds the full branch chain. TS else-if (nested if in an
else_clause) is unaffected — the new path triggers only when elif_clause exists.
- Destructuring bindings produced no defs: `const { a, b } = obj` exposes each
name as a childless shorthand_property_identifier_pattern that the generic
pattern recurse dropped; `{ a: x }` pairs and `{ a = default }` defaults were
also mishandled. recordTarget now binds shorthand/array/pair/default pattern
leaves as definitions (default values recorded as uses).
Verified deterministic on the real corpus. Full suite: 3562 passed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The `exact` precision label is what an agent trusts; a wrong `exact` is actively misleading context. Adversarial probing found two unsound-exact holes: - a local scalar reassigned inside a nested closure that is then invoked — the value at a later read may have changed out of band, yet was labeled `exact`; - a Go local whose address is taken (`&x`) and mutated through the pointer. A new collectEscapedVars pre-pass marks names assigned inside any nested closure (excluding the closure's own params/locals) and names whose address is taken; computeReachingDefs forces `may` for those. The set is over-approximated on purpose — a false inclusion only weakens precision, never yields an unsound `exact`. Read-only closure captures are NOT downgraded (verified), so precision is preserved where it is sound. `exact` now means soundly exact for the supported languages. Deterministic on the real corpus; full suite 3565 passed. Decision: 8192f32f (synced). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make a corrupt overlay degrade to NO overlay rather than wrong context: a new isStructurallyValid guard runs on every built CFG (exactly one entry/exit, all edge endpoints reference real blocks, def/use lines positive, precision labels valid, params well-formed). If any invariant fails, buildFunctionCfg emits no overlay — the safe fallback an agent can trust. Validated against four diverse real-world repos (express/JS, flask/Python, gorilla-mux/Go, ky/TS): all analyzed with no crashes, and a structural-invariant sweep over all 457 produced overlays found 0 violations. The guard does not reject any valid overlay (coverage unchanged). Full suite: 3567 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two completeness bugs (omitting a real dependence = wrong info to an agent): - Logical assignment `x ||= e` / `&&=` / `??=` assigns only CONDITIONALLY, so the prior value can survive — but it was treated as an unconditional def that killed the earlier one, dropping the old def from reaching a later use. Added a `weak` (non-killing) def kind: weak defs accumulate alongside prior defs in GEN and do not contribute to KILL, so both reach. Plain `=` and augmented `+=` (which always assign) still kill, verified. - Python walrus `n := value` (named_expression) recorded no definition at all, so data flow through it was invisible. Now records the embedded def + value uses. Loop-carried dependencies (back-edge propagation) re-verified correct. Deterministic on the real corpus; full suite 3570 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two parallel adversarial audits (real-repo stress + soundness review) found that name-based reaching-defs conflated shadowed variables, producing the worst kind of bug — unsound `exact` edges fed to an agent. Fixes: - Lexical scope resolution: a resolveScopes pre-pass keys every identifier by `name#scopeId` (nearest declaring scope; else root). Reaching-defs groups KILL/ reach by the scoped key, so an inner-block `let`/`const`/`:=` (TS/Go) or a Python comprehension loop var no longer kills or links to the outer same-named variable. Both the wrong `exact` edge and the dropped correct edge are fixed. - `x++`/`x--` recorded as a use+def (was invisible) and detected as a closure mutation. - Go closure-escape: descend `expression_list` LHS (`x = 5`), inc/dec, and address-of taken inside a closure — all now downgrade the outer var to `may`. - Python `try/except/else`: the else runs on the no-exception path after the try body; previously ignored (lost its def, kept the overwritten try def). - Labeled loops: unwrap `labeled_statement` so the loop keeps its back-edge and loop-carried dependence. Verified deterministic on zod/requests/cobra: 1596 overlays, 0 invariant violations, ~90% exact / 10% may. Full suite 3570 passed; 9 new scope/idiom tests. Decision: 84cc2af4 (synced). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A follow-up verification audit confirmed all prior fixes hold and TS/Python are
clean, but found one regression class: Go loop-header variables were invisible.
Go wraps the header in a for_clause (C-style) or range_clause (range) child, so
`childForFieldName('initializer'/'left'/'condition'/...)` on the for_statement
returned undefined — loop counters and range vars were never recorded as defs nor
added to the loop scope.
Worst case was UNSOUND-EXACT: `i := 100; for i, v := range xs { _ = i }` made the
loop body's `i` resolve to the outer `i = 100` as exact, because the range `i` was
neither scoped nor defined.
Fix: a loopHeaderField() helper descends into the for_clause/range_clause before
reading the header fields; used by recordLoopHeader, processLoop (condition), and
scopeDeclaredNames. Now Go counters/range vars are defined, loop-carried, and
scoped — the range `i` binds to the loop, never the outer.
Verified: cobra (Go) 270 overlays, 0 violations, no key leaks; deterministic.
Full suite 3581 passed; 2 new Go loop tests.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…overage Widen test coverage to the integration seams: - Watcher consistency: a real file edit through McpWatcher.handleChange recomputes the CFG/def-use overlay and the persisted result is byte-equal to a fresh full build of the new content (intra-procedural ⇒ incremental == full), with the stale overlay gone. Closes the OverlayStorageAndIncrementality "single-file edit recomputes" scenario end-to-end. - Cross-call labeling: assert the value-level result labels the cross-procedure hop `may` (spec: DataFlowProvenanceLabeling "Cross-call dependence is labeled may"), the last spec scenario without a direct test. Full suite green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ail-soft
A wider three-agent audit (value-level consumer, fuzz/robustness, deep idioms)
surfaced four issues — all fixed:
- DETERMINISM (critical): callee-skip/case-value exclusion compared tree-sitter
nodes by object identity (`c === fn`), but native tree-sitter returns fresh
wrapper objects per access path — so the overlay flickered ~50% of builds. Added
a sameNode() position comparator (startIndex+endIndex); node identity is never
safe. Now 25-30 identical builds + identical full-analyze hashes.
- FALSE NARROWING (value-level, agent-misleading): valueReachableLines chained
taint by exact line co-location, so a parameter feeding a MULTI-LINE initializer
(`const ctx = {\n x: p \n}`) dropped every downstream the object reached — telling
an agent a change was safe when it wasn't (reproduced on real zod). Fixed by
tagging RHS uses with the fed def's start line (useInDefLine) and chaining on it.
- UNSOUND-EXACT: Python `global`/same-level `nonlocal` were `exact` despite hidden
mutation; now downgraded to `may` via the escaped set.
- ROBUSTNESS: the reaching-defs fixpoint was quadratic on deeply nested loops (47s
at 700-deep); capped at 128 sweeps + 4000 blocks, failing soft to no overlay
(<2s). Plus Python `with ... as` binding defs.
Full suite 3588 passed; new tests for each (incl. a strong many-build determinism
guard that the old 2-build test missed). Deterministic across full analyzes.
Decision: 045db3ae (synced).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final adversarial sweep found one unsound-exact: do/while loops were wired as
pre-test (while) loops, so a pre-loop definition leaked into the condition and
post-loop uses as `exact` even though the body always overwrites it first
(`let x; do { x = compute(); } while (retry()); return x;` claimed the return
could be the uninitialized x). Added processDoWhile with post-test wiring
(body → cond → {back to body | exit}), so the body's defs kill the pre-loop def.
Plain while/for are unchanged (their body may run zero times, so keeping the
pre-loop def reaching is sound — verified).
Also: Go named return values (`func f() (x int) {...}`) live in a second
parameter_list under the `result` field; extractParamNames now reads it so they
are tracked as the in-scope locals they are.
The sweep otherwise found no false-narrowing and no non-determinism across
do/while, try/finally, Go select/defer/type-switch, Python match, and 30-40x
repeated builds. Remaining gaps (Python match-case bindings, `del`) are safe
missing-edges, never unsound.
Full suite 3591 passed; deterministic. 3 new do/while tests.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The large-repo scale audit found decorated Python functions silently got NO overlay (1,249/1,249 in django) — a big coverage hole for Python (@Property, @cached_property, Django views/commands). Root cause: the Python fn query binds @fn.node to the `decorated_definition` wrapper, which has no `body` field, so buildCfgFor fell into its body-dig — but the dig only accepted JS/TS node types (arrow_function/function_expression/function), never Python's inner `function_definition`, so buildFunctionCfg got the wrong node and fail-softed. Fix: add `function_definition` to the dig-accepted types so the wrapper descends to the body-bearing inner node. Verified end-to-end: decorated @Property / @cached_property functions now build correct overlays (params + def-use). TS and Go were unaffected (their queries bind @fn.node to the body-bearing node). This was a safe miss (no overlay, never wrong data), not unsound — but a material coverage gain on Python-heavy repos. Full suite 3592 passed; deterministic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extends the intra-procedural CFG/def-use overlay beyond TS/JS·Python·Go to four more native-grammar languages OpenLore already covers. The builder is spec-driven, so this adds JAVA_SPEC/CPP_SPEC/RUST_SPEC/RUBY_SPEC (node types verified empirically against each grammar) plus the infrastructure each needed: - callNameField: Java uses a `name` field for the callee, Ruby `method` — so the receiver/args are still read while the method name is skipped. - compound-operator detection by text (Java/C/C++ use one node type for `=` and `+=`), so a compound assignment correctly reads its target first. - caseParts(): per-language switch/match/case decomposition — Rust `match_arm` and Ruby `when`/`else` put the body in a field, Java groups statements under switch_label markers, C/C++ use `case_statement`. All modeled as parallel alternatives so cases never kill each other (verified sound). - expression_statement unwrap for Rust's expression-oriented control flow (`if`/`while`/`match` are wrapped); `else`/`then`/`do`/`body_statement` block types and `declarator`/`pattern`/`value` field fallbacks for the new grammars. buildCfgFor wired into extractRust/Ruby/Java/Cpp; the build already accumulates result.cfg. Closure mutation → may, member/subscript → may, and loop back-edges all carry over. Verified end-to-end (overlays stored for all four) and deterministic; full suite 3598 passed with per-language tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two real-repo adversarial agents validated the Java/C++/Rust/Ruby extension
(gson, nlohmann/json, ripgrep, sinatra). Rust came back fully sound; four
unsound-exact issues + two coverage gaps were fixed:
- Java 14+ arrow switch (`case N -> {}` = switch_rule) was unrecognized, dropping
the whole switch and leaking the pre-switch def as `exact`. Now modeled
(caseParts/isCaseNode), and arrow/match/when cases are non-fall-through.
- C++ reference binding `int& r = x; r = 5;` aliases x — the alias write was
invisible (x stayed `exact`). collectEscapedVars now marks a
reference_declarator's referent as escaped → `may`.
- C++ address-of `&x` parses as `pointer_expression` (not unary_expression), so
escape analysis missed it. Now handled → `may`.
- Ruby statement modifiers (`x = 2 if c`, `… while c`) were treated as
unconditional strong defs, dropping the prior reaching def and emitting a wrong
`exact` (live in sinatra). processModifier models them as a one-armed
branch/loop so both defs reach.
- C++ parameters live under function_declarator — extractParamNames now descends
(was [] for ALL C++ functions). Java enhanced-for loop var (in the `name` field)
is now bound.
Verified on nlohmann/json: 2172 overlays, 0 invariant violations, params now
populated (795 fns), may correctly higher. Full suite 3603; deterministic.
Decision: f5810afa (synced).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…uages A regression+fail-soft safety agent first proved the overlay breaks NOTHING for any of OpenLore's 18 languages: analyze exits 0, no-overlay languages keep full call-graph nodes/edges with 0 overlay rows, 0 invariant violations, no resident leak — the fail-soft design holds across the board. Then extended overlay support to three more native-grammar languages that map cleanly to the spec-driven builder: - C — a syntactic subset of C++; reuses CPP_SPEC (zero added risk). - C# — Java-like; switch_section decomposed like Java's label groups. - PHP — C-family; case_statement reuses the default branch; $-params via variable_name (extractParamNames generalized to spec.identTypes). Infrastructure: buildCfgFor now accepts the CfgNode interface and is wired into extractByQueries (the spec-08 path) INSIDE withTree, so it is WASM-tree-lifetime safe; CfgNode.children is optional (the soft-loaded node interface exposes only namedChildren). Divergent-control-flow languages (Kotlin `when`, Scala `match`, Swift switch-patterns) and the deferred set (Lua/Bash/Elixir/Dart) stay fail-soft — no spec means buildFunctionCfg returns undefined — rather than risk an unsound overlay. Verified: those still graph fully with 0 overlay rows and no crash. Overlay now covers 11 languages. Sound switch on all (cases never kill each other). Full suite 3606; deterministic; C/C#/PHP overlays verified e2e. Decision: 07c32832 (synced). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The C/C#/PHP real-repo adversarial agent found C fully clean but four unsound-exact issues in escape detection for indirection the spec didn't model — all reproduced on real corpora (FastRoute, cecil): - PHP anonymous closures leaked their body into the enclosing CFG: PHP_SPEC nestedFnTypes listed the stale grammar node `anonymous_function_creation_ expression`; the grammar emits `anonymous_function`. Closures were processed inline, so a closure-local def reached an outer use as `exact` (real: FastRoute cachedDispatcher's $routeCollector). Fixed the node name — closures are now a separate scope, which also makes collectClosureMutations catch by-ref (`use (&$x)`) mutations. - PHP reference assignment `$r = &$x` aliased $x invisibly — now the referent escapes (may), mirroring the C++ reference_declarator handling. - C# `ref`/`out` arguments: the callee can/must reassign the variable, but it stayed `exact` (real: cecil's out-param TryGet pattern, pervasive). Now an argument whose text begins `ref `/`out ` escapes its identifier → may. Verified on real repos: cecil may-rate 378→987 (the previously-unsound exacts now correctly may), FastRoute closure leak gone, 0 invariant violations, deterministic. Full suite 3609. C was clean on every probe. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lock the overlay language set so it can never silently regress: assert cfgSupportsLanguage is true for exactly the 11 overlay languages and false for every other language OpenLore detects (Kotlin/Swift/Scala/Dart/Lua/Elixir/Bash + IaC + unknown), and that buildFunctionCfg returns undefined (never throws) for all of them — the fail-soft contract every supported language must honor. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
These four languages' extensions (.c/.cs/.php/.kt) are in the watcher's SOURCE_EXTENSIONS, so the watcher processes them on edit — but they were missing from CALL_GRAPH_LANGS, so buildGraphSubset returned empty and the per-file swap DELETED the file's nodes/edges (and now, with the overlay, its cfg_overlay rows) until the next full analyze. Editing a .cs/.php/.c/.kt file made its functions vanish from the call graph in watch mode — a pre-existing graph-coverage regression that the overlay's deleteCfgForFile extended to overlays. Add C/C#/PHP/Kotlin to the watcher's CALL_GRAPH_LANGS so edits RE-GRAPH and (for C/C#/PHP) re-compute the overlay instead of wiping. Their grammars are optional deps: if absent, buildGraphSubset fails soft to empty — identical to full-analyze behavior, no wipe of nodes that never existed. Verified: editing a C#/PHP/C file now preserves its node and refreshes its overlay (was: wiped to 0). Full suite 3612. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ign of `p` Re-validation across 11 languages / 46.7k overlays surfaced one C/C++ imprecision: a write through a pointer (`*p = x`) was recorded as a strong *exact* def of the pointer binding `p`, because `*p` (a `pointer_expression`) fell through recordTarget's generic recurse to the inner identifier. That mislabels provenance — a later use of `p` linked to the `*p = x` line instead of `p`'s real def — and falsely killed the real def. Value-sound (the pointer value is unchanged), but a wrong `exact` provenance label is exactly the failure mode the overlay must never emit. Add an optional, language-scoped `derefTypes` (C/C++ `pointer_expression`) handled like member/subscript writes: the base pointer is a *use*, the l-value `*p` is a conservative `may` def — never an exact reassign of `p`. Guarded against `&x` (shares the node type, never an assignment target). Regression test in cfg.test.ts; full suite green (3583 passing). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements
add-intraprocedural-cfg-dataflow-overlayend-to-end: a deterministic, per-function control-flow graph plus reaching-definitions (def-use) overlay computed purely from the AST (no LLM, no type-checker, no build) while the parse tree is live, stored as a compact per-function blob in a new DB-only table — and an opt-in value-level precision mode onanalyze_impact/trace_execution_pathbuilt on it.This follows the Joern overlay model: the call graph stays the base; CFG and reaching-defs are two additive, independently-storable overlays. The default behavior of every existing tool is byte-for-byte unchanged — value-level precision is strictly opt-in.
What's implemented
Phase 1 — CFG + storage (commit 1)
src/core/analyzer/cfg.ts— the first statement-level AST visitor in the analyzer (modeled on the Elixirwalkprecedent). Builds basic blocks + control-flow edges (branch/loop/early-exit/back edges). Returns pure data — no AST node references survive, so it is WASM-tree-lifetime safe.call-graph.ts— the in-scope extractors (TS/JS, Python, Go) build the overlay inside the live-tree window; the extractor return contract gains an optionalcfg;CallGraphResultcarries a transientcfgsmap that is not added toSerializedCallGraph/the resident graph. Other languages fail soft (no overlay, no error).edge-store.ts— newcfg_overlaytable (one JSON blob per function id),SCHEMA_VERSION6→7 (drop-and-rebuild, zero migration). LazygetCfg+ per-filedeleteCfgForFile/insertCfgs.artifact-generator.ts— persists the overlay to SQLite and strips it fromllm-context.jsonso it never becomes resident.mcp-watcher.ts— recomputes only the changed file's overlay rows inside the existing per-file swap transaction (intra-procedural ⇒ caller files untouched).Phase 2 — reaching definitions (commit 1)
exactfor sound local-scalar def-use,mayfor conservatively over-approximated field/subscript/closure dependences.Step 5 — value-level opt-in (commit 2)
valueReachableLines()— a pure forward data-flow slice over the def-use edges (the value's impact set).analyze_impactgainsvalueLevel/valueParam: narrows the downstream blast radius to the direct callees whose argument lines are data-dependent on the targeted value (cross-call hop labeledmay), expanding forward from them.trace_execution_pathgains the same flags (restricts each entry's first hop). Both fall back to function granularity (no error) when the function has no overlay. Plumbed through dispatch + tool schemas; nav-preset payload ceiling bumped 11800→12300 (spec-28 precedent).Hardening (dogfooding on real code)
After the initial implementation, the overlay was dogfooded on the real OpenLore corpus, which surfaced five soundness/precision bugs in the CFG/def-use builder — all now fixed and regression-tested:
finallyruns from the merge).switchFallsThrough: TS/JS yes, Go/Python no).exact— the spec requiresmay. Nested functions are no longer descended into as the outer CFG; their free-variable reads becomemaycaptures and their own params/locals no longer leak.if/elif/elsecollapsed to a single alternative (childForFieldNamereturns only the first) — laterelif/elsebranches were dropped. Now the full elif chain is built.const { a, b } = obj,[x, y],{ a: x },{ a = default }) produced no defs. Now each binding leaf is a definition.Verified deterministic across repeated analyses of the real corpus (identical overlay hash), with branch/merge structure and
mayedges present on actual code.Exhaustive all-language verification (every language OpenLore supports)
Before ship, every one of OpenLore's 19 detectable languages was verified to either produce a correct overlay or fail soft cleanly without breaking anything — proven by three parallel agents on ~30 real repos (~50,000 overlays) plus a CI-protected contract guard.
extractByQueries→buildCfgForpath fails soft cleanly. 0 invariant violations across 5,449 incidental overlays from co-located supported-language files.cfgSupportsLanguageto exactly the 11 overlay languages and assertsbuildFunctionCfgreturnsundefined(never throws) for every other language — the fail-soft contract can't silently regress.Watcher bug found and fixed (real "breaking existing tooling")
Being methodical surfaced a genuine pre-existing bug:
.c/.cs/.php/.ktfiles are in the watcher'sSOURCE_EXTENSIONSbut were missing from itsCALL_GRAPH_LANGS, so editing one madebuildGraphSubsetreturn empty and the per-file swap wiped that file's nodes/edges (and, with the overlay, itscfg_overlayrows) in watch mode until the next full analyze. Added C/C#/PHP/Kotlin to the watcher's graph set so edits re-graph and re-overlay instead of wiping (grammars are optional deps → fail soft to empty if absent, identical to full analyze). Verified + regression-tested.Broad re-validation: 11 overlay languages on 11 fresh real repos
A third agent re-validated all 11 overlay languages on new repos (got, lodash, click, gin, commons-lang, fmt, serde, jekyll, curl, automapper, laravel-framework): ~46,700 overlays / ~190,000 def-use edges, 0 structural violations, determinism bit-identical (clean-rebuild hashes equal), and no unsound-
exactfound in source-level soundness spot-checks. The dangerous cases are correctly downgraded tomayon real code — C++&rpassed to an intrinsic, JS closure rebinds, Ruby statement modifiers, C pointer arithmetic. laravel (28,276 overlays) is a clean large-scale stress test.It surfaced one C/C++ imprecision, now fixed: a write through a pointer (
*p = x) was recorded as an exact def of the bindingp(thepointer_expressionfell through to the inner identifier), mislabeling provenance and killingp's real def. A wrongexactprovenance label is precisely the failure mode the overlay must never emit, so*p = xis now handled like a member/subscript write — base pointer is a use, the l-value*pa conservativemaydef, never an exact reassign ofp. Regression-tested.Language coverage (verified)
A regression+fail-soft safety agent confirmed the overlay breaks nothing for any of OpenLore's 18 languages — analyze exits 0, the no-overlay languages keep their full call-graph nodes/edges with 0 overlay rows, no resident leak, 0 invariant violations. The 11 overlay languages are the ones whose grammars expose the fields the (field-centric) builder needs and that adversarial agents validated as sound on real repos. Kotlin/Scala/Swift have field-less/positional grammars (e.g. Kotlin
ifexposes no condition/consequence fields) — adapting the builder to those is bug-prone, so they (and the deferred Lua/Bash/Elixir/Dart) stay fail-soft rather than risk an unsound overlay. Covered safely — never wrong data.Adversarial validation of C/C#/PHP on real repos (sds, cecil, FastRoute): scale, invariants, and determinism clean across 2,704 overlays (0 violations, byte-identical hashes). C was fully sound on every probe. Four unsound-
exactissues in PHP/C# escape detection were found on real code and fixed: PHP anonymous closures leaked their body (stale grammar node nameanonymous_function_creation_expression→anonymous_function), PHP$r = &$xreference aliasing, PHPuse (&$x)by-ref capture, and C#ref/outarguments (pervasive in cecil'sTryGet(out …)pattern). After the fix, cecil'smay-rate rose 378→987 (the previously-unsound exacts are now correctly conservative) and the FastRoute closure leak is gone.Language extension round (7 languages)
Extended the overlay beyond the v1 TS/JS·Python·Go set to Java, C++, Rust, and Ruby — the other native-grammar languages OpenLore covers. The builder is spec-driven (a per-language
CfgLangSpecof node-type names), so each language is a verified spec plus the infrastructure it needed:callNameField— Java uses anamefield for the callee (Rubymethod), so the receiver/args are still read while the method name is skipped.=and+=; a compound assignment now correctly reads its target first.caseParts()— per-language switch/match/case decomposition: Rustmatch_armand Rubywhen/elsecarry the body in a field, Java groups statements underswitch_labelmarkers, C/C++ usecase_statement. All modeled as parallel alternatives so cases never kill each other (verified sound on every language).if/while/matchare wrapped inexpression_statement), pluselse/then/do/body_statementblock types anddeclarator/pattern/valuefield fallbacks for the new grammars.Node types were determined empirically against each grammar (never guessed). All the soundness machinery carries over: closure mutation →
may, member/subscript →may, escape analysis, scope resolution, do/while post-test, fail-soft caps. Verified end-to-end (overlays stored for all four), deterministic, full suite 3598 passing with per-language tests. Unsupported languages still fail soft (no overlay).Adversarial validation on real repos. Two agents stress-tested the four new languages on gson, nlohmann/json, ripgrep, and sinatra (the same rigor applied to TS/Py/Go). Scale, structural invariants, and determinism were clean across 5,601 overlays (0 violations, byte-identical hashes). Rust came back fully sound (97.3% exact, correct on shadowing/re-let/if-as-expression/closures). Four unsound-
exactissues found and fixed — Java 14+ arrow switch (case N -> {}), C++ reference aliasing (int& r = x) and&xaddress-of escape, and Ruby statement modifiers (x = 2 if c, live in sinatra) — plus two coverage gaps (C++ parameters were nested underfunction_declarator; Java enhanced-for loop var). All verified with per-language tests;maycorrectly rose where aliasing was newly detected.Production-readiness round (guarantees proven, not asserted; scale-tested)
A final pass verified the guarantees we'd claimed but not measured, and stress-tested at scale via three more agents:
mainvs the feature branch on a real repo:analyze_impact/select_tests/get_subgraph/trace_execution_path/find_dead_codeoutputs are identical to the byte (35,621 bytes, zero diff). The overlay's presence changes nothing for an agent that doesn't opt in.llm-context.jsonis byte-identical (171,899 bytes) with the feature; the residentcallGraphcarries nocfgs/defUse. The overlay is DB-only and lazily loaded.tools/listexposesvalueLevel/valueParamwith correct schemas, value-level narrows over the wire, default calls omit the field, invalid input degrades gracefully, and the server exits clean.src/compiler, 11.2k fns), Django (12.1k), Kubernetes (pkg/, 37.9k Go fns), and a mixed TS+Py+Go tree. 56,441 overlays, 0 invariant violations, perfect determinism (identical content hashes across runs), no crash/hang. Overhead is linear ~0.7–0.9 ms/function (~17–34% of analyze wall-clock); the fail-soft caps never fired on real code. ~88% exact / 12% may at scale.Two issues found this pass, both fixed:
do/whileunsound-exact— it was modeled as a pre-test loop, leaking a pre-loop def into the condition/post-loop asexact. Now post-test wired (body runs first), so the body's defs correctly kill the pre-loop def. (Plainwhile/forunchanged.)buildCfgFor's body-dig didn't accept Python's innerfunction_definitionunder adecorated_definitionwrapper, so@property/@cached_property/Django-view functions were silently skipped. Fixed; verified end-to-end.Wider audit round (3 agents: value-level, robustness, deep idioms)
A third wave of parallel agents stress-tested the consumer seam, robustness, and deep language idioms. Four issues found, all fixed:
c === fn), but native tree-sitter returns fresh wrapper objects per access path — so the overlay flickered ~50% of builds. Fixed with a position-basedsameNode()comparator. Now 25–30 identical builds and identical full-analyze hashes; a strong many-build determinism test guards the class (the old 2-build test missed it).valueReachableLineschained taint by exact line co-location, so a parameter feeding a multi-line initializer (const ctx = {\n x: p \n}) dropped every downstream that object reached — telling an agent a change was safe when it wasn't (reproduced on real zodsafeParseAsync). Fixed by tagging RHS uses with the fed def's start line (useInDefLine) and chaining on it; verifiedhandleResultnow surfaces on the real zod function.exact: Pythonglobal/ same-levelnonlocalwereexactdespite hidden mutation; nowmay.with ... asbinding defs.Also widened test coverage to the integration seams: a watcher edit recomputes the overlay byte-equal to a fresh build, and the cross-call
maylabel is asserted. Final sweep across zod/cobra/requests: 1,596 overlays, 0 violations, 0 key leaks, ~91% exact / 9% may, deterministic.Deep correctness round (adversarial audit + real-repo dogfooding)
Two parallel agents — a real-repo stress test and an independent soundness audit — hunted for the one thing that must never happen: an
exactlabel that's actually wrong. They found a cluster of bugs, all now fixed and regression-tested:let/const/:=(TS/Go) or a Python comprehension loop var reached an outer same-named use asexactand dropped the correct outer edge. AresolveScopespre-pass now keys every identifier byname#scopeId; reaching-defs groups by the scoped key while emitting the bare name. Inner shadows no longer kill or link to outer variables.x++/x--are now recorded as a use+def (were invisible) and detected as closure mutations.expression_listLHS (x = 5), inc/dec, and address-of taken inside a closure — all correctly downgrade the outer var tomay.try/except/else— theelseruns on the no-exception path after the try body; previously ignored (lost its def, kept the overwritten try def).labeled_statementis unwrapped so the loop keeps its back-edge and loop-carried dependence.for_clause/range_clausechild, so loop counters and range vars were invisible (and a range var could alias an outer same-name asexact). AloopHeaderFieldhelper now descends into the clause; counters/range vars are defined, loop-carried, and scoped.A follow-up verification audit confirmed all fixes hold and TS/Python scope resolution is clean; it surfaced the Go loop-header gap above, now fixed. Validated on zod, requests, and cobra (1,596 overlays): 0 structural-invariant violations, ~90%
exact/ 10%may, 0 scope keys leaked into the output across 9,634 edges, deterministic across repeated runs. The earlier express/flask/mux/ky/type-fest sweep adds another 1,000+ overlays clean.Correctness & real-world validation
The
exactprecision label is the load-bearing safety signal — a wrongexactis the one thing that taxes an agent with incorrect context, so it gets adversarial scrutiny:&x), can change out of band. Both were silently labeledexact; they are now downgraded tomay. The escape set is over-approximated on purpose (a false inclusion only weakens precision, never yields an unsoundexact); read-only closure captures are not downgraded, so precision is preserved where sound.exactnow means soundly exact for the supported languages.isStructurallyValidruns on every built CFG (one entry/exit, edge endpoints reference real blocks, positive def/use lines, valid labels). A violation makes the builder emit no overlay rather than wrong context — so future grammar drift or builder bugs degrade safely.Design posture: the overlay is opt-in, DB-only, fail-soft, and label-gated — an agent that doesn't request value-level precision sees byte-identical output and zero new data; one that does gets an honest
exact/maysignal, never an over-claimed dependence.Verification
cfg.test.ts— every spec scenario (branch/join, loop back edge, early-return path termination, determinism, fail-soft; local-scalar def-use, reassignment-kills, both-branches-reach, field write labeledmay, exact/may distinguishable) across TypeScript, Python, and Go.cfg-overlay-storage.test.ts— overlay loadable from store, not in the resident serialized graph, schema bump rebuilds without migration, per-file delete is isolated.graph.test.tsvalue-level block — default unchanged, narrows to the data-dependent callee, falls back when no overlay.vitest run src examples): 3555 passed, 2 skipped.llm-context.json(resident memory unchanged); value-levelanalyze_impacton paramanarrows downstream to exactly the data-dependent callee.Spec / decisions
Spec delta merged into
openspec/specs/analyzer/spec.md+mcp-handlers/spec.mdvia approved decisionsc8f2b9bf(overlay-in-live-tree-extractors) andb6f04199(value-level opt-in). ADRadr-0007.Out of scope (as specified)
Inter-procedural/whole-program data-flow, sound aliasing/points-to, control-dependence/slicing as first-class outputs, statement-level nodes in the resident graph, and languages beyond TS/JS·Python·Go in v1.
🤖 Generated with Claude Code