Skip to content

feat(resolver): extract generator functions as definitions (JS/TS)#1333

Merged
carlos-alm merged 5 commits into
mainfrom
feat/generator-yield-resolution
Jun 6, 2026
Merged

feat(resolver): extract generator functions as definitions (JS/TS)#1333
carlos-alm merged 5 commits into
mainfrom
feat/generator-yield-resolution

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • function* gen() {} declarations were invisible to the graph — not registered as definitions, so calls inside generator bodies were attributed to the file level and yield* otherGen() edges were never emitted
  • Add generator_function_declaration and generator_function (variable-declared) to query patterns in both parser.ts and wasm-worker-entry.ts; mirror in the walk path and all export/return-type-map handlers
  • Native (Rust) engine updated in parallel for parity — match_js_node, handle_var_decl, match_js_return_type_map, handle_export_declaration
  • New jelly-micro/generators fixture: 9 named edges, 100% recall after fix (was 0%)

Test plan

  • npx vitest run tests/parsers/javascript.test.ts — 4 new generator tests, all pass
  • npx vitest run tests/benchmarks/resolution/jelly-micro.test.ts — generators fixture: 100% recall (9/9 TP, 0 FP, 0 FN)
  • npx vitest run --exclude "tests/benchmarks/**" — full suite green (2656 tests)

Closes #1319

Register function* declarations and const gen = function*() {} as
first-class function definitions in both the WASM and native engines.

Before this change, generator_function_declaration nodes were absent
from the definition registry, so:
- Calls inside generator bodies were attributed to the file level
- yield* gen8() and other inter-generator calls produced no edges
- The generators jelly-micro fixture scored 0% recall (0/9 named edges)

Changes:
- parser.ts + wasm-worker-entry.ts: add generator_function_declaration
  and generator_function (var-declared) to COMMON_QUERY_PATTERNS
- extractors/javascript.ts: walk-path case, handleVariableDecl,
  extractConstDeclarators skip guard, export kindMaps, return type map
- extractors/javascript.rs: equivalent changes for native engine parity
- Add generators jelly-micro fixture (9 named edges, 100% recall)
- Add 4 unit tests for generator extraction and yield* call capture

Closes #1319
docs check acknowledged
@claude
Copy link
Copy Markdown

claude Bot commented Jun 5, 2026

Claude finished @carlos-alm's task —— View job


PR Review: feat(resolver): extract generator functions as definitions (JS/TS)

  • Gather context and understand request
  • Read changed files
  • Phase 1: Root cause analysis
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity
  • Phase 4: Code quality
  • Phase 5: Backlog compliance

Root Cause Analysis

Problem Being Solved: Generator functions (function* gen() {}) were invisible to the dependency graph. They weren't registered as definitions, so (a) all calls inside generator bodies were incorrectly attributed to file level rather than to the generator, and (b) yield* otherGen() delegation edges were never emitted. This is a real correctness gap — the graph produced false source attributions for every generator-containing codebase.

Why This Approach: Correct approach — adds generator_function_declaration and generator_function to the same extraction paths already used for function_declaration and function_expression. Mirrors the existing pattern faithfully across all four layers (query patterns, walk path, export handler, Rust extractor). No architectural novelty required; the gap was purely an omission.

Risk Assessment: Low. The change is purely additive: new node types are registered in an existing dispatch table. No existing behavior is modified.


Phase 2: Configuration Change Detection

No linting rules relaxed, no TypeScript strict flags disabled, no test timeouts or coverage thresholds changed. Clean.


Phase 3: Test Integrity

Tests are new, additive, and appropriately strict. No assertions weakened, no existing tests modified.

One minor gap: the test at tests/parsers/javascript.test.ts:53 ("attributes calls inside generator body to the generator") verifies definitions and that gen8 appears in calls — but does not assert that the call is attributed to gen9 as the source scope. The test as written would also pass if gen8 appeared in calls attributed at file level. This is the exact bug being fixed, so a tighter assertion would be:

// Ideally: assert that calls[gen8].enclosingFunction === 'gen9'

However the calls array in ExtractorOutput doesn't carry enclosing function scope — that attribution happens during graph build. So the test is as precise as the data model allows at the unit level. This is acceptable; the benchmark fixture (jelly-micro/generators) covers the end-to-end attribution through the full pipeline.


Phase 4: Code Quality

WASM path (src/extractors/javascript.ts) — changes are correct and consistent:

  • FUNCTION_SCOPE_TYPES set at line 375–382 already included both generator_function_declaration and generator_function — these were pre-existing entries added in context of the constants-walk scope guard. The new work is in the query pattern additions.
  • handleExportCapture (line 191–193) and handleExportStmt (line 945–949) — generator_function_declaration added to both kindMaps. Consistent.
  • handleVariableDecl (line 811–815) — generator_function added alongside arrow_function / function_expression. Consistent.
  • extractReturnTypeMapWalk (line 1212) and extractConstDeclarators (line 487) — both already handle generators. Pre-existing; not changed in this PR but consistent.

Query patterns (src/domain/parser.ts:154–158 and src/domain/wasm-worker-entry.ts:111–115) — two new patterns added identically to both files:

(generator_function_declaration name: (identifier) @fn_name) @fn_node
(variable_declarator name: (identifier) @varfn_name value: (generator_function) @varfn_value)

Both files' COMMON_QUERY_PATTERNS are now in sync. ✓

Rust engine (crates/codegraph-core/src/extractors/javascript.rs) — changes mirror the WASM path:

  • match_js_node at line 337 — generator_function_declaration added alongside function_declaration. ✓
  • handle_var_decl at line 500 — generator_function added to the value-type check. ✓
  • match_js_return_type_map at line 203 — already handles generator_function_declaration. ✓
  • handle_export_declaration at line 654 — already handles generator_function_declaration. ✓
  • find_parent_of_types scopes at lines 515–516 and 526–527 — generator_function_declaration and generator_function added to the "don't extract constants inside function scopes" guard. ✓

One finding worth noting: The yield* expression itself is a yield_expression node in tree-sitter with a * operator. The PR description claims yield* otherGen() edges are now emitted — this works because gen8() is a call_expression nested inside the yield_expression, and the general call_expression walk (match_js_nodehandle_call_expr) recurses into generator bodies once generators are registered as definitions. The PR doesn't add special yield* handling — it relies on the existing call-expression extraction. This is correct and simpler than the alternative, but it means the source of the call is the generator function itself (once it's a definition), which is exactly the fix needed.

No issues found with the Rust parity. The diff is clean.


Phase 5: Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies. Builds on existing tree-sitter node type coverage.
  • Foundation-aligned: ✓ — Purely local graph extraction improvement; no cloud calls, no new infrastructure. Aligns with P1 (always-current accurate graph) and P4 (zero-cost core).
  • Problem-fit: 5/5 — Generator functions are real JS/TS code. Missing definitions meant AI agents using the graph got wrong attribution for calls inside generator bodies — exactly the "hallucinated dependencies" problem the tool exists to solve. This is a correctness fix, not a nice-to-have.
  • Breaking: No — purely additive. New edges will appear in the graph for codebases using generators; no existing edges are removed or changed.
  • Tier: 1 (zero-dep ✓ + foundation-aligned ✓)

Critical Concerns

None. The one thing I would flag for awareness (not blocking):

yield* delegation vs. direct call semantics: The fixture comment says yield* gen1() creates edge gen2 → gen1, and this works because gen1() is a call_expression inside the generator body. However, yield value (no *) where value is an identifier that happens to be a generator is not captured as a call edge — e.g., yield gen8 (without ()) would produce no edge. This is correct JS semantics (yield expr without * just yields a value; yield* iterable delegates), but worth knowing if future fixtures include such patterns. Not a bug in this PR.


Final Assessment

## Root Cause Analysis
Problem Being Solved: Generator functions were invisible — zero definitions registered,
  all interior calls mis-attributed to file level, yield* delegation edges never emitted.
Why This Approach: Minimal surgical fix — adds two tree-sitter node types to each
  existing extraction dispatch table across WASM and Rust paths. No new abstractions needed.
Risk Assessment: Low — purely additive to existing dispatch tables.

## Backlog Compliance
- Zero-dep: ✓
- Foundation-aligned: ✓
- Problem-fit: 5 — correctness fix that prevents wrong attribution for generator-heavy codebases
- Breaking: No
- Tier: 1

## Critical Concerns
None.

## Final Recommendation
- Rating: ⭐⭐⭐⭐⭐ (5/5)
- Action: APPROVE
- Reasoning: Clean, well-scoped correctness fix with full dual-engine parity, a hand-annotated
  benchmark fixture, and 4 targeted unit tests. The change is minimal relative to the bug
  severity (0% → 100% recall on generators). No regressions possible from the additive approach.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 5, 2026

Greptile Summary

This PR fixes a gap in JS/TS graph extraction where function* declarations were completely invisible — not registered as definitions, causing calls inside generator bodies to be attributed to file scope and yield* delegation edges to be dropped entirely.

  • TS extractor & query patterns (javascript.ts, parser.ts, wasm-worker-entry.ts): adds generator_function_declaration and generator_function to all query patterns, walk-path switch statements, variable-declaration handlers, export kind-maps, and return-type-map walks in lock-step.
  • Rust native engine (javascript.rs): mirrors every TS change — match_js_node, handle_var_decl, match_js_return_type_map, and handle_export_declaration all extended for parity.
  • Tests & fixture (javascript.test.ts, generators/): 4 new unit tests (declaration, variable-form, body attribution via line-range containment, yield* capture) and a 9-edge benchmark fixture that goes from 0 % to 100 % recall.

Confidence Score: 5/5

Safe to merge — the change is a targeted, additive extension that mirrors every existing function-type handler pattern without touching any shared logic paths.

Every modified surface (query patterns, walk-path switch, variable-decl handler, export kind-map, return-type-map, Rust counterparts) receives an identical, symmetric change alongside the existing function_declaration/function_expression cases. The extractConstDeclarators skip-guard is correctly extended to prevent duplicate constant entries. Tests cover declaration-form, variable-form, call attribution with line-range containment, and yield* capture. The benchmark fixture confirms end-to-end 9/9 edge recall. No existing code paths are altered.

No files require special attention.

Important Files Changed

Filename Overview
src/extractors/javascript.ts Adds generator_function_declaration and generator_function to walk-path switch, handleVariableDecl, extractConstDeclarators skip-guard, handleExportStmt/handleExportCapture kind-maps, and extractReturnTypeMapWalk — all symmetrically with existing function types.
crates/codegraph-core/src/extractors/javascript.rs Rust parity update: match_js_node, handle_var_decl, match_js_return_type_map, and handle_export_declaration each gain generator_function(_declaration) alongside the existing function equivalents.
src/domain/parser.ts Adds two new tree-sitter query patterns (generator_function_declaration and generator_function variable form) to COMMON_QUERY_PATTERNS, matching the wasm-worker-entry update.
src/domain/wasm-worker-entry.ts Mirrors the same two query-pattern additions as parser.ts in the worker's local COMMON_QUERY_PATTERNS copy.
tests/parsers/javascript.test.ts Adds 4 focused generator tests; the attribution test uses line-range containment to verify calls are scoped to the generator body, addressing the previously raised concern about weak assertions.
tests/benchmarks/resolution/fixtures/jelly-micro/generators/generators.js New benchmark fixture covering declaration-form generators, variable-declared generators, yield* delegation, mutual recursion, and regular-function callees — all 9 expected edges are hand-annotated.
tests/benchmarks/resolution/fixtures/jelly-micro/generators/expected-edges.json JSON fixture correctly specifies 9 call edges matching generators.js; all edge notes match the source-level comments in the fixture file.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph "Before (0% recall)"
        A1["function* gen(){}"] -->|"not recognised"| B1["No definition (file-scope attribution)"]
        C1["const g = function*(){}"] -->|"not recognised"| D1["No definition (file-scope attribution)"]
    end

    subgraph "After this PR (100% recall)"
        A2["function* gen(){}"] -->|"generator_function_declaration"| B2["Definition registered (walk + query)"]
        C2["const g = function*(){}"] -->|"generator_function"| D2["Definition registered (walk + query)"]
        B2 --> E2["Calls inside body scoped to generator"]
        D2 --> E2
        E2 --> F2["yield*/call edges emitted correctly"]
    end

    subgraph "Touched surfaces"
        G["parser.ts COMMON_QUERY_PATTERNS"]
        H["wasm-worker-entry.ts COMMON_QUERY_PATTERNS"]
        I["javascript.ts walk + export + return-type-map"]
        J["javascript.rs match_js_node + handle_var_decl + match_js_return_type_map + handle_export_declaration"]
    end
Loading

Reviews (7): Last reviewed commit: "Merge branch 'main' into feat/generator-..." | Re-trigger Greptile

Comment on lines +53 to +62
it('attributes calls inside generator body to the generator', () => {
const symbols = parseJS(`
function* gen9() { yield* gen8(); }
function* gen8() { yield 1; }
`);
expect(symbols.definitions).toContainEqual(expect.objectContaining({ name: 'gen9' }));
expect(symbols.definitions).toContainEqual(expect.objectContaining({ name: 'gen8' }));
expect(symbols.calls).toContainEqual(expect.objectContaining({ name: 'gen8' }));
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Call attribution not verified

The test attributes calls inside generator body to the generator only confirms that gen8 appears somewhere in symbols.calls — it does not assert that the call is scoped to gen9. The original bug was that calls fell through to file-level attribution, and this test would pass even with the old (broken) behavior since the call would still appear in the flat calls list. A stronger assertion would check that no definition with a matching line range contains the call, or use the graph resolution path to confirm the gen9 → gen8 edge actually appears.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — rewrote the test to use multi-line generator bodies so line ranges are distinct, then added line-range containment assertions: the call to gen8 must fall within gen9's [line, endLine] range (proving it is attributed to gen9's body), and must NOT fall within gen8's own range. This would have caught the original bug — with the old behavior, generators had no definitions and all calls had file-level attribution outside any definition's range.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Codegraph Impact Analysis

12 functions changed15 callers affected across 3 files

  • match_js_return_type_map in crates/codegraph-core/src/extractors/javascript.rs:354 (0 transitive callers)
  • match_js_node in crates/codegraph-core/src/extractors/javascript.rs:488 (0 transitive callers)
  • handle_var_decl in crates/codegraph-core/src/extractors/javascript.rs:642 (1 transitive callers)
  • handle_export_declaration in crates/codegraph-core/src/extractors/javascript.rs:805 (2 transitive callers)
  • handleExportCapture in src/extractors/javascript.ts:181 (3 transitive callers)
  • extractConstDeclarators in src/extractors/javascript.ts:473 (3 transitive callers)
  • walkJavaScriptNode in src/extractors/javascript.ts:635 (2 transitive callers)
  • handleVariableDecl in src/extractors/javascript.ts:806 (3 transitive callers)
  • handleExportStmt in src/extractors/javascript.ts:945 (3 transitive callers)
  • extractReturnTypeMapWalk in src/extractors/javascript.ts:1200 (3 transitive callers)
  • walk in src/extractors/javascript.ts:1204 (12 transitive callers)
  • gen4helper in tests/benchmarks/resolution/fixtures/jelly-micro/generators/generators.js:18 (1 transitive callers)

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@claude

@claude
Copy link
Copy Markdown

claude Bot commented Jun 5, 2026

Claude finished @carlos-alm's task —— View job


Review: Generator attribution test strengthening

  • Gather context and understand request
  • Read changed files to verify fix implementation
  • Assess test attribution assertion strength

The fix flagged by Greptile is already implemented in commit 037d326 ("test(parsers): strengthen generator call attribution assertion with line-range check").

What changed (tests/parsers/javascript.test.ts:53–77):

The test was rewritten to use multi-line generators so line ranges are non-overlapping, then added two containment assertions:

// The call's line must fall within gen9's range
expect(gen8Call!.line).toBeGreaterThanOrEqual(gen9Def!.line);
expect(gen8Call!.line).toBeLessThanOrEqual(gen9Def!.endLine!);

// Negative: the call must NOT fall within gen8's own range
const callIsInsideGen8 = gen8Call!.line >= gen8Def!.line && ...;
expect(callIsInsideGen8).toBe(false);

This is exactly what was requested. With the old (broken) behavior, generators had no definitions, so all calls had file-level attribution and gen9Def would be undefined — the test would have failed at line 61 (expect(gen9Def).toBeDefined()). The attribution check at lines 70–71 provides the additional depth to verify correct scoping.

Status: No changes needed — the fix is already in the branch and addresses the review feedback.


| Branch: feat/generator-yield-resolution

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit c62e3b8 into main Jun 6, 2026
28 checks passed
@carlos-alm carlos-alm deleted the feat/generator-yield-resolution branch June 6, 2026 06:23
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 6, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(resolver): track yielded values from generator functions (JS)

1 participant