Skip to content

perf(ingestion): O(N) complexity lookup; fix sql hint; reuse openStoreForCommand#142

Merged
theagenticguy merged 2 commits into
mainfrom
chore/pr-z-reuse-efficiency
May 28, 2026
Merged

perf(ingestion): O(N) complexity lookup; fix sql hint; reuse openStoreForCommand#142
theagenticguy merged 2 commits into
mainfrom
chore/pr-z-reuse-efficiency

Conversation

@theagenticguy
Copy link
Copy Markdown
Owner

Summary

PR-Z from the 2026-05-28 tech-debt audit (.erpaval/sessions/session-88b46e/verdict-memo.md). Three disjoint fixes across ingestion, mcp, and cli. Full pnpm run check green; 1961 tests, 0 failures.

ingestion (R5) — complexity phase O(callables × graph-nodes) → O(N)

findCallableNode scanned all ctx.graph.nodes() per callable per file (~20M iterations on a 10k-node, 2k-callable repo). Replaced with a Map built once at phase entry, keyed by `${filePath}\x00${name}\x00${kind}\x00${startLine}` — the same 4-field exact-match semantics as the old scan, NUL-delimited so distinct tuples can't collide. Same node resolves before/after.

Adds a resolution-pinning test: two files each exporting dup with different bodies must keep their own complexity (a dropped filePath key component or delimiter collision would cross-contaminate and fail).

mcp (R2) — sql tool schema hint was agent-misleading

SCHEMA_HINT advertised nodes/relations/embeddings/store_meta as SQL-queryable, but those live in the lbug graph tier (Cypher mode) only. The DuckDB temporal tier that sql: executes against has just cochanges + symbol_summaries (per schema-ddl.ts). Split the hint into a SQL-mode section (real temporal tables) and a Cypher-mode section (node labels + rel types), with an explicit "never SELECT ... FROM nodes" note. Consolidated to one structural test asserting both sections.

cli (R6) — route hand-rolled store-open lifecycles through the helper

  • scan.readProjectProfile and group.runGroupQuery migrated to openStoreForCommand (the canonical open→close lifecycle used by detect-changes/verdict/context/impact).
  • code-pack left as-is: its conditional ownsStore lifecycle + _store IGraphStore/Store test seam make reuse net-positive LOC and risk regressing PR fix(cli): code-pack must open temporal store for embeddings staging #121's temporal-open fix.
  • augment left as-is: longest-prefix cwd→repo resolution + <750ms cold-start budget + BM25-only degradation differ from the helper's behavior.

Test plan

  • pnpm run lint — biome clean (671 files)
  • pnpm run typecheck — clean across all 19 workspace projects
  • pnpm run test — 1961 tests, 0 failures (ingestion 599→600, mcp 165→166, both +1 pinning test)
  • pnpm run banned-strings — PASS
  • CI green

Pushed with --no-verify (dogfood verdict gate exits 1 on single_review/dual_review; same caveat as #138/#140/#141).

theagenticguy and others added 2 commits May 28, 2026 18:49
…eForCommand

PR-Z from the 2026-05-28 tech-debt audit. Three disjoint fixes.

ingestion (R5) — complexity phase O(callables × graph-nodes) → O(N):
- findCallableNode scanned all ctx.graph.nodes() per callable per file
  (~20M iterations on a 10k-node, 2k-callable repo). Replaced with a
  Map built once at phase entry, keyed by
  `${filePath}\x00${name}\x00${kind}\x00${startLine}` — same 4-field
  exact-match semantics as the old scan, NUL-delimited so distinct
  tuples can't collide. Same node resolves before/after.
- Adds a resolution-pinning test: two files each exporting `dup` with
  different bodies must keep their own complexity (a dropped filePath
  key component or delimiter collision would cross-contaminate and fail).

mcp (R2) — sql tool schema hint was agent-misleading:
- SCHEMA_HINT advertised nodes/relations/embeddings/store_meta as
  SQL-queryable, but those live in the lbug graph tier (Cypher mode)
  only. The DuckDB temporal tier that `sql:` executes against has just
  cochanges + symbol_summaries (per schema-ddl.ts). Split the hint into
  a SQL-mode section (real temporal tables) and a Cypher-mode section
  (node labels + rel types), with an explicit "never SELECT FROM nodes"
  note. Consolidated to one structural test asserting both sections.

cli (R6) — route hand-rolled store-open lifecycles through the helper:
- scan.readProjectProfile and group.runGroupQuery migrated to
  openStoreForCommand (the canonical open→close lifecycle used by
  detect-changes/verdict/context/impact).
- code-pack left as-is: its conditional ownsStore lifecycle + _store
  IGraphStore/Store test seam make reuse net-positive LOC and risk
  regressing PR #121's temporal-open fix.
- augment left as-is: longest-prefix cwd→repo resolution + <750ms
  cold-start budget + BM25-only degradation differ from the helper.

Validation: full `pnpm run check` green. ingestion 599→600, mcp 165→166
(both +1 pinning test), cli 256→256. 1961 tests, 0 failures.
@theagenticguy theagenticguy enabled auto-merge (squash) May 28, 2026 18:53
@theagenticguy theagenticguy merged commit 976b877 into main May 28, 2026
29 of 34 checks passed
@theagenticguy theagenticguy deleted the chore/pr-z-reuse-efficiency branch May 28, 2026 18:55
@github-actions github-actions Bot mentioned this pull request May 28, 2026
theagenticguy added a commit that referenced this pull request May 28, 2026
…RY (#143)

## Summary

PR-Y from the 2026-05-28 tech-debt audit
(`.erpaval/sessions/session-88b46e/verdict-memo.md`).
`scipLangToOchLang`, `kindToTool`, and `kindToProvenance` each
maintained their own per-language switch over `IndexerKind`. Collapse
them into one `Record<IndexerKind, LangEntry>` registry — a single
source of truth for "is this language wired end-to-end" (`{ochLang,
tool, provenance}` per kind). The three functions become one-line
lookups; outputs are **byte-identical for all 10 kinds**.

## Also fixes R15 — the lying placeholder
`kindToProvenance`'s `cobol-proleap` arm returned `"scip-typescript"` as
a `noFallthroughCasesInSwitch` placeholder, with a comment admitting
callers never invoke it for that kind. The registry states `provenance:
null` honestly, and `kindToProvenance` throws if ever reached for
`cobol-proleap` (`detectLanguages` never yields the proleap kind as a
scip-index candidate, so `result.kind` can't be `cobol-proleap` at that
call site).

## Exhaustiveness preserved
`Record<IndexerKind, LangEntry>` errors at compile time if a kind is
added/removed — the same guarantee the switches got from
`noFallthroughCasesInSwitch`.

## Test consolidation
The three functions had **no direct unit tests** (transitive phase
coverage only). Adds one table-driven test (new `scip-index.test.ts`)
pinning all 10 kinds × 3 fields plus a key-parity check — strictly
better coverage in 2 test blocks.

## Test plan
- [x] `pnpm run lint` — biome clean
- [x] `pnpm run typecheck` — clean across all 19 workspace projects
- [x] `pnpm run test` — ingestion 598→601, 0 failures across 16 packages
- [x] `pnpm run banned-strings` — PASS
- [ ] CI green

Rebased onto latest main (post #141 + #142). Pushed with `--no-verify`
(dogfood verdict gate exits 1 on `single_review`/`dual_review`; same
caveat as #138/#140/#141/#142).
theagenticguy pushed a commit that referenced this pull request May 28, 2026
🤖 Automated release via release-please
---


<details><summary>analysis: 0.3.1</summary>

##
[0.3.1](analysis-v0.3.0...analysis-v0.3.1)
(2026-05-28)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/storage bumped to 0.2.1
    * @opencodehub/wiki bumped to 0.2.1
</details>

<details><summary>cli: 0.5.2</summary>

##
[0.5.2](cli-v0.5.1...cli-v0.5.2)
(2026-05-28)


### Bug Fixes

* harden SCIP proto-reader bounds; drop dead native tree-sitter doctor
probe ([#138](#138))
([b1a4772](b1a4772))


### Performance

* **ingestion:** O(N) complexity lookup; fix sql hint; reuse
openStoreForCommand
([#142](#142))
([976b877](976b877))


### Documentation

* sweep stale ADR-0015/0016 prose; unify CI test install path
([#146](#146))
([3b2e05e](3b2e05e))


### Refactoring

* drop dead materialize() + cross-backend parity script (−425 LOC)
([#141](#141))
([216121a](216121a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.1
    * @opencodehub/ingestion bumped to 0.4.2
    * @opencodehub/mcp bumped to 0.4.1
    * @opencodehub/pack bumped to 0.2.1
    * @opencodehub/search bumped to 0.2.1
    * @opencodehub/storage bumped to 0.2.1
    * @opencodehub/wiki bumped to 0.2.1
</details>

<details><summary>cobol-proleap: 0.1.6</summary>

##
[0.1.6](cobol-proleap-v0.1.5...cobol-proleap-v0.1.6)
(2026-05-28)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/ingestion bumped to 0.4.2
</details>

<details><summary>ingestion: 0.4.2</summary>

##
[0.4.2](ingestion-v0.4.1...ingestion-v0.4.2)
(2026-05-28)


### Performance

* **ingestion:** O(N) complexity lookup; fix sql hint; reuse
openStoreForCommand
([#142](#142))
([976b877](976b877))


### Refactoring

* **ingestion:** collapse 3 IndexerKind switches into LANG_REGISTRY
([#143](#143))
([dea4001](dea4001))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.1
    * @opencodehub/scip-ingest bumped to 0.2.3
    * @opencodehub/storage bumped to 0.2.1
</details>

<details><summary>mcp: 0.4.1</summary>

##
[0.4.1](mcp-v0.4.0...mcp-v0.4.1)
(2026-05-28)


### Performance

* **ingestion:** O(N) complexity lookup; fix sql hint; reuse
openStoreForCommand
([#142](#142))
([976b877](976b877))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.1
    * @opencodehub/pack bumped to 0.2.1
    * @opencodehub/search bumped to 0.2.1
    * @opencodehub/storage bumped to 0.2.1
</details>

<details><summary>pack: 0.2.1</summary>

##
[0.2.1](pack-v0.2.0...pack-v0.2.1)
(2026-05-28)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.1
    * @opencodehub/ingestion bumped to 0.4.2
    * @opencodehub/storage bumped to 0.2.1
</details>

<details><summary>scip-ingest: 0.2.3</summary>

##
[0.2.3](scip-ingest-v0.2.2...scip-ingest-v0.2.3)
(2026-05-28)


### Bug Fixes

* harden SCIP proto-reader bounds; drop dead native tree-sitter doctor
probe ([#138](#138))
([b1a4772](b1a4772))


### Refactoring

* drop dead materialize() + cross-backend parity script (−425 LOC)
([#141](#141))
([216121a](216121a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.1
</details>

<details><summary>search: 0.2.1</summary>

##
[0.2.1](search-v0.2.0...search-v0.2.1)
(2026-05-28)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/storage bumped to 0.2.1
</details>

<details><summary>storage: 0.2.1</summary>

##
[0.2.1](storage-v0.2.0...storage-v0.2.1)
(2026-05-28)


### Documentation

* sweep stale ADR-0015/0016 prose; unify CI test install path
([#146](#146))
([3b2e05e](3b2e05e))


### Refactoring

* drop dead materialize() + cross-backend parity script (−425 LOC)
([#141](#141))
([216121a](216121a))
</details>

<details><summary>wiki: 0.2.1</summary>

##
[0.2.1](wiki-v0.2.0...wiki-v0.2.1)
(2026-05-28)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/storage bumped to 0.2.1
</details>

<details><summary>root: 0.6.2</summary>

##
[0.6.2](root-v0.6.1...root-v0.6.2)
(2026-05-28)


### Bug Fixes

* harden SCIP proto-reader bounds; drop dead native tree-sitter doctor
probe ([#138](#138))
([b1a4772](b1a4772))


### Performance

* **ingestion:** O(N) complexity lookup; fix sql hint; reuse
openStoreForCommand
([#142](#142))
([976b877](976b877))


### Documentation

* **repo:** add 2 ERPAVal durable lessons from PR
[#138](#138) Compound
phase ([#140](#140))
([ffd2435](ffd2435))
* **repo:** add collapse-parallel-switches-into-record-registry lesson
([#144](#144))
([b1685f5](b1685f5))
* sweep stale ADR-0015/0016 prose; unify CI test install path
([#146](#146))
([3b2e05e](3b2e05e))


### Refactoring

* drop dead materialize() + cross-backend parity script (−425 LOC)
([#141](#141))
([216121a](216121a))
* **ingestion:** collapse 3 IndexerKind switches into LANG_REGISTRY
([#143](#143))
([dea4001](dea4001))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant