Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .erpaval/INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ development sessions. Solutions are reusable; specs are per-feature.
- [Lift pure helpers to the deepest shared workspace dependency to break future cycles](solutions/architecture-patterns/lift-pure-functions-to-shared-dep-to-break-cycles.md) — `mcp → pack → mcp` was averted by lifting `classifyDependencies` into `@opencodehub/analysis` (the LCA dep). 30-LOC mechanical chore commit.
- [Worktree isolation — pin pwd at task start and exclude worktrees from biome v2](solutions/best-practices/worktree-isolation-pwd-pin-and-biome-exclusion.md) — gitignore is not enough for biome v2; scope to `packages/` or add `experimentalScannerIgnores`. Always `pwd && git rev-parse --show-toplevel` at task start.
- [Resolve milestone-old spec drifts inline with the implementing commit](solutions/best-practices/spec-drift-amend-inline-with-implementing-commit.md) — amend spec wording in the same commit that implements the resolution; record drifts with `recommend` in explore-delta so Gate 0 is a confirmation, not a fresh debate.
- [Segregate graph-only and tabular-only stores at the interface boundary](solutions/architecture-patterns/igraphstore-itemporalstore-segregation.md) — when one type extends multiple sub-interfaces and a concrete implementor can't honestly satisfy all, segregate at the interface, not the class. `IGraphStore` + `ITemporalStore` + `openStore()` composition factory.
- [Replace raw-SQL escape hatches with typed finders on the storage interface](solutions/architecture-patterns/typed-finders-replace-raw-sql-in-consumers.md) — 108 raw-SQL sites collapse into 15 named finders. Adapters internalize dialect; consumers stay backend-agnostic. Liskov-clean parity harness via public-method rebuilder.
- [Parallel Act subagents on a shared git tree — interleaving + cherry-pick discipline](solutions/best-practices/parallel-act-subagents-with-shared-git-tree.md) — verify branch state, spawn on non-overlapping packages, watch for stale dist + phantom test counts, watch the test-fixup tail.

## Specs

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
title: Segregate graph-only and tabular-only stores at the interface boundary
tags: [interface-segregation, liskov, storage, multi-backend, igraphstore]
session: session-33f24f
---

## Context

`IGraphStore` originally extended `CochangeStore + SymbolSummaryStore` and
exposed `query(sql, params)`. `GraphDbStore` (LadybugDB) couldn't honestly
satisfy `lookupCochangesForFile` — it threw `NotImplementedError` on six
methods. The "obvious" fix was to *implement* cochanges on the graph
adapter. The clean fix was to *delete* those signatures from the graph
interface entirely.

After AC-A-1 (split) + AC-A-3 (residue cleanup): `IGraphStore` is graph-only
(Cypher dialect or none). `ITemporalStore` is tabular-only (SQL `exec()` +
cochanges + symbol summaries). `openStore({path, backend}) -> {graph,
temporal, close, describe}` composes both. DuckDB-only deployments share
one connection between views via structural typing — no class split. LadybugDB
deployments open `graph.lbug` + `temporal.duckdb` as siblings.

## Lesson

When one type extends multiple sub-interfaces and a concrete implementor
can't honestly satisfy all of them, segregate at the interface boundary.
NOT at the class. The concrete that DOES satisfy both stays as one class
implementing both interfaces (structural typing); the concrete that only
satisfies one drops the other entirely from its `implements` list.

Procedure:

1. Name the two cohesive interfaces — pick the responsibility, not the
storage technology. Here: graph operations vs tabular operations.
2. Add a composition factory (`openStore`) that returns BOTH views in one
envelope. Callers needing both take the envelope; callers needing one
take the narrow interface.
3. Delete the cross-cutting methods from the narrow interface entirely.
Concrete adapters that don't implement them no longer need to throw
`NotImplementedError`.
4. Test contract for community adapters: only the narrow interface, with a
conformance suite that any implementor imports + runs.

## Why this matters

This pattern lets community contributors fork in adapters without
re-implementing concerns that don't belong on their backend. An AGE /
Memgraph / Neo4j / Neptune author implements `IGraphStore` only —
DuckDB stays as the temporal backend on every deployment. Two files to
fork in: implement IGraphStore + call `assertIGraphStoreConformance` in
their test. The pattern beats the alternative ("one mega-interface,
each adapter throws NotImplementedError on what it can't do") on type
honesty, conformance verifiability, and Liskov compliance.

## Example

- `packages/storage/src/interface.ts` — split into IGraphStore + ITemporalStore.
- `packages/storage/src/index.ts` — openStore factory composes views.
- `packages/storage/src/graphdb-adapter.ts` — implements IGraphStore only.
- `packages/storage/src/duckdb-adapter.ts` — implements both via structural typing.
- `packages/storage/src/test-utils/conformance.ts` (AC-A-11) — pre-baked test
suite that any IGraphStore implementor imports.
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
title: Replace raw-SQL escape hatches with typed finders on the storage interface
tags: [service-layer, dialect-leak, typed-finders, dry, igraphstore]
session: session-33f24f
---

## Context

108 raw-SQL call sites lived outside `packages/storage/`: 46 in mcp/, 27
in analysis/, 17 in cli/, 12 in wiki/, 4 in pack/, 2 in search/. Each
called `store.query("SELECT ... FROM nodes WHERE ...")`. After
`IGraphStore` split graph-only (no SQL), every one of those was a
silent breakage waiting to fire when the default backend flipped.

The clean fix wasn't `s/IGraphStore/DuckDbStore/` everywhere — that
preserves the abstraction leak. It was **a 13-finder service layer**
on the interface: `listNodesByKind`, `listEdges`, `listEdgesByType`,
`listFindings`, `listDependencies`, `listRoutes`, `getRepoNode`,
`countNodesByKind`, `countEdgesByType`, `traverseAncestors`,
`traverseDescendants`, `listEmbeddings`, `listConsumerProducerEdges`,
plus 2 specialized (`listNodesByEntryPoint`, `listNodesByName`).

Each adapter (DuckDB, GraphDb, future AGE/Memgraph/Neo4j/Neptune)
internalizes the dialect. Consumers call `store.listFindings({severity:
"error"})`. The 108 sites collapse into 15 named finders. SQL strings
never leave the adapter.

## Lesson

When raw-SQL escape hatches sprawl across a codebase, the migration
target is not the "right" type pin — it's the right service-layer API.
Pattern:

1. Audit raw call sites. Group by query shape. The grouping IS the
finder set.
2. Add finders to the interface. Each finder is the SMALLEST coherent
abstraction that covers a recurring query shape.
3. Implement on every adapter. Internalize the dialect. Determinism
(ORDER BY id ASC for nodes; (from_id, to_id, type) for edges).
4. Migrate consumers one package at a time. Per-package agent + write
protocol per AC.
5. Test contract: round-trip parity via a Liskov rebuilder that uses
ONLY public methods (no raw SQL/Cypher). Any new adapter slots in.

## Why this matters

Raw SQL in consumers is a leaky abstraction that fires the day the
default backend changes. Replacing it with typed finders:

- Makes the architecture honest at compile time, not runtime.
- Lets community adapters slot in without rewriting consumers.
- The 15-finder set is a SOLID-I balance — small enough to be coherent,
large enough to cover every read pattern.
- The Liskov-clean parity harness (`rebuildFromStore` using only public
methods) means a third-party adapter proves conformance by passing
the suite. No coupling to either flagship adapter.

## Example

- `packages/storage/src/interface.ts:144-215` — 15 finder signatures.
- `packages/storage/src/duckdb-adapter.ts`, `graphdb-adapter.ts` — 13 finder
impls each, dialect internalized.
- `packages/storage/src/test-utils/parity-harness.ts` — `rebuildFromStore`
uses listNodes + listEdges only.
- `packages/storage/src/test-utils/conformance.ts` —
`assertIGraphStoreConformance(name, factory)` for community adapters.
- 108 migration sites across analysis/mcp/pack/wiki/search/cli — see
commits `efa673c` through `e4131b3` on `feat/v1-finalize-track-a`.
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
---
title: Parallel Act subagents on a shared git tree — interleaving + cherry-pick discipline
tags: [erpaval, act-phase, worktrees, subagents, parallelism, cherry-pick]
session: session-33f24f
---

## Context

Track A of v1-finalize ran 13 ACs. Most ACs spawned a dedicated Act
subagent on an isolated worktree (`isolation: worktree`). Two recurring
behaviors emerged:

1. **Worktrees that branched off `main` instead of `feat/v1-finalize-track-a`.**
Several agents reported "fast-forwarded to feat/v1-finalize-track-a
before starting" — the worktree harness defaults the new branch off
the orchestrator's CURRENT HEAD, but if the orchestrator hasn't
pushed track-a, the harness picked up `origin/main` instead. Fix:
the agent's first action is `pwd && git rev-parse --show-toplevel
&& git log --oneline -10` to verify expected commits are in the
chain. If missing, `git fetch && git merge --ff-only feat/v1-finalize-track-a`.
Document in the packet's Work log.

2. **Worktree commits landing on the parent branch directly.** Several
agents committed to the worktree's local branch but their changes
appeared on `feat/v1-finalize-track-a` because the git dir is shared
across worktrees. The orchestrator's cherry-pick became a no-op
(commit already in branch); next cherry-pick of a NEW commit worked
normally. Net effect: orchestrator must verify branch state before
AND after each agent completion, not assume cherry-pick is required.

3. **Concurrent worktrees on overlapping packages.** Two agents both
editing `packages/storage/` produced merge friction even when their
files didn't overlap because lefthook + biome lock root state. Fix:
spawn parallel agents on NON-OVERLAPPING package boundaries.
`mcp/` parallel with `storage/` is fine; `mcp/` parallel with
`analysis/` is fine; two agents on `storage/` is not.

4. **Stale dist + test reports.** `pnpm -r test` runs `node --test
./dist/**/*.test.js`. Type-only changes update `.ts` but leave
`.js` stale. After every interface-touching commit, rebuild
(`pnpm -r build`) before trusting test counts. Several agents
reported phantom failure counts that resolved on rebuild.

## Lesson

For ERPAVal Act phase with parallel subagents on a shared git tree:

1. **Each Act subagent's first action is to verify branch state.**
Document `git log --oneline -10` in the Work log. If branched off
`main` instead of the feature branch, fast-forward before editing.

2. **Spawn parallel agents on non-overlapping package boundaries.**
Worktree isolation does NOT prevent biome / lefthook root-config
conflicts. Don't spawn 2+ agents on the same package.

3. **The orchestrator's cherry-pick may be a no-op.** Verify branch
HEAD post-completion via `git log --oneline -3 HEAD`. If the agent's
reported SHA is already at HEAD, the cherry-pick is redundant — log
it and move on.

4. **Rebuild before trusting test counts after interface changes.**
`pnpm -r build && pnpm -r test`. Stale `dist/` produces phantom
failures.

5. **Watch the test-fixup tail.** When production migrates to a new
interface (e.g. typed finders), per-test FakeStore mocks need
migration too. The packet that does the production migration should
either (a) hoist a shared fake to `<pkg>/src/test-utils.ts` or
(b) explicitly defer test-fixup as a follow-on packet. Don't let
it slip silently — the rebuild surfaces 50+ failing tests at once.

## Why this matters

Track A landed 25 commits across 13 ACs in one session via parallel
subagents. The patterns above are what kept the hash-parity invariant
green per-commit and prevented two-week debug sessions on phantom
failures. Future multi-AC tracks (Track C debt sweep, Track D dogfood
polish) inherit these.

## Example

- `feat/v1-finalize-track-a` HEAD `894d477` — 25 commits, all green.
- Two agents on storage/ in parallel produced the AC-A-3 / AC-A-7
sequencing fix that landed cleanly.
- Mass mcp test-fixup (`a2718d4f4bf486a57`) was a deferred follow-on
packet because AC-A-6c's per-AC scope didn't include the 17-file
test mass migration. Right call — the deferred packet had a clean
scope and landed in one commit (`d67f115`).
- Phantom 79-failure count appeared on first AC-A-6c rebuild;
resolved on full repo `pnpm -r build`.
14 changes: 14 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,17 @@ This repo ships a Claude Code plugin at `plugins/opencodehub/` — it
provides `/probe`, `/verdict`, `/owners`, `/audit-deps`, `/rename` slash
commands plus a `code-analyst` subagent and 10 skills. Install via
`codehub init` (writes `.mcp.json` + links the plugin).

## Storage backend — graph-default

`CODEHUB_STORE` is unset by default. OpenCodeHub probes
`@ladybugdb/core` and uses the graph-database backend when the binding
is available; otherwise it falls back to DuckDB with a one-shot stderr
advisory (gated on TTY or `OCH_VERBOSE=1`). Set `CODEHUB_STORE=duck` to
force the legacy layout (single DuckDB file backs both graph + temporal
views) or `CODEHUB_STORE=lbug` to require the graph-database backend.

When both `graph.duckdb` and `graph.lbug` exist as siblings in the same
`<repo>/.codehub/`, the newer-mtime file wins. See ADR 0013
(`docs/adr/0013-m7-default-flip-and-abstraction.md`) for the rationale
and the AGE/Memgraph/Neo4j/Neptune community-adapter escape hatch.
14 changes: 14 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,20 @@ provides `/probe`, `/verdict`, `/owners`, `/audit-deps`, `/rename` slash
commands plus a `code-analyst` subagent and 10 skills. Install via
`codehub init` (writes `.mcp.json` + links the plugin).

## Storage backend — graph-default

`CODEHUB_STORE` is unset by default. OpenCodeHub probes
`@ladybugdb/core` and uses the graph-database backend when the binding
is available; otherwise it falls back to DuckDB with a one-shot stderr
advisory (gated on TTY or `OCH_VERBOSE=1`). Set `CODEHUB_STORE=duck` to
force the legacy layout (single DuckDB file backs both graph + temporal
views) or `CODEHUB_STORE=lbug` to require the graph-database backend.

When both `graph.duckdb` and `graph.lbug` exist as siblings in the same
`<repo>/.codehub/`, the newer-mtime file wins. See ADR 0013
(`docs/adr/0013-m7-default-flip-and-abstraction.md`) for the rationale
and the AGE/Memgraph/Neo4j/Neptune community-adapter escape hatch.

## Parse runtime — WASM default, native opt-in

`@opencodehub/ingestion` defaults to the `web-tree-sitter` (WASM) runtime
Expand Down
24 changes: 24 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,30 @@ switching mid-project requires `codehub analyze --rebuild-embeddings`.
`--offline` refuses SageMaker and HTTP backends, so offline mode is
compatible only with the local ONNX path.

## Storage backend — graph-default

Starting with v1.0, OpenCodeHub picks the graph-database backend
(`@ladybugdb/core`) as the default whenever the binding is importable on
the current platform. DuckDB is retained as the temporal store
(cochanges + symbol summaries) and as the legacy graph fallback. The
`CODEHUB_STORE` environment variable controls selection:

| `CODEHUB_STORE` | Behaviour |
|---|---|
| *unset* (default) | Probe `@ladybugdb/core`. Available → graph artifact at `<repo>/.codehub/graph.lbug` + temporal sibling `temporal.duckdb`. Missing → fall back to `<repo>/.codehub/graph.duckdb` (one-shot stderr advisory under TTY / `OCH_VERBOSE=1`). |
| `duck` | Force the legacy DuckDB-only layout. One file backs both the graph and temporal views. |
| `lbug` | Force the graph-database layout. Surface a `GraphDbBindingError` at open time if the binding is unavailable. |

Two-artifact transition: when both `graph.duckdb` AND `graph.lbug` are
present in the same `<repo>/.codehub/`, the newer-mtime file wins and a
one-shot advisory fires. Remove the stale artifact to silence the
advisory.

See [`docs/adr/0011-graph-db-backend.md`](./docs/adr/0011-graph-db-backend.md)
for the M3 phase-1 rationale and
[`docs/adr/0013-m7-default-flip-and-abstraction.md`](./docs/adr/0013-m7-default-flip-and-abstraction.md)
for the M7 default-flip + interface segregation.

## Status

**v0.1.0 — initial public release.** The codebase is feature-complete
Expand Down
Loading
Loading