diff --git a/.erpaval/INDEX.md b/.erpaval/INDEX.md
index dac938fd..a98e4817 100644
--- a/.erpaval/INDEX.md
+++ b/.erpaval/INDEX.md
@@ -34,6 +34,10 @@ development sessions. Solutions are reusable; specs are per-feature.
- [No spec-coordinate leakage into source](solutions/best-practices/no-spec-coordinate-leakage-into-source.md) — ERPAVal `AC-*`, `M-*`, `W-*`, `CL-*` prefixes belong in commits, PR bodies, ADR refs sections — NOT in JSDoc, inline comments, CLI flag help, MCP tool descriptions, or test names. Sweep `rg -n "AC-[A-Z]-[0-9]" packages/` before every PR-open; LLM clients pick up the leakage and start citing it back.
- [release: published events need PAT or inline](solutions/conventions/release-published-event-needs-pat-or-inline.md) — release-please-action with default `GITHUB_TOKEN` does NOT fire downstream `release: [published]` workflows; inline asset-attach in `release-please.yml` gated on `steps.release.outputs.release_created`. Fixed AC-D-4; sbom.yml has same latent bug for follow-on.
- [Dogfood pre-push hook catches CLI spec drift on first push](solutions/best-practices/dogfood-prepush-hook-caught-cli-spec-mismatch.md) — the first `git push` of the commit that adds a self-targeting pre-push hook is where spec/CLI-flag mismatches and "missing index" foot-guns surface. Pattern: SKIP-with-message shape from `pack-determinism-audit.sh` for any gate that depends on a derived artifact.
+- [Cherry-pick verified bug fixes from a sibling testbed clone](solutions/best-practices/cherry-pick-from-sibling-testbed.md) — when a post-filter sibling has authored fix commits with file:line repro coordinates, fetch the sibling and cherry-pick directly; preserves authorship, halves review surface, defeats re-author drift.
+- [Bench dashboard ↔ acceptance script banner-text parity](solutions/architecture-patterns/bench-dashboard-acceptance-script-parity.md) — when a dashboard parses banners by exact-string match, the two artifacts must be edited together; add a roster-shape test that pulls the banner list from the script directly. Surfaced 9-of-17 gates rendering by Bug #4 in 2026-05-10 smoke campaign.
+- [Test env hermeticity for backend-precedence libraries](solutions/conventions/test-env-hermeticity-for-backend-precedence.md) — when an SDK picks a backend by env presence, tests must scope-stash every key in the chain via prefix glob, not just the one they assert on. Per Bug #7 in 2026-05-10 smoke: `CODEHUB_EMBEDDING_*` chain.
+- [Parallel docs subagents over-scrub ADR coordinates](solutions/best-practices/parallel-docs-subagent-overscrubs-adrs.md) — PR #74's carve-out for ADR text isn't visible in the durable lesson; brief docs subagents explicitly that `docs/adr/*` retains spec coordinates.
## Specs
diff --git a/.erpaval/solutions/architecture-patterns/bench-dashboard-acceptance-script-parity.md b/.erpaval/solutions/architecture-patterns/bench-dashboard-acceptance-script-parity.md
new file mode 100644
index 00000000..caac01ff
--- /dev/null
+++ b/.erpaval/solutions/architecture-patterns/bench-dashboard-acceptance-script-parity.md
@@ -0,0 +1,66 @@
+---
+name: A dashboard that parses banner-text from a script must mirror the script's banners verbatim
+description: Bench/dashboard tools that index gates/jobs by exact-title match against a script's banner output drift silently when the script grows new gates — both files must be edited together
+type: architecture-patterns
+---
+
+`packages/cli/src/commands/bench.ts` indexes gate rows by exact-string
+match against `scripts/acceptance.sh` banners (`N/17:
`). When
+the script grew from 9 to 17 gates and changed a few existing banner
+titles ("graphHash determinism" → "determinism (double-run graphHash)"),
+the dashboard didn't follow. Result: 8 gates never advance past
+"pending" and post-stream get stamped "skipped — script crashed" by the
+crash-fallback path; another 3 displayed under stale titles. Operators
+saw 9/17 gates with confusing detail strings.
+
+The original code shape:
+
+```ts
+export const MVP_GATES: readonly { id: string; title: string }[] = [
+ { id: "install", title: "pnpm install --frozen-lockfile" },
+ // ... 8 more, with stale titles
+];
+
+export function applyLine(rows: GateRow[], rawLine: string): void {
+ const banner = /^\d+\/\d+:\s+(.*)$/.exec(line);
+ if (banner) {
+ const idx = rows.findIndex((r) => r.title === banner[1]); // exact match
+ if (idx >= 0) currentGateIdx = idx;
+ }
+}
+```
+
+**Why:** the dashboard is a thin presenter over the script's stdout. Any
+banner text not in `MVP_GATES` is silently dropped. There is no compile-
+time signal — the build is green, the unit tests are green, only the
+runtime UX degrades. The same gap also caught `[SKIP]` markers: the
+original `applyLine` matched `[PASS]`/`[FAIL]` but not `[SKIP]`, so
+gracefully-degrading gates rendered as "skipped — script crashed" via
+the crash-fallback path with a misleading detail string.
+
+**How to apply:**
+
+1. **Treat banner titles as a contract** between the script and the
+ dashboard. Edit both files in the same commit.
+2. **Add a roster-shape test.** Assert `MVP_GATES.length === 17` AND
+ `MVP_GATES.map(g => g.title)` matches the banner sequence the script
+ emits. The test pulls the banner list from the script directly with
+ `grep -oE '^echo "\d+/\${TOTAL_GATES}: (.+)"$' scripts/acceptance.sh`
+ so the assertion follows the source of truth.
+3. **Match every marker the script emits.** If the script emits `[PASS]`,
+ `[FAIL]`, AND `[SKIP]`, the parser must handle all three. The
+ crash-fallback path must NOT fire for legitimate skips.
+4. **Order matters when index = listr2 row.** `MVP_GATES` order must
+ match script execution order — the dashboard advances rows by index
+ as banners stream in.
+
+Anti-pattern: a "we'll keep them in sync manually" comment without an
+enforcement test. The 9-gate / 17-gate drift sat in `main` undetected
+because no CI surface failed when the script grew. Surfacing it
+required an operator to run `codehub bench` and notice the visual
+mismatch.
+
+Cross-link: the `dogfood-prepush-hook-caught-cli-spec-mismatch` durable
+lesson covers a related pattern — the dogfood pre-push hook on this
+exact PR was where this bug was first surfaced (Bug #4 in
+UPSTREAM_BUGS.md, 2026-05-10 smoke).
diff --git a/.erpaval/solutions/best-practices/cherry-pick-from-sibling-testbed.md b/.erpaval/solutions/best-practices/cherry-pick-from-sibling-testbed.md
new file mode 100644
index 00000000..afd63246
--- /dev/null
+++ b/.erpaval/solutions/best-practices/cherry-pick-from-sibling-testbed.md
@@ -0,0 +1,52 @@
+---
+name: Cherry-pick verified bug fixes from a sibling testbed clone
+description: When a sibling/post-filter checkout has authored fix commits with file:line repro coordinates, fetch the sibling and cherry-pick directly — no need to re-author or re-test on upstream
+type: best-practices
+---
+
+When you maintain a "post-filter testbed" sibling repo for smoke / dogfood
+campaigns and you've already authored fix commits there with verified
+repros, do not re-write the fixes on upstream. Fetch the sibling as a
+local remote and cherry-pick.
+
+**Why:** The fix has already been authored, repro'd, verified. Re-authoring
+on upstream loses authorship metadata, doubles review surface, and
+introduces drift between what was fixed and what landed. Re-testing
+re-validates the same green path. The cherry-pick is provably equivalent
+when the file:line coordinates in the fix message match upstream HEAD.
+
+**How to apply:**
+
+1. **Verify file:line parity first.** Each fix in the testbed report
+ should cite file paths and line numbers; quickly grep upstream to
+ confirm the same lines exist there. Per Bug #2 in OCH 2026-05-10
+ campaign: `packages/cli/src/commands/scan.ts:162-171` was identical in
+ testbed and upstream — direct cherry-pick worked.
+2. **Fetch the sibling as a path remote.** No need to register it
+ permanently. One-shot:
+ ```bash
+ git fetch /efs/lalsaado/workplace/opencodehub.post-filter --no-tags
+ ```
+ `FETCH_HEAD` now points at the sibling's HEAD; commits referenced by
+ short-hash become resolvable.
+3. **Cherry-pick in severity order.** HIGH first, MEDIUM next, LOW last.
+ Each pick is one commit; do not squash them into a "umbrella fix"
+ commit — preserves blame and lets the PR reviewer see one
+ self-contained fix per scope.
+4. **Re-verify after each pick** with the package-scoped check:
+ `pnpm -F @opencodehub/ test` plus any smoke script the fix
+ targets (`bash scripts/smoke-mcp.sh`, `node ... doctor`, etc.).
+5. **Prefer one PR for the bundle** when the fixes are small and
+ thematically related (a "v1 upstream bug sweep") — reviewer context
+ stays coherent. Split only if the bundle exceeds reviewability.
+
+Anti-pattern: re-authoring the fix on upstream and citing the testbed
+commit in the body. That loses the original commit's authorship and
+makes blame point at the re-author for code that was thought-through
+elsewhere. If you need to adapt the fix to upstream divergence, do that
+as a follow-up commit on top of the cherry-pick, not a rewrite.
+
+Related: this pairs naturally with the durable lesson "Squash-merge
+masks pre-existing repo-wide debt" — run `mise run check` on upstream
+BEFORE the cherry-pick to baseline-clean, so any test regression after
+the pick is unambiguously attributable to the picked fix.
diff --git a/.erpaval/solutions/best-practices/no-spec-coordinate-leakage-into-source.md b/.erpaval/solutions/best-practices/no-spec-coordinate-leakage-into-source.md
index d60b5c7a..d5bf17be 100644
--- a/.erpaval/solutions/best-practices/no-spec-coordinate-leakage-into-source.md
+++ b/.erpaval/solutions/best-practices/no-spec-coordinate-leakage-into-source.md
@@ -48,11 +48,19 @@ LLM clients then pick up and start citing back; the leakage compounds.
("graphHash parity: medium-with-empty-keywords ([] vs absent)") over
the AC ("AC-C-2: graphHash..."). The behavior survives renames; AC
numbers don't.
-- **Sweep before commit.** Run `rg -n "AC-[A-Z]-[0-9]" packages/`
+- **Sweep before commit.** Run `rg -n "AC-[A-Z]-[0-9]" packages/ scripts/`
against your branch before PR-open. Anything that hits is a
candidate for rephrase. If the comment NEEDS to cite the AC, use a
short reference at the end like "(AC-C-5)" rather than leading with
it.
+- **Sweep scope is `packages/` and `scripts/`, NOT `docs/adr/*`.** PR #74
+ (`f09d804`) carved out `docs/adr/*` as the explicit place where
+ coordinates ARE permanent decision rationale. A docs-refresh subagent
+ that sees the sweep regex without the scope qualifier will scrub
+ ADRs by default — DO NOT. Brief docs subagents explicitly that ADR
+ text retains coordinates. See the
+ `parallel-docs-subagent-overscrubs-adrs.md` lesson for the failure
+ mode.
- **The test fakes are the trap.** When a Wave subagent edits a test
fake, it tends to add `// AC-XXX: stubs ...` because it's writing
the comment WITH the AC packet open in front of it. Sweep test files
diff --git a/.erpaval/solutions/best-practices/parallel-docs-subagent-overscrubs-adrs.md b/.erpaval/solutions/best-practices/parallel-docs-subagent-overscrubs-adrs.md
new file mode 100644
index 00000000..7cd307f6
--- /dev/null
+++ b/.erpaval/solutions/best-practices/parallel-docs-subagent-overscrubs-adrs.md
@@ -0,0 +1,61 @@
+---
+name: Parallel docs-refresh subagents must be told that ADR text is the carve-out where spec coordinates ARE allowed
+description: When a docs-refresh subagent inherits the "no spec-coordinate leakage" rule from durable lessons, it will scrub ADR text by default — but PR #74 carved out docs/adr/* as the place where coordinates ARE the durable rationale; brief explicitly
+type: best-practices
+---
+
+OCH PR #74 (`f09d804 chore(repo): scrub ERPAVal spec coordinates from
+source`) explicitly retained spec coordinates in `docs/adr/*` as
+"permanent decision rationale". The durable lesson
+`no-spec-coordinate-leakage-into-source.md` documents the scrub but
+does NOT crisply state the carve-out. When a parallel docs-refresh
+subagent reads the durable lesson and is told "no spec-coordinate
+leakage", it scrubs ADRs too — undoing PR #74's deliberate carve-out.
+
+Observed in OCH session 6c091d (2026-05-10 v1 upstream bug sweep): the
+docs-refresh subagent stripped `AC-A-1`, `AC-A-2`, `AC-A-6 a/b/c/d`,
+`AC-A-7`, `AC-A-9`, `AC-A-11` from ADR 0013-m7 and `AC-C-3`, `AC-C-5`,
+`E-C-3`, `W-A-2` from ADR 0014. Required a follow-up
+`docs(docs): restore ADR-permanent spec coordinates per PR #74 policy`
+commit on the same branch.
+
+**Why:** the durable lesson's scope says "production source, JSDoc,
+inline comments, CLI flag help, MCP tool option descriptions, test
+names" — but the ADR carve-out lives only in PR #74's body. Subagents
+read the lesson, not the PR archive. The carve-out is invisible to a
+fresh agent.
+
+**How to apply:**
+
+1. **Brief docs subagents explicitly.** When seeding a docs-refresh
+ subagent prompt, include both rules:
+ - "No spec-coordinate prefixes in production source (per durable
+ lesson)."
+ - "ADR text is the carve-out: spec coordinates in `docs/adr/*` are
+ intentional permanent rationale per PR #74. Do NOT scrub them
+ there."
+2. **Update the lesson itself.** Edit
+ `solutions/best-practices/no-spec-coordinate-leakage-into-source.md`
+ to add a "Scope" section that names `docs/adr/*` as the carve-out,
+ so future subagents reading the lesson see the constraint without
+ needing PR archaeology.
+3. **Sweep with a scope-aware regex.** When auditing leakage, exclude
+ `docs/adr/*` from the sweep:
+ `rg -n 'AC-[A-Z]-[0-9]' packages/ scripts/`
+ not
+ `rg -n 'AC-[A-Z]-[0-9]'` (which would falsely flag ADRs).
+4. **The reverse case is also valid.** `docs/adr/0014-*` originally
+ listed `.erpaval/specs/...` and `.erpaval/sessions/...` as
+ References — those paths are gitignored and rot once the packet
+ graduates. Replacing them with code-path citations IS correct, even
+ in ADR text. The carve-out is for spec-coordinate prefixes, not for
+ pointers to gitignored paths.
+
+Anti-pattern: writing a generic "scrub spec coords everywhere" rule and
+then surprised when ADR rationale gets vacuumed. The leakage rule
+exists to prevent rot; ADR rationale doesn't rot because the ADR is
+the rationale.
+
+Cross-link:
+[no-spec-coordinate-leakage-into-source](no-spec-coordinate-leakage-into-source.md) — the original rule.
+PR #74 (`f09d804`) — the carve-out's authoritative source.
diff --git a/.erpaval/solutions/conventions/test-env-hermeticity-for-backend-precedence.md b/.erpaval/solutions/conventions/test-env-hermeticity-for-backend-precedence.md
new file mode 100644
index 00000000..38e172e9
--- /dev/null
+++ b/.erpaval/solutions/conventions/test-env-hermeticity-for-backend-precedence.md
@@ -0,0 +1,71 @@
+---
+name: Tests for backend-precedence libraries must wipe all env keys in the precedence chain, not just the one they assert
+description: When an SDK picks a backend by env presence (CODEHUB_EMBEDDING_SAGEMAKER_ENDPOINT, CODEHUB_EMBEDDING_URL, ...), tests of "backend X is picked when only X's env is set" must scope-stash every key in the chain, not only the local one
+type: conventions
+---
+
+`packages/embedder/src/http-embedder.test.ts:441,458` asserted that
+`tryOpenHttpEmbedder` returns `null` when its specific env var is unset.
+The test only stashed `CODEHUB_HOME`. With
+`CODEHUB_EMBEDDING_SAGEMAKER_ENDPOINT` exported in the operator's shell,
+the higher-precedence SageMaker backend short-circuited, the assertion
+flipped, and the test failed — but only on the specific dev box where
+the operator was working through SageMaker integration.
+
+The fix: a `sanitizeEmbeddingEnv()` helper that snapshots and wipes
+every `CODEHUB_EMBEDDING_*` key plus `CODEHUB_HOME`, restored on
+teardown via `beforeEach`/`afterEach`:
+
+```ts
+function sanitizeEmbeddingEnv() {
+ const saved: Record = {};
+ for (const k of Object.keys(process.env)) {
+ if (k.startsWith("CODEHUB_EMBEDDING_") || k === "CODEHUB_HOME") {
+ saved[k] = process.env[k];
+ delete process.env[k];
+ }
+ }
+ return () => {
+ for (const [k, v] of Object.entries(saved)) {
+ if (v === undefined) delete process.env[k]; else process.env[k] = v;
+ }
+ };
+}
+```
+
+**Why:** the backend-precedence pattern is a chain — env-X-set → backend-X,
+else env-Y-set → backend-Y, else fallback. A test that asserts about
+backend Y must explicitly clear backend-X's env, otherwise the assertion
+silently tests the wrong code path under any operator who happens to
+have backend-X configured. The failure is non-reproducible on a clean
+laptop, fires on a dev box with the higher-precedence env exported.
+This is exactly the env-leak class that bedevils CI-vs-local divergence
+debugging.
+
+**How to apply:**
+
+1. **Identify the precedence chain.** For OCH embedder:
+ `CODEHUB_EMBEDDING_SAGEMAKER_ENDPOINT` → `CODEHUB_EMBEDDING_URL` →
+ `CODEHUB_EMBEDDING_*` (HTTP options) → `CODEHUB_HOME` (local ONNX).
+ Any test that asserts about backend selection must wipe the entire
+ chain, not just one key.
+2. **Stash with a prefix glob, not a fixed key list.** `Object.keys`
+ filtered by `startsWith("CODEHUB_EMBEDDING_")` catches keys added
+ later (e.g. a future `CODEHUB_EMBEDDING_AZURE_*`) without revisiting
+ every test.
+3. **Wire it as `beforeEach`/`afterEach`, not per-case try/finally.**
+ Easier to audit; harder to forget on the next case.
+4. **Apply defensively to sibling describe blocks.** Even cases that
+ don't care about the env can be poisoned by stale state from a prior
+ test that mutated `process.env`. Hermetic test suites don't pay a
+ cost for being defensive.
+
+Anti-pattern: per-case `originalKey = process.env[KEY]; ... finally
+process.env[KEY] = originalKey` for a single key. The single-key save
+worked when there was one env var; with a chain, every test that misses
+a sibling key in the chain becomes flaky on operator boxes.
+
+Cross-link: pairs with the existing `sagemaker-embedder-backend.md`
+durable lesson — that one covers the SDK-side dynamic-import + soft-fail
+pattern; this one covers the test-side env-hermeticity pattern that
+that pattern requires.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 7e833474..0a32504e 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -90,3 +90,34 @@ jobs:
--onlyAllow 'Apache-2.0;MIT;BSD-2-Clause;BSD-3-Clause;ISC;CC0-1.0;BlueOak-1.0.0;0BSD'
--excludePrivatePackages
--production
+
+ # GitHub code-scanning advanced-setup is configured to look for an `osv`
+ # job in `ci.yml`. The standalone `osv.yml` workflow does the same scan,
+ # but the configured pointer lives here, so we mirror it: install
+ # osv-scanner, emit SARIF, upload to code-scanning, then fail the run on
+ # vulnerabilities. The standalone workflow remains for the weekly cron.
+ osv:
+ runs-on: ubuntu-latest
+ permissions:
+ contents: read
+ security-events: write
+ steps:
+ - uses: actions/checkout@v6
+ - name: Install osv-scanner
+ run: |
+ curl -sL -o /tmp/osv-scanner \
+ https://github.com/google/osv-scanner/releases/download/v2.3.8/osv-scanner_linux_amd64
+ chmod +x /tmp/osv-scanner
+ - name: Scan pnpm-lock.yaml (SARIF output)
+ run: |
+ /tmp/osv-scanner scan source \
+ --lockfile=pnpm-lock.yaml \
+ --format=sarif \
+ --output=osv.sarif || true
+ - uses: github/codeql-action/upload-sarif@v4
+ if: always()
+ with:
+ sarif_file: osv.sarif
+ category: osv-scanner
+ - name: Fail on vulnerabilities
+ run: /tmp/osv-scanner scan source --lockfile=pnpm-lock.yaml
diff --git a/.github/workflows/osv.yml b/.github/workflows/osv.yml
index d58d10f9..dc7195d3 100644
--- a/.github/workflows/osv.yml
+++ b/.github/workflows/osv.yml
@@ -1,10 +1,12 @@
name: OSV-Scanner
+# The push/pull_request runs were moved to `ci.yml`'s `osv` job so the
+# GitHub code-scanning advanced-setup configuration (which is pinned to
+# `ci.yml:osv`) finds them. This standalone workflow keeps the weekly
+# cron + manual dispatch path so OSV advisory-data updates are picked
+# up between PRs.
on:
- push:
- branches: [main]
- pull_request:
- branches: [main]
+ workflow_dispatch:
schedule:
- cron: "33 5 * * 2"