Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .erpaval/INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,10 @@ development sessions. Solutions are reusable; specs are per-feature.
- [No spec-coordinate leakage into source](solutions/best-practices/no-spec-coordinate-leakage-into-source.md) — ERPAVal `AC-*`, `M-*`, `W-*`, `CL-*` prefixes belong in commits, PR bodies, ADR refs sections — NOT in JSDoc, inline comments, CLI flag help, MCP tool descriptions, or test names. Sweep `rg -n "AC-[A-Z]-[0-9]" packages/` before every PR-open; LLM clients pick up the leakage and start citing it back.
- [release: published events need PAT or inline](solutions/conventions/release-published-event-needs-pat-or-inline.md) — release-please-action with default `GITHUB_TOKEN` does NOT fire downstream `release: [published]` workflows; inline asset-attach in `release-please.yml` gated on `steps.release.outputs.release_created`. Fixed AC-D-4; sbom.yml has same latent bug for follow-on.
- [Dogfood pre-push hook catches CLI spec drift on first push](solutions/best-practices/dogfood-prepush-hook-caught-cli-spec-mismatch.md) — the first `git push` of the commit that adds a self-targeting pre-push hook is where spec/CLI-flag mismatches and "missing index" foot-guns surface. Pattern: SKIP-with-message shape from `pack-determinism-audit.sh` for any gate that depends on a derived artifact.
- [Cherry-pick verified bug fixes from a sibling testbed clone](solutions/best-practices/cherry-pick-from-sibling-testbed.md) — when a post-filter sibling has authored fix commits with file:line repro coordinates, fetch the sibling and cherry-pick directly; preserves authorship, halves review surface, defeats re-author drift.
- [Bench dashboard ↔ acceptance script banner-text parity](solutions/architecture-patterns/bench-dashboard-acceptance-script-parity.md) — when a dashboard parses banners by exact-string match, the two artifacts must be edited together; add a roster-shape test that pulls the banner list from the script directly. Surfaced 9-of-17 gates rendering by Bug #4 in 2026-05-10 smoke campaign.
- [Test env hermeticity for backend-precedence libraries](solutions/conventions/test-env-hermeticity-for-backend-precedence.md) — when an SDK picks a backend by env presence, tests must scope-stash every key in the chain via prefix glob, not just the one they assert on. Per Bug #7 in 2026-05-10 smoke: `CODEHUB_EMBEDDING_*` chain.
- [Parallel docs subagents over-scrub ADR coordinates](solutions/best-practices/parallel-docs-subagent-overscrubs-adrs.md) — PR #74's carve-out for ADR text isn't visible in the durable lesson; brief docs subagents explicitly that `docs/adr/*` retains spec coordinates.

## Specs

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
name: A dashboard that parses banner-text from a script must mirror the script's banners verbatim
description: Bench/dashboard tools that index gates/jobs by exact-title match against a script's banner output drift silently when the script grows new gates — both files must be edited together
type: architecture-patterns
---

`packages/cli/src/commands/bench.ts` indexes gate rows by exact-string
match against `scripts/acceptance.sh` banners (`N/17: <title>`). When
the script grew from 9 to 17 gates and changed a few existing banner
titles ("graphHash determinism" → "determinism (double-run graphHash)"),
the dashboard didn't follow. Result: 8 gates never advance past
"pending" and post-stream get stamped "skipped — script crashed" by the
crash-fallback path; another 3 displayed under stale titles. Operators
saw 9/17 gates with confusing detail strings.

The original code shape:

```ts
export const MVP_GATES: readonly { id: string; title: string }[] = [
{ id: "install", title: "pnpm install --frozen-lockfile" },
// ... 8 more, with stale titles
];

export function applyLine(rows: GateRow[], rawLine: string): void {
const banner = /^\d+\/\d+:\s+(.*)$/.exec(line);
if (banner) {
const idx = rows.findIndex((r) => r.title === banner[1]); // exact match
if (idx >= 0) currentGateIdx = idx;
}
}
```

**Why:** the dashboard is a thin presenter over the script's stdout. Any
banner text not in `MVP_GATES` is silently dropped. There is no compile-
time signal — the build is green, the unit tests are green, only the
runtime UX degrades. The same gap also caught `[SKIP]` markers: the
original `applyLine` matched `[PASS]`/`[FAIL]` but not `[SKIP]`, so
gracefully-degrading gates rendered as "skipped — script crashed" via
the crash-fallback path with a misleading detail string.

**How to apply:**

1. **Treat banner titles as a contract** between the script and the
dashboard. Edit both files in the same commit.
2. **Add a roster-shape test.** Assert `MVP_GATES.length === 17` AND
`MVP_GATES.map(g => g.title)` matches the banner sequence the script
emits. The test pulls the banner list from the script directly with
`grep -oE '^echo "\d+/\${TOTAL_GATES}: (.+)"$' scripts/acceptance.sh`
so the assertion follows the source of truth.
3. **Match every marker the script emits.** If the script emits `[PASS]`,
`[FAIL]`, AND `[SKIP]`, the parser must handle all three. The
crash-fallback path must NOT fire for legitimate skips.
4. **Order matters when index = listr2 row.** `MVP_GATES` order must
match script execution order — the dashboard advances rows by index
as banners stream in.

Anti-pattern: a "we'll keep them in sync manually" comment without an
enforcement test. The 9-gate / 17-gate drift sat in `main` undetected
because no CI surface failed when the script grew. Surfacing it
required an operator to run `codehub bench` and notice the visual
mismatch.

Cross-link: the `dogfood-prepush-hook-caught-cli-spec-mismatch` durable
lesson covers a related pattern — the dogfood pre-push hook on this
exact PR was where this bug was first surfaced (Bug #4 in
UPSTREAM_BUGS.md, 2026-05-10 smoke).
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
name: Cherry-pick verified bug fixes from a sibling testbed clone
description: When a sibling/post-filter checkout has authored fix commits with file:line repro coordinates, fetch the sibling and cherry-pick directly — no need to re-author or re-test on upstream
type: best-practices
---

When you maintain a "post-filter testbed" sibling repo for smoke / dogfood
campaigns and you've already authored fix commits there with verified
repros, do not re-write the fixes on upstream. Fetch the sibling as a
local remote and cherry-pick.

**Why:** The fix has already been authored, repro'd, verified. Re-authoring
on upstream loses authorship metadata, doubles review surface, and
introduces drift between what was fixed and what landed. Re-testing
re-validates the same green path. The cherry-pick is provably equivalent
when the file:line coordinates in the fix message match upstream HEAD.

**How to apply:**

1. **Verify file:line parity first.** Each fix in the testbed report
should cite file paths and line numbers; quickly grep upstream to
confirm the same lines exist there. Per Bug #2 in OCH 2026-05-10
campaign: `packages/cli/src/commands/scan.ts:162-171` was identical in
testbed and upstream — direct cherry-pick worked.
2. **Fetch the sibling as a path remote.** No need to register it
permanently. One-shot:
```bash
git fetch /efs/lalsaado/workplace/opencodehub.post-filter --no-tags
```
`FETCH_HEAD` now points at the sibling's HEAD; commits referenced by
short-hash become resolvable.
3. **Cherry-pick in severity order.** HIGH first, MEDIUM next, LOW last.
Each pick is one commit; do not squash them into a "umbrella fix"
commit — preserves blame and lets the PR reviewer see one
self-contained fix per scope.
4. **Re-verify after each pick** with the package-scoped check:
`pnpm -F @opencodehub/<pkg> test` plus any smoke script the fix
targets (`bash scripts/smoke-mcp.sh`, `node ... doctor`, etc.).
5. **Prefer one PR for the bundle** when the fixes are small and
thematically related (a "v1 upstream bug sweep") — reviewer context
stays coherent. Split only if the bundle exceeds reviewability.

Anti-pattern: re-authoring the fix on upstream and citing the testbed
commit in the body. That loses the original commit's authorship and
makes blame point at the re-author for code that was thought-through
elsewhere. If you need to adapt the fix to upstream divergence, do that
as a follow-up commit on top of the cherry-pick, not a rewrite.

Related: this pairs naturally with the durable lesson "Squash-merge
masks pre-existing repo-wide debt" — run `mise run check` on upstream
BEFORE the cherry-pick to baseline-clean, so any test regression after
the pick is unambiguously attributable to the picked fix.
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,19 @@ LLM clients then pick up and start citing back; the leakage compounds.
("graphHash parity: medium-with-empty-keywords ([] vs absent)") over
the AC ("AC-C-2: graphHash..."). The behavior survives renames; AC
numbers don't.
- **Sweep before commit.** Run `rg -n "AC-[A-Z]-[0-9]" packages/`
- **Sweep before commit.** Run `rg -n "AC-[A-Z]-[0-9]" packages/ scripts/`
against your branch before PR-open. Anything that hits is a
candidate for rephrase. If the comment NEEDS to cite the AC, use a
short reference at the end like "(AC-C-5)" rather than leading with
it.
- **Sweep scope is `packages/` and `scripts/`, NOT `docs/adr/*`.** PR #74
(`f09d804`) carved out `docs/adr/*` as the explicit place where
coordinates ARE permanent decision rationale. A docs-refresh subagent
that sees the sweep regex without the scope qualifier will scrub
ADRs by default — DO NOT. Brief docs subagents explicitly that ADR
text retains coordinates. See the
`parallel-docs-subagent-overscrubs-adrs.md` lesson for the failure
mode.
- **The test fakes are the trap.** When a Wave subagent edits a test
fake, it tends to add `// AC-XXX: stubs ...` because it's writing
the comment WITH the AC packet open in front of it. Sweep test files
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
name: Parallel docs-refresh subagents must be told that ADR text is the carve-out where spec coordinates ARE allowed
description: When a docs-refresh subagent inherits the "no spec-coordinate leakage" rule from durable lessons, it will scrub ADR text by default — but PR #74 carved out docs/adr/* as the place where coordinates ARE the durable rationale; brief explicitly
type: best-practices
---

OCH PR #74 (`f09d804 chore(repo): scrub ERPAVal spec coordinates from
source`) explicitly retained spec coordinates in `docs/adr/*` as
"permanent decision rationale". The durable lesson
`no-spec-coordinate-leakage-into-source.md` documents the scrub but
does NOT crisply state the carve-out. When a parallel docs-refresh
subagent reads the durable lesson and is told "no spec-coordinate
leakage", it scrubs ADRs too — undoing PR #74's deliberate carve-out.

Observed in OCH session 6c091d (2026-05-10 v1 upstream bug sweep): the
docs-refresh subagent stripped `AC-A-1`, `AC-A-2`, `AC-A-6 a/b/c/d`,
`AC-A-7`, `AC-A-9`, `AC-A-11` from ADR 0013-m7 and `AC-C-3`, `AC-C-5`,
`E-C-3`, `W-A-2` from ADR 0014. Required a follow-up
`docs(docs): restore ADR-permanent spec coordinates per PR #74 policy`
commit on the same branch.

**Why:** the durable lesson's scope says "production source, JSDoc,
inline comments, CLI flag help, MCP tool option descriptions, test
names" — but the ADR carve-out lives only in PR #74's body. Subagents
read the lesson, not the PR archive. The carve-out is invisible to a
fresh agent.

**How to apply:**

1. **Brief docs subagents explicitly.** When seeding a docs-refresh
subagent prompt, include both rules:
- "No spec-coordinate prefixes in production source (per durable
lesson)."
- "ADR text is the carve-out: spec coordinates in `docs/adr/*` are
intentional permanent rationale per PR #74. Do NOT scrub them
there."
2. **Update the lesson itself.** Edit
`solutions/best-practices/no-spec-coordinate-leakage-into-source.md`
to add a "Scope" section that names `docs/adr/*` as the carve-out,
so future subagents reading the lesson see the constraint without
needing PR archaeology.
3. **Sweep with a scope-aware regex.** When auditing leakage, exclude
`docs/adr/*` from the sweep:
`rg -n 'AC-[A-Z]-[0-9]' packages/ scripts/`
not
`rg -n 'AC-[A-Z]-[0-9]'` (which would falsely flag ADRs).
4. **The reverse case is also valid.** `docs/adr/0014-*` originally
listed `.erpaval/specs/...` and `.erpaval/sessions/...` as
References — those paths are gitignored and rot once the packet
graduates. Replacing them with code-path citations IS correct, even
in ADR text. The carve-out is for spec-coordinate prefixes, not for
pointers to gitignored paths.

Anti-pattern: writing a generic "scrub spec coords everywhere" rule and
then surprised when ADR rationale gets vacuumed. The leakage rule
exists to prevent rot; ADR rationale doesn't rot because the ADR is
the rationale.

Cross-link:
[no-spec-coordinate-leakage-into-source](no-spec-coordinate-leakage-into-source.md) — the original rule.
PR #74 (`f09d804`) — the carve-out's authoritative source.
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
name: Tests for backend-precedence libraries must wipe all env keys in the precedence chain, not just the one they assert
description: When an SDK picks a backend by env presence (CODEHUB_EMBEDDING_SAGEMAKER_ENDPOINT, CODEHUB_EMBEDDING_URL, ...), tests of "backend X is picked when only X's env is set" must scope-stash every key in the chain, not only the local one
type: conventions
---

`packages/embedder/src/http-embedder.test.ts:441,458` asserted that
`tryOpenHttpEmbedder` returns `null` when its specific env var is unset.
The test only stashed `CODEHUB_HOME`. With
`CODEHUB_EMBEDDING_SAGEMAKER_ENDPOINT` exported in the operator's shell,
the higher-precedence SageMaker backend short-circuited, the assertion
flipped, and the test failed — but only on the specific dev box where
the operator was working through SageMaker integration.

The fix: a `sanitizeEmbeddingEnv()` helper that snapshots and wipes
every `CODEHUB_EMBEDDING_*` key plus `CODEHUB_HOME`, restored on
teardown via `beforeEach`/`afterEach`:

```ts
function sanitizeEmbeddingEnv() {
const saved: Record<string, string | undefined> = {};
for (const k of Object.keys(process.env)) {
if (k.startsWith("CODEHUB_EMBEDDING_") || k === "CODEHUB_HOME") {
saved[k] = process.env[k];
delete process.env[k];
}
}
return () => {
for (const [k, v] of Object.entries(saved)) {
if (v === undefined) delete process.env[k]; else process.env[k] = v;
}
};
}
```

**Why:** the backend-precedence pattern is a chain — env-X-set → backend-X,
else env-Y-set → backend-Y, else fallback. A test that asserts about
backend Y must explicitly clear backend-X's env, otherwise the assertion
silently tests the wrong code path under any operator who happens to
have backend-X configured. The failure is non-reproducible on a clean
laptop, fires on a dev box with the higher-precedence env exported.
This is exactly the env-leak class that bedevils CI-vs-local divergence
debugging.

**How to apply:**

1. **Identify the precedence chain.** For OCH embedder:
`CODEHUB_EMBEDDING_SAGEMAKER_ENDPOINT` → `CODEHUB_EMBEDDING_URL` →
`CODEHUB_EMBEDDING_*` (HTTP options) → `CODEHUB_HOME` (local ONNX).
Any test that asserts about backend selection must wipe the entire
chain, not just one key.
2. **Stash with a prefix glob, not a fixed key list.** `Object.keys`
filtered by `startsWith("CODEHUB_EMBEDDING_")` catches keys added
later (e.g. a future `CODEHUB_EMBEDDING_AZURE_*`) without revisiting
every test.
3. **Wire it as `beforeEach`/`afterEach`, not per-case try/finally.**
Easier to audit; harder to forget on the next case.
4. **Apply defensively to sibling describe blocks.** Even cases that
don't care about the env can be poisoned by stale state from a prior
test that mutated `process.env`. Hermetic test suites don't pay a
cost for being defensive.

Anti-pattern: per-case `originalKey = process.env[KEY]; ... finally
process.env[KEY] = originalKey` for a single key. The single-key save
worked when there was one env var; with a chain, every test that misses
a sibling key in the chain becomes flaky on operator boxes.

Cross-link: pairs with the existing `sagemaker-embedder-backend.md`
durable lesson — that one covers the SDK-side dynamic-import + soft-fail
pattern; this one covers the test-side env-hermeticity pattern that
that pattern requires.
31 changes: 31 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -90,3 +90,34 @@ jobs:
--onlyAllow 'Apache-2.0;MIT;BSD-2-Clause;BSD-3-Clause;ISC;CC0-1.0;BlueOak-1.0.0;0BSD'
--excludePrivatePackages
--production

# GitHub code-scanning advanced-setup is configured to look for an `osv`
# job in `ci.yml`. The standalone `osv.yml` workflow does the same scan,
# but the configured pointer lives here, so we mirror it: install
# osv-scanner, emit SARIF, upload to code-scanning, then fail the run on
# vulnerabilities. The standalone workflow remains for the weekly cron.
osv:
runs-on: ubuntu-latest
permissions:
contents: read
security-events: write
steps:
- uses: actions/checkout@v6
- name: Install osv-scanner
run: |
curl -sL -o /tmp/osv-scanner \
https://github.com/google/osv-scanner/releases/download/v2.3.8/osv-scanner_linux_amd64
chmod +x /tmp/osv-scanner
- name: Scan pnpm-lock.yaml (SARIF output)
run: |
/tmp/osv-scanner scan source \
--lockfile=pnpm-lock.yaml \
--format=sarif \
--output=osv.sarif || true
- uses: github/codeql-action/upload-sarif@v4
if: always()
with:
sarif_file: osv.sarif
category: osv-scanner
- name: Fail on vulnerabilities
run: /tmp/osv-scanner scan source --lockfile=pnpm-lock.yaml
10 changes: 6 additions & 4 deletions .github/workflows/osv.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
name: OSV-Scanner

# The push/pull_request runs were moved to `ci.yml`'s `osv` job so the
# GitHub code-scanning advanced-setup configuration (which is pinned to
# `ci.yml:osv`) finds them. This standalone workflow keeps the weekly
# cron + manual dispatch path so OSV advisory-data updates are picked
# up between PRs.
on:
push:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch:
schedule:
- cron: "33 5 * * 2"

Expand Down
Loading