Skip to content

chore: v1 upstream bug sweep + docs refresh#76

Merged
theagenticguy merged 10 commits into
mainfrom
chore/v1-upstream-bug-sweep
May 10, 2026
Merged

chore: v1 upstream bug sweep + docs refresh#76
theagenticguy merged 10 commits into
mainfrom
chore/v1-upstream-bug-sweep

Conversation

@theagenticguy
Copy link
Copy Markdown
Owner

Summary

V1-launch readiness sweep: cherry-picks three known-good upstream bug fixes from the post-filter testbed, closes two residual smoke gaps, and deeply refreshes the v1 docs against current reality.

Bug fixes (5 of 7 from UPSTREAM_BUGS.md)

Severity Bug Fix
HIGH (data corruption) #2codehub scan <path> ingested SARIF into operator's CWD instead of the scanned repo c43c5aa fix(cli): scan ingests SARIF into the scanned repo, not CWD
HIGH (CI gate) #3scripts/smoke-mcp.sh asserted EXPECTED_TOOLS=19; server registers 29 433f684 fix(repo): smoke-mcp asserts 29 tools, matching the v1.0 server
HIGH (CI dashboard) #4codehub bench surfaced 9 of 17 acceptance gates (some titles also stale) c5f9047 fix(cli): bench dashboard surfaces all 17 acceptance gates
MEDIUM #1 + #6codehub doctor false-WARN on tree-sitter / @duckdb / @LadybugDB under pnpm strict isolation; duckdb close() undefined on @duckdb/node-api@1.x c218c31 fix(cli): doctor resolves native bindings from owner workspaces
LOW (test hygiene) #7http-embedder.test.ts cases failed when CODEHUB_EMBEDDING_* env was set in operator's shell 317bdf1 fix(embedder): isolate http-embedder tests from operator env

Bug #5 (testbed-only pytest-timeout) does not apply upstream. Bug fixes #1+#6, #2, #3 are direct cherry-picks of def988b, 6924b1b, ec66d4a from the post-filter sibling — every changed file:line coordinate verified to match upstream HEAD before pick.

Spec-coordinate hygiene

  • fad766f — scrub AC-A-7 / AC-A-10 from scripts/m7-parity-audit.sh header (per the durable lesson; scripts are not ADRs).
  • e186aea — restore ADR-permanent spec coordinates in docs/adr/0013-m7-default-flip-and-abstraction.md and docs/adr/0014-scip-references-and-embedder-fingerprint.md after an earlier docs-sweep commit over-scrubbed them. Per PR chore(repo): scrub ERPAVal spec coordinates from source #74's carve-out, ADR text is the explicit place where coordinates ARE allowed.

Final sweep: rg -n 'AC-[A-Z]-[0-9]' packages/ scripts/ returns zero hits.

Docs refresh

  • 898192e — README: status flipped from "v0.1.0 initial public release" to "v1 — feature-complete on M1–M7" (the prerelease caveat stays since package.json is still 0.1.x); 28 → 29 MCP tools across the mermaid diagram, table heading, and mcp-package row; new "Parse runtime — WASM default" section cross-linking ADR 0013-parse-runtime-wasm-default.md; Repository Layout regenerated against ls packages/ (now 17 packages — adds cobol-proleap, frameworks, pack, policy, wiki; drops eval and gym with a sibling-testbed note); 14 → 15 GA languages (COBOL via regex provider); requirements bumped to Node 22-or-24; tool table expanded to enumerate the cross-repo federation tools and pack_codebase.
  • 69eac8f — ADR 0011 Proposed → Accepted; ADR 0013-m7 Proposed → Accepted; sibling-ADR cross-link banner on the duplicate-0013 collision (0013-parse-runtime-wasm-default.md and 0013-m7-default-flip-and-abstraction.md both landed concurrently); ADR 0014 References block swapped from .erpaval/specs/... (gitignored, will rot once packet graduates) to durable code-path citations.
  • edb362e — CHANGELOG [Unreleased] entry summarizing this PR; AGENTS.md 28 → 29 tools and a divergence banner where it intentionally drops session-local coordinates that CLAUDE.md still carries; OBJECTIVES.md tool count + language count + sibling-testbed note.

Validation

  • pnpm install --frozen-lockfile
  • mise run check (lint + typecheck + test + banned-strings + verdict) ✅
  • pnpm -F @opencodehub/cli test236/236 pass (was 235; +1 from the new [SKIP] parsing case in bench.test.ts)
  • pnpm -F @opencodehub/embedder test — 79 pass / 0 fail / 1 skipped
  • bash scripts/smoke-mcp.shPASS (29 tools listed)
  • node packages/cli/dist/index.js doctortree-sitter native binding: OK, duckdb native binding: OK, graph-db native binding: FAIL (real opt-in build status — the @ladybugdb/core binding is not installed on this dev box, which is what doctor is supposed to surface; the false-WARN this PR fixes is gone)
  • rg -n 'AC-[A-Z]-[0-9]' packages/ scripts/ — zero hits

Test plan

  • CI green on chore/v1-upstream-bug-sweep
  • codehub doctor reports OK on tree-sitter + duckdb in CI matrix (Node 22 + Node 24)
  • codehub scan /tmp/<fixture> ingests into <fixture> not CWD (manual verification on a downstream repo)
  • codehub bench table now renders all 17 rows, none stuck on "skipped — script crashed"
  • License audit / banned-strings / commitlint stay green

Out of scope

`codehub scan <path>` only forwarded the `--repo NAME` flag to its
inner `runIngestSarif` call. When operators passed a positional
`<path>` instead, ingest-sarif fell back to `process.cwd()`, so
findings landed in the operator's CWD repo graph rather than the
repo that was actually scanned.

Fix: pass the already-resolved `repoPath` through. `runIngestSarif`
treats absolute paths as a registry-name fallback (ingest-sarif.ts:
351-352), so this works for both `--repo NAME` and positional `<path>`
invocations.

Found during the 2026-05-10 overnight smoke campaign on a TS auth
fixture under /tmp/. See journal.md (Bug #2) for the repro.
Under pnpm strict isolation, `tree-sitter*` is a direct dep of
`packages/ingestion`, `@duckdb/node-api` of `packages/storage`, and
`@ladybugdb/core` of `packages/storage` — none are direct deps of
`packages/cli` or the workspace root. The previous resolveFromRoot()
only tried the CLI's own require chain and the root package.json,
both of which fail under pnpm. Result: doctor printed
`tree-sitter or tree-sitter-typescript not installed` and similar
WARNs even when bindings were healthy and `pnpm -r test` was green.

Fix: extend resolveFromRoot with a per-workspace fallback that maps
native package families to their owner workspace's package.json,
so `createRequire(<owner>/package.json).resolve(pkg)` reliably
walks into the .pnpm store.

Also fix a latent crash in duckdbWorksCheck: @duckdb/node-api 1.x
exposes Sync teardown helpers (`disconnectSync`, `closeSync`); the
async `.close()` was dropped. Probe both, prefer Sync when present.
The previous fall-through-to-WARN had been masking this.

Found during the 2026-05-10 smoke campaign. See journal.md
(Bug #1, Bug #6) for the repros.
The MCP server registers 29 tools at packages/mcp/src/server.ts:
list_repos, pack_codebase, query, context, impact, detect_changes,
rename, sql, group_list, group_query, group_status, group_contracts,
group_cross_repo_links, group_sync, project_profile, dependencies,
license_audit, owners, list_findings, list_findings_delta,
list_dead_code, remove_dead_code, scan, verdict, risk_trends,
route_map, api_impact, shape_check, tool_map.

scripts/smoke-mcp.sh still hard-coded EXPECTED_TOOLS=19, so
\`codehub bench\` reported the MCP-stdio gate as FAIL even when
the server was healthy. EXPECTED_TOOLS env-var override remains
for mid-migration windows.

Found during the 2026-05-10 smoke campaign. See journal.md (Bug #3).
Per the durable lesson "no spec-coordinate leakage into source" —
spec-coordinate prefixes belong in PR bodies and commit messages, not in
script comment headers where LLM clients pick them up and start citing
them back. Cleaned two stale references in scripts/m7-parity-audit.sh.
Before: MVP_GATES held 9 entries with stale titles
("graphHash determinism", "incremental reindex timings (soft)",
"Python eval harness") that diverged from the script banners,
so even those rows never advanced past pending. Net: 0 of 17
gates rendered correctly in `codehub bench`.

After: MVP_GATES mirrors all 17 banners from
scripts/acceptance.sh verbatim (gates 1-17 in script order),
with stable kebab-case ids. applyLine now also recognizes the
[SKIP] marker so graceful-degrade gates (eval, embeddings,
scanner-smoke, etc.) render as skipped rather than pending.
Updated unit tests to assert the new 17-gate roster, banner
format (N/17:), and SKIP-handling path.
The `tryOpenHttpEmbedder` describe block had two cases that asserted a
`null` return when the HTTP env vars (`CODEHUB_EMBEDDING_URL`,
`CODEHUB_EMBEDDING_MODEL`) were absent. The pre-fix `beforeEach` only
deleted those two keys, leaving the SageMaker family
(`CODEHUB_EMBEDDING_SAGEMAKER_ENDPOINT`) untouched. Because
`tryOpenHttpEmbedder` consults SageMaker env first, an operator shell
exporting `CODEHUB_EMBEDDING_SAGEMAKER_ENDPOINT` flipped the assertion
target from `null` to `Promise<Embedder>` and the cases failed.

Fix: introduce `sanitizeEmbeddingEnv()` — a snapshot-and-wipe helper
that walks `process.env` and removes every `CODEHUB_EMBEDDING_*` key at
test entry, returning a restorer the `afterEach` calls. Wire it into
the `readHttpEmbedderConfigFromEnv`, `openEmbedder factory`, and
`tryOpenHttpEmbedder` describe blocks so all three are hermetic against
operator-shell leakage.

Verified with:
  CODEHUB_EMBEDDING_SAGEMAKER_ENDPOINT=fake-endpoint \
    pnpm -F @opencodehub/embedder test  # 79 pass, 0 fail
  pnpm -F @opencodehub/embedder test     # 79 pass, 0 fail
…ckage list

- Status flips from "v0.1.0 initial release" to "v1 feature-complete on
  M1-M7" with the 0.1.1 tag still shipped pending 1.0.0 sign-off.
- MCP tool surface bumped 28 -> 29 tools (matches
  packages/mcp/src/server.ts), with the federation, pack, and remaining
  tools enumerated explicitly.
- Repository layout regenerated against `ls packages/` -- now lists 17
  packages (cobol-proleap, frameworks, pack, policy, wiki added; eval
  and gym dropped to a sibling testbed).
- 14 -> 15 GA languages (COBOL via the regex provider).
- New "Parse runtime" section mirrors CLAUDE.md: WASM default, native
  opt-in via OCH_NATIVE_PARSER=1, complexity phase still native.
- Quick start: Node 22 or 24 (was Node 20+), Python 3.12 only for the
  SCIP indexers.
…ults

- Flip ADR 0011 (LadybugDB phase-1) status from "Proposed" to "Accepted"
  -- M3 has merged. Add a forward link to ADR 0013 (M7 phase-2).
- Flip ADR 0013-m7 (default-flip + interface segregation) status to
  "Accepted" -- the Track A PR has merged.
- Cross-link the two ADR 0013 files (m7 default-flip + parse-runtime
  WASM default) -- both numbers landed concurrently on the same release;
  the next ADR uses 0014.
- Scrub session-local spec coordinates from ADR text so the docs read
  as durable architecture rationale, not work-tracking artefacts. The
  underlying decisions and code paths remain.
- CHANGELOG: add an [Unreleased] block summarizing this PR's bug sweep
  (cli/scan SARIF ingest, cli/doctor binding resolution, smoke-mcp
  29-tool assertion) and docs refresh.
- AGENTS: bump 28 -> 29 tools; drop session-local spec coordinates from
  the AMBIGUOUS_REPO worked example so AGENTS.md reads cleanly as a
  contributor reference (CLAUDE.md keeps the original prose for now).
- OBJECTIVES: bump 28 -> 29 tools, 14 -> 15 GA languages, note that the
  retrieval / F1 gym is now a sibling testbed.
PR #74 (`f09d804`) explicitly carved out `docs/adr/*` as the place
where ERPAVal spec coordinates ARE allowed: "ADR text and docs/adr/*
files retain coordinates where they cite the permanent decision
rationale". Commit 69eac8f over-scrubbed those references.

Restored:
- ADR 0013-m7: AC-A-1 / AC-A-2 / AC-A-6 (a-d) / AC-A-7 / AC-A-9 /
  AC-A-11 in section headers + body, the four-row sub-commit table
  with sub-commit IDs, the W-M3-1 byte-identity invariant citation,
  and the architecture-revised.md spec-cross-link block.
- ADR 0014: AC-C-3, AC-C-5, E-C-3, W-A-2 in section headers, hint
  strings, and the alternatives section.

Kept from 69eac8f:
- ADR 0011 + 0013-m7 status flips (Proposed -> Accepted) since both
  PRs have merged.
- Sibling-ADR cross-link banner on the duplicate 0013 collision.
- ADR 0014 References block stays as code paths (the gitignored
  .erpaval/specs/... and .erpaval/sessions/... entries rot once the
  packet graduates -- swap is per the no-spec-coordinate-leakage
  durable lesson, with code paths as the durable substitute).
@theagenticguy theagenticguy merged commit c67294e into main May 10, 2026
32 checks passed
@theagenticguy theagenticguy deleted the chore/v1-upstream-bug-sweep branch May 10, 2026 16:47
theagenticguy added a commit that referenced this pull request May 10, 2026
## Summary

Compound phase from session-6c091d (PR #76). Four new durable lessons
extracted from the v1 upstream bug sweep, plus a clarification of the
existing leakage lesson's sweep scope.

### New lessons

| File | Category | Surfaced by |
|---|---|---|
| `cherry-pick-from-sibling-testbed.md` | best-practices | Whole
campaign — fetched the post-filter sibling, picked 3 fix commits
directly |
| `bench-dashboard-acceptance-script-parity.md` | architecture-patterns
| Bug #4 — dashboard parsed banners by exact-string match; 9-of-17 gates
rendered |
| `test-env-hermeticity-for-backend-precedence.md` | conventions | Bug
#7 — `CODEHUB_EMBEDDING_*` precedence chain leaked from operator's shell
|
| `parallel-docs-subagent-overscrubs-adrs.md` | best-practices | The
docs subagent stripped AC-* from `docs/adr/0013-m7` and `0014` despite
PR #74's ADR carve-out — required a follow-up restore commit |

### Updated

- `no-spec-coordinate-leakage-into-source.md` — added a "Sweep scope is
`packages/` and `scripts/`, NOT `docs/adr/*`" rule that names PR #74's
carve-out, so future subagents reading the lesson see the constraint
without PR archaeology.
- `INDEX.md` — pointers for the four new lessons.

## Test plan

- [ ] CI green on `chore/v1-compound-lessons`
- [ ] No spec-coordinate leakage in source: `rg -n 'AC-[A-Z]-[0-9]'
packages/ scripts/` returns zero hits.
- [ ] Future ERPAVal sessions that load `INDEX.md` at session start
surface these four lessons.
theagenticguy added a commit that referenced this pull request May 10, 2026
## Summary

V1-launch readiness sweep: cherry-picks three known-good upstream bug
fixes from the post-filter testbed, closes two residual smoke gaps, and
deeply refreshes the v1 docs against current reality.

### Bug fixes (5 of 7 from UPSTREAM_BUGS.md)

| Severity | Bug | Fix |
|---|---|---|
| HIGH (data corruption) | #2 — `codehub scan <path>` ingested SARIF
into operator's CWD instead of the scanned repo | `c43c5aa fix(cli):
scan ingests SARIF into the scanned repo, not CWD` |
| HIGH (CI gate) | #3 — `scripts/smoke-mcp.sh` asserted
EXPECTED_TOOLS=19; server registers 29 | `433f684 fix(repo): smoke-mcp
asserts 29 tools, matching the v1.0 server` |
| HIGH (CI dashboard) | #4 — `codehub bench` surfaced 9 of 17 acceptance
gates (some titles also stale) | `c5f9047 fix(cli): bench dashboard
surfaces all 17 acceptance gates` |
| MEDIUM | #1 + #6 — `codehub doctor` false-WARN on tree-sitter /
@duckdb / @LadybugDB under pnpm strict isolation; `duckdb close()`
undefined on `@duckdb/node-api@1.x` | `c218c31 fix(cli): doctor resolves
native bindings from owner workspaces` |
| LOW (test hygiene) | #7 — `http-embedder.test.ts` cases failed when
`CODEHUB_EMBEDDING_*` env was set in operator's shell | `317bdf1
fix(embedder): isolate http-embedder tests from operator env` |

Bug #5 (testbed-only pytest-timeout) does not apply upstream. Bug fixes
#1+#6, #2, #3 are direct cherry-picks of `def988b`, `6924b1b`, `ec66d4a`
from the post-filter sibling — every changed file:line coordinate
verified to match upstream HEAD before pick.

### Spec-coordinate hygiene
- `fad766f` — scrub `AC-A-7` / `AC-A-10` from
`scripts/m7-parity-audit.sh` header (per the durable lesson; scripts are
not ADRs).
- `e186aea` — restore ADR-permanent spec coordinates in
`docs/adr/0013-m7-default-flip-and-abstraction.md` and
`docs/adr/0014-scip-references-and-embedder-fingerprint.md` after an
earlier docs-sweep commit over-scrubbed them. Per PR #74's carve-out,
ADR text is the explicit place where coordinates ARE allowed.

Final sweep: `rg -n 'AC-[A-Z]-[0-9]' packages/ scripts/` returns zero
hits.

### Docs refresh
- `898192e` — README: status flipped from "v0.1.0 initial public
release" to "v1 — feature-complete on M1–M7" (the prerelease caveat
stays since `package.json` is still `0.1.x`); 28 → 29 MCP tools across
the mermaid diagram, table heading, and mcp-package row; new "Parse
runtime — WASM default" section cross-linking ADR
`0013-parse-runtime-wasm-default.md`; Repository Layout regenerated
against `ls packages/` (now 17 packages — adds `cobol-proleap`,
`frameworks`, `pack`, `policy`, `wiki`; drops `eval` and `gym` with a
sibling-testbed note); 14 → 15 GA languages (COBOL via regex provider);
requirements bumped to Node 22-or-24; tool table expanded to enumerate
the cross-repo federation tools and `pack_codebase`.
- `69eac8f` — ADR 0011 `Proposed → Accepted`; ADR 0013-m7 `Proposed →
Accepted`; sibling-ADR cross-link banner on the duplicate-0013 collision
(`0013-parse-runtime-wasm-default.md` and
`0013-m7-default-flip-and-abstraction.md` both landed concurrently); ADR
0014 References block swapped from `.erpaval/specs/...` (gitignored,
will rot once packet graduates) to durable code-path citations.
- `edb362e` — CHANGELOG `[Unreleased]` entry summarizing this PR;
AGENTS.md 28 → 29 tools and a divergence banner where it intentionally
drops session-local coordinates that CLAUDE.md still carries;
OBJECTIVES.md tool count + language count + sibling-testbed note.

## Validation

- `pnpm install --frozen-lockfile` ✅
- `mise run check` (lint + typecheck + test + banned-strings + verdict)
✅
- `pnpm -F @opencodehub/cli test` — **236/236** pass (was 235; +1 from
the new `[SKIP]` parsing case in `bench.test.ts`)
- `pnpm -F @opencodehub/embedder test` — 79 pass / 0 fail / 1 skipped
- `bash scripts/smoke-mcp.sh` — **PASS (29 tools listed)**
- `node packages/cli/dist/index.js doctor` — `tree-sitter native
binding: OK`, `duckdb native binding: OK`, `graph-db native binding:
FAIL` (real opt-in build status — the `@ladybugdb/core` binding is not
installed on this dev box, which is what `doctor` is supposed to
surface; the false-WARN this PR fixes is gone)
- `rg -n 'AC-[A-Z]-[0-9]' packages/ scripts/` — zero hits

## Test plan

- [ ] CI green on `chore/v1-upstream-bug-sweep`
- [ ] `codehub doctor` reports OK on tree-sitter + duckdb in CI matrix
(Node 22 + Node 24)
- [ ] `codehub scan /tmp/<fixture>` ingests into `<fixture>` not CWD
(manual verification on a downstream repo)
- [ ] `codehub bench` table now renders all 17 rows, none stuck on
"skipped — script crashed"
- [ ] License audit / banned-strings / commitlint stay green

## Out of scope

- Bug #5 (testbed-only pytest-timeout). Listed for reference in
UPSTREAM_BUGS.md; does not affect upstream.
theagenticguy added a commit that referenced this pull request May 10, 2026
## Summary

Compound phase from session-6c091d (PR #76). Four new durable lessons
extracted from the v1 upstream bug sweep, plus a clarification of the
existing leakage lesson's sweep scope.

### New lessons

| File | Category | Surfaced by |
|---|---|---|
| `cherry-pick-from-sibling-testbed.md` | best-practices | Whole
campaign — fetched the post-filter sibling, picked 3 fix commits
directly |
| `bench-dashboard-acceptance-script-parity.md` | architecture-patterns
| Bug #4 — dashboard parsed banners by exact-string match; 9-of-17 gates
rendered |
| `test-env-hermeticity-for-backend-precedence.md` | conventions | Bug
#7 — `CODEHUB_EMBEDDING_*` precedence chain leaked from operator's shell
|
| `parallel-docs-subagent-overscrubs-adrs.md` | best-practices | The
docs subagent stripped AC-* from `docs/adr/0013-m7` and `0014` despite
PR #74's ADR carve-out — required a follow-up restore commit |

### Updated

- `no-spec-coordinate-leakage-into-source.md` — added a "Sweep scope is
`packages/` and `scripts/`, NOT `docs/adr/*`" rule that names PR #74's
carve-out, so future subagents reading the lesson see the constraint
without PR archaeology.
- `INDEX.md` — pointers for the four new lessons.

## Test plan

- [ ] CI green on `chore/v1-compound-lessons`
- [ ] No spec-coordinate leakage in source: `rg -n 'AC-[A-Z]-[0-9]'
packages/ scripts/` returns zero hits.
- [ ] Future ERPAVal sessions that load `INDEX.md` at session start
surface these four lessons.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant