From b61633d680053c2436cb70aa946efcf948a91ae5 Mon Sep 17 00:00:00 2001 From: Laith Al-Saadoon <9553966+theagenticguy@users.noreply.github.com> Date: Mon, 27 Apr 2026 15:22:33 -0700 Subject: [PATCH 1/7] feat(plugin): ship artifact factory (spec 001 P0) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds the artifact-generation skill family inside plugins/opencodehub/ that turns the code graph into committed Markdown via the codeprobe-pattern 4-phase orchestration. Four new skills (all P0 per spec 001): - codehub-document — primary generator, single + group mode - codehub-pr-description — PR body from detect_changes/verdict/owners - codehub-onboarding — ONBOARDING.md with centrality-ranked reading order - codehub-contract-map — group-only cross-repo contract matrix Six new doc-* subagents (8-section scaffold, sonnet): doc-{architecture,reference,behavior,analysis,diagrams,cross-repo} New PostToolUse hook emits a non-blocking systemMessage after git mutations when .codehub/docs/.docmeta.json is stale. Never auto- regenerates — regeneration requires user consent. Starlight docs get a new Skills category with per-skill pages and an index, plus a reference page for the .docmeta.json schema. ADRs 0007 (artifact factory), 0008 (codeprobe pattern port), 0009 (output conventions) record the load-bearing decisions and are indexed in architecture/adrs.md. Scope exclusions locked durably (not timeline): - no hosted/managed/SaaS tier - no remote/HTTP MCP server (stdio-only) - no agent SDK (python/ts/claude-hooks) - no grounding_pack compositor tool - no own coding agent; no LLM-based PR review; no IDE/LSP Spec 001 and spec 002 (CI-action surface, deferred P1) committed under .erpaval/specs/. Roadmap SPA at .erpaval/roadmap/ reflects locked scope. --- .../001-opencodehub-next-strategy.md | 57 ++ .../002-opencodehub-artifact-skills-prd.md | 155 +++++ .../003-opencodehub-skill-interface-design.md | 113 ++++ .../004-opencodehub-subagent-prompts.md | 402 +++++++++++ .../005-opencodehub-output-conventions.md | 158 +++++ .../brainstorms/006-synthesis-whats-next.md | 139 ++++ .../007-agents-at-scale-strategy.md | 84 +++ .../brainstorms/008-agent-grounding-prd.md | 223 +++++++ .../009-grounding-plane-interfaces.md | 372 +++++++++++ .erpaval/brainstorms/010-agent-sdk-design.md | 361 ++++++++++ .../011-ci-integration-playbook.md | 233 +++++++ .../brainstorms/012-competitive-landscape.md | 162 +++++ .../013-synthesis-v2-two-surface-product.md | 160 +++++ .erpaval/roadmap/app.js | 408 ++++++++++++ .erpaval/roadmap/data.js | 622 ++++++++++++++++++ .erpaval/roadmap/index.html | 213 ++++++ .erpaval/roadmap/styles.css | 404 ++++++++++++ .../001-claude-code-artifact-surface/spec.md | 91 +++ .../specs/002-agent-grounding-plane/spec.md | 124 ++++ docs/adr/0007-artifact-factory.md | 110 ++++ docs/adr/0008-codeprobe-pattern-port.md | 131 ++++ docs/adr/0009-artifact-output-conventions.md | 253 +++++++ packages/docs/astro.config.mjs | 4 + .../src/content/docs/architecture/adrs.md | 52 ++ .../content/docs/reference/docmeta-schema.mdx | 98 +++ .../docs/skills/codehub-contract-map.mdx | 89 +++ .../content/docs/skills/codehub-document.mdx | 121 ++++ .../docs/skills/codehub-onboarding.mdx | 86 +++ .../docs/skills/codehub-pr-description.mdx | 72 ++ .../docs/src/content/docs/skills/index.mdx | 84 +++ plugins/opencodehub/agents/doc-analysis.md | 69 ++ .../opencodehub/agents/doc-architecture.md | 70 ++ plugins/opencodehub/agents/doc-behavior.md | 62 ++ plugins/opencodehub/agents/doc-cross-repo.md | 69 ++ plugins/opencodehub/agents/doc-diagrams.md | 69 ++ plugins/opencodehub/agents/doc-reference.md | 66 ++ plugins/opencodehub/hooks.json | 7 +- plugins/opencodehub/hooks/docs-staleness.sh | 45 ++ .../skills/codehub-contract-map/SKILL.md | 144 ++++ .../skills/codehub-document/SKILL.md | 107 +++ .../references/cross-reference-spec.md | 105 +++ .../references/data-source-map.md | 101 +++ .../references/document-templates.md | 347 ++++++++++ .../references/mermaid-patterns.md | 181 +++++ .../skills/codehub-onboarding/SKILL.md | 111 ++++ .../skills/codehub-pr-description/SKILL.md | 122 ++++ .../skills/opencodehub-guide/SKILL.md | 19 +- 47 files changed, 7273 insertions(+), 2 deletions(-) create mode 100644 .erpaval/brainstorms/001-opencodehub-next-strategy.md create mode 100644 .erpaval/brainstorms/002-opencodehub-artifact-skills-prd.md create mode 100644 .erpaval/brainstorms/003-opencodehub-skill-interface-design.md create mode 100644 .erpaval/brainstorms/004-opencodehub-subagent-prompts.md create mode 100644 .erpaval/brainstorms/005-opencodehub-output-conventions.md create mode 100644 .erpaval/brainstorms/006-synthesis-whats-next.md create mode 100644 .erpaval/brainstorms/007-agents-at-scale-strategy.md create mode 100644 .erpaval/brainstorms/008-agent-grounding-prd.md create mode 100644 .erpaval/brainstorms/009-grounding-plane-interfaces.md create mode 100644 .erpaval/brainstorms/010-agent-sdk-design.md create mode 100644 .erpaval/brainstorms/011-ci-integration-playbook.md create mode 100644 .erpaval/brainstorms/012-competitive-landscape.md create mode 100644 .erpaval/brainstorms/013-synthesis-v2-two-surface-product.md create mode 100644 .erpaval/roadmap/app.js create mode 100644 .erpaval/roadmap/data.js create mode 100644 .erpaval/roadmap/index.html create mode 100644 .erpaval/roadmap/styles.css create mode 100644 .erpaval/specs/001-claude-code-artifact-surface/spec.md create mode 100644 .erpaval/specs/002-agent-grounding-plane/spec.md create mode 100644 docs/adr/0007-artifact-factory.md create mode 100644 docs/adr/0008-codeprobe-pattern-port.md create mode 100644 docs/adr/0009-artifact-output-conventions.md create mode 100644 packages/docs/src/content/docs/reference/docmeta-schema.mdx create mode 100644 packages/docs/src/content/docs/skills/codehub-contract-map.mdx create mode 100644 packages/docs/src/content/docs/skills/codehub-document.mdx create mode 100644 packages/docs/src/content/docs/skills/codehub-onboarding.mdx create mode 100644 packages/docs/src/content/docs/skills/codehub-pr-description.mdx create mode 100644 packages/docs/src/content/docs/skills/index.mdx create mode 100644 plugins/opencodehub/agents/doc-analysis.md create mode 100644 plugins/opencodehub/agents/doc-architecture.md create mode 100644 plugins/opencodehub/agents/doc-behavior.md create mode 100644 plugins/opencodehub/agents/doc-cross-repo.md create mode 100644 plugins/opencodehub/agents/doc-diagrams.md create mode 100644 plugins/opencodehub/agents/doc-reference.md create mode 100755 plugins/opencodehub/hooks/docs-staleness.sh create mode 100644 plugins/opencodehub/skills/codehub-contract-map/SKILL.md create mode 100644 plugins/opencodehub/skills/codehub-document/SKILL.md create mode 100644 plugins/opencodehub/skills/codehub-document/references/cross-reference-spec.md create mode 100644 plugins/opencodehub/skills/codehub-document/references/data-source-map.md create mode 100644 plugins/opencodehub/skills/codehub-document/references/document-templates.md create mode 100644 plugins/opencodehub/skills/codehub-document/references/mermaid-patterns.md create mode 100644 plugins/opencodehub/skills/codehub-onboarding/SKILL.md create mode 100644 plugins/opencodehub/skills/codehub-pr-description/SKILL.md diff --git a/.erpaval/brainstorms/001-opencodehub-next-strategy.md b/.erpaval/brainstorms/001-opencodehub-next-strategy.md new file mode 100644 index 00000000..9342649e --- /dev/null +++ b/.erpaval/brainstorms/001-opencodehub-next-strategy.md @@ -0,0 +1,57 @@ +# OpenCodeHub — What's Next: A Strategy Thesis + +*Author: CSO pass, 2026-04-27. Audience: Laith (owner). Frame: Rumelt kernel (diagnosis → guiding policy → coherent actions).* + +## 1. Diagnosis + +OpenCodeHub has shipped a remarkable retrieval surface: 28 MCP tools across six families, five prompts, a six-skill Claude Code plugin with PreToolUse context injection and PostToolUse reindex hooks, an Astro Starlight doc site with llms.txt and per-page Copy-as-Markdown / Open-in-Claude affordances, and — crucially — a working LLM-assisted wiki generator (`codehub wiki --llm`) that already routes through `@opencodehub/summarizer` and Bedrock Converse to emit Markdown. The single-repo version of "produce good docs from the graph" is *already solved* in this repo, buried one layer below the agent. + +And yet: Claude Code users do not spontaneously produce artifacts from OpenCodeHub. The plugin's six skills are all analysis-flavored (guide, exploring, impact-analysis, debugging, refactoring, pr-review). None of them emit a document. The five slash commands (`/probe`, `/verdict`, `/audit-deps`, `/rename`, `/owners`) are investigative, not generative. The `generate-map` MCP prompt sketches an ARCHITECTURE.md but no command or skill invokes it. The wiki generator is CLI-only and therefore invisible to an agent session. The cross-repo `group_*` tools retrieve but do not synthesize. + +The symptom is "low artifact throughput." The root cause is a **missing artifact layer**: OpenCodeHub has retrieval primitives and a group abstraction, and it has a synthesis engine (`summarizer` + wiki generator), but nothing in the Claude Code surface area connects them. Users would have to know the CLI, know the summarizer exists, know the group tools exist, and hand-compose the orchestration themselves. + +**The crux is (a) no artifact-producing skill exists in the plugin — and specifically, no skill that operates at the *group* level.** Candidates (b) and (c) are downstream of (a): exposing the wiki generator as MCP only matters if something drives it; group synthesis only matters if a skill calls for it. (d) and (e) are framing issues that dissolve once a concrete artifact skill ships. Solve (a) at the group level and the rest become execution. *Assumption: codeprobe's `/document` pattern generalizes — the same 4-phase orchestration (precompute → parallel doc-* subagents → cross-ref assembler) holds at the group level because the shared-context file pattern is topology-agnostic.* + +## 2. Guiding policy + +**Ship the codeprobe `/document` pattern at the group level — an artifact-producing skill family backed by a group-synthesis MCP surface, with the existing wiki generator as the per-repo building block.** + +This is the wedge. codeprobe owns single-repo documentation; OpenCodeHub owns *multi-repo* documentation by inheriting codeprobe's orchestration and adding the only thing codeprobe structurally cannot do — cross-repo synthesis over the group graph. The group_* tools become the retrieval substrate; the wiki generator becomes the per-repo leaf; a new group-synthesis MCP tool becomes the cross-repo join; a new skill family becomes the user-facing driver. + +**What this policy rules out:** + +- **No new retrieval tools until the artifact layer ships.** The surface is already large. Adding more retrieval without synthesis deepens the gap between capability and workflow. +- **No web UI, no hosted service, no non-Claude-Code clients.** Every surface ships as MCP tool, slash command, or skill. If Claude Code cannot drive it, it does not exist this quarter. +- **No analysis-flavored skills.** The next six skills must all emit artifacts to disk. +- **No head-on competition with Copilot / Cursor.** Those tools do not do multi-repo artifact synthesis from a cross-repo graph. That is the defensible seam. + +## 3. Coherent actions + +All actions mutually reinforce the policy: each produces an artifact or exposes the synthesis path Claude Code needs to generate one. Priority tiers: **P0** = this quarter, **P1** = next, **P2** = later. + +**A. [P0, single-track] Ship `/document-group` skill at `plugins/opencodehub/skills/opencodehub-document-group/`.** The flagship. Mirrors codeprobe's `/document` 4-phase orchestration (Phase 0 precompute → AB parallel doc-* subagents → CD round-two → E inline cross-ref assembler), but the unit of work is a *group* not a repo. Emits an `.opencodehub/docs/` tree rooted at the group: `00-group-overview.md`, `10-/*.md` per member repo, `90-contracts.md` for inter-repo contracts, `99-glossary.md`. Uses `references/*.md` for progressive disclosure (document-templates, group-data-source-map, cross-repo-reference-spec, mermaid-patterns). Reads shared-context files `.context.md` and `.opencodehub-prefetch.md` written by Phase 0. Depends on action B. + +**B. [P0, single-track, unblocks A] Expose group synthesis as MCP: `group_wiki` and `group_synthesize`.** `group_wiki` lifts the existing `generateWiki` from the CLI into an MCP tool that fans out across group members and returns a manifest of per-repo Markdown paths; `group_synthesize` consumes `group_contracts` + `group_query` output and emits the cross-repo join sections (shared types, boundary contracts, call graphs that cross repo lines). Both tools live in `packages/mcp-server/`. *Assumption: `@opencodehub/summarizer` is stateless enough to be invoked from an MCP worker — the CLI path proves the Bedrock wiring.* + +**C. [P0, parallel-track with A] `.opencodehub-prefetch.md` precompute contract.** A typed, deterministic prefetch file written by Phase 0 of `/document-group`. Contains: group manifest, per-repo symbol counts, cross-repo edges, owners map, top modules by in-degree. Every doc-* subagent reads it — prompt dedup via filesystem, same pattern codeprobe validated. Schema and writer live in `packages/analysis/src/prefetch.ts`. + +**D. [P0, parallel-track with A] `.docmeta.json` sidecar + `--refresh` semantics.** Compare source-artifact mtimes (graph hash, prefetch hash, member repo HEADs) to section mtimes; regenerate only stale sections. Reuses codeprobe's sidecar shape verbatim. Drives freshness. + +**E. [P1, deferred-blocked-by A, B] PostToolUse hook: auto-refresh group docs on `git commit`/`merge`.** Extend `plugins/opencodehub/hooks.json`. On commit-to-main in any group member, enqueue `/document-group --refresh` for that group. Makes freshness free. Compete on *documentation freshness* as the second-order moat. + +**F. [P1, parallel-tracks] Three satellite artifact skills.** `/document-repo` (single-repo, wraps the existing wiki generator — proves the pattern before committing to group scope at scale), `/document-contracts` (API/type boundaries only — the thinnest useful slice of group synthesis), `/document-onboarding` (new-engineer "read these N files in this order" walkthrough generated from graph centrality). Each under 300 lines of skill prose; all reuse the prefetch contract from C. + +**G. [P1, single-track] Discoverability: rewrite `plugins/opencodehub/README.md` around artifacts-first.** Lead with "OpenCodeHub produces docs for Claude Code, it doesn't just retrieve them." Update `/probe` help text to surface `/document-group` as the next step. Update the Starlight doc site landing page. *Assumption: llms.txt should also list the artifact commands, not just retrieval tools — mirror the wedge in the LLM-consumable index.* + +## 4. What we are not doing + +- **No web UI or hosted dashboard.** Claude Code is the client. Every feature ships as MCP / slash command / skill or it doesn't ship. +- **No more retrieval tools this quarter.** 28 is enough. Stop expanding the surface until the artifact layer catches up. +- **No indexer rewrite in Rust or Go.** Commit f8454b5 (cross-node batching + worker pool) bought the perf headroom; further performance work is premature. +- **No non-Claude-Code LLM integrations.** No Cursor plugin, no Continue.dev adapter, no OpenAI Assistants flavor. The wedge is Claude Code + cross-repo; splitting focus across clients forfeits it. +- **No direct competition with Copilot / Cursor on inline autocomplete or chat.** Those tools are agentic editors; OpenCodeHub is an artifact factory. Staying out of their lane is the point. +- **No generative-UX experiments (diagrams-as-a-service, custom viewers, embedded iframes).** Mermaid in Markdown is sufficient. Markdown is the format Claude Code already writes best. + +## 5. One-sentence strategy thesis + +**OpenCodeHub's wedge is becoming the artifact factory for Claude Code at the group level: ship a `/document-group` skill backed by `group_wiki` + `group_synthesize` MCP tools that lift codeprobe's single-repo documentation pattern into multi-repo cross-repo synthesis, and compete on documentation freshness via commit-driven auto-refresh.** diff --git a/.erpaval/brainstorms/002-opencodehub-artifact-skills-prd.md b/.erpaval/brainstorms/002-opencodehub-artifact-skills-prd.md new file mode 100644 index 00000000..08b13a3b --- /dev/null +++ b/.erpaval/brainstorms/002-opencodehub-artifact-skills-prd.md @@ -0,0 +1,155 @@ +# OpenCodeHub Artifact-Generation Skills — PRD + +**Owner:** Laith Al-Saadoon (AGS Tech AI Engineering NAMER) +**Status:** Draft v1 — 2026-04-27 +**Surface:** `plugins/opencodehub/` (Claude Code plugin) + +--- + +## Problem + +OpenCodeHub ships 28 MCP tools and a Claude Code plugin that covers five *analytical* commands (`/probe`, `/verdict`, `/audit-deps`, `/rename`, `/owners`) plus six exploration skills. Every current surface answers questions at the speed of chat. None of them produce a committed Markdown artifact — the durable unit of output that Principal engineers actually ship: ADRs, architecture maps, onboarding guides, PR descriptions, release notes, cross-repo contract matrices. + +Two concrete gaps for Claude-Code-as-artifact-producer: + +1. **Artifacts are invisible.** `codehub wiki --llm` already exists inside the CLI (Bedrock + `@opencodehub/summarizer`) and emits Markdown, but it is not wrapped as an MCP tool, not reachable from Claude Code, and not composed with the graph queries an agent has just run. The committed-file workflow lives only in the terminal, behind a flag nobody invokes. +2. **The multi-repo lever is idle.** The `group_list` / `group_query` / `group_status` / `group_contracts` / `group_sync` tools are the single feature no other code-graph tool has (codeprobe is single-repo, GitNexus is single-repo, SCIP graphs are per-package). Yet there is zero plugin surface that synthesizes a cross-repo artifact. Platform architects still hand-draw the same contract-drift diagrams every quarter. + +The codeprobe `/document` skill proved the pattern works: single skill, 8 parallel subagents, 33 cross-linked Markdown files, `.docmeta.json` sidecar, source citations with LOC, Mermaid instead of PNG. We port it, adapt it to OpenCodeHub's graph + supply-chain tools, and extend it with group mode. + +--- + +## Users and job stories + +**Repo onboarder** (new contributor or an LLM inheriting the repo). +*When* I clone a repo or get assigned a group, *I want* auto-generated onboarding docs tied to the current graph, *so that* I navigate the system without the 3-day ramp. **Bad outcome to avoid:** a stale handwritten `ONBOARDING.md` that cites deleted files and teaches me the wrong entry points. + +**Platform architect** (owns a cross-repo group — e.g., `gts-platform`). +*When* a contract changes across repos, *I want* a cross-repo architecture artifact regenerated from `group_contracts` + `group_query`, *so that* my design reviews don't drift off the true consumer/producer topology. **Bad outcome to avoid:** approving a breaking change because my mental model of who calls `/v1/verdict` was six months out of date. + +**Release manager / PR author**. +*When* I cut a PR or a release, *I want* draft-quality Markdown generated from `detect_changes` + `verdict` + `owners` + `list_findings_delta`, *so that* I stop re-explaining what shipped. **Bad outcome to avoid:** a PR description that says "refactor" while `impact` shows tier-2 blast radius on three services. + +--- + +## Solution shape + +One primary skill — **`/codehub-map`** — plus a cluster of four specialized skills. I chose `map` over `document` for three reasons: (1) `/document` collides with codeprobe verbatim, which will confuse users running both plugins; (2) "map" foregrounds the graph-origin of the artifacts (this is a *map of the code graph*, not prose description); (3) it scales cleanly to group mode ("map this group"). + +| Skill | Invocation | Argument hint | Precondition | Output path(s) | Primary MCP tools | Shared-context phases reused | +|---|---|---|---|---|---|---| +| **`/codehub-map`** (P0) | `/codehub-map` | `[output-dir] [--group ] [--section ] [--since ] [--committed] [--refresh]` | `codehub status` is fresh; in group mode, every repo in the group is fresh | `.codehub/docs/` (repo) or `.codehub/groups//docs/` (group) | `list_repos`, `project_profile`, `query`, `context`, `impact`, `owners`, `route_map`, `tool_map`, `risk_trends`, `group_contracts`, `group_query`, `group_status`, `list_findings`, `license_audit`, `verdict` | Phases 0 / AB / CD / E — full | +| **`/codehub-pr-description`** (P0) | `/codehub-pr-description` | `[--base ] [--out ]` | Repo has uncommitted or PR-range changes | `.codehub/pr/PR-.md` (default) or user-supplied | `detect_changes`, `verdict`, `owners`, `list_findings_delta`, `api_impact`, `shape_check` | Phase 0 only (lightweight precompute) | +| **`/codehub-onboarding`** (P0) | `/codehub-onboarding` | `[--committed] [--group ]` | Index is fresh | `.codehub/ONBOARDING.md` or `docs/ONBOARDING.md` with `--committed` | `project_profile`, `route_map`, `tool_map`, `owners`, `query`, `group_status` | Phase 0 + a single specialized subagent | +| **`/codehub-contract-map`** (P1) | `/codehub-contract-map ` | ` [--out ]` | `group_status` reports all repos fresh | `.codehub/groups//contracts.md` | `group_list`, `group_contracts`, `group_query`, `route_map`, `shape_check` | Phase 0 + Phase CD specialty (Mermaid) | +| **`/codehub-adr`** (P1) | `/codehub-adr ""` | `"" [--target ]` | Repo index fresh | `docs/adr/NNNN-.md` (committed by default — ADRs are durable) | `impact`, `context`, `risk_trends`, `owners` | Phase 0 lightweight precompute | + +**P0 = `/codehub-map`, `/codehub-pr-description`, `/codehub-onboarding`.** Justification: `/codehub-map` is the flagship analogue of codeprobe `/document` and unlocks the entire 4-phase pattern; `/codehub-pr-description` has the highest frequency of use (every PR) and the shortest agent path; `/codehub-onboarding` is the lowest-effort v1 output that immediately showcases the graph. `/codehub-contract-map` is P1 because it depends on having two or more indexed repos in a group — the install base on day one is small. `/codehub-adr` is P1 because the template market is crowded; we ship once the core pattern is landed. + +--- + +## Architecture — codeprobe 4-phase pattern, adapted + +**Phase 0 — precompute shared context to disk.** Replace codeprobe's `.codeprobe/heuristic_summary.json` requirement with `codehub status` + a Phase-0 writer. The skill writes two files: + +- `.codehub/.context.md` — project name, `project_profile` output, top-level dirs, stack detection, and (in group mode) the member-repo list from `group_list`. Under 200 lines. +- `.codehub/.prefetch.md` — graph pre-fetch, three strategic blocks: + 1. `query` top-20 symbols by score (grouped by process), for breadth. + 2. `risk_trends` last-30-days summary (per-community trend lines). + 3. `owners` table for the top-30 hotspot files (resolved from `risk_trends` + `sql`). + 4. `route_map` full HTTP surface if any Route nodes exist. + 5. `tool_map` full MCP-tool surface if any Tool nodes exist. + 6. Group mode only: `group_contracts` consumer-producer matrix + `group_status` staleness table. + +**Phase AB — content generation, 4 subagents in parallel** (codeprobe runs 6; OpenCodeHub's surface is narrower because supply-chain tools already pre-digest). Dispatched in a single message with 4 `Agent` tool calls: + +| Subagent | Output files | +|---|---| +| `doc-architecture` | `project-overview.md`, `architecture/system-overview.md`, `architecture/components.md`, `architecture/dependencies.md` | +| `doc-behavior` | `behavior/processes.md`, `behavior/routes.md`, `behavior/tools.md` | +| `doc-supply-chain` | `supply-chain/findings.md`, `supply-chain/licenses.md`, `supply-chain/dead-code.md` | +| `doc-hotspots` | `hotspots/risk-trends.md`, `hotspots/owners.md`, `hotspots/co-changes.md` | + +Each subagent reads `.context.md` + `.prefetch.md` first. Tool access: `Read`, `Grep`, `Glob`, plus the `mcp__opencodehub__*` tools named in its row; no Bash except to run `sql` queries via the MCP tool. + +**Phase CD — diagrams + specialty, 2 subagents in parallel.** + +| Subagent | Output files | Mode | +|---|---|---| +| `doc-diagrams` | `diagrams/architecture.md`, `diagrams/data-flow.md`, `diagrams/process-map.md` | Both | +| `doc-cross-repo` | `cross-repo/portfolio-map.md`, `cross-repo/contracts.md`, `cross-repo/dependency-flow.md` | Group mode only | + +Mermaid is sourced from `sql` queries over `relations` (CONTAINS, CALLS, HANDLES_ROUTE, FETCHES). Never rendered to PNG/SVG. + +**Phase E — cross-reference assembler (inline, no subagent).** Preserve the codeprobe algorithm: extract H1 + backtick `[:]` references, build co-occurrence (≥2 shared refs), append `See also` footers (3-5 links, bidirectional override), write `README.md` + `.docmeta.json`. **Novel element:** when two or more repos are mapped together, Phase E builds a **cross-repo link graph** — for every `docs/` tree it finds under `.codehub/groups///`, it links per-repo documents to the sibling repo's equivalent section (e.g., `repo-a/behavior/routes.md` ↔ `repo-b/behavior/routes.md` when `group_contracts` shows `repo-a` FETCHES a route produced by `repo-b`). This is the payoff: navigating one repo's docs jumps you to the consumer/producer on the other side, something no single-repo generator can do. + +--- + +## Multi-repo strategy — the wedge + +**Single-repo mode** (default, no `--group`). Phase 0 reads one repo; Phases AB/CD/E behave exactly like the codeprobe analogue. Output at `.codehub/docs/`. + +**Group mode** (`--group ` OR the cwd matches a registered group root — autodetected via `group_list`). Phase 0 calls `group_contracts`, `group_query`, `group_status`; Phase AB fans out 4 × N subagents (4 per repo). Claude Code's parallel-agent ceiling is ~10 concurrent tool calls per message, so for groups of 3+ repos we batch: all `doc-architecture` agents in message 1, all `doc-behavior` in message 2, etc. Phase CD's `doc-cross-repo` synthesizes the portfolio-level artifacts. Output at `.codehub/groups//docs/` with per-repo subtrees + one `cross-repo/` root. + +**Incremental mode** (`--since `). Reads `.docmeta.json`, compares each section's `generated_at` to the mtime of its declared `data_sources` resolved to `.codehub/` artifacts + git blame range since ``. Regenerates only the dirty sections. Always re-runs Phase E because cross-links shift. + +--- + +## Freshness + hooks + +Extend `plugins/opencodehub/hooks.json`: + +- After the existing PostToolUse auto-reindex on `git commit|merge|rebase|pull`, if `.codehub/docs/.docmeta.json` exists, emit a `systemMessage`: *"Docs at `.codehub/docs/` may be stale (graph hash changed). Run `/codehub-map --refresh` when convenient."* **Non-blocking.** We never auto-regenerate — regeneration spends Bedrock credits and takes 30-90 s, both of which the user must consent to. +- Precondition gate inside every artifact skill: call `codehub status` first. If staleness > 0 commits, the skill refuses and prints `Run 'codehub analyze' first.` The frontmatter `description` advertises this gate. +- Group-mode freshness: `group_status` must return `fresh: true` for every member. One stale member fails loudly with the repo name — no silent partial regeneration. + +--- + +## Output contracts + +- **Default location:** `.codehub/docs/` (gitignored, colocated with every other codehub artifact). Flag `--committed` writes to `docs/codehub/` instead and omits the `.gitignore` entry. ADRs are the single exception — they default to committed, because an ADR that isn't in git isn't an ADR. +- **Citations:** backtick `:` inline, exactly like codeprobe. Phase E's regex extends to accept the `:LOC` suffix. +- **No YAML frontmatter** on output docs. H1 is the identifier. +- **`.docmeta.json` schema** (extended): + ```json + { + "generated_at": "ISO-8601", + "codehub_graph_hash": "sha256:…", + "mode": "single|group", + "group": "", + "sections": { "": { "agent": "…", "generated_at": "…", "data_sources": [...], "files_produced": [...] } }, + "cross_repo_refs": [ { "from": "repo-a/behavior/routes.md", "to": "repo-b/behavior/routes.md", "reason": "FETCHES->HANDLES_ROUTE" } ] + } + ``` +- **Mermaid code blocks allowed, SVG/PNG never generated.** + +--- + +## Acceptance criteria (EARS) + +1. When the user invokes `/codehub-map` and the index is fresh, the system shall write at least 10 Markdown files under the output directory and a valid `.docmeta.json`. +2. When `codehub status` reports staleness, the system shall refuse to run `/codehub-map`, `/codehub-onboarding`, `/codehub-contract-map`, or `/codehub-adr` and shall print a single-line remediation hint. +3. When `/codehub-map --group ` is invoked and any member repo is stale, the system shall abort and shall name the stale repo(s) in the error. +4. When Phase E completes, every non-diagram document shall end with a `See also:` footer containing between 3 and 5 links and every link shall resolve to a file that exists. +5. When group mode is active, the system shall produce a `cross-repo/portfolio-map.md` that includes one Mermaid block sourced from `group_contracts`. +6. When `/codehub-map --refresh` is invoked, the system shall regenerate only sections whose declared `data_sources` have an mtime newer than the section's `generated_at`, and shall always re-run Phase E. +7. When `/codehub-map --committed` is invoked, the system shall write under `docs/codehub/` rather than `.codehub/docs/` and shall not add any entry to `.gitignore`. +8. When `/codehub-pr-description` is invoked inside a branch with changes, the system shall write a Markdown file whose body cites `verdict` tier, `detect_changes` affected symbols, and the `owners` reviewers for every touched file. + +--- + +## Open questions / risks + +- **Precompute size vs Bedrock cost.** Phase 0 `.prefetch.md` can balloon on large repos. Cap each section at ~500 lines and emit a truncation notice. Decide: does Phase 0 call the summarizer MCP tool (wrap `codehub wiki --llm` at last) or keep it as raw graph output? Leaning raw + let Phase AB agents summarize inline. +- **Parallel subagent ceiling.** Claude Code's practical ceiling is around 10 concurrent Agent tool calls per message. Groups larger than 2 repos must batch by subagent role, not by repo. Need to verify against current Claude Code release. +- **Naming collision.** `/document` is taken by codeprobe. `/codehub-map` avoids it. Consider prefixing all five skills with `codehub-` consistently for namespace hygiene. +- **Starlight site duplication.** The repo already has an Astro Starlight docs site with `llms.txt`. Generated artifacts should stay per-repo (under `.codehub/` or `docs/codehub/`) — the site stays meta and curated. We do not auto-publish generated docs into Starlight; that would couple generation to the site build and invalidate `--committed` semantics. +- **Bedrock credentials for the summarizer.** Any skill that invokes the summarizer needs AWS credentials on the host. Document the failure mode: if Bedrock is unreachable, skills degrade to raw graph output, never block. + +--- + +## Scope — v1 + +**IN:** `/codehub-map` (single + group), `/codehub-pr-description`, `/codehub-onboarding`, the Phase 0 / AB / CD / E pipeline, shared-context precompute to disk, `.docmeta.json` schema with `cross_repo_refs`, the post-reindex systemMessage hook extension, `--refresh` + `--committed` + `--since` flags. + +**OUT:** hosted web UI, SVG/PNG diagram generation, non-Claude-Code LLM support, custom/fine-tuned summarizer models, Starlight auto-publish, `/codehub-adr` (moved to P1), `/codehub-contract-map` as a standalone skill (folded into `/codehub-map --group` for v1). diff --git a/.erpaval/brainstorms/003-opencodehub-skill-interface-design.md b/.erpaval/brainstorms/003-opencodehub-skill-interface-design.md new file mode 100644 index 00000000..ee4e9de9 --- /dev/null +++ b/.erpaval/brainstorms/003-opencodehub-skill-interface-design.md @@ -0,0 +1,113 @@ +# 003 — OpenCodeHub Skill Interface Design + +VP-of-Design pass on the five artifact-generation skills shipping in the `opencodehub` plugin. Each entry gives the full SKILL.md frontmatter, natural-language + slash invocation examples, and what the user sees in the transcript. + +*Scope: user-facing surface only. Subagent prompts live in `004`, output contract in `005`.* + +## Skill 1 — `codehub-document` + +Primary artifact generator. Ports the codeprobe `/document` choreography to OpenCodeHub's graph. + +```yaml +--- +name: codehub-document +description: "Use when the user asks to generate, regenerate, or refresh long-form codebase documentation, an architecture book, a module map, or a per-repo reference — especially after `codehub analyze` finishes or after a large merge. Examples: \"document this repo\", \"regenerate the architecture docs\", \"write a module map for the monorepo\", \"produce a group-wide portfolio doc\". DO NOT use if the repo is not indexed — run `codehub analyze` first and confirm `mcp__opencodehub__list_repos` returns the repo. DO NOT use for PR descriptions (use `codehub-pr-description`) or ADRs (use `codehub-adr`)." +allowed-tools: "Read, Write, Edit, Glob, Grep, Bash(codehub:*), mcp__opencodehub__list_repos, mcp__opencodehub__project_profile, mcp__opencodehub__query, mcp__opencodehub__context, mcp__opencodehub__impact, mcp__opencodehub__dependencies, mcp__opencodehub__owners, mcp__opencodehub__risk_trends, mcp__opencodehub__route_map, mcp__opencodehub__tool_map, mcp__opencodehub__list_dead_code, mcp__opencodehub__group_list, mcp__opencodehub__group_query, mcp__opencodehub__group_status, mcp__opencodehub__group_contracts, mcp__opencodehub__sql, Task" +argument-hint: "[output-dir] [--group ] [--committed] [--refresh] [--section ]" +color: indigo +model: opus +--- +``` + +*Rationale: `opus` for the orchestrator because it routes to six subagents and must reason about which artifacts are still fresh; subagents themselves run on `sonnet` to stay cheap. The negative rule forces a `list_repos` pre-check — our single biggest support failure today is running against an unindexed repo.* + +**Invocation examples** +- Natural: "regenerate the OpenCodeHub docs" / "document the `platform` group" +- Slash: `/codehub-document docs/` or `/codehub-document docs/ --group platform --refresh` + +**Transcript shape** +- *Phase 0 (precompute):* single status line — `Prefetching graph context for opencodehub… 18 KB → .codehub/.context.md`. +- *Phase AB (fan-out 4):* `Dispatching doc-architecture, doc-reference, doc-behavior, doc-analysis in parallel…`. Each subagent emits a one-line summary on completion; the orchestrator does not echo their tool calls. +- *Phase CD (fan-out 2):* `Dispatching doc-diagrams, doc-cross-repo…` (cross-repo skipped silently in single-repo mode). +- *Phase E (assembler):* `Linking 33 docs · 241 citations · 18 See-also blocks · wrote .docmeta.json`. + +## Skill 2 — `codehub-pr-description` + +```yaml +--- +name: codehub-pr-description +description: "Use when the user asks for a PR description, a pull request summary, a merge write-up, or a release note for a branch or diff. Examples: \"write the PR description\", \"summarize this branch for review\", \"draft release notes for HEAD\". Runs `mcp__opencodehub__detect_changes` + `verdict` + `owners` and writes Markdown. DO NOT use for open-ended architecture docs (use `codehub-document`). DO NOT use when no diff exists — the skill refuses on a clean tree." +allowed-tools: "Read, Write, Bash(git diff:*), Bash(git log:*), Bash(git rev-parse:*), mcp__opencodehub__detect_changes, mcp__opencodehub__verdict, mcp__opencodehub__owners, mcp__opencodehub__impact, mcp__opencodehub__signature, mcp__opencodehub__list_findings_delta" +argument-hint: "[--base ] [--head ]" +color: teal +model: sonnet +--- +``` + +*Rationale: `sonnet` is enough because the skill is linear — one `detect_changes` call threads into a fixed template. The negative rule "refuses on a clean tree" prevents the annoying failure mode where a user fires it at `main` by mistake.* + +- Natural: "write the PR description" / "summarize this branch" +- Slash: `/codehub-pr-description --base main --head HEAD` +- Transcript: one-shot; user sees `detect_changes → 7 symbols · verdict: review_recommended · owners: 2` then the rendered Markdown. + +## Skill 3 — `codehub-onboarding` + +```yaml +--- +name: codehub-onboarding +description: "Use when the user asks for an ONBOARDING, getting-started, or new-engineer guide for the current repo or group. Examples: \"write ONBOARDING.md\", \"generate an onboarding doc for new hires\", \"what should a new engineer read first\". Produces a ranked reading order from `project_profile` + top processes + entry points. DO NOT use for full architecture books (use `codehub-document`)." +allowed-tools: "Read, Write, Glob, mcp__opencodehub__project_profile, mcp__opencodehub__query, mcp__opencodehub__route_map, mcp__opencodehub__tool_map, mcp__opencodehub__owners, mcp__opencodehub__sql" +argument-hint: "[output-path]" +color: green +model: sonnet +--- +``` + +*Rationale: scoped to a single file so we can keep it tight. The ranked reading order is the deliverable — it is the wedge over a generic README scaffold.* + +- Natural: "write an onboarding guide" — slash: `/codehub-onboarding ONBOARDING.md`. + +## Skill 4 — `codehub-contract-map` + +```yaml +--- +name: codehub-contract-map +description: "Use when the user asks for a cross-repo contract map, an API-consumer matrix, or a service-interaction diagram across a repo group. Examples: \"map the HTTP contracts between services\", \"which services call the billing API\", \"show the contract matrix for the platform group\". GROUP MODE ONLY — requires a named group. DO NOT use on a single repo (use `codehub-document` with `reference/public-api.md`). DO NOT use if `mcp__opencodehub__group_list` does not include the group." +allowed-tools: "Read, Write, mcp__opencodehub__group_list, mcp__opencodehub__group_status, mcp__opencodehub__group_contracts, mcp__opencodehub__group_query, mcp__opencodehub__route_map" +argument-hint: " [--output ]" +color: purple +model: sonnet +--- +``` + +*Rationale: this is the skill that advertises the group wedge — the negative rule gates it to groups only so users do not waste a turn on it in single-repo mode.* + +- Natural: "map the contracts across the platform group" — slash: `/codehub-contract-map platform --output .codehub/groups/platform/contract-map.md`. + +## Skill 5 — `codehub-adr` + +```yaml +--- +name: codehub-adr +description: "Use when the user asks to draft an Architecture Decision Record, ADR, or design decision document for a concrete code change or refactor. Examples: \"draft an ADR for splitting the ingestion pipeline\", \"write a decision record for deprecating the legacy handler\". Takes a problem statement, grounds consequences in `mcp__opencodehub__impact` on the target symbol. DO NOT use without a problem statement argument. DO NOT use for retrospective docs of shipped work (use `codehub-document`)." +allowed-tools: "Read, Write, mcp__opencodehub__query, mcp__opencodehub__context, mcp__opencodehub__impact, mcp__opencodehub__owners, mcp__opencodehub__risk_trends, mcp__opencodehub__signature" +argument-hint: "\"\" [--target ] [--adr-number ]" +color: amber +model: sonnet +--- +``` + +*Rationale: forcing the problem statement as a positional arg dodges the empty-ADR failure mode. Impact query seeds the "Consequences" section with real blast-radius data instead of LLM speculation.* + +- Natural: "draft an ADR for removing the v1 auth middleware" — slash: `/codehub-adr "Remove v1 auth middleware" --target legacyAuth`. + +## Discoverability design + +Four reinforcing paths, ranked by hit-rate: + +1. **`opencodehub-guide` adds a Skills table** — the guide is already the on-ramp; add a row per artifact skill with trigger examples. *Zero-cost, highest reach.* +2. **`codehub analyze` completion hint (PostToolUse hook)** — after analyze finishes, the hook prints `Indexed opencodehub (1,248 symbols). Try: /codehub-document docs/ · /codehub-onboarding`. *Catches the moment the user is most likely to want docs.* +3. **`verdict` / `detect_changes` next-step hints** — existing `next_steps` arrays on these responses already flow through `packages/mcp/src/next-step-hints.ts`; append `{suggest: "codehub-pr-description"}` when a diff is present. *Reuses infrastructure we already ship.* +4. **Starlight docs site `/skills/` page** — one page per skill with the frontmatter rendered as a card, trigger examples, and a 10-second demo asciicast. *Backstop for Googling users.* + +*Design bet: the PostToolUse hook is the decisive surface. Users who just ran `codehub analyze` are in exactly the mental state that predicts a documentation request — meet them there.* diff --git a/.erpaval/brainstorms/004-opencodehub-subagent-prompts.md b/.erpaval/brainstorms/004-opencodehub-subagent-prompts.md new file mode 100644 index 00000000..c4b4c680 --- /dev/null +++ b/.erpaval/brainstorms/004-opencodehub-subagent-prompts.md @@ -0,0 +1,402 @@ +# 004 — `codehub-document` Subagent Prompt Templates + +Drop-in prompt templates for the six `doc-*` subagents dispatched by the `codehub-document` skill. Each prompt follows the codeprobe 8-section scaffold. Every agent reads `.codehub/.context.md` and `.codehub/.prefetch.md` first — Phase 0 is responsible for writing them. + +*Placement: `plugins/opencodehub/agents/doc-*.md`. The skill invokes them via the `Task` tool with `subagent_type: "doc-architecture"` (etc.).* + +## Phase 0 — Shared context precompute spec + +The orchestrator runs Phase 0 inline before dispatching any subagent. It writes two files into `.codehub/` (single-repo) or `.codehub/groups//` (group mode). + +### `.context.md` (hard 200-line cap) + +```markdown +# Codehub context — +generated_at: +graph_hash: + +## Repo profile # from project_profile +- languages: TypeScript 87%, Rust 11%, Python 2% +- stacks: Node 22, pnpm 9, Vitest, Axum +- entry points: packages/mcp/src/index.ts, packages/cli/src/bin.ts + +## Top communities (≤ 10) # from sql over nodes WHERE kind='Community' ORDER BY cohesion DESC +| name | inferred_label | cohesion | symbols | + +## Top processes (≤ 10) # from sql over nodes WHERE kind='Process' ORDER BY step_count DESC +| name | entry_point | step_count | + +## Routes # from route_map — truncated to 25 rows +## MCP tools # from tool_map — truncated to 25 rows +## Owners summary # from owners on top 5 folders +## Staleness envelope # from list_repos._meta.codehub/staleness +``` + +*Enforcement: the Phase 0 writer truncates each subsection to its cap and records a `truncated: true` flag per section in `.prefetch.md`. Subagents see the cap, not the raw firehose.* + +### `.prefetch.md` (no cap, structured tool-call log) + +Newline-delimited JSON records of the exact tool calls Phase 0 made and their response digests. Subagents reuse these digests instead of re-calling the same tool. Example line: + +```json +{"tool":"project_profile","args":{"repo":"opencodehub"},"sha256":"…","keys":["languages","stacks","entryPoints"],"cached_at":"2026-04-27T18:04:11Z"} +``` + +*Rationale: two files, not one. `.context.md` is human-readable and LLM-primable; `.prefetch.md` is the de-dup ledger. Splitting them keeps the 200-line cap meaningful — ledger growth does not crowd out context.* + +## Agent 1 — `doc-architecture` + +```markdown +--- +name: doc-architecture +description: "Generates architecture/system-overview.md, architecture/module-map.md, architecture/data-flow.md for codehub-document. Invoked by the orchestrator — not user-facing." +model: sonnet +tools: Read, Write, Grep, Glob, mcp__opencodehub__project_profile, mcp__opencodehub__query, mcp__opencodehub__context, mcp__opencodehub__sql, mcp__opencodehub__route_map, mcp__opencodehub__dependencies +--- + +You are the architecture documenter. Produce three Markdown files that describe the static shape of this repository. + +## Output Files +- `/architecture/system-overview.md` +- `/architecture/module-map.md` +- `/architecture/data-flow.md` + +## Input Specification +| Source artifact | Read how | +| ------------------------- | ------------------------------------------------------------- | +| `.codehub/.context.md` | `Read` (always, first) | +| `.codehub/.prefetch.md` | `Read` — reuse digests, do not re-call identical tools | +| project profile | `mcp__opencodehub__project_profile({repo})` | +| communities (modules) | `sql` over `nodes WHERE kind='Community' ORDER BY cohesion` | +| entry points | `sql` over `nodes WHERE kind='Process'` joined to `entry_point_id` | +| imports / dependencies | `mcp__opencodehub__dependencies({repo})` | + +## Process +1. Read the two shared-context files. Treat them as canonical; do not re-call `project_profile`. +2. `sql({query: "SELECT name, inferred_label, cohesion, symbol_count, keywords FROM nodes WHERE kind='Community' ORDER BY cohesion DESC LIMIT 20"})` — these are the modules. +3. For each of the top 8 modules, `context({symbol: })` to pull inbound/outbound relation counts. Cache the summary. +4. `query({text: "system entry point", limit: 10})` — reconcile against community members to find the bootstrap files. +5. `dependencies({repo})` — extract top 15 external packages for `system-overview.md` stack table. +6. Draft `system-overview.md` (H1 = repo identifier, 400-600 words, one Mermaid `flowchart LR` of top-6 modules). +7. Draft `module-map.md` (one H2 per module, bullet list of files cited as `` `path:LOC` ``). +8. Draft `data-flow.md` — walk top 3 processes, each as a Mermaid `sequenceDiagram`. +9. Write all three files. Do not emit YAML frontmatter on outputs. + +## Document Format Rules +- H1 = identifier of the repo or module (no decorative titles). +- Every claim backed by a backtick citation `` `path:LOC` `` with `(N LOC)` suffix for file-level cites. +- Mermaid blocks use fenced ```mermaid. +- No emojis. No filler adverbs. + +## Tool Usage Guide +| Need | Tool | Why | +| ------------------------------------- | ------------------------------------ | ------------------------------------ | +| Module list with cohesion score | `sql` over `nodes` | Communities are the module proxy | +| Symbol neighborhood | `context` | Gives inbound/outbound + cochanges | +| Cross-module concept search | `query` | Hybrid BM25+vector, process-grouped | +| File line ranges for citations | `Read` (then count) | Graph does not store LOC count | +| External dependency list | `dependencies` | Authoritative over grepping manifests | + +## Fallback Paths +- If `sql` over `nodes WHERE kind='Community'` returns zero rows: the repo predates communities. Fall back to `sql` over `nodes WHERE kind='File'` grouped by top folder. +- If `dependencies` errors: `Read` the root `package.json` / `Cargo.toml` / `pyproject.toml`. +- If a module has fewer than 3 files: collapse into a "Supporting code" trailing section. + +## Quality Checklist +- [ ] All three output files written. +- [ ] Each file has H1 = identifier, no YAML frontmatter. +- [ ] Every factual claim has a backtick citation. +- [ ] `system-overview.md` has exactly one Mermaid flowchart. +- [ ] `data-flow.md` has one sequenceDiagram per top process, max 3. +- [ ] No re-calls of tools whose digest is in `.prefetch.md`. +``` + +## Agent 2 — `doc-reference` + +```markdown +--- +name: doc-reference +description: "Generates reference/public-api.md, reference/cli.md (if CLI package present), reference/mcp-tools.md (if MCP package present)." +model: sonnet +tools: Read, Write, Glob, Grep, mcp__opencodehub__query, mcp__opencodehub__context, mcp__opencodehub__signature, mcp__opencodehub__route_map, mcp__opencodehub__tool_map, mcp__opencodehub__sql, mcp__opencodehub__project_profile +--- + +You document the public API, CLI surface, and MCP tool surface of this repo. + +## Output Files +- `/reference/public-api.md` (always) +- `/reference/cli.md` (conditional) +- `/reference/mcp-tools.md` (conditional) + +## Input Specification +| Source | Read how | +| --------------------- | ---------------------------------------------------------- | +| shared context | `Read .codehub/.context.md` | +| exported symbols | `sql` over `nodes` filtered to exports (see Process #2) | +| route inventory | `route_map({repo})` | +| MCP tool inventory | `tool_map({repo})` | +| signatures | `signature({symbol})` per public function | + +## Process +1. Read shared context. Identify CLI / MCP presence from `project profile → entry points`. +2. `sql({query: "SELECT name, kind, file_path, start_line FROM nodes WHERE kind IN ('Function','Class','Method') AND name NOT LIKE '\\_%' ORDER BY file_path LIMIT 500"})` — public-ish surface. +3. Filter to symbols whose file path is under `packages/*/src/index.ts` or an equivalent barrel. These are the real exports. +4. For the top 30 exports: `signature({symbol: })` then `context({symbol: })` to pick up usage count. +5. `route_map({repo})` — render into `cli.md` if the repo is a CLI, else into `public-api.md` under HTTP section. +6. `tool_map({repo})` — if non-empty, write `reference/mcp-tools.md` with one H2 per tool. +7. Quote signatures verbatim from `signature`. Never paraphrase. + +## Document Format Rules +- Each public symbol: H3 name, signature in a ```ts or ```py fence, 1-paragraph description, `Defined at: ` `path:LOC`. +- Routes: Markdown table `| Method | Path | Handler | Middleware |`. +- MCP tools: H2 per tool name, bullet list of input keys + output keys. + +## Tool Usage Guide +| Need | Tool | Why | +| -------------------------------- | ----------------- | --------------------------------------------- | +| Verbatim signature | `signature` | Never paraphrase — quote | +| Usage count | `context` | Inbound call count = is-it-actually-public | +| HTTP routes | `route_map` | Authoritative, includes middleware chain | +| MCP tools | `tool_map` | Enumerates `mcp__*__*` surface | + +## Fallback Paths +- No CLI package → skip `cli.md` entirely (do not write an empty file). +- No MCP package → skip `mcp-tools.md`. +- `signature` returns no result → cite `path:LOC` and the symbol name only; do not invent a signature. + +## Quality Checklist +- [ ] Every signature is quoted from `signature`, never paraphrased. +- [ ] Conditional files omitted when their package is absent. +- [ ] Route table sorted by path then method. +- [ ] Every symbol has a `Defined at:` citation. +``` + +## Agent 3 — `doc-behavior` + +```markdown +--- +name: doc-behavior +description: "Generates behavior/processes.md and behavior/state-machines.md — runtime/behavioral view of the repo." +model: sonnet +tools: Read, Write, Grep, mcp__opencodehub__query, mcp__opencodehub__context, mcp__opencodehub__sql, mcp__opencodehub__api_impact +--- + +You document how the system behaves at runtime — discovered processes, state machines, retry/error handling. + +## Output Files +- `/behavior/processes.md` +- `/behavior/state-machines.md` + +## Input Specification +| Source | Read how | +| ---------------- | ----------------------------------------------------------------------- | +| shared context | `Read .codehub/.context.md` | +| process nodes | `sql` over `nodes WHERE kind='Process' ORDER BY step_count DESC` | +| process steps | `sql` over `relations WHERE type='PROCESS_STEP'` | +| state shapes | `query({text: "state machine|enum|status transition"})` then `context` | + +## Process +1. Read shared context. +2. `sql({query: "SELECT id, name, inferred_label, step_count, entry_point_id FROM nodes WHERE kind='Process' ORDER BY step_count DESC LIMIT 15"})`. +3. For each process, `sql` the `PROCESS_STEP` relations ordered by step index; resolve each step to `{file_path, start_line, name}` via a join against `nodes`. +4. For state machines: `query({text: "state transition enum", limit: 20})`, filter matches to `kind IN ('Enum','TypeAlias')`, then `context` each to find referencing functions. +5. Group state-transition call sites into a state diagram per Enum. +6. Write `processes.md`: H2 per process, bulleted ordered step list, one Mermaid `sequenceDiagram` for the top 3. +7. Write `state-machines.md`: H2 per Enum, Mermaid `stateDiagram-v2`, table of transition sites. + +## Document Format Rules +- Step lines: `1. — `path:LOC` — <1-line why>`. +- Mermaid `stateDiagram-v2` only for enums with ≥ 3 distinct values reached from code. +- Skip `state-machines.md` entirely if zero candidates survive filtering. + +## Tool Usage Guide +| Need | Tool | Why | +| ---------------------------- | --------------- | ----------------------------------------- | +| Process step list | `sql` | `PROCESS_STEP` relations carry the order | +| Transition call sites | `context` | Inbound callers of the enum variant | +| Concept → code | `query` | Finds state shapes by description | +| HTTP-level runtime impact | `api_impact` | Consumer chain for a route | + +## Fallback Paths +- No processes indexed → describe top 5 entry points from project profile as pseudo-processes with a "Processes not yet extracted — run `codehub analyze`" admonition. +- Zero state-machine candidates → omit `state-machines.md`. + +## Quality Checklist +- [ ] Each process cites its entry point line. +- [ ] Step lists preserve graph order, not alphabetical. +- [ ] No sequence diagram exceeds 12 lines (truncate with `… (N more)`). +- [ ] No invented transitions — every arrow has a call-site citation. +``` + +## Agent 4 — `doc-analysis` + +```markdown +--- +name: doc-analysis +description: "Generates analysis/risk-hotspots.md, analysis/ownership.md, analysis/dead-code.md — the risk-and-governance view." +model: sonnet +tools: Read, Write, mcp__opencodehub__risk_trends, mcp__opencodehub__owners, mcp__opencodehub__list_dead_code, mcp__opencodehub__list_findings, mcp__opencodehub__license_audit, mcp__opencodehub__sql, mcp__opencodehub__context +--- + +You document risk, ownership, and dead code. All three artifacts must be fully data-driven — no LLM speculation about "this looks risky". + +## Output Files +- `/analysis/risk-hotspots.md` +- `/analysis/ownership.md` +- `/analysis/dead-code.md` + +## Input Specification +| Source | Read how | +| ----------------- | ----------------------------------- | +| shared context | `Read .codehub/.context.md` | +| risk trends | `risk_trends({repo})` | +| ownership | `owners({repo, scope: "folder"})` | +| dead code | `list_dead_code({repo})` | +| findings | `list_findings({repo, limit: 200})` | +| license audit | `license_audit({repo})` | + +## Process +1. Read shared context. +2. `risk_trends({repo})` — rank communities by 30-day projection; top 10 into `risk-hotspots.md`. +3. For each hotspot, `context({symbol: })` for cochanges count; cite the cochange signal explicitly as "git co-change, not call dependency". +4. `owners({repo, scope: "folder"})` — top-contributor table per top-level folder. +5. `list_dead_code({repo})` — group by folder, emit a table. +6. `license_audit({repo})` — append a "License posture" section to `risk-hotspots.md` (copyleft / unknown counts). +7. `list_findings({repo, severity: "high,critical", limit: 50})` — inline-quote top 10 into `risk-hotspots.md`. + +## Document Format Rules +- Each hotspot: H3 = community name, bullet list of `metric: value`, `Top co-changing files:` sublist. +- Ownership table: `| Folder | Owner | Trailing 90d commits | Bus factor |`. +- Dead code: one H2 per folder, bullet list of `path:LOC — symbol (exported, 0 callers)`. + +## Tool Usage Guide +| Need | Tool | Why | +| --------------------------------- | ------------------- | --------------------------------- | +| 30-day risk projection | `risk_trends` | First-class trend | +| Folder ownership + bus factor | `owners` | CODEOWNERS + blame fusion | +| Unreferenced exports | `list_dead_code` | Deterministic | +| Active SARIF findings | `list_findings` | Cites severity and rule | +| License tier | `license_audit` | Copyleft / unknown / proprietary | + +## Fallback Paths +- `risk_trends` empty → write "Trends unavailable — requires ≥ 30 days of indexed history" and skip projections. +- `list_dead_code` empty → write an empty-state paragraph, not a blank file. + +## Quality Checklist +- [ ] Every hotspot has a numeric metric (not a vibe). +- [ ] Cochange signal labeled as git-history, not call-dependency. +- [ ] License posture table present. +- [ ] Dead code table sorted by folder then LOC descending. +``` + +## Agent 5 — `doc-diagrams` + +```markdown +--- +name: doc-diagrams +description: "Generates diagrams/architecture/components.md, diagrams/behavioral/sequences.md, diagrams/structural/dependency-graph.md." +model: sonnet +tools: Read, Write, mcp__opencodehub__query, mcp__opencodehub__context, mcp__opencodehub__dependencies, mcp__opencodehub__sql +--- + +You render Mermaid diagrams. Structure first, narrative second. Every node in every diagram has a citation in an accompanying legend. + +## Output Files +- `/diagrams/architecture/components.md` +- `/diagrams/behavioral/sequences.md` +- `/diagrams/structural/dependency-graph.md` + +## Input Specification +| Source | Read how | +| --------------- | ---------------------------------------------------------------------- | +| shared context | `Read .codehub/.context.md` | +| modules | `sql` over `nodes WHERE kind='Community'` | +| edges | `sql` over `relations WHERE type IN ('CALLS','IMPORTS','DEPENDS_ON')` | +| processes | reuse prefetch digest from `doc-behavior` if present | + +## Process +1. Read shared context; check `.prefetch.md` for behavior-agent digests before re-querying. +2. Components: `sql` top 12 communities + their inter-community edge counts; render as Mermaid `classDiagram` with cardinality labels on edges. +3. Sequences: reuse top 3 processes from shared context; render one `sequenceDiagram` each, max 10 participants. +4. Dependency graph: `sql` for `IMPORTS` edges folded to folder level; render as Mermaid `flowchart LR` with clickable node links `click A "…/path"`. + +## Document Format Rules +- One Mermaid block per H2 section. +- Legend table directly below each diagram: `| Node | Path | Role |`. +- Truncate diagrams to 20 nodes; list omitted with "… and N more" in the legend. + +## Tool Usage Guide +| Need | Tool | Why | +| ------------------------------ | ------------------- | ------------------------------ | +| Edge counts between modules | `sql` | Raw graph, cheap | +| Process step list | prefetch digest | Avoid re-call | +| External package grouping | `dependencies` | Group imports by ecosystem | + +## Fallback Paths +- Fewer than 3 communities → emit a single flowchart, skip classDiagram. +- No `IMPORTS` edges → fold to file-level rather than folder-level. + +## Quality Checklist +- [ ] Every diagram ≤ 20 nodes. +- [ ] Every node in legend has `path:LOC`. +- [ ] Mermaid syntax validates (no stray punctuation in labels). +``` + +## Agent 6 — `doc-cross-repo` (GROUP MODE ONLY) + +```markdown +--- +name: doc-cross-repo +description: "GROUP MODE ONLY. Generates cross-repo/portfolio-map.md, cross-repo/contracts-matrix.md, cross-repo/dependency-flow.md from group-scope MCP tools." +model: sonnet +tools: Read, Write, mcp__opencodehub__group_list, mcp__opencodehub__group_status, mcp__opencodehub__group_contracts, mcp__opencodehub__group_query, mcp__opencodehub__sql +--- + +You document how the repos in a named group relate. This agent is dispatched only when the orchestrator invokes `codehub-document --group `. If the input does not identify a group, exit with a one-line "Group mode not requested; skipping." message and write no files. + +## Output Files +- `/cross-repo/portfolio-map.md` +- `/cross-repo/contracts-matrix.md` +- `/cross-repo/dependency-flow.md` + +## Input Specification +| Source | Read how | +| ----------------- | ----------------------------------------------------------- | +| group membership | `group_list()` → filter to requested group | +| per-repo freshness| `group_status({group})` | +| HTTP contracts | `group_contracts({group})` — consumer FETCHES → producer Route | +| concept fan-out | `group_query({group, text: "…"})` | + +## Process +1. `group_list()` — confirm the group exists; refuse if not. +2. `group_status({group})` — capture per-repo `graph_hash` and staleness for later inclusion in `.docmeta.json` `cross_repo_refs`. +3. Portfolio map: one H2 per member repo, table `| Repo | Role | Languages | Top communities |` sourced from each repo's shared context (read `/.codehub/.context.md` when present; otherwise call `project_profile`). +4. Contracts matrix: `group_contracts({group})` → Markdown table `| Consumer repo | Consumer call site | HTTP method+path | Producer repo | Producer handler | Confidence |`, sorted by producer then path. +5. Dependency flow: Mermaid `flowchart LR` with one node per repo, one edge per distinct consumer→producer contract pair; edge label = count. +6. In every artifact, prefix each citation with `:` — `` `billing:src/client.ts:42` `` — since paths are now ambiguous across repos. + +## Document Format Rules +- Repo-qualified citations mandatory: `` `::` ``. +- Each cross-repo edge notes confidence from `group_contracts`. +- Portfolio map ordered by role (producer → consumer → infra). + +## Tool Usage Guide +| Need | Tool | Why | +| -------------------------- | ------------------- | --------------------------------------- | +| Group members | `group_list` | Authoritative | +| Per-repo staleness | `group_status` | Drives the `cross_repo_refs` sidecar | +| HTTP contract edges | `group_contracts` | Consumer FETCHES → Producer Route join | +| Concept fan-out | `group_query` | BM25 RRF across repos | + +## Fallback Paths +- `group_contracts` returns zero rows → write a "No HTTP contracts detected — rerun `codehub analyze` with HTTP scanners" admonition and skip the matrix file. +- A member repo is stale (from `group_status`) → include a staleness admonition at the top of `portfolio-map.md` naming the stale repos. + +## Quality Checklist +- [ ] Every citation is repo-qualified. +- [ ] Contracts matrix sorted by producer repo then path. +- [ ] Stale member repos called out at the top of `portfolio-map.md`. +- [ ] `cross_repo_refs` manifest emitted to stdout for the Phase E assembler to pick up. +``` + +*Cross-agent rationale: every prompt opens with `.context.md` + `.prefetch.md` Read, ends with a Quality Checklist the agent self-verifies, and names exact `mcp__opencodehub__*` tools — no generic "search the code" instructions. That's the single biggest correctness lever.* diff --git a/.erpaval/brainstorms/005-opencodehub-output-conventions.md b/.erpaval/brainstorms/005-opencodehub-output-conventions.md new file mode 100644 index 00000000..c1d86269 --- /dev/null +++ b/.erpaval/brainstorms/005-opencodehub-output-conventions.md @@ -0,0 +1,158 @@ +# 005 — OpenCodeHub Output Conventions + +The on-disk contract for everything `codehub-document` writes. This spec binds the subagents in `004` and the Phase E assembler together. + +*Source of truth: this file. If a subagent output disagrees with this spec, the assembler is authoritative and rewrites the section.* + +## Directory layout + +### Single-repo mode (`.codehub/docs/`) + +``` +.codehub/ +├── .context.md # Phase 0 shared context (200-line cap) +├── .prefetch.md # Phase 0 tool-call digest ledger +└── docs/ + ├── README.md # Landing page — generated by Phase E + ├── .docmeta.json # Manifest (see schema below) + ├── architecture/ + │ ├── system-overview.md + │ ├── module-map.md + │ └── data-flow.md + ├── reference/ + │ ├── public-api.md + │ ├── cli.md # Conditional + │ └── mcp-tools.md # Conditional + ├── behavior/ + │ ├── processes.md + │ └── state-machines.md # Conditional + ├── analysis/ + │ ├── risk-hotspots.md + │ ├── ownership.md + │ └── dead-code.md + └── diagrams/ + ├── architecture/components.md + ├── behavioral/sequences.md + └── structural/dependency-graph.md +``` + +### Group mode (`.codehub/groups//docs/`) + +Each member repo keeps its own single-repo tree; the group tree adds cross-repo artifacts. + +``` +.codehub/groups// +├── .context.md +├── .prefetch.md +└── docs/ + ├── README.md + ├── .docmeta.json + └── cross-repo/ + ├── portfolio-map.md + ├── contracts-matrix.md + └── dependency-flow.md +``` + +*Rationale: mirroring `.codehub/` for both scopes keeps the mental model flat. Group docs never duplicate per-repo content — they link into `/.codehub/docs/…` via `See also (other repos in group)` footers.* + +## Citation grammar + +Backtick-wrapped inline citations. Two forms, both recognized by the Phase E assembler. + +- **Single-repo**: `` `:` `` or `` `:-` ``; file-level citations append ` (N LOC)`. +- **Group-qualified**: `` `::` `` — mandatory in any file under `cross-repo/`. + +### Phase E regex (the assembler contract) + +``` +(?P[a-zA-Z0-9_-]+:)?(?P[^\s`:]+\.[a-zA-Z0-9]+)(?::(?P\d+)(?:-(?P\d+))?)?(?:\s*\((?P\d+)\s*LOC\))? +``` + +*Rationale: exactly one regex for every citation form keeps the assembler 40 lines of deterministic code. Matches must be inside backticks — the assembler scans between `` ` `` pairs only, never raw prose.* + +## `.docmeta.json` schema + +Written by Phase E. Drives `--refresh` and `codehub status` staleness reporting. + +```json +{ + "$schema": "https://opencodehub.dev/schemas/docmeta-v1.json", + "generated_at": "2026-04-27T18:12:04Z", + "codehub_graph_hash": "sha256:a1b2c3…", + "mode": "single-repo", + "repo": "opencodehub", + "staleness_at": "2026-04-27T18:12:04Z", + "sections": [ + { + "path": "architecture/system-overview.md", + "agent": "doc-architecture", + "sources": [ + "packages/mcp/src/server.ts", + "packages/mcp/src/index.ts" + ], + "mtime": "2026-04-27T18:11:58Z", + "citation_count": 18, + "mermaid_count": 1 + } + ], + "cross_repo_refs": [ + { + "repo": "billing", + "from_doc": "cross-repo/contracts-matrix.md", + "to_doc": "../../../billing/.codehub/docs/reference/public-api.md", + "contract_count": 4 + } + ] +} +``` + +*`cross_repo_refs` is emitted only in group mode. `staleness_at` is copied from the `_meta.codehub/staleness` envelope on the last MCP response the assembler observed — letting `codehub status` compare graph_hash drift without rereading the whole manifest.* + +## Cross-reference rules + +- **Within a single repo**: if two docs share ≥ 2 citations to the same source files, Phase E appends a `## See also` footer to both, listing the sibling path. Threshold enforced by the assembler, not the subagents. +- **Group mode**: `cross-repo/*` files additionally receive a `## See also (other repos in group)` section linking into sibling repos' generated docs via relative paths rooted at the group directory. +- **Link form**: Markdown reference links, not inline URLs — keeps footers tidy when the list grows. +- **Dedup**: a sibling path appears at most once across both footer sections. + +## Mermaid patterns + +One diagram type per artifact. Examples below; subagents in `004` generate the full versions. + +- **Dependency graph (`flowchart LR`)** — `flowchart LR; mcp-->core; cli-->core; storage-->duckdb` +- **Component view (`classDiagram`)** — `classDiagram; class mcp { +server.ts }; mcp --> core` +- **Top process (`sequenceDiagram`)** — `sequenceDiagram; CLI->>MCP: analyze; MCP->>Storage: write` +- **State machine (`stateDiagram-v2`)** — `stateDiagram-v2; [*] --> Pending; Pending --> Running; Running --> Done` +- **Data flow (`flowchart TB`)** — `flowchart TB; Source[Repo]-->Parse[tree-sitter]-->Graph[DuckDB]` + +*Rationale: one canonical diagram per artifact makes the docs scannable. Diagrams are capped at 20 nodes — any overflow goes into the legend table, never into the diagram.* + +## `--refresh` algorithm + +Deterministic, per-section. Avoids regenerating unchanged sections on every invocation. + +1. Load `.docmeta.json`. +2. Fetch current `codehub_graph_hash` from `list_repos`. If it matches the manifest's hash exactly, skip straight to step 5. +3. For each `section`: + - Compute `max(mtime(source))` across `sources[]` (via `stat`, not `Read`). + - If `max(source_mtime) > section.mtime`: mark section stale. +4. Collect the union of stale sections and their subagent owners from `section.agent`. Dispatch only the owning subagents; pass them a `sections_to_refresh` list so they write only those files. +5. Re-run Phase E over the full tree (cross-reference assembly is cheap and idempotent). + +*Rationale: source-mtime comparison is tolerant of the common case where `codehub analyze` updates the graph but touches only a few files. Falling back to a full regen when `graph_hash` churns avoids subtle staleness when node IDs shift.* + +## Determinism guarantees + +- **Deterministic**: file list, directory layout, section ordering, diagram node set, citation targets, `.docmeta.json` structure. Given the same `codehub_graph_hash`, two runs produce the same *structure*. +- **Non-deterministic**: prose sentences, diagram edge ordering within a node (Mermaid renderers stable but LLM-emitted source ordering is not), choice of which 3 processes to render as sequence diagrams among ties. +- **Explicit call-out**: the generated `README.md` landing page includes a one-line "Prose is LLM-generated; structure is graph-derived" note so reviewers treat the diff accordingly. + +*Design bet: readers need to know which parts of the doc to trust as deterministic. Flagging it once in the landing page is cheaper than littering every section with a disclaimer.* + +## Staleness signals + +- **`codehub status`** reads `.docmeta.json.codehub_graph_hash` and compares against the live graph hash; if different, reports `docs stale at `. Implementation note: bolt into the existing status command at `packages/cli/src/commands/status.ts`; no new command surface. +- **Phase E writes `staleness_at`** from the last MCP `_meta.codehub/staleness` envelope observed during assembly. Consumers can detect "the graph itself was stale when these docs were generated" without rereading every envelope. +- **Per-section drift** is visible through `section.mtime` vs. `section.sources[].mtime` — `codehub status --verbose` will render a per-section stale/fresh table. + +*Follow-up for `006`: CI workflow to run `codehub-document --refresh` on `push to main` and open a PR when any section's mtime moves — keeps the docs honest without a human in the loop.* diff --git a/.erpaval/brainstorms/006-synthesis-whats-next.md b/.erpaval/brainstorms/006-synthesis-whats-next.md new file mode 100644 index 00000000..fe0f3f9a --- /dev/null +++ b/.erpaval/brainstorms/006-synthesis-whats-next.md @@ -0,0 +1,139 @@ +# 006 — What's Next for OpenCodeHub: Synthesis + +*Draft: 2026-04-27. Inputs: 001 Strategy (Rumelt kernel), 002 PRD (product discovery), 003–005 Design (interfaces, subagent prompts, output conventions). Run via erpaval with product + strategy + design cycles in parallel.* + +This memo is the single-source recommendation. All three cycles agree on the wedge and the pattern. They disagreed on three things — naming, P0 scope, and orchestrator model. I resolve each below, then hand off to an EARS spec at `.erpaval/specs/001-claude-code-artifact-surface/spec.md` that a later `/act` pass can compile into tasks. + +## The thesis (refined) + +**OpenCodeHub becomes the artifact factory for Claude Code at the group level.** We port codeprobe's `/document` choreography — Phase 0 precompute → parallel `doc-*` subagents → deterministic cross-reference assembler → `.docmeta.json` sidecar — into a plugin skill family that treats a *group of repos* as a first-class scope. Every other code-graph tool is single-repo. We are the only retrieval surface with cross-repo graph primitives (`group_contracts`, `group_query`, `group_status`, `group_sync`), and we have a latent wiki/summarizer engine in the CLI already. The wedge writes itself. + +Two reinforcing moats: + +1. **Group-level synthesis** — artifacts that cite across repos with a `See also (other repos in group)` footer. Nobody else can do this. +2. **Freshness** — PostToolUse hook notices a `.docmeta.json` and flags staleness after `git commit|merge|pull|rebase`. Users feel the docs track the code. + +## The crux + +No artifact-producing skill exists in the plugin. All 6 current skills are analytical (guide, exploring, impact, debugging, refactoring, pr-review). The `codehub wiki --llm` generator and Bedrock summarizer exist in the CLI but are invisible to Claude Code. The `generate-map` MCP prompt sketches an ARCHITECTURE.md template but no command invokes it. Users literally have no way to ask Claude Code "document this" and get a committed Markdown tree back. + +Remove the crux by shipping the skill. Everything else is downstream. + +## Three tensions resolved + +### Tension 1 — Naming + +| Cycle | Name | +|---|---| +| Strategy (001) | `/document-group` | +| PRD (002) | `/codehub-map` | +| Design (003) | `/codehub-document` | + +**Resolution: `/codehub-document`.** Reasons: (a) users who already know codeprobe expect "document" verbs; (b) prefix `codehub-` sidesteps the literal collision codeprobe owns at `/document`; (c) "map" foregrounds graph-origin but misframes the output — the artifact is a *document set*, not a map; (d) this also aligns with the agent naming convention in 004 (`doc-architecture`, `doc-reference`, `doc-cross-repo`). Group mode is a flag, not a separate skill. The full family becomes `codehub-document`, `codehub-pr-description`, `codehub-onboarding`, `codehub-adr`, `codehub-contract-map`. + +### Tension 2 — P0 scope + +| Cycle | P0 skills | +|---|---| +| Strategy (001) | `/document-group` only | +| PRD (002) | `/codehub-map` + `/codehub-pr-description` + `/codehub-onboarding` | +| Design (003) | Designs all 5 | + +**Resolution: PRD's P0 shape, with `codehub-document` doing single- and group-mode from day one.** Justification: + +- `codehub-document` alone doesn't demonstrate the pattern's reach. Users need to see the skill family to understand what OpenCodeHub-as-artifact-factory means. +- `codehub-pr-description` has the highest invocation frequency (every PR) and the shortest agent path. It proves the MCP→Markdown pipeline in 10 seconds, not 90. +- `codehub-onboarding` is the lowest-effort v1 output that shows the graph doing something prose can't — ranked reading order from centrality, owners table, entry point trace. +- `codehub-contract-map` folds into `codehub-document --group` for v1. If standalone demand emerges, split later. +- `codehub-adr` is P1 — the template market is crowded, and the group wedge matters more first. + +### Tension 3 — Orchestrator model + +Design flagged `codehub-document` running on **Opus** while every sibling runs on Sonnet. Cost posture inversion. + +**Resolution: Sonnet as default; Opus only when `--refresh` with `--group` is passed.** Rationale: the Opus routing argument is real but narrow — refresh logic that prunes sections by mtime comparison and fans out a partial subagent set requires judgment. Full-scan single-repo generation does not. Cost-but-lazy tenet (global CLAUDE.md) says spend when it matters; `--refresh --group` is where it matters. Single-repo first-run does not need Opus. + +## What ships this quarter (P0) + +Ordered by critical path. + +### 1. `codehub-document` skill — `plugins/opencodehub/skills/codehub-document/` + +Single- and group-mode from v1. 4-phase orchestration per codeprobe: Phase 0 precompute → Phase AB four subagents parallel → Phase CD two subagents parallel → Phase E inline assembler. `references/` for progressive disclosure: `document-templates.md`, `data-source-map.md`, `cross-reference-spec.md`, `mermaid-patterns.md`. Frontmatter per 003. Precondition: `list_repos` contains the target and `codehub status` is fresh. Argument-hint: `[output-dir] [--group ] [--committed] [--refresh] [--section ]`. + +### 2. Six `doc-*` subagents — `plugins/opencodehub/agents/doc-*.md` + +`doc-architecture`, `doc-reference`, `doc-behavior`, `doc-analysis`, `doc-diagrams`, `doc-cross-repo`. 8-section scaffold per 004. All Sonnet. All read `.codehub/.context.md` + `.codehub/.prefetch.md` first. `doc-cross-repo` is group-mode-only and is skipped silently in single-repo mode. + +### 3. Shared-context precompute — `packages/analysis/src/prefetch.ts` (or equivalent location) + +Phase 0 writer. Emits `.codehub/.context.md` (200-line cap, human-readable project digest) and `.codehub/.prefetch.md` (newline-delimited JSON ledger of tool calls with response digests — the dedup substrate). Per-subsection truncation with `truncated: true` flag. Group mode writes to `.codehub/groups//`. + +### 4. `.docmeta.json` schema + Phase E assembler + +Schema per 005: `generated_at`, `codehub_graph_hash`, `mode`, `sections[]` with `agent`/`sources[]`/`mtime`/`citation_count`, `cross_repo_refs[]` for group mode, `staleness_at` lifted from the MCP `_meta.codehub/staleness` envelope. Phase E is a single regex pass over citations followed by a co-occurrence join — 40 lines of deterministic code, no LLM call. + +### 5. `codehub-pr-description` skill — `plugins/opencodehub/skills/codehub-pr-description/` + +Sonnet, linear (no subagents). Reads `detect_changes` + `verdict` + `owners` + `list_findings_delta`. Outputs `.codehub/pr/PR-.md` by default or user path. Refuses on a clean tree. + +### 6. `codehub-onboarding` skill — `plugins/opencodehub/skills/codehub-onboarding/` + +Sonnet, one specialty subagent (`doc-onboarding`) that walks `project_profile` + `query` on entry-point concepts + `owners` + `route_map`/`tool_map`. Output: `.codehub/ONBOARDING.md` or `docs/ONBOARDING.md` with `--committed`. + +### 7. PostToolUse hook extension — `plugins/opencodehub/hooks.json` + +After the existing auto-reindex on `git commit|merge|rebase|pull`, if `.codehub/docs/.docmeta.json` exists and its `codehub_graph_hash` disagrees with the live hash, emit a non-blocking `systemMessage` suggesting `/codehub-document --refresh`. No auto-regeneration — regeneration spends Bedrock credits, user must consent. + +### 8. Discoverability patches + +- `opencodehub-guide` skill gains a Skills table listing the five artifact skills with trigger examples. +- `packages/cli/src/commands/analyze.ts` completion message appends `Try: /codehub-document · /codehub-onboarding`. +- `packages/mcp/src/next-step-hints.ts` — `verdict` and `detect_changes` responses append `{suggest: "codehub-pr-description"}` when a diff is present. +- Starlight site gains `/skills/` page rendered from each skill's frontmatter. + +## What moves to P1 + +- `codehub-contract-map` as a standalone skill (folded into `--group` for v1) +- `codehub-adr` +- `codehub-document --group --auto` mode inside the PostToolUse hook (auto-regenerate on merge-to-main) +- `group_wiki` + `group_synthesize` MCP tools (Strategy action B). Deferred because the Phase 0 precompute + existing `group_*` tools + `codehub wiki --llm` already cover the data path; promote to MCP if v1 proves the pattern + +## What we are NOT doing (explicit exclusions) + +Copy forward from 001 verbatim so they are reviewable: + +- **No web UI or hosted dashboard.** Claude Code is the client. +- **No new retrieval tools this quarter.** 28 is enough. +- **No indexer rewrite in Rust/Go.** Commit f8454b5 bought the headroom. +- **No non-Claude-Code LLM integrations.** No Cursor plugin, no Continue, no OpenAI Assistants. +- **No head-to-head with Copilot/Cursor.** They are agentic editors; we are an artifact factory. +- **No SVG/PNG diagrams.** Mermaid in Markdown is sufficient. +- **No Starlight auto-publish of generated docs.** Per-repo artifacts live under `.codehub/` or `docs/codehub/`; the site stays meta. + +## Risks (escalate if they block) + +1. **Parallel subagent ceiling** — Claude Code caps concurrent `Agent` calls at ~10 per message. Groups of 3+ repos require batching by role (all `doc-architecture` agents in message 1, all `doc-behavior` in message 2, …). Design 004 codified this; verify against the current Claude Code release before committing the group fan-out shape. If the ceiling is lower than 10, consider a `doc-supervisor` meta-agent per repo instead of per-role fan-out. +2. **Subagent tool sprawl** — each `doc-*` carries 6–10 `mcp__opencodehub__*` tools plus Read/Write/Grep/Glob. Tool-metadata context bloat is the realistic failure mode. Mitigation is already baked into 004: every agent opens with "do not re-call tools whose digest is in `.prefetch.md`" plus a Tool Usage Guide table. +3. **Bedrock credential gating** — the summarizer path requires AWS credentials on the host. Any skill that invokes it must degrade gracefully to raw graph output when Bedrock is unreachable. Document the failure mode in each SKILL.md. +4. **Precompute size** — `.prefetch.md` can balloon on large repos. Per-section caps (500 lines per block, hard) in the Phase 0 writer prevent the ledger from crowding out the context window. + +## Follow-on work once v1 lands + +1. `codehub-document --since ` for git-range-scoped regeneration. +2. CI workflow that runs `--refresh` on push-to-main and opens a PR when section mtimes move. +3. `group_wiki` + `group_synthesize` MCP tools if usage data shows Phase 0 precompute is the bottleneck. +4. `codehub-adr` with `impact`-sourced Consequences section. +5. A `codehub-release-notes` skill that consumes `list_findings_delta` across a range. + +## Open questions for you to decide + +These are the places where I made a call but could be wrong: + +- **Is "codehub-document" the right name?** Short enough, no collision, keeps the verb from codeprobe. Alternatives I rejected: `codehub-map`, `codehub-wiki`, `codehub-book`. If you hate the verb, flag it before we ship the frontmatter. +- **Gitignored default vs committed default.** I kept the PRD's call: `.codehub/docs/` gitignored by default, `--committed` writes to `docs/codehub/`. The one exception is `codehub-adr` — ADRs default to committed because an ADR that isn't in git isn't an ADR. +- **Does `codehub-onboarding` warrant its own skill or should it be a `--section onboarding` flag on `codehub-document`?** I kept it as its own skill to get the invocation phrase "write onboarding" directly. If you'd rather ship only one skill in v1 and fold onboarding + pr-description into flags, that's smaller but erodes the "artifact family" framing. My bet is the family signals the wedge better. + +## Ready for implementation + +Spec at `.erpaval/specs/001-claude-code-artifact-surface/spec.md` (EARS). When you want to start, invoke `/erpaval` (Act phase) against that spec — the task derivation and subagent packets follow the standard orchestrator runbook. If you want `/codehub-document` as a proof of concept first (single-repo, no group, no `--refresh`), skip the spec and say "do the POC" — I'll scaffold just Phase 0/AB/E plus the `doc-architecture` subagent as a 500-line demo. diff --git a/.erpaval/brainstorms/007-agents-at-scale-strategy.md b/.erpaval/brainstorms/007-agents-at-scale-strategy.md new file mode 100644 index 00000000..866a673c --- /dev/null +++ b/.erpaval/brainstorms/007-agents-at-scale-strategy.md @@ -0,0 +1,84 @@ +# 007 — OpenCodeHub at Agent Scale: Strategy Thesis + +*CSO pass, 2026-04-27. Audience: Laith (owner). Frame: Rumelt kernel, scoped to the autonomous-coding-agents-at-scale regime. Extends, does not replace, `.erpaval/brainstorms/001-opencodehub-next-strategy.md` and `.erpaval/brainstorms/006-synthesis-whats-next.md`.* + +The developer-laptop artifact factory — `/codehub-document`, `/codehub-pr-description`, `/codehub-onboarding`, and the PostToolUse freshness hook locked down in 006 — remains the wedge. This memo opens the second surface: OpenCodeHub as the grounding and guardrail substrate for coding agents that run off-laptop, at volume, on somebody else's infrastructure. + +## 1. Diagnosis + +The bounding challenge for an org that moves from "Claude Code at the engineer's terminal" to "thousands of agent-authored PRs per week across a repo fleet" is a fused pathology: **agents write uninformed and reviewers cannot catch the consequences deterministically, because grounding and guardrails are not services**. I pick the grounding gap and the review collapse as a single crux. They are not independent problems; they are the two ends of the same broken pipe. + +The grounding-gap half. A GitHub Action runner, a Claude-for-GitHub session, a Devin sandbox, an Amazon Q Developer job — each spins up, reads the files the task description pointed at, and writes. None of them consult a cross-repo symbol graph before the first token, because consulting one requires a graph to exist and a service to return it. The existing OpenCodeHub design assumes `.codehub/graph.duckdb` is sitting on the developer's disk from a prior `codehub analyze` run. That assumption dies the moment the writer is ephemeral. *Assumption: the median agent invocation today reads 3–10 open files and inlines their contents as context; a graph slice would replace most of that lookup with a one-shot structured payload, but only if the slice is one RPC away.* The consequence is that agents trip on the exact class of bug the graph would surface: a function signature changed, a consumer in a sibling repo broken, an invariant declared in an ADR violated, a license transitively pulled in. + +The review-collapse half. At 1000s of agent-authored PRs per week per org, humans cannot gate each merge with the attention budget they spent on the 10–50 human PRs per week they used to see. "LGTM" becomes a reflex; the signal-to-noise of CI checks drops because the checks in place (unit tests, typecheck, lint) do not catch cross-repo semantic regressions. OpenCodeHub has `verdict`, `impact`, blast-radius, owners, and license scanners (see `packages/analysis/src/verdict.ts`, `impact.ts`, `risk.ts`, `risk-snapshot.ts`), but they are invoked manually from `/verdict` on a laptop, not wired into a merge gate anyone's CI system enforces. The graph can answer "is this PR safe to auto-approve?" deterministically — nobody is asking it that question in the merge path. + +The review-collapse half feeds the grounding-gap half. If agents had been grounded pre-write, fewer of the 1000 PRs would carry latent cross-repo regressions, and the gate would have less to catch. Conversely, if the gate were deterministic and graph-backed, agents would learn (via fast red builds and structured findings) what grounding they should have pulled. Fixing either alone is half the win. Fixing both is OpenCodeHub's unique position — we already own the graph and the verdict primitives; we do not own the pre-write and pre-merge surfaces through which agents meet them. + +Secondary framings (provenance vacuum, fleet incoherence, policy enforcement gap) are real but downstream. Provenance without grounding is theater. Fleet coherence without a graph is impossible. Policy-as-code without a graph to evaluate against is a linter. The graph is the primitive; the missing layer is the service shape around it. + +## 2. Guiding policy + +**OpenCodeHub ships as the grounding plane for coding agents: a stateless/stateful service that every agent platform — Claude, Cursor, Copilot, Devin, Amazon Q, Cody, Greptile, CodeRabbit, Diamond — calls to get pre-write context, mid-write invariants, and pre-merge deterministic gates. We do not build an agent. We ground everyone else's.** + +The policy follows from the diagnosis. If the pathology is uninformed writes plus non-deterministic reviews, the intervention is a graph-backed service that sits on both sides of the write. Pre-write: a grounding pack delivered as MCP-over-HTTP to whichever agent runtime made the call. Pre-merge: a GitHub Action (and GitLab CI template) that runs `codehub analyze` on the PR head, consults the policy file, emits a Checks verdict with auto-approve/block/route signals. Post-merge: a provenance manifest every agent PR carries, so an audit trail exists. + +Three things this policy rules out, each a real option I am rejecting: + +- **We do not build our own coding agent.** No OpenCodeHub-branded autonomous PR author. Every cycle we spend competing with Devin or Claude-for-GitHub is a cycle we do not spend being the grounding surface they all depend on. Composability wins against consolidation when the primitive is hard and the runtime is easy. +- **We do not build a hosted review UI.** GitHub PR comments and Checks are the review UI. Building a dashboard competes with the customer's existing tool and loses on integration cost. +- **We do not fine-tune models or ship our own LLMs.** The value is the graph and the policy shape. Routing model choice back to the agent platform preserves our neutrality — we can be the grounding layer for a Claude agent and an Amazon Q agent in the same org without picking a side. + +The developer-laptop artifact factory from 001/006 stays in the product. It is the wedge that makes OpenCodeHub visible to engineers; the agent-scale grounding plane is the surface that makes OpenCodeHub structural to their org. Two surfaces, one graph. + +## 3. Coherent actions + +Ten moves. P0 = ship this quarter alongside the 006 artifact surface. P1 = next quarter. P2 = followup. + +**A. [P0] MCP-over-HTTP server at `packages/mcp-http/`.** The existing stdio server at `packages/mcp/src/server.ts` is laptop-only. Fork a `packages/mcp-http/` flavor that speaks Streamable HTTP per the MCP spec, authenticates via short-lived OAuth tokens (GitHub App installation tokens are the natural issuer), and exposes a narrower tool set: `grounding_pack`, `query`, `context`, `impact`, `detect_changes`, `verdict`, `group_contracts`, `group_query`, `list_findings_delta`. Destructive tools (`rename`, raw `sql`) stay local-only; remote callers get read-only graph surface plus pre-computed gates. *Assumption: short-lived GitHub-App-minted tokens scoped to repo + group are sufficient auth for v1; SSO/OIDC lands in P1.* Allocation: one engineer, full quarter. + +**B. [P0] `opencodehub/analyze-action@v1` and `opencodehub/verdict-action@v1` — GitHub Action pair.** Home them under `packages/actions/analyze/` and `packages/actions/verdict/` with a thin Node action shell that shells out to the CLI already in `packages/cli/src/commands/`. `analyze-action` runs `codehub analyze` on checkout, uploads the resulting `.codehub/graph.duckdb` + sidecars to S3/R2/GitHub Actions Cache keyed by `graph_hash`. `verdict-action` pulls the graph by hash, runs `codehub verdict` + policy evaluation, posts a GitHub Checks run with structured annotations. Companion GitLab template at `packages/cli/src/ci-templates/` (directory already scaffolded). Allocation: one engineer, six weeks. + +**C. [P0] `grounding_pack` MCP tool — `packages/mcp/src/tools/grounding-pack.ts` + `packages/mcp-http/` surface.** Signature: `grounding_pack({repo, task_description, target_files?, group?}) → { symbol_slice, blast_radius_hint, owners, recent_findings_on_touched_files, group_contracts_if_crosses_boundary, invariants }`. Implementation composes existing primitives (`query`, `context`, `impact`, `owners`, `list_findings`, `group_contracts`) into one JSON payload an agent prepends to its system prompt. This is the single most important tool in this memo — it is the pre-write intervention made concrete. Allocation: one engineer, four weeks. + +**D. [P0] `@opencodehub/agent-sdk` — thin Node/Python client at `packages/agent-sdk-node/` and `packages/agent-sdk-python/`.** Drop-in for Claude Agent SDK, Vercel AI SDK, LangGraph, Strands, and a generic OpenAI tool-use loop. Wraps the MCP-over-HTTP endpoint, handles token refresh, and exposes `groundingPack()` plus a `withGrounding(agent)` decorator that auto-injects on every turn. Ships with example integrations in `examples/`. *Assumption: MCP-over-HTTP adoption by agent frameworks is uneven; an SDK wrapper is the cheap accelerant.* Allocation: one engineer, six weeks. + +**E. [P0] `opencodehub.policy.yaml` + evaluator — `packages/analysis/src/policy/`.** Declarative rules at the repo or group root: blast-radius tiers that auto-approve or block, license allowlists, required owners, architectural invariants as graph queries (e.g., `no_import_from: [packages/storage/**] unless target_path: [packages/storage/**, packages/service-*/**]`). The evaluator consumes `verdict`, `impact`, and `sql` output; emits a structured result with `decision: approve|block|route`, `reasons[]`, `policy_version`. Wired into `verdict-action` from B. Allocation: one engineer, full quarter. + +**F. [P1] Grounding provenance manifest — `.opencodehub/grounding.json` in every agent PR.** Schema: `graph_hash`, `tools_called[]` with digests, `findings_received[]`, `policy_evaluation`, `agent_identity`, `signed_by`. Written by the `agent-sdk` on each turn, committed as a PR artifact, verified by `verdict-action`. Two deterministic gate signals drop out of this: (1) "did the agent call grounding before writing?" (presence check), (2) "did the grounding content match post-merge reality?" (audit replay). Allocation: one engineer, four weeks, P1 only because the SDK (D) must land first. + +**G. [P0] Graph-as-service storage — `packages/graph-store/` with S3/R2 and GitHub Actions Cache backends.** Ephemeral CI jobs must not re-index a 2M-LOC monorepo on every PR. Per-commit graph uploaded by `analyze-action` keyed by `graph_hash = sha256(commit_hash + analyzer_version + config_hash)`; group-scoped manifests keyed by `group_hash`. Content-addressed so cache collisions are impossible and coherence is trivial. *Assumption: GitHub Actions Cache 10 GB per-repo quota is enough for the median customer; S3/R2 is the overflow path and the path for self-hosted runners.* Allocation: one engineer, six weeks. + +**H. [P1] Fleet-coherence primitive — `detect_changes` and `impact` gain `session_id` + `open_branches[]` params.** Question answered: "what changes would this PR conflict with if it merges while sibling PRs X, Y, Z are still open?" Requires a cross-branch graph merge over the graph-store in G. This is the uniquely-hard move and the one nobody else has. Ships behind a feature flag in v1 because the merge semantics need field data. Allocation: two engineers, full quarter — the hardest work in the plan. + +**I. [P1] GitHub App webhook subscriber — `packages/github-app/`.** Listens for PR events, invokes `analyze-action` + `verdict-action` logic in-process, posts Checks and structured comments. Unlocks zero-config usage for customers who don't want to edit `.github/workflows/`. Kept P1 to avoid forcing a hosted-service commitment in the v1 quarter; customers can still self-host the App. Allocation: one engineer, six weeks. + +**J. [P0] Auto-approval policy primitives — `codehub policy evaluate --pr ` and `opencodehub/auto-merge-action@v1`.** Rule classes shipped out of the box: `label:agent-authored + verdict.tier<=2 + all-policies-pass + required-owners-approved auto-merges after N hours`. The action writes a review and either approves or requests changes; merge itself stays with the customer's branch-protection rules. Allocation: one engineer, four weeks. + +**Critical path.** A → C → D is the pre-write spine. B → E → J is the pre-merge spine. G unblocks both. H is the differentiator once both spines exist. F is provenance once D exists. I is distribution once everything else works. + +## 4. What we are NOT doing + +- **No OpenCodeHub-branded coding agent.** We do not compete with Devin, Claude-for-GitHub, Amazon Q Developer, or Cursor agents. +- **No hosted review UI.** GitHub PR comments, GitHub Checks, and GitLab MR widgets are the review surface. We post into them; we do not replace them. +- **No head-on competition with CodeRabbit / Greptile / Diamond.** Those tools are agents. We ground them — they become customers of `grounding_pack`. +- **No model fine-tuning.** Model choice stays with the agent platform. +- **No LSP or IDE plugin.** Claude Code plugin, MCP-over-HTTP, and GitHub Action are the three surfaces; an LSP is a fourth we do not need. +- **No head-on competition with Sourcegraph's enterprise-code-search SKU.** We are narrower and more opinionated: graph-grounding for agents, not general code search. +- **No hosted cloud service in v1.** Everything is self-hostable — the Action runs in the customer's CI, the graph store is their S3/R2, the policy file is in their repo. Hosted-OpenCodeHub is a P2 SaaS play contingent on v1 pull. +- **No new retrieval MCP tools beyond `grounding_pack`.** The 28-tool surface is frozen per 001; `grounding_pack` is a compositor, not a new primitive. + +## 5. Moat analysis + +The underlying graph tools (tree-sitter, SCIP, tsserver, pyright) are commodities. Moats are shape moats, not capability moats. + +- **Group-level (multi-repo) graph joins.** The `group_*` surface — `group_contracts`, `group_query`, `group_status`, `group_sync` — is the delta over every single-repo code-intel tool (Sourcegraph Cody is closest; their group shape is weaker). Cross-repo blast radius and cross-repo invariants are table-stakes at agent scale, and single-repo tools structurally cannot answer them. Action H extends this into cross-branch, which is another order of hard. +- **Policy + provenance shape.** `opencodehub.policy.yaml` plus `.opencodehub/grounding.json` is a product-design problem dressed as an engineering one. The schema choices — what counts as an auto-approve signal, what provenance the gate requires, what reviewers see — compound over deployments. Customers who adopt the schema are expensive to migrate off. +- **Offline-safe and air-gap-friendly.** SPECS.md already commits to offline guarantees on the core graph path. In regulated orgs (finance, defense, health) the ability to run the grounding plane fully inside a VPC with no outbound calls is a deal qualifier, not a feature. Devin and Claude-for-GitHub cannot offer this today; neither can Sourcegraph's hosted tier. +- **Composability with every agent platform.** Because we ship no agent, every agent vendor is a potential distribution partner rather than a competitor. The positioning is symmetric to the way Vercel's AI SDK became the substrate for model-agnostic agents — be the neutral layer, get integrated by everyone. +- **Open-source and self-hostable.** The GitLab trajectory versus Sourcegraph's closed model. Enterprise procurement prefers self-hostable OSS substrate with an optional hosted SKU. Action G makes self-hosting operationally cheap. + +The moats compound. Group graph joins are hard to replicate technically; policy-and-provenance shape is hard to replicate productively; offline guarantees are hard to replicate organizationally; composability is hard to replicate strategically once you have built a rival agent. The intersection is the defensible position. + +## 6. Updated one-sentence strategy thesis + +**OpenCodeHub is a two-surface product — the Claude Code artifact factory that produces group-level documentation on the developer's laptop, and the MCP-over-HTTP grounding plane plus CI merge gate that every coding agent platform calls to write and merge safely at org scale — unified by a single offline-safe cross-repo graph that nobody else ships.** diff --git a/.erpaval/brainstorms/008-agent-grounding-prd.md b/.erpaval/brainstorms/008-agent-grounding-prd.md new file mode 100644 index 00000000..79e09202 --- /dev/null +++ b/.erpaval/brainstorms/008-agent-grounding-prd.md @@ -0,0 +1,223 @@ +# OpenCodeHub as Agent Grounding Plane — Lifecycle PRD + +**Owner:** Laith Al-Saadoon (AGS Tech AI Engineering NAMER) +**Status:** Draft v1 — 2026-04-27 +**Scope:** OpenCodeHub repositioned as a grounding + guardrail plane for autonomous coding agents running off-laptop at PR-scale. This PRD supersedes the single-repo artifact framing of `002-opencodehub-artifact-skills-prd.md` at the *lifecycle* level; the artifact-skill work remains in-scope and becomes one of the tools the plane exposes. + +--- + +## Problem + +Three concrete failure modes orgs hit today once agent-authored PRs cross ~100/week per repo fleet. + +- **Agent writes blind.** The agent's prompt never sees the call graph. It refactors `normalizeInvoiceId` in `billing-core`, unaware that four sibling repos import it. CI 10 minutes later screams across three pipelines, humans context-switch to triage, and the agent has already moved to the next task. The graph existed; it just wasn't in the prompt. +- **Review collapse.** A mid-size platform org fields 600 agent-authored PRs per week. Human reviewers rubber-stamp the ones that "look fine," the occasional subtle bug ships, and blast radius goes unchecked because no reviewer traces five hops of downstream impact by hand. The review budget is a fixed human-hour line; agent output is elastic. +- **No provenance.** Post-incident, the security lead asks: which agent made this commit, what did it read first, was its graph view stale, did the policy gate actually run? Today that question is unanswerable — git blame points at `agent-bot@example.com`, CI logs rolled off, and no signed manifest exists. + +The gap: OpenCodeHub has the graph and the tools — `verdict`, `impact`, `detect_changes`, `list_findings_delta`, `group_contracts`, `license_audit`, `owners`, plus 20+ more — but the surface is stdio-MCP + laptop CLI only. No HTTP endpoint, no CI action, no policy-as-code, no provenance manifest, no graph-storage service. The assets exist; the *plane* doesn't. Off-laptop agents cannot reach us, cannot gate on us, and cannot be audited through us. + +--- + +## Agent lifecycle — 5 stages + +Every agent interaction with code flows through these stages. For each we name the status quo and what OpenCodeHub should offer. + +### 1. Task intake + +The agent is handed "fix bug in billing" or "add endpoint X" via a GitHub issue, a Slack message, or a PR-bot webhook. **Today:** agents route tasks on natural-language matching; no repo-aware triage. **OpenCodeHub offers:** `list_repos` + `project_profile` + `owners` resolve the task to a target repo and a responsible owner set before the agent claims the task. A `task_route({description})` helper wraps the triage call. + +### 2. Pre-write grounding + +Before editing, the agent should pull a graph slice: relevant symbols, blast radius for the likely target, owners, prior findings, group-level contracts it must respect, arch invariants. **Today:** rarely happens; agents rely on in-prompt file reads and training-data recall. **OpenCodeHub offers:** a new `grounding_pack({repo, task_description, target_files?})` MCP tool that returns one bundle — top-ranked symbols (via existing `query`), first- and second-order `impact`, `owners` table, recent `list_findings` for the area, relevant `group_contracts`, the policy file digest, and a `graph_hash` for the manifest. + +### 3. Write + +Agent edits files. Can consult MCP tools mid-write for spot-check questions (`context(symbol)`, `impact(target)`). **Today:** some agents wire a handful of tools; many don't. **OpenCodeHub offers:** the existing 28 tools, unchanged, now reachable over HTTP. The Agent SDK exposes them as a typed client so framework authors don't hand-roll JSON-RPC. + +### 4. Pre-PR gate + +Before opening the PR, the agent should run policy evaluation locally: blast-radius budget, license allowlist, arch invariants, required-owner coverage, findings-delta severity. **Today:** policy runs post-open, in CI, wasting a round trip. **OpenCodeHub offers:** `policy_evaluate({repo, pr_ref, policy_path?})` as an MCP tool the agent calls before `gh pr create`. Same evaluator runs identically in CI (stage 5). Deterministic: same inputs, same verdict. + +### 5. PR review + merge gate + +The PR is open. CI must produce a deterministic verdict driven by graph policy, not LLM vibes. **Today:** `verdict`, `impact`, `detect_changes`, `list_findings_delta` exist as separate tools with no integrated gate. **OpenCodeHub offers:** `opencodehub/verdict-action@v1` posts a GitHub Check with per-rule pass/fail, auto-applies an `opencodehub:auto-approve` label when all rules pass, and uploads the signed provenance manifest. Humans review only the PRs the policy flagged. + +### 6. Post-merge + +The graph must re-index; downstream agents must pick up the new state before their next grounding call. **Today:** `codehub analyze` runs per-laptop on PostToolUse. **OpenCodeHub offers:** `opencodehub/analyze-action@v1` runs on `push` to main, writes the graph to the configured object store keyed by `(repo, commit_sha)`, and emits a webhook that group-member repos can subscribe to for cross-repo freshness. + +--- + +## Users — 4 archetypes + +### Agent framework author + +*Builds an autonomous coding agent on Claude Agent SDK, LangGraph, or Strands.* + +- **When I** build an autonomous coding agent that edits code in repos I don't own, +- **I want** a grounding SDK that injects graph context, blast radius, and owners into every prompt via one call, +- **so that** my agent writes code aware of cross-repo effects without me re-implementing code-graph retrieval. +- **Bad outcome to avoid:** shipping an agent that ignores blast radius, then getting banned from an org's merge automation after one bad refactor cascade. + +### Platform engineer / DevEx lead + +*Owns CI and merge automation for an org with 500 repos and a growing fleet of agent-authored PRs.* + +- **When I** turn on agent-authored PRs across my repo fleet, +- **I want** deterministic merge gates driven by graph policy, not LLM review, +- **so that** safe PRs auto-approve and human reviewers focus on the ~15% the policy flags. +- **Bad outcome to avoid:** human reviewers rubber-stamping 600 PRs/week, a real blast-radius-5 refactor ships, and I'm the one rolling it back Sunday night. + +### Security / governance lead + +*Accountable for SBOM, license compliance, arch invariants, and post-incident audit.* + +- **When I** let agents ship PRs across production repos, +- **I want** a signed grounding manifest attached to every agent PR, +- **so that** I can audit what the agent knew, prove the policy ran, and enforce invariants at merge time. +- **Bad outcome to avoid:** an incident where I cannot reconstruct which agent changed what, what it read first, or whether our arch invariants were checked. + +### Individual repo owner + +*Library maintainer on the receiving end of cross-repo agent PRs.* + +- **When I** get an agent PR that touches my library, +- **I want** the agent to have already consumed my `group_contracts`, owners file, and invariants, +- **so that** I'm not spending review cycles re-teaching the agent context it could have fetched. +- **Bad outcome to avoid:** approving a PR that breaks three downstream consumers because the agent never saw my contract surface. + +--- + +## Solution surface — the grounding plane + +| Surface | What it does | +|---|---| +| `packages/mcp-http/` | MCP-over-HTTP server using the Model Context Protocol streamable-HTTP transport on `/mcp`. Exposes all 28 existing tools plus three new ones below. Single endpoint, `Mcp-Session-Id` header for stateful sessions, `Origin` validation for DNS-rebinding safety. *Assumption: streamable-HTTP remains the current MCP transport per the March 2025 spec revision; SSE is the deprecated fallback we do not ship.* | +| `grounding_pack(tool)` | New MCP tool. Input: `{repo, task_description, target_files?}`. Output: bundle of ranked symbols, first- and second-order impact, owners, recent findings, relevant group contracts, policy digest, `graph_hash`. This is the single call a pre-write agent makes. | +| `policy_evaluate(tool)` | New MCP tool. Input: `{repo, pr_ref, policy_path?}`. Output: structured verdict per rule (pass/fail/skip) with citations back into the graph. Deterministic — same inputs, same verdict. Runs identically locally (agent pre-PR) and in CI (merge gate). | +| `provenance_record(tool)` | New MCP tool. Input: `{repo, pr_ref, tools_called, graph_hash}`. Writes `.opencodehub/grounding.json` to the PR branch. | +| `opencodehub/analyze-action@v1` | GitHub Action. Runs `codehub analyze`, uploads the graph DB to the configured backend (S3, R2, GitHub Artifact), keyed by `(repo, commit_sha)`. | +| `opencodehub/verdict-action@v1` | GitHub Action. Fetches the cached graph, runs `policy_evaluate`, posts a GitHub Check, applies auto-approve label on full pass. | +| `opencodehub/ground-action@v1` (P1) | GitHub Action. Runs `grounding_pack` on an open PR, posts a human-readable summary as a PR comment so reviewers see what the agent saw. | +| GitLab CI templates (P1) | `.opencodehub/gitlab/analyze.yml`, `verdict.yml`, `ground.yml`. Same semantics, mirror of the GH actions. | +| `opencodehub.policy.yaml` | Policy-as-code schema at repo or group root. v1 rule types: `blast_radius_max`, `license_allowlist`, `ownership_required`, `arch_invariants` (constrained YAML that compiles to cypher against the graph), `finding_severity_blocking`. JSON-Schema validated; invalid policies fail fast. | +| `@opencodehub/agent-sdk` | Thin typed wrapper over MCP-over-HTTP. Python (`opencodehub_agent_sdk`) and TypeScript. Three primary calls: `ground(task)`, `verdict(pr)`, `provenance(pr, grounding_result)`. Plus framework adapters for Claude Agent SDK (Python), Vercel AI SDK (TS), LangGraph, Strands, and a generic OpenAI/Anthropic tool-use loop. | +| `.opencodehub/grounding.json` | Provenance manifest the agent commits to the PR branch. Schema: `graph_hash`, `tools_called[]` each with `{tool, args_hash, response_hash, ts}`, `policy_evaluated`, `findings_received`, `agent_identity`. Rendered as a human-readable summary by `ground-action`. | +| Graph storage service | Object-store-backed (S3 / R2 / GitHub Artifact adapter). Keyed by `(repo, commit_sha)`. TTL-ed. Local file (`.codehub/graph.duckdb`) remains the laptop fallback. Paired with a short-lived signed-URL fetch so actions never see bucket creds directly. | +| `conflict_forecast(tool)` (P2) | Fleet-coherence primitive. Input: `{pr_ref, open_sibling_prs[]}`. Projects merge conflicts across multiple open agent PRs on the same graph. | + +--- + +## Job stories in detail + +### JS-1 Agent framework author wires grounding into a Claude Agent SDK loop + +**Trigger:** framework author is building `billing-refactor-agent`, a Claude Agent SDK agent that takes a Jira ticket and opens a PR. + +**Steps:** (1) install `@opencodehub/agent-sdk` in the agent project; (2) set `OPENCODEHUB_URL` and a short-lived token scoped to a GitHub App install; (3) on every task, call `sdk.ground(task)` before the first model turn and feed the returned bundle into the system prompt as the "codebase context" section; (4) expose the full MCP tool set to the model via the SDK's tool adapter so the agent can mid-write call `context(symbol)` or `impact(target)`; (5) before the agent opens the PR, call `sdk.verdict(pr_ref)` and abort if verdict is `blocked`; (6) on successful open, call `sdk.provenance(pr_ref, grounding_result)`. + +**Surfaces touched:** `packages/mcp-http/`, `grounding_pack`, `policy_evaluate`, `provenance_record`, `@opencodehub/agent-sdk`, `.opencodehub/grounding.json`. + +**Outcome:** agent prompts are reproducibly grounded, PRs arrive with a signed manifest, and the framework author did not hand-roll retrieval. + +**Failure modes and detection:** (a) grounding_pack timeout on a fresh repo — SDK logs `grounding_stale=true`, agent falls back to empty context and the manifest records the failure. (b) policy_evaluate diverges between local and CI — both runs record `policy_hash`; a mismatch is a P0 bug. (c) expired token — 401 at SDK boundary; the SDK's retry wrapper refreshes from the configured OIDC source. + +### JS-2 Platform engineer turns on deterministic merge gates for a 500-repo fleet + +**Trigger:** platform lead rolls out agent-authored PRs to a second tier of repos and needs the human-review line to hold. + +**Steps:** (1) drop `.github/workflows/opencodehub.yml` referencing `analyze-action@v1` on push-to-main and `verdict-action@v1` on pull_request; (2) author `opencodehub.policy.yaml` at the org's shared-config repo and reference it from member repos; (3) configure auto-approve rule: `verdict == pass && label == opencodehub:auto-approve` merges via the existing branch-protection bot; (4) wire the Checks API output to a Slack digest. + +**Surfaces touched:** `analyze-action`, `verdict-action`, `opencodehub.policy.yaml`, graph storage service. + +**Outcome:** the review queue drops from 600/week to roughly the ~15% of PRs the policy flags. Mean-time-to-merge for safe PRs falls to minutes; human attention concentrates on real blast radius. + +**Failure modes and detection:** (a) graph cache miss in CI — `analyze-action` re-runs analysis on the fly with a visible latency hit; alert fires after 3 consecutive misses. (b) a policy rule has a false-positive pattern — per-rule pass/fail is itemized, so the platform lead can silence one rule without disabling the gate. (c) agent bypasses the label — branch protection requires the Check, not the label; bypass requires explicit human override logged in audit. + +### JS-3 Security lead audits an incident traced to an agent PR + +**Trigger:** Sev-2 incident, rollback points to commit `abc123` authored by an agent. + +**Steps:** (1) fetch `.opencodehub/grounding.json` from the PR branch (or its archived copy in the audit bucket); (2) verify the manifest signature against the agent-identity key; (3) read `tools_called[]` to reconstruct the agent's view; (4) compare `graph_hash` to the graph at commit time — was the grounding fresh? (5) replay `policy_evaluate` with the captured inputs to confirm the gate ran as advertised. + +**Surfaces touched:** provenance manifest, graph storage (historical `(repo, commit_sha)` entries), `policy_evaluate`, signed-URL fetch. + +**Outcome:** incident post-mortem names a specific grounding gap or policy gap, not "the agent." Remediation is a new rule or a richer grounding bundle, not a blanket rollback of agent-authored PRs. + +**Failure modes and detection:** (a) graph for the incident commit aged out of the cache — TTL policy tunable per repo criticality; audit-tier repos pin forever. (b) missing manifest — CI fails closed if `provenance_record` did not run, so missing manifests are a pre-merge signal, not a post-incident surprise. + +### JS-4 Repo owner receives an agent PR on a shared library + +**Trigger:** `platform-sdk` maintainer sees an agent PR on their library from a downstream product repo. + +**Steps:** (1) the PR comment from `ground-action` already summarizes which contracts the agent consumed and which owners were notified; (2) maintainer scans the blast-radius section (two downstream repos); (3) reviews the one file the agent touched against the contract the agent cited; (4) merges or requests changes; no context re-teaching. + +**Surfaces touched:** `ground-action`, `grounding_pack`, `group_contracts`, PR comment renderer. + +**Outcome:** review time per agent PR drops to minutes because the agent's context is visible in-thread. + +**Failure modes and detection:** (a) agent cited an outdated contract — `group_contracts` staleness flag is rendered in the comment; maintainer sees it immediately and requests a re-ground. (b) agent touched a file with no contract — the comment flags `ungoverned_surface=true` and routes the PR to the `arch` team. + +--- + +## Acceptance criteria (EARS) + +1. When an agent sends a request to `POST /mcp` with a valid session header, the HTTP server **shall** respond using the Model Context Protocol streamable-HTTP transport and **shall** reject requests with an invalid `Origin` header. +2. When `grounding_pack` is called with `{repo, task_description}`, the system **shall** return JSON conforming to the `grounding_pack.schema.json` and **shall** include a non-empty `graph_hash`. +3. When `policy_evaluate` is called against a `pr_ref` with an `opencodehub.policy.yaml` present, the system **shall** return a structured verdict with a per-rule pass/fail/skip entry for every rule declared in the policy. +4. When `policy_evaluate` runs twice on unchanged inputs (same `graph_hash`, same `pr_ref`, same policy file), the two verdicts **shall** be byte-identical. +5. When `analyze-action@v1` runs on push-to-main for a repo up to 500k LOC, it **shall** upload the graph to the configured backend within 10 minutes and **shall** emit the `(repo, commit_sha)` cache key in the action output. +6. When `verdict-action@v1` runs on a PR and all policy rules pass, it **shall** post a GitHub Check named `OpenCodeHub / verdict` with conclusion `success` and **shall** apply the `opencodehub:auto-approve` label. +7. When an agent commits via the SDK, the PR branch **shall** contain `.opencodehub/grounding.json` validating against the provenance schema before the PR is marked ready-for-review. +8. When `sdk.ground(task)` is invoked against a repo up to 100k LOC with a warm graph cache, it **shall** return within 5 seconds at p50 and 15 seconds at p95. +9. When `opencodehub.policy.yaml` fails JSON-Schema validation, `policy_evaluate` **shall** exit with a non-zero code and **shall not** return a pass verdict. +10. When `provenance_record` completes, the written manifest **shall** be signed with the agent-identity key (P1 requirement; v1 accepts unsigned with a `signed=false` flag). +11. When the graph storage service receives a request for a `(repo, commit_sha)` already in cache, it **shall** serve the cached artifact without re-running analysis and **shall** emit a `cache_hit=true` metric. +12. When two or more open PRs against the same repo are passed to `conflict_forecast`, the system **shall** return a list of files predicted to conflict with at least the precision of a three-way `git merge --no-commit` dry-run (P2). +13. When the agent calls any tool over MCP-HTTP without a valid short-lived token, the server **shall** return 401 and **shall** log the attempt to the audit sink. +14. When the policy file at the group root and the repo root both exist, the repo root **shall** take precedence and the merged effective policy **shall** be recorded in the manifest. +15. When `ground-action@v1` posts a PR comment, the rendered comment **shall** include the `graph_hash`, the list of tools called, the blast-radius tier, and the owner set — in ≤150 lines. + +--- + +## Scope — v1 / P1 / P2 + +### v1 (this quarter) + +- `packages/mcp-http/` with streamable-HTTP `/mcp` endpoint and all 28 existing tools. +- `grounding_pack` MCP tool (new) with JSON-Schema output contract. +- `analyze-action@v1` and `verdict-action@v1` GitHub Actions, published to the GitHub Marketplace under `opencodehub/`. +- `@opencodehub/agent-sdk` Python client first; TS client P1. +- `opencodehub.policy.yaml` skeleton with three v1 rule types: `blast_radius_max`, `license_allowlist`, `ownership_required`. +- Graph storage service — S3/R2 adapter, `(repo, commit_sha)` keying, 30-day TTL default. +- Short-lived-token auth keyed to a GitHub App install. + +### P1 (next quarter) + +- `provenance_record` + `.opencodehub/grounding.json` schema + signing. +- `ground-action@v1` PR-comment renderer. +- GitLab CI templates mirroring the three actions. +- TypeScript SDK (`@opencodehub/agent-sdk` npm package). +- Two additional policy rule types: `arch_invariants` (constrained YAML → cypher) and `finding_severity_blocking`. +- Framework adapters: Claude Agent SDK, Vercel AI SDK, LangGraph, Strands, generic OpenAI tool-use. + +### P2 + +- `conflict_forecast` fleet-coherence primitive. +- GitHub App (not just Action) with native Checks + auto-merge surface. +- Hosted graph storage (OpenCodeHub-operated tier for teams that don't want to run S3 themselves). +- Signed/tamper-evident provenance with Sigstore-compatible cosign signatures. +- Cross-org policy federation. + +--- + +## Risks and open questions + +- **Transport choice.** MCP streamable-HTTP on `/mcp` is the current spec default (March 2025 revision, superseding SSE). v1 ships streamable-HTTP only. Websockets ruled out — not in the spec. *Assumption: the Anthropic streamable-HTTP spec holds through the next 12 months; re-check before GA.* +- **Graph privacy.** We never upload source. We do upload the graph DB — nodes, relations, symbol names, file paths, LOC ranges. For some orgs the graph itself is sensitive (internal service topology). v1: BYO bucket with CMK support. P1: per-repo encryption-key binding so the plane never sees plaintext. +- **Policy-DSL vs raw cypher.** Raw cypher is maximally expressive and maximally dangerous — a bad query on a large graph burns minutes. v1 ships a constrained YAML schema that compiles to a curated cypher subset. Raw cypher is explicitly out of scope for v1; platform teams who need it can run `sql` tool directly. +- **Agent authentication.** Short-lived OIDC tokens keyed to a GitHub App install. TTL 15 minutes. *Assumption: the consuming orgs already run GitHub App auth for their agents; the Cursor / Devin / Jules vendors each have their own identity model and we expose the GitHub App path first.* +- **Cost scaling for graph storage on monorepos.** A 5M-LOC monorepo graph is ~1-3 GB. At 30-day TTL and commit-level keying, retained footprint is ~(commits/day × 30 × GB). Mitigation: commit-sha dedup on unchanged subgraphs (content-addressed graph blocks) lands in P1; v1 accepts the naive footprint and surfaces a `graph_bytes` metric so platform leads can set per-repo TTL. +- **Framework breadth vs depth.** Claude Agent SDK is the first-class client. Being framework-agnostic at the wire protocol (MCP-over-HTTP) keeps every other agent framework unblocked at the cost of shipping N framework adapters in P1. My call: ship Claude first; let community PRs fill the rest. +- **Fleet coherence's real cost.** `conflict_forecast` on N open PRs is O(N²) in the naive implementation. P2 is the right timing to research an incremental projection model rather than ship the quadratic version. + diff --git a/.erpaval/brainstorms/009-grounding-plane-interfaces.md b/.erpaval/brainstorms/009-grounding-plane-interfaces.md new file mode 100644 index 00000000..23893978 --- /dev/null +++ b/.erpaval/brainstorms/009-grounding-plane-interfaces.md @@ -0,0 +1,372 @@ +# 009 — Agent-Grounding Plane: Remote API, CI Integrations, Policy Schema + +*Draft: 2026-04-27. Inputs: 001 Strategy kernel, 002 PRD, 003–005 Design, 006 Synthesis (artifact plane). This memo extends OpenCodeHub past the laptop and into CI — the off-desktop surface that grounds any agent in any pipeline.* + +Prior brainstorms locked the **artifact plane** (single-repo Markdown-factory skills on Claude Code). 006 closed that thread and called out four P1 items, including "CI workflow that runs `--refresh` on push-to-main". This memo designs the **grounding plane**: the HTTP MCP surface, two new agent-facing tools, a policy-verdict DSL, and the two GitHub Actions that stitch them together. + +The wedge is the same — **graph-aware retrieval + blast-radius + group contracts** — but the consumer shifts from Claude Code to any agent (Claude Agent SDK, Vercel AI SDK, LangGraph, bespoke OpenAI loops) running inside CI or a remote runtime. The artifact plane writes Markdown *to* a repo; the grounding plane feeds *graph evidence* to whatever agent is editing the repo, then emits a verdict the pipeline can enforce. + +## Section 1 — MCP-over-HTTP server: `packages/mcp-http/` + +`packages/mcp/` is stdio-only today — 28 tools, single process per repo. The remote form is a new sibling package that reuses the tool registry verbatim and swaps the transport. + +**Transport.** *Assumption: the current Anthropic MCP spec names the remote transport "Streamable HTTP" and provides `NodeStreamableHTTPServerTransport` in `@modelcontextprotocol/sdk`. That matches the docs I verified against the SDK's llms.txt on 2026-04-27.* We implement Streamable HTTP as the primary surface and keep SSE as a compatibility fallback for clients older than spec revision 2025-03-26. + +**Entrypoint.** `packages/mcp-http/src/server.ts` runs an Express app with a single POST `/mcp` route plus `/healthz` and `/.well-known/oauth-protected-resource`. Bearer auth is enforced at middleware level before the transport touches the request body. + +```typescript +// packages/mcp-http/src/server.ts (sketch, ~30 lines) +import express from "express"; +import { randomUUID } from "node:crypto"; +import { NodeStreamableHTTPServerTransport } from "@modelcontextprotocol/node"; +import { McpServer, isInitializeRequest } from "@modelcontextprotocol/server"; +import { registerAllTools } from "@opencodehub/mcp/registry"; +import { authMiddleware, AuthedRequest } from "./auth.js"; +import { rateLimit } from "./rate-limit.js"; +import { GraphCache } from "./graph-cache.js"; + +const app = express(); +app.use(express.json({ limit: "2mb" })); +app.use("/mcp", authMiddleware, rateLimit); +const cache = new GraphCache({ maxBytes: 2 * 1024 * 1024 * 1024 }); +const sessions = new Map(); + +app.post("/mcp", async (req: AuthedRequest, res) => { + const sid = req.headers["mcp-session-id"] as string | undefined; + if (sid && sessions.has(sid)) { + return sessions.get(sid)!.handleRequest(req, res, req.body); + } + if (!isInitializeRequest(req.body)) return res.status(400).json({ error: "missing session" }); + + const server = new McpServer({ name: "opencodehub", version: "0.3.0" }); + const graph = await cache.load(req.scope.repo, req.scope.graphHash); + registerAllTools(server, { graph, scope: req.scope }); // reuses packages/mcp registry + + const transport = new NodeStreamableHTTPServerTransport({ + sessionIdGenerator: () => randomUUID(), + onsessioninitialized: id => sessions.set(id, transport), + }); + transport.onclose = () => transport.sessionId && sessions.delete(transport.sessionId); + await server.connect(transport); + await transport.handleRequest(req, res, req.body); +}); + +app.listen(Number(process.env.PORT ?? 8787)); +``` + +**Auth.** Bearer JWT issued by a lightweight service (`packages/mcp-http/src/auth.ts`). Token payload: `{ install_id, repo, group?, pr_ref?, scope: "install" | "group" | "repo", allowlist: string[], exp }`. `allowlist` is the tool-name subset this token may invoke; absent means "all 28+3". Middleware: + +```typescript +// packages/mcp-http/src/auth.ts +export async function authMiddleware(req, res, next) { + const raw = (req.headers.authorization ?? "").replace(/^Bearer /, ""); + if (!raw) return res.status(401).json({ error: "missing bearer" }); + const claims = await verifyJwt(raw, process.env.OPENCODEHUB_JWT_PUBKEY!); + if (claims.exp * 1000 < Date.now()) return res.status(401).json({ error: "expired" }); + req.scope = { + installId: claims.install_id, repo: claims.repo, group: claims.group, + prRef: claims.pr_ref, graphHash: req.headers["x-codehub-graph-hash"], + allowlist: new Set(claims.allowlist ?? []), + }; + next(); +} +``` + +The registry wrapper in `registerAllTools` consults `scope.allowlist` per call; unknown tools return a structured `method not allowed` rather than silently dropping. + +**Graph connection.** `GraphCache` is keyed by `{repo, graph_hash}`. Miss path: download `graph.duckdb` from the configured backend (S3, R2, GitHub Artifacts, or file://), open it read-only, pin in an LRU. 2 GB ceiling by default; spill to tmpfs above that. Cache-hit latency is the p50 target: sub-150 ms from POST `/mcp` to first tool result on a warm instance. + +**Rate limits.** Token-bucket per `install_id` (default 120 rpm, burst 20) and per `repo` (60 rpm). Excess returns HTTP 429 + `Retry-After` seconds. Limits are advertised in `/healthz` for preflight. + +**Surface.** All 28 existing tools (`list_repos`, `query`, `context`, `impact`, `detect_changes`, `rename`, `sql`, `owners`, `route_map`, `tool_map`, `list_findings`, `license_audit`, `group_contracts`, `group_query`, `group_status`, `group_sync`, `project_profile`, `verdict`, and the rest) stay byte-identical. Three new tools ship in this package: `grounding_pack`, `policy_evaluate`, `provenance_record`. + +## Section 2 — `grounding_pack` tool + +Composes the existing retrieval primitives into a single LLM-ready payload. Inputs: + +```json +{ + "repo": "github.com/acme/payments-api", + "task_description": "add rate limiting to the GraphQL mutation handlers", + "target_files": ["packages/api/src/graphql/mutations.ts"], + "max_tokens": 8192 +} +``` + +Output (truncated at `max_tokens` by pruning `relevant_symbols` first, then `prior_findings`): + +```json +{ + "graph_hash": "sha256:8f3c…", + "repo_profile": { + "summary": "Node 22 monorepo, GraphQL API over Postgres, 42 packages.", + "languages": {"typescript": 0.87, "sql": 0.08, "shell": 0.05}, + "entrypoints": ["packages/api/src/server.ts", "packages/worker/src/main.ts"] + }, + "relevant_symbols": [ + {"name": "createPayment", "kind": "function", + "path": "packages/api/src/graphql/mutations.ts", "loc": "L42-L91", + "summary": "Mutation resolver; calls PaymentService.create; no throttling."}, + {"name": "refundPayment", "kind": "function", + "path": "packages/api/src/graphql/mutations.ts", "loc": "L93-L140", + "summary": "Mutation resolver; calls PaymentService.refund."} + ], + "blast_radius": { + "upstream": [{"symbol": "graphqlServer", "path": "packages/api/src/server.ts"}], + "downstream": [{"symbol": "PaymentService.create", "path": "packages/core/src/payment.ts"}, + {"symbol": "metricsEmit", "path": "packages/obs/src/metrics.ts"}], + "tier": 2 + }, + "owners": [{"path": "packages/api/**", "owners": ["@api-team"]}, + {"path": "packages/core/**", "owners": ["@payments-core"]}], + "prior_findings": [ + {"rule_id": "no-unbounded-loops", "severity": "warning", + "path": "packages/api/src/graphql/mutations.ts", "summary": "L67 unbounded forEach over user input."} + ], + "group_contracts": null, + "arch_invariants": [ + {"name": "db-access-only-in-storage", "query": "MATCH (f:Function)-[:CALLS]->(:Module {name:'db'}) …", + "description": "Only packages/storage/** may touch db directly."} + ] +} +``` + +**Internal pipeline.** `grounding_pack` is pure composition over existing tools, no new retrieval: + +1. `project_profile({repo})` → `repo_profile`. +2. `query({repo, text: task_description, k: 20})` → candidate symbols, filtered by `target_files` when present. +3. For each top-k symbol: `context({repo, symbol})` for inbound/outbound refs and participating flows. +4. `impact({repo, targets: [...], depth: 2})` → union upstream/downstream; tier from the existing risk tiering in `packages/search`. +5. `owners({repo, paths: [...]})` → owners table. +6. `list_findings({repo, paths: [...]})` → `prior_findings`. +7. If token scope includes a group: `group_contracts({group, repo})` → `group_contracts`. +8. Read `opencodehub.policy.yaml#arch_invariants` entries verbatim → `arch_invariants`. + +`graph_hash` is stamped at step 1 and carried through; drift during the session surfaces as an SDK-side refusal (see 010). + +## Section 3 — `policy_evaluate` tool + +Inputs: `{repo, pr_ref: "base..head", policy_path?: "opencodehub.policy.yaml"}`. + +Output: + +```json +{ + "graph_hash": "sha256:8f3c…", + "pr_ref": "main..feat/rate-limit", + "overall": "needs-review", + "rules": [ + {"id": "no-direct-db-access", "type": "arch_invariant", "outcome": "pass", + "evidence": {"matched_rows": 0}, "blocked_merge": false}, + {"id": "disallow-gpl", "type": "license", "outcome": "pass", + "evidence": {"new_deps": []}, "blocked_merge": false}, + {"id": "blast-radius-tier", "type": "blast_radius", "outcome": "fail", + "evidence": {"tier": 1, "touched": ["packages/core/src/payment.ts"]}, + "blocked_merge": true}, + {"id": "require-owner-approval", "type": "ownership", "outcome": "needs-review", + "evidence": {"paths": ["packages/storage/**"], "required": ["@storage-team"]}, + "blocked_merge": false} + ], + "auto_approve": false, + "required_reviewers": ["@storage-team"] +} +``` + +**Overall resolution.** Any `fail` with `blocked_merge: true` → `fail`. Else any `needs-review` or `fail` → `needs-review`. Else `pass`. `auto_approve` is `overall === "pass"` AND the policy's `auto_approve.require` clauses all match. + +**Compilation.** The policy YAML is parsed once per call, each rule compiled to a tool call: + +| rule `type` | compiles to | +|------------------|-----------------------------------------------------------------------------| +| `arch_invariant` | `sql` tool with rule's Cypher-over-DuckDB query, row count ≥ 1 ⇒ fail | +| `license` | `license_audit({repo, pr_ref})` filtered by `deny` list | +| `ownership` | `owners({repo, paths})` × `detect_changes({pr_ref})` intersection | +| `blast_radius` | `detect_changes` → `impact(depth=2)` → tier compared against threshold | + +## Section 4 — `opencodehub.policy.yaml` schema + +Committed to the repo root (or group root for monorepos). Realistic example: + +```yaml +version: 1 +auto_approve: + require: + - blast_radius.tier: ">= 3" # tier 1 = highest risk, 5 = lowest + - license_audit.violations: 0 + - findings.severity_critical: 0 +rules: + - id: no-direct-db-access + type: arch_invariant + severity: error + query: | + MATCH (n:Function)-[:CALLS]->(db:Module {name:'db'}) + WHERE NOT n.path STARTS WITH 'packages/storage/' + RETURN n.path, n.name + - id: disallow-gpl + type: license + severity: error + deny: ["GPL-3.0", "AGPL-3.0", "SSPL-1.0"] + - id: require-owner-approval + type: ownership + severity: warning + paths: ["packages/storage/**", "packages/core/src/payment.ts"] + require_approval_from: ["@storage-team", "@payments-core"] + - id: blast-radius-tier + type: blast_radius + severity: error + max_tier: 2 # fail if touched symbols land at tier 1 or 2 + depth: 2 +``` + +**Rule-type reference.** + +- **`arch_invariant`.** Input: `query` (Cypher-over-DuckDB string), optional `allow_rows: int`. Compiles to `sql({repo, query, readonly: true})`. Pass when `rows.length <= allow_rows` (default 0). +- **`license`.** Input: `deny: string[]` (SPDX ids), optional `allow: string[]`. Compiles to `license_audit({repo, pr_ref})`, filter `.violations[].spdx ∈ deny`. Pass when filtered length is 0. +- **`ownership`.** Input: `paths: glob[]`, `require_approval_from: string[]`. Compiles to `owners({repo, paths})` intersected with `detect_changes({pr_ref})` file set. Pass when no intersection; `needs-review` when intersection is non-empty (verdict-action posts the required-reviewers list as a PR comment). +- **`blast_radius`.** Input: `max_tier: 1..5`, `depth: int` (default 2). Compiles to `detect_changes({pr_ref})` → `impact({targets, depth})` → worst tier across touched symbols. Pass when `tier > max_tier` (higher tier = lower risk in OpenCodeHub's convention). + +## Section 5 — `opencodehub/analyze-action@v1` + +```yaml +# action.yml +name: OpenCodeHub Analyze +description: Build a code graph and publish it to a storage backend. +inputs: + repo-path: { description: "Path to checkout", required: false, default: "." } + storage-backend: { description: "s3 | r2 | artifact | local", required: true } + bucket: { description: "Bucket (s3/r2)", required: false } + prefix: { description: "Key prefix", required: false, default: "codehub" } + codehub-version: { description: "npm tag", required: false, default: "latest" } +outputs: + graph-hash: { description: "sha256 of graph.duckdb" } + graph-url: { description: "Backend-resolvable URL or artifact id" } +runs: + using: composite + steps: + - uses: actions/setup-node@v4 + with: { node-version: "22" } + - name: Install codehub + shell: bash + run: npm i -g @opencodehub/cli@${{ inputs.codehub-version }} + - name: Analyze + shell: bash + working-directory: ${{ inputs.repo-path }} + run: codehub analyze --emit-hash + - name: Publish graph + id: publish + shell: bash + env: + STORAGE: ${{ inputs.storage-backend }} + BUCKET: ${{ inputs.bucket }} + PREFIX: ${{ inputs.prefix }} + run: codehub publish --backend "$STORAGE" --bucket "$BUCKET" --prefix "$PREFIX" --out "$GITHUB_OUTPUT" +``` + +Workflow usage: + +```yaml +jobs: + analyze: + runs-on: ubuntu-latest + outputs: + graph-url: ${{ steps.ch.outputs.graph-url }} + graph-hash: ${{ steps.ch.outputs.graph-hash }} + steps: + - uses: actions/checkout@v4 + - id: ch + uses: opencodehub/analyze-action@v1 + with: + storage-backend: s3 + bucket: acme-codehub-graphs +``` + +## Section 6 — `opencodehub/verdict-action@v1` + +```yaml +name: OpenCodeHub Verdict +description: Evaluate a policy against a PR and post a GitHub Check. +inputs: + graph-url: { required: true } + pr-ref: { required: true } + policy-path: { required: false, default: "opencodehub.policy.yaml" } + endpoint: { required: false, default: "https://mcp.opencodehub.dev" } + token: { required: true } +outputs: + verdict: { description: "pass | needs-review | fail" } + auto-approve-eligible: { description: "true | false" } +runs: + using: composite + steps: + - name: Fetch graph + shell: bash + run: codehub fetch-graph --url "${{ inputs.graph-url }}" --out "$RUNNER_TEMP/graph.duckdb" + - name: Evaluate + id: eval + shell: bash + env: + OPENCODEHUB_ENDPOINT: ${{ inputs.endpoint }} + OPENCODEHUB_TOKEN: ${{ inputs.token }} + run: codehub mcp call policy_evaluate + --repo "$GITHUB_REPOSITORY" + --pr-ref "${{ inputs.pr-ref }}" + --policy-path "${{ inputs.policy-path }}" + --out "$GITHUB_OUTPUT" + - name: Post Check + uses: actions/github-script@v7 + with: + script: | + const verdict = JSON.parse(process.env.VERDICT_JSON); + await github.rest.checks.create({ + owner: context.repo.owner, repo: context.repo.repo, + name: "opencodehub/verdict", head_sha: context.payload.pull_request.head.sha, + status: "completed", + conclusion: verdict.overall === "pass" ? "success" : verdict.overall === "fail" ? "failure" : "neutral", + output: { title: `OpenCodeHub: ${verdict.overall}`, summary: renderMd(verdict) }, + }); +``` + +## Section 7 — Grounding provenance: `.opencodehub/grounding.json` + +Committed to the PR branch by the agent (via `provenance_record` tool). One file per PR under `.opencodehub/grounding.json`; `.opencodehub/history/.json` for historical PRs. + +```json +{ + "$schema": "https://opencodehub.dev/schemas/grounding.v1.json", + "schema_version": 1, + "agent_identity": {"runtime": "claude-agent-sdk", "model": "claude-opus-4-7", "run_id": "cr_01HXZ…"}, + "graph_hash": "sha256:8f3c…", + "tools_called": [ + {"name": "grounding_pack", "at": "2026-04-27T14:12:03Z", "input_digest": "sha256:…", "output_digest": "sha256:…"}, + {"name": "impact", "at": "2026-04-27T14:12:41Z", "input_digest": "sha256:…", "output_digest": "sha256:…"} + ], + "policy_result": {"overall": "needs-review", "rules": [ /* as in policy_evaluate output */ ]}, + "generated_at": "2026-04-27T14:13:02Z" +} +``` + +JSON Schema sketch (`packages/core-types/src/schemas/grounding.v1.json`): + +```json +{ + "$id": "https://opencodehub.dev/schemas/grounding.v1.json", + "type": "object", + "required": ["schema_version", "agent_identity", "graph_hash", "tools_called", "generated_at"], + "properties": { + "schema_version": {"const": 1}, + "agent_identity": {"type": "object", + "required": ["runtime", "model"], + "properties": {"runtime": {"type": "string"}, "model": {"type": "string"}, "run_id": {"type": "string"}}}, + "graph_hash": {"type": "string", "pattern": "^sha256:[0-9a-f]{64}$"}, + "tools_called": {"type": "array", "items": {"type": "object", + "required": ["name", "at", "input_digest", "output_digest"]}}, + "policy_result": {"type": "object"}, + "generated_at": {"type": "string", "format": "date-time"} + } +} +``` + +Signing is P2 — detached JWS over the canonical JSON, public key resolved from the install's GitHub App. For v1 the manifest is unsigned but content-addressed via the digests, which is sufficient for audit and correlation with CI logs. + +--- + +This closes the public surface. 010 covers the SDK agents drop into their framework; 011 wires the two actions above into a copy-pasteable workflow playbook. diff --git a/.erpaval/brainstorms/010-agent-sdk-design.md b/.erpaval/brainstorms/010-agent-sdk-design.md new file mode 100644 index 00000000..ea2d0f3e --- /dev/null +++ b/.erpaval/brainstorms/010-agent-sdk-design.md @@ -0,0 +1,361 @@ +# 010 — `@opencodehub/agent-sdk`: The Thin Grounding Wrapper + +*Draft: 2026-04-27. Inputs: 009 (remote MCP surface, `grounding_pack` / `policy_evaluate` / `provenance_record`). This memo designs the client-side SDK any agent framework drops in. Target audience: framework authors and in-house agent teams.* + +Agent frameworks don't need a new retrieval system. They need a single call that returns a grounded prompt block, a single call that returns a merge verdict, and a context manager that writes the provenance manifest on exit. The SDK is thin on purpose — any intelligence lives server-side in `packages/mcp-http/`. + +## Package layout + +- `packages/agent-sdk-py/` → published as `opencodehub-agent-sdk` on PyPI. +- `packages/agent-sdk-ts/` → published as `@opencodehub/agent-sdk` on npm. + +Python is primary because Claude Agent SDK and LangGraph are Python-first; TypeScript is secondary for Vercel AI SDK and LangGraph JS. Both expose the same surface in idiomatic form. + +```bash +pip install opencodehub-agent-sdk +pnpm add @opencodehub/agent-sdk +``` + +## Python core API + +```python +# opencodehub_agent_sdk/grounding.py +from contextlib import asynccontextmanager +from pydantic import BaseModel, Field +from datetime import datetime + +class Symbol(BaseModel): + name: str + kind: str + path: str + loc: str + summary: str + +class BlastRadius(BaseModel): + upstream: list[dict] + downstream: list[dict] + tier: int = Field(ge=1, le=5) + +class GroundingResult(BaseModel): + graph_hash: str + repo_profile: dict + relevant_symbols: list[Symbol] + blast_radius: BlastRadius + owners: list[dict] + prior_findings: list[dict] + group_contracts: list[dict] | None = None + arch_invariants: list[dict] = Field(default_factory=list) + + def as_system_block(self) -> str: + """Render the grounded prompt block (see Section: Prompt injection).""" + ... + +class VerdictRule(BaseModel): + id: str + type: str + outcome: str # "pass" | "fail" | "needs-review" + evidence: dict + blocked_merge: bool + +class VerdictResult(BaseModel): + graph_hash: str + pr_ref: str + overall: str # "pass" | "fail" | "needs-review" + rules: list[VerdictRule] + auto_approve: bool + required_reviewers: list[str] + +class ToolCall(BaseModel): + name: str + at: datetime + input_digest: str + output_digest: str + +class Grounding: + def __init__( + self, + endpoint: str, + token: str, + repo: str, + group: str | None = None, + strict: bool = True, + ) -> None: + self.endpoint, self.token, self.repo, self.group, self.strict = ( + endpoint, token, repo, group, strict + ) + self._client = _McpHttpClient(endpoint, token) + self._session_graph_hash: str | None = None + self._calls: list[ToolCall] = [] + + async def ground( + self, + task: str, + target_files: list[str] | None = None, + max_tokens: int = 8192, + ) -> GroundingResult: + result = await self._client.call("grounding_pack", { + "repo": self.repo, "task_description": task, + "target_files": target_files, "max_tokens": max_tokens, + }) + if self._session_graph_hash is None: + self._session_graph_hash = result["graph_hash"] + elif self.strict and result["graph_hash"] != self._session_graph_hash: + raise GraphDriftError(self._session_graph_hash, result["graph_hash"]) + self._record("grounding_pack", result) + return GroundingResult.model_validate(result) + + async def verdict( + self, pr_ref: str, policy_path: str | None = None, + ) -> VerdictResult: + result = await self._client.call("policy_evaluate", { + "repo": self.repo, "pr_ref": pr_ref, "policy_path": policy_path, + }) + self._record("policy_evaluate", result) + return VerdictResult.model_validate(result) + + async def record_provenance( + self, pr_ref: str, grounding: GroundingResult, + tools_called: list[ToolCall], + ) -> None: + await self._client.call("provenance_record", { + "repo": self.repo, "pr_ref": pr_ref, + "graph_hash": grounding.graph_hash, + "tools_called": [t.model_dump(mode="json") for t in tools_called], + }) + + @asynccontextmanager + async def session(self, pr_ref: str): + sess = GroundingSession(self, pr_ref) + try: + yield sess + finally: + if sess.last_grounding is not None: + await self.record_provenance(pr_ref, sess.last_grounding, list(self._calls)) + +class GroundingSession: + def __init__(self, parent: Grounding, pr_ref: str) -> None: + self._g, self.pr_ref = parent, pr_ref + self.last_grounding: GroundingResult | None = None + + async def ground(self, task: str, **kw) -> GroundingResult: + self.last_grounding = await self._g.ground(task, **kw) + return self.last_grounding + + async def verdict(self) -> VerdictResult: + return await self._g.verdict(self.pr_ref) +``` + +`_record` appends to `self._calls` with input/output SHA-256 digests so that the session exit can reconstruct the provenance manifest without replaying tools. `GraphDriftError` fires when the index moves mid-session; agents can catch it, re-ground, or override with `strict=False`. + +TypeScript mirrors this surface: `class Grounding`, `async ground()`, `async verdict()`, `async withSession(prRef, async (session) => { … })`. Types are generated from the same JSON schemas that the server uses. + +## Integration examples + +### 1. Claude Agent SDK (Python) + +```python +# agent.py +import os, asyncio +from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions +from opencodehub_agent_sdk import Grounding + +async def main(): + g = Grounding( + endpoint=os.environ["OPENCODEHUB_ENDPOINT"], + token=os.environ["OPENCODEHUB_TOKEN"], + repo=os.environ["GITHUB_REPOSITORY"], + ) + async with g.session(pr_ref=os.environ["GITHUB_PR_REF"]) as sess: + pack = await sess.ground( + task="add rate limiting to the GraphQL mutation handlers", + target_files=["packages/api/src/graphql/mutations.ts"], + ) + + opts = ClaudeAgentOptions( + model="claude-opus-4-7", + system_prompt=f"{DEFAULT_SYSTEM}\n\n{pack.as_system_block()}", + allowed_tools=["Read", "Edit", "Bash"], + ) + async with ClaudeSDKClient(options=opts) as client: + await client.query("Implement the task described in the grounding block.") + async for msg in client.receive_response(): + print(msg) + + verdict = await sess.verdict() + if verdict.overall == "fail": + raise SystemExit(f"policy failed: {verdict.rules}") + +asyncio.run(main()) +``` + +### 2. Vercel AI SDK (TypeScript) + +```typescript +// agent.ts +import { generateText } from "ai"; +import { anthropic } from "@ai-sdk/anthropic"; +import { Grounding } from "@opencodehub/agent-sdk"; + +const g = new Grounding({ + endpoint: process.env.OPENCODEHUB_ENDPOINT!, + token: process.env.OPENCODEHUB_TOKEN!, + repo: process.env.GITHUB_REPOSITORY!, +}); + +await g.withSession(process.env.GITHUB_PR_REF!, async (sess) => { + const pack = await sess.ground({ + task: "add rate limiting to the GraphQL mutation handlers", + targetFiles: ["packages/api/src/graphql/mutations.ts"], + }); + + const { text } = await generateText({ + model: anthropic("claude-opus-4-7"), + system: pack.asSystemBlock(), + prompt: "Produce a unified diff implementing the task.", + }); + await fs.writeFile(".opencodehub/plan.diff", text); + + const verdict = await sess.verdict(); + if (verdict.overall === "fail") process.exit(1); +}); +``` + +### 3. Framework-agnostic OpenAI tool loop (Python) + +```python +# openai_loop.py +import json, os +from openai import OpenAI +from opencodehub_agent_sdk import Grounding, ToolCall +from datetime import datetime, UTC + +client = OpenAI() +g = Grounding(endpoint=os.environ["OPENCODEHUB_ENDPOINT"], + token=os.environ["OPENCODEHUB_TOKEN"], + repo=os.environ["GITHUB_REPOSITORY"]) + +async def run(task: str, pr_ref: str): + async with g.session(pr_ref) as sess: + pack = await sess.ground(task=task) + messages = [ + {"role": "system", "content": pack.as_system_block()}, + {"role": "user", "content": task}, + ] + while True: + resp = client.chat.completions.create( + model="gpt-4.1", messages=messages, + tools=[{"type": "function", "function": {"name": "edit_file", + "parameters": {"type": "object", + "properties": {"path": {"type": "string"}, + "patch": {"type": "string"}}}}}]) + choice = resp.choices[0] + if choice.finish_reason == "stop": + break + for call in choice.message.tool_calls or []: + apply_patch(json.loads(call.function.arguments)) + messages.append({"role": "tool", "tool_call_id": call.id, "content": "ok"}) + + verdict = await sess.verdict() + return verdict +``` + +### 4. LangGraph node (Python) + +```python +# langgraph_nodes.py +from langgraph.graph import StateGraph +from opencodehub_agent_sdk import Grounding, GroundingResult + +class GroundingNode: + def __init__(self, grounding: Grounding) -> None: + self.g = grounding + + async def __call__(self, state: dict) -> dict: + task = state["task"] + pack: GroundingResult = await self.g.ground( + task=task, target_files=state.get("target_files"), + ) + return {**state, "grounding": pack, "system_prompt": pack.as_system_block()} + +class VerdictNode: + def __init__(self, grounding: Grounding) -> None: + self.g = grounding + + async def __call__(self, state: dict) -> dict: + v = await self.g.verdict(pr_ref=state["pr_ref"]) + return {**state, "verdict": v, "should_merge": v.auto_approve} + +graph = StateGraph(dict) +graph.add_node("ground", GroundingNode(grounding)) +graph.add_node("plan", plan_node) # user-defined LLM node +graph.add_node("execute", execute_node) +graph.add_node("verdict", VerdictNode(grounding)) +graph.add_edge("ground", "plan") +graph.add_edge("plan", "execute") +graph.add_edge("execute", "verdict") +``` + +## Prompt injection pattern + +`GroundingResult.as_system_block()` produces clean Markdown that LLMs parse reliably: + +```markdown +# Repository grounding (OpenCodeHub) + +You are editing **github.com/acme/payments-api** (graph_hash `sha256:8f3c…`). +Node 22 monorepo, GraphQL API over Postgres, 42 packages. +Entrypoints: `packages/api/src/server.ts`, `packages/worker/src/main.ts`. + +## Task +Add rate limiting to the GraphQL mutation handlers. + +## Relevant symbols (top 2) +- `createPayment` — function at `packages/api/src/graphql/mutations.ts:L42-L91`. + Mutation resolver; calls `PaymentService.create`; no throttling. +- `refundPayment` — function at `packages/api/src/graphql/mutations.ts:L93-L140`. + Mutation resolver; calls `PaymentService.refund`. + +## Blast radius — tier 2 (high) +Touching these files affects 1 upstream and 2 downstream symbols. +- Upstream: `graphqlServer` (packages/api/src/server.ts) +- Downstream: `PaymentService.create` (packages/core/src/payment.ts), + `metricsEmit` (packages/obs/src/metrics.ts) + +## Owners to notify +- `packages/api/**` → @api-team +- `packages/core/**` → @payments-core + +## Prior findings on touched files +- [warning] **no-unbounded-loops** at `packages/api/src/graphql/mutations.ts` + L67 unbounded forEach over user input. + +## Architectural invariants (must not violate) +- **db-access-only-in-storage** — only `packages/storage/**` may touch `db` directly. + +## Rules for your output +1. Do not modify files outside `packages/api/**` without explicit owner approval. +2. Preserve the listed invariants. Your plan will be re-evaluated by `policy_evaluate` before merge. +3. Cite file paths and line ranges you touched in your final summary. +``` + +Sections are elided when empty (no group contracts in this example). The block is stable across calls so prompt caches hit. + +## Auth flow + +1. Org installs the **OpenCodeHub GitHub App** on the relevant repos/groups. +2. At workflow start, the `opencodehub/verdict-action@v1` action exchanges the GitHub OIDC token for a short-lived OpenCodeHub JWT against the auth service. Scope = `(install_id, repo, pr_ref)`, TTL 60 minutes. +3. The JWT lands in the workflow env as `OPENCODEHUB_TOKEN`. The SDK reads it on construction. +4. The SDK passes `Authorization: Bearer ` on every MCP call plus `X-Codehub-Graph-Hash` when the caller wants to pin a specific graph version. + +No long-lived secrets in workflows. Token rotation is automatic because CI re-runs mint fresh tokens. + +## Triggers and telemetry + +- Every `ground()` / `verdict()` call appends a `ToolCall` record to the in-memory ledger with input/output SHA-256 digests. +- `graph_hash` is captured on first `ground()`; subsequent calls compare and raise `GraphDriftError` under `strict=True` (default). This maps the "reproducibility boundary" contract from 005 onto the remote plane — if the index moved, the session is not reproducible and the agent must decide. +- On session exit (`__aexit__`), `record_provenance()` fires, writing the manifest described in 009 §7. +- A `Grounding(debug=True)` constructor flag emits OTel spans (`otel.semconv: llm.*`) per MCP call for observability stacks that already sample them. + +--- + +This is a 500-line implementation at most. The complexity is on the server (009) and in the CI playbook (011). The SDK's job is to make the pattern a two-line import for any agent author. diff --git a/.erpaval/brainstorms/011-ci-integration-playbook.md b/.erpaval/brainstorms/011-ci-integration-playbook.md new file mode 100644 index 00000000..063b1ab4 --- /dev/null +++ b/.erpaval/brainstorms/011-ci-integration-playbook.md @@ -0,0 +1,233 @@ +# 011 — CI/CD Integration Playbook + +*Draft: 2026-04-27. Inputs: 009 (remote MCP surface + both actions), 010 (agent SDK). This memo is copy-pasteable: an org drops these three workflows into `.github/workflows/` and they have the grounding plane wired to PRs and main-branch pushes.* + +The playbook assumes the two actions from 009 are published (`opencodehub/analyze-action@v1`, `opencodehub/verdict-action@v1`) and the GitHub App is installed. Per 001 § "offline-safe by SPECS.md" and 009 §5, the storage backend is user-selectable so air-gapped orgs can pin to on-prem object storage. + +## Workflow 1 — `opencodehub-analyze.yml` + +Builds the graph on every push to default and every PR sync. Concurrency grouped per ref so the latest commit wins. + +```yaml +# .github/workflows/opencodehub-analyze.yml +name: opencodehub-analyze +on: + push: + branches: [main] + pull_request: + types: [opened, synchronize, reopened] +concurrency: + group: codehub-analyze-${{ github.ref }} + cancel-in-progress: true +permissions: + contents: read + id-token: write # for OIDC → JWT exchange +jobs: + analyze: + runs-on: ubuntu-latest + outputs: + graph-url: ${{ steps.ch.outputs.graph-url }} + graph-hash: ${{ steps.ch.outputs.graph-hash }} + steps: + - uses: actions/checkout@v4 + with: { fetch-depth: 0 } # full history for detect_changes + - uses: actions/setup-node@v4 + with: { node-version: "22" } + - id: ch + uses: opencodehub/analyze-action@v1 + with: + storage-backend: s3 + bucket: ${{ vars.CODEHUB_BUCKET }} + prefix: graphs/${{ github.repository }} + - name: Annotate + run: echo "graph_hash=${{ steps.ch.outputs.graph-hash }}" >> "$GITHUB_STEP_SUMMARY" +``` + +Runtime on a typical 200k-LOC TypeScript monorepo is 2-4 minutes cold, <90 s on warm incremental (the phase DAG in `packages/ingestion/` skips unchanged files). The graph is keyed in storage by `{repo, sha}` so downstream jobs look it up by commit. + +## Workflow 2 — `opencodehub-verdict.yml` + +Runs `policy_evaluate`, posts a GitHub Check, labels the PR with the verdict tier. + +```yaml +# .github/workflows/opencodehub-verdict.yml +name: opencodehub-verdict +on: + pull_request: + types: [opened, synchronize, reopened] +concurrency: + group: codehub-verdict-${{ github.event.pull_request.number }} + cancel-in-progress: true +permissions: + contents: read + pull-requests: write + checks: write + issues: write + id-token: write +jobs: + verdict: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - name: Resolve graph URL for PR head + id: resolve + env: + REPO: ${{ github.repository }} + SHA: ${{ github.event.pull_request.head.sha }} + run: | + url="s3://${{ vars.CODEHUB_BUCKET }}/graphs/${REPO}/${SHA}.duckdb" + if ! aws s3 ls "$url" > /dev/null; then + echo "miss=true" >> "$GITHUB_OUTPUT" + else + echo "miss=false" >> "$GITHUB_OUTPUT" + echo "url=$url" >> "$GITHUB_OUTPUT" + fi + - name: Force re-analyze on cache miss + if: steps.resolve.outputs.miss == 'true' + id: reanalyze + uses: opencodehub/analyze-action@v1 + with: + storage-backend: s3 + bucket: ${{ vars.CODEHUB_BUCKET }} + prefix: graphs/${{ github.repository }} + - name: Mint OpenCodeHub JWT + id: token + uses: opencodehub/token-action@v1 + with: + endpoint: https://auth.opencodehub.dev + - id: v + uses: opencodehub/verdict-action@v1 + with: + graph-url: ${{ steps.resolve.outputs.url || steps.reanalyze.outputs.graph-url }} + pr-ref: ${{ github.event.pull_request.base.ref }}..${{ github.event.pull_request.head.ref }} + policy-path: opencodehub.policy.yaml + token: ${{ steps.token.outputs.jwt }} + - name: Label PR + uses: actions/github-script@v7 + env: + VERDICT: ${{ steps.v.outputs.verdict }} + TIER: ${{ steps.v.outputs.tier }} + with: + script: | + const labels = [`codehub/verdict:${process.env.VERDICT}`, `codehub/tier:${process.env.TIER}`]; + await github.rest.issues.addLabels({ + ...context.repo, issue_number: context.payload.pull_request.number, labels, + }); +``` + +Timing target: <30 s p50 from PR synchronize to posted check, assuming warm cache and a policy with 5-10 rules. The `policy_evaluate` tool parallelizes rule execution server-side. + +## Workflow 3 — `opencodehub-auto-merge.yml` + +Consumes verdict-action's `auto-approve-eligible` output, checks required reviewers are satisfied, enables auto-merge through the GitHub CLI. + +```yaml +# .github/workflows/opencodehub-auto-merge.yml +name: opencodehub-auto-merge +on: + pull_request_review: + types: [submitted] + check_run: + types: [completed] +permissions: + pull-requests: write + contents: write + checks: read +jobs: + maybe-auto-merge: + runs-on: ubuntu-latest + if: github.event.check_run.name == 'opencodehub/verdict' || github.event_name == 'pull_request_review' + steps: + - uses: actions/checkout@v4 + - name: Read verdict from check + id: read + env: { GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} } + run: | + pr=$(gh pr list --head "${{ github.event.pull_request.head.ref }}" --json number -q '.[0].number') + eligible=$(gh pr checks "$pr" --json name,summary -q \ + '.[] | select(.name=="opencodehub/verdict") | .summary' \ + | jq -r '.auto_approve_eligible') + echo "eligible=$eligible" >> "$GITHUB_OUTPUT" + echo "pr=$pr" >> "$GITHUB_OUTPUT" + - name: Enable auto-merge + if: steps.read.outputs.eligible == 'true' && !contains(github.event.pull_request.labels.*.name, 'codehub/hold') + env: { GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} } + run: | + gh pr merge --auto --squash "${{ steps.read.outputs.pr }}" + gh pr edit "${{ steps.read.outputs.pr }}" --add-label auto-merge +``` + +**Human overrides.** Two escape hatches: `codehub/hold` label blocks auto-merge for that PR; `opencodehub.policy.yaml#auto_approve.require` can be edited to raise the bar globally. A `gh pr edit --remove-label auto-merge` immediately cancels a pending auto-merge. + +## Group mode — monorepo with declared groups + +Per 006 § group-mode and 001 "group contracts are the moat", orgs with `group_list` declarations fan analyze per repo and run one consolidated verdict. + +```yaml +# .github/workflows/opencodehub-group-verdict.yml +name: opencodehub-group-verdict +on: + pull_request: + types: [opened, synchronize, reopened] +jobs: + discover: + runs-on: ubuntu-latest + outputs: { repos: ${{ steps.l.outputs.repos }} } + steps: + - uses: actions/checkout@v4 + - id: l + run: echo "repos=$(cat opencodehub.group.yaml | yq -o=json '.repos')" >> "$GITHUB_OUTPUT" + analyze: + needs: discover + runs-on: ubuntu-latest + strategy: + matrix: { repo: ${{ fromJSON(needs.discover.outputs.repos) }} } + steps: + - uses: actions/checkout@v4 + with: { repository: ${{ matrix.repo }}, path: repos/${{ matrix.repo }} } + - uses: opencodehub/analyze-action@v1 + with: { repo-path: repos/${{ matrix.repo }}, storage-backend: s3, + bucket: ${{ vars.CODEHUB_BUCKET }} } + verdict: + needs: analyze + runs-on: ubuntu-latest + steps: + - uses: opencodehub/verdict-action@v1 + with: + graph-url: group://${{ vars.CODEHUB_BUCKET }}/${{ github.event.pull_request.head.sha }} + pr-ref: ${{ github.event.pull_request.base.ref }}..${{ github.event.pull_request.head.ref }} + policy-path: opencodehub.group.policy.yaml + token: ${{ steps.token.outputs.jwt }} +``` + +The `group://` URL scheme tells the server to load all member-repo graphs and run `group_contracts` rules (see 009 §4 — `arch_invariant` rules can name `MATCH` patterns that cross repo boundaries). This is the one rule class impossible on a single-repo tool. + +## Self-hosted runner considerations + +- **Storage.** GitHub Artifacts caps at 500 MB per artifact and 90-day retention, which saturates on repos above ~1 M LOC once you add multiple SHAs. Recommend MinIO or Cloudflare R2 for any org with more than 20 active repos. S3 Intelligent-Tiering handles cold graphs cheaply. +- **JWT minting.** The OpenCodeHub GitHub App lives in the org. On-prem orgs either (a) run the auth service themselves — `packages/mcp-http/src/auth.ts` exports a standalone `opencodehub-auth` binary — or (b) mint JWTs from their existing IdP with the `codehub:*` claim scope that the server verifies. +- **Air-gap pattern.** Point the action at the on-prem endpoint via `with: endpoint: https://codehub.internal.corp`. The action image bundles no code that reaches out to opencodehub.dev; all telemetry defaults are off. The storage backend can be a local mount on the self-hosted runner: `storage-backend: local` with `prefix: /srv/codehub/graphs`. + +## Failure modes and fallbacks + +| Failure | Behavior | +|---------------------------------------------|--------------------------------------------------------------------------| +| Graph not in cache for PR head SHA | `verdict.yml` runs `analyze-action` inline before evaluating (step shown) | +| `policy_evaluate` exceeds action timeout | Action exits with `verdict=needs-review` (not `fail`) + posts a warning | +| `opencodehub.policy.yaml` has invalid YAML | Action fails loud with `line: N, col: M, msg: …`; does not post a check | +| MCP-HTTP endpoint 5xx | SDK retries with exponential backoff (3 tries, 250/500/1000 ms) | +| Graph hash drift mid-session (agent mode) | `GraphDriftError` per 010; agent re-grounds or sets `strict=False` | +| GitHub App lost install permissions | Token action fails with a clear message to reinstall the app | + +Loud failure on policy syntax is deliberate — a silent drop would let a misconfigured gate look like a pass. Timeout → `needs-review` is also deliberate: blocking merges on transient MCP unavailability punishes the user for our infrastructure. + +## Observability + +- Every `policy_evaluate` call emits a structured log line to stdout: + `{ "ts": "...", "install": "...", "repo": "...", "pr_ref": "...", "graph_hash": "...", "overall": "pass", "duration_ms": 1840, "rules": [{ "id": "...", "outcome": "..." }] }`. +- Optional Prometheus push: `verdict-action` honors `OPENCODEHUB_METRICS_URL`, posts counters per outcome and histograms per rule. Empty env var = no network egress. +- The `.opencodehub/grounding.json` manifest per PR is the durable audit surface — every `gh pr view` can link to it, and the schema in 009 §7 makes post-incident forensics concrete. + +--- + +This playbook closes the trilogy. 009 defines the surface, 010 defines the SDK, 011 wires both into CI. A Day-1 adopter commits `opencodehub.policy.yaml`, pastes the three workflows, installs the App, and has grounding + verdicts running within an hour. diff --git a/.erpaval/brainstorms/012-competitive-landscape.md b/.erpaval/brainstorms/012-competitive-landscape.md new file mode 100644 index 00000000..3baa1019 --- /dev/null +++ b/.erpaval/brainstorms/012-competitive-landscape.md @@ -0,0 +1,162 @@ +# 012 — Competitive Landscape: Agent + Grounding + Guardrail Ecosystem + +*Draft: 2026-04-27. Scope: autonomous coding agents, PR review tooling, code-graph grounding, MCP ecosystem, policy-as-code, agent provenance. Sources cited inline; products move fast so everything below is dated to April 2026.* + +## 1. The Map + +| Player | Category | What they do | Surface (LSP/IDE/CI/SDK/MCP) | Who runs it | Differentiator | Gap we can exploit | +|---|---|---|---|---|---|---| +| Claude Code (Anthropic) | Autonomous agent | Agent loop over local repo; `/loop` scheduler, Auto Mode, Linear-triggered background agents [1][2] | CLI + GitHub Action + IDE ext + MCP client | Laptop, cloud VM, CI runner | Best-in-class agent loop; native MCP consumer | Deep graph grounding — reads files, doesn't precompute blast radius | +| Claude Agent SDK (Anthropic) | Agent framework | Programmatic agent loop w/ hooks (PreToolUse/PostToolUse), subagents, MCP servers, permissions [3] | Python + TS SDK | CI / server-side | `PreToolUse` hook is a perfect seam for deterministic gates | No built-in code-graph; hooks are empty unless someone ships policy packs | +| GitHub Copilot coding agent | Autonomous agent | Issue → PR w/ Actions runner; `@copilot` PR edits; Autopilot CLI w/ `--max-autopilot-continues` [4][5] | GitHub-native + CLI | Cloud (GitHub Actions) | Incumbent distribution; owns the merge button | Grounding is "read files in the runner"; no graph, no cross-repo | +| Cursor background agents | Autonomous agent | Automations + Cloud Agents via Graphite; sandboxed PR authoring; open-sourced security agents [6][7] | IDE + web + GH integration | Cursor Cloud sandbox | $2B ARR, Composer 2.5 model, deep IDE loop | CI surface is thin; not the canonical choice where PRs are gated by compliance | +| Devin (Cognition) | Autonomous agent | ACU-billed autonomous SWE; Windsurf IDE, Devin Wiki, Devin Search [8][9] | Web + Slack + GitHub + VPC | Cognition cloud / VPC | Enterprise VPC; multi-product (Wiki = graph-adjacent) | Wiki is single-repo; no graph primitives exposed as tools to other agents | +| Jules (Google) | Autonomous agent | Async agent; Gemini 3.1 Pro; Jitro KPI-driven variant coming [10] | Web + GitHub + CLI (`jules`) | Google Cloud VM | Full-filesystem sandbox; async discipline | No graph context; explicitly struggles on large files/edge cases | +| Amazon Q Developer `/dev` | Autonomous agent | Multi-file feature agent; CLI auto-runs validation [11] | IDE + CLI + AWS Console | AWS | IP indemnity; AWS ecosystem hooks | Single-repo context; no blast radius; no cross-service graph | +| Replit Agent 4 | Autonomous agent | Parallel frontend/backend/DB agents; browser-based [12] | Web-only | Replit Cloud | Zero-setup full-stack | Not oriented toward CI/merge gates on external repos | +| v0 (Vercel) | Autonomous agent | React/Next.js UI generator; Figma + Shadcn [12] | Web + Vercel | Vercel Cloud | UI-first, not general SWE | Out of our lane — UI generation, not grounding | +| Bolt / StackBlitz | Autonomous agent | Prompt-to-preview browser dev | Web | StackBlitz WebContainers | Instant dev loop | Out of lane — prototyping, not merge gating | +| CodeRabbit | PR review | Inline PR review + pre-merge checks; YAML policies; 40+ linters [13][14] | GitHub App + CLI + IDE | SaaS | Largest install base; policy framework in YAML | LLM-grounded review; no precomputed graph; no blast radius tier | +| Greptile v4 | PR review | Codebase-graph-based review; multi-repo context; severity badges [15][16] | GitHub App | SaaS | Claims graph; 66.2% precision benchmark | Graph is proprietary + closed; no open export; no determinism guarantee | +| Graphite Diamond / Agent | PR review | Real-time PR review + stacked PRs + merge queue [17] | GitHub App + VS Code | SaaS | Stacked-PR UX + merge queue + review | Diamond deprecated into Graphite Agent; LLM review, not policy-graph | +| Ellipsis | PR review | Async PR review + bug fixes via `@ellipsis-dev` [18] | GitHub App | SaaS | Low-friction; async fix bot | Smaller; no graph story; style-focused | +| Qodo PR-Agent | PR review | Open-source PR review (Apache 2.0), now community-governed [19][20] | Self-host + SaaS (Qodo Merge) | Self-host or SaaS | Apache-2.0 license; self-host; benchmark-leading F1 60.1% | No code graph; multi-repo "context engine" is embeddings, not graph | +| Sweep AI | PR review / agent | Pivoted away from coding PRs; now ESG/sustainability [21] | N/A | N/A | — | De-facto exited the market | +| Sourcegraph (Cody + Code Intelligence) | Graph + agent | SCIP-backed code graph; Cody chat+search; auto-edit; code review agents [22][23] | IDE ext + web + API | Self-host + Sourcegraph Cloud | Deepest graph incumbency; SCIP governance now open (Uber/Meta steering) [24] | SCIP governance move signals they're less possessive; Cody-as-surface is fading vs Code Intelligence-as-substrate | +| SCIP / LSIF specs | Graph infra | Compiler-grade index format for code intelligence [24] | File format + indexers | Anyone | Neutral spec; now community-governed | Ours consumes SCIP natively — we're downstream-compatible, not a fork | +| GitHub Code Search | Graph-lite | Semantic + symbol search across repos | Web + API | GitHub | Universal; zero-setup | No blast radius; no process clusters; not agent-shaped | +| Context7 | Docs grounding | Library-docs MCP; `resolve-library-id` → `query-docs` | MCP | SaaS | Best-in-class library-docs grounding | Docs only — doesn't know *your* repo | +| Repomix / pack-codebase | Grounding | Pack repo into a single token blob | CLI | Laptop | Simple; supports agent onboarding | Flat; no relations; token-bloated for any real repo | +| Aider context | Grounding | `repo-map` embedded into Aider agent | CLI | Laptop | Built-in to Aider | Aider-specific; not a surface | +| E2B | Sandbox | MicroVM code-exec sandboxes; 200M+ executions; OpenAI Agents SDK integration [25] | SDK | SaaS | Dominant execution substrate; Firecracker isolation | Execution-only; no grounding; we'd sit above them | +| Modal | Sandbox | gVisor sandboxes + GPU infra [25] | SDK | SaaS | Infra breadth | Same — execution substrate | +| GitHub MCP Server (official) | MCP ecosystem | 51 tools over OAuth; Streamable HTTP mode [26] | MCP | github.copilot.com/mcp | Official; OAuth; Lockdown Mode | 7-32× token bloat vs `gh` CLI; read-oriented, not graph-oriented | +| Linear / Sentry / PagerDuty MCP | MCP ecosystem | Remote MCP servers over HTTP+OAuth [27] | MCP (remote) | Vendor hosted | Remote MCP is real and production | All per-vendor; nobody ships a code-graph remote MCP | +| OPA / Conftest | Policy-as-code | Rego policies against structured inputs (Terraform plans, K8s) in CI [28] | CLI + GH Action | CI runner | Mature; deterministic; auditable | Targets IaC, not code-graph diffs | +| Semgrep Supply Chain | Policy-as-code | Reachability + malicious-dep detection + license + SAST [29] | CI | SaaS | Reachability analysis = graph-lite | Single-repo SAST; no cross-repo contract gate | +| Mergify / Dependabot / Renovate | Merge automation | Queue + auto-merge based on labels/statuses [30] | GitHub App | SaaS | Default plumbing for PR flow | No semantic gate — just label + status check plumbing | +| GitHub Rulesets + Checks API | Merge automation | Required status checks + branch protection [31] | GitHub native | GitHub | Owns the merge primitive | Any status we publish here becomes a gate | +| SLSA + Sigstore | Provenance | OIDC → Fulcio → Rekor attestations via GH Actions [32] | GH Action | Public good | Mature; OpenSSF-backed | Nobody wires SLSA attestations to *who-the-agent-was* — attestation target is always the build, not the agent identity | + +## 2. Segment analysis + +### Autonomous coding agents — who writes PRs, and where + +Volume is clear. **Copilot, Claude Code, Cursor, and Jules** write most of the agent-authored PRs today. Copilot rides GitHub distribution, Claude Code ships `/loop` for cron-scheduled autonomous work [2], Cursor's Automations drive hands-off maintenance PRs claimed at "20–40% review reduction" [6], and Jules runs async in a Google Cloud VM with filesystem access [10]. Devin and Q Developer are real but sit in enterprise-deal volumes, not raw PR count. + +Where they run matters more than who they are. Copilot, Cursor, Jules, and Claude Code-in-background-mode all execute in **ephemeral cloud runners** (GitHub Actions, Cursor Cloud, Google Cloud VMs, Anthropic-managed). This is the shift since early 2025: the PR-filing agent is no longer on a laptop. That's the bet OpenCodeHub must play to — the agent that writes the diff never sees a developer's IDE. + +Grounding today is embarrassingly primitive. Every agent except Greptile and Sourcegraph Cody says some variant of "read files + maybe Grep." Claude Agent SDK literally lists `Read, Write, Edit, Bash, Glob, Grep` as its built-ins [3]. Nobody is asking for external graph grounding *explicitly*, because nobody has shipped it as an MCP tool that's easy to wire. The demand is latent: every post-mortem of a bad agent PR ("it broke the callers it couldn't see") is demand for blast-radius-in-one-call. + +### PR review tools — how they gate merges + +CodeRabbit, Greptile, Diamond, Ellipsis, and Qodo all publish **GitHub Check runs** that can be wired as required status checks [31]. CodeRabbit explicitly ships "pre-merge checks" — built-in + custom rules in YAML [14]. But every one of them produces an **LLM verdict** on a diff. Greptile markets "graph-based review" [15], but their graph is closed and the verdict is still LLM-synthesized text. + +Nobody composes policy-as-code over a real code graph. Nobody publishes a Check that says `blast_radius=HIGH AND contract_version_bumped=false AND license_added=AGPL-3.0 → block`. CodeRabbit gets closest with YAML custom checks but has no graph substrate underneath. This is a wide-open seam. + +### Code graph + grounding infrastructure + +Sourcegraph is shifting. The March 2026 SCIP governance move — handing the spec to a Steering Committee with Uber and Meta engineers [24] — is Sourcegraph signaling that SCIP is infrastructure, not a product moat. Cody's positioning has softened; public comms emphasize **Code Intelligence** (the graph-as-substrate) and **review agents** rather than Cody-the-chatbot [22]. They're enterprise-priced, require hosted infra (Sourcegraph Cloud or self-managed), and are not lightweight enough for per-repo CI. + +Practically, Sourcegraph owns "enterprise code graph on a server." OpenCodeHub owns "offline deterministic code graph in a CI runner's filesystem." Those are different products. We consume SCIP natively; we're downstream-compatible, not competitive on format. + +### MCP ecosystem maturity + +MCP 2025-03-26 / 2025-11-25 consolidated on **Streamable HTTP** (POST/GET, no persistent SSE) and **OAuth 2.1** for remote server auth [33]. Production deployments exist: Linear (May 2025 remote MCP), Sentry (production-ready MCP with Seer integration), PagerDuty (250+ customers within a month of launch) [27]. AWS Bedrock and Lambda-hosted MCP servers are documented patterns [33]. Auth is settled enough that enterprise teams now buy it. + +The one soft spot is cost. GitHub's MCP server consumes **7–32× more tokens** than equivalent `gh` CLI calls [26]. This is pertinent to us — our MCP tools need to be token-lean per call, which is a design constraint we already meet (one-shot blast radius vs ten round-trips). + +### Policy-as-code + +Conftest/OPA is mature for Terraform plans and K8s manifests [28]. Semgrep Supply Chain is the closest thing to "graph-aware gating" with reachability analysis and license compliance [29]. But nobody blends **code-graph blast radius + contract diffs + SBOM license risk + scanner findings** into a single deterministic verdict. The market composes gates the way vi composes shortcuts: everyone wires their own. + +This is the second wide-open seam. A policy-as-code primitive that takes a graph diff and emits a verdict with evidence is a product, not a feature. + +### Provenance + +SLSA Level 3 via GitHub Actions + Sigstore's Fulcio/Rekor pipeline is mature [32]. But the attestation subject is always **the build artifact**, not **the agent that authored the commit**. Claude Code's `Co-Authored-By: Claude` is literally a freeform git-trailer string and inconsistently respected even by Anthropic's own tool [34]. No agent platform I can find signs its outputs with a verifiable agent-identity attestation. This is a third open seam, though one with slower commercial pull. + +## 3. Seams for OpenCodeHub + +Each seam below describes where no player is credibly present, the closest almost-competitor, and why our existing assets give us a head start. + +1. **Blast-radius-as-a-Check (deterministic merge gate).** Publish a GitHub Check named `codehub/impact` that maps a diff → affected symbols → risk tier in one call. Closest: Greptile (graph-claimed LLM review); CodeRabbit (YAML policies but no graph). Our head start: `impact()` + `detect_changes()` are already MCP tools returning deterministic structured output, and `SPECS.md §1.2` mandates byte-identical `graphHash` across runs [local]. No one else has determinism guarantees a compliance team can audit. + +2. **Cross-repo contract gate.** `group_contracts` surfaces API contracts shared across a group of repos. Breaking changes become a Check that blocks merge. Closest: Sourcegraph Code Connect (enterprise-only, server-hosted); nobody in the PR-review segment does cross-repo. Our head start: `group_contracts`, `group_query`, `group_status`, `group_sync` already exist as cross-repo MCP primitives; nobody else has even a single-surface cross-repo graph tool. + +3. **Policy-as-code over a code graph.** Expose the graph as a Rego input so teams write `deny[msg]` rules like "AGPL introduced" or "blast_radius=HIGH requires 2 reviewers." Closest: Conftest (IaC only) + Semgrep (pattern-only, not graph). Our head start: `sql` tool already exposes read-only graph access with 5s timeout [CLAUDE.md]; `verdict` tool gives us a natural point to plug a Rego evaluator. + +4. **Agent-scoped grounding server for CI runners.** A remote (or in-runner-colocated) MCP server that any agent framework — Claude Agent SDK, Copilot agent runners, Cursor Cloud Agents — can consume without installing. Closest: GitHub's MCP (read-only of GH metadata, not code graph); Context7 (library docs only). Our head start: the MCP server is already stdio; we need a Streamable-HTTP mode and OIDC auth to be a drop-in in any runner. + +5. **Claude Agent SDK hook pack.** Ship a published `@codehub/claude-hooks` that wires `PreToolUse(Edit)` → impact check → block-if-HIGH. Closest: nobody has shipped hook packs — the hook API exists and is empty [3]. Our head start: we already produce the exact structured output a hook would consume; this is a 200-line adapter. + +6. **Agent-attributable provenance.** Sign PR diffs with an attestation whose subject is the *agent identity* (Claude Code Auto Mode, Cursor Cloud Agent #X), not the build. Closest: SLSA attests builds, not authorship; Claude Code's freeform git trailer [34]. Our head start: `detect_changes` already fingerprints the diff and our output envelope is versioned; adding a signed attestation over `{diff-hash, agent-id, graphHash, verdict}` is a natural extension. + +7. **Staleness-aware grounding.** Most MCP tools silently serve stale data. Our `_meta["codehub/staleness"]` envelope makes staleness first-class; a CI gate can refuse to trust an agent that operated against a stale index. Closest: nobody — not even Sourcegraph exposes index-freshness per tool call. Our head start: already shipped [CLAUDE.md, OBJECTIVES.md §7]. + +8. **Deterministic cross-run graph hash for audit.** Compliance teams can verify that two PRs on identical commits got identical verdicts. Closest: nobody. Our head start: `graphHash` invariant is already an acceptance gate [SPECS §1.2]. + +## 4. Risks — who moves into this space fastest + +### GitHub +GitHub owns the Checks API, the Rulesets primitive, the MCP server for its own surface, and the coding agent's execution environment [4][26][31]. Most likely move: a GitHub-native "Code Intelligence Check" that publishes a first-party Check run from an internal SCIP-like index, bundled with Copilot Enterprise. Countermove: be offline-first, deterministic, and license-open *now* — GitHub won't ship Apache-2.0 offline, won't cover non-GitHub SCMs, and will charge per-seat. Our wedge is "self-hostable, offline, cross-repo, any SCM." + +### Sourcegraph +Sourcegraph has graph incumbency and enterprise trust. Most likely move: double down on Cody review agents + Code Connect, pitched as "Sourcegraph inside your CI." Countermove: we're a tenth the weight — no server to operate, one DuckDB file, embedded in the runner. The SCIP governance move [24] tells us Sourcegraph sees the protocol as a commons; that's good for us because our indexer stays interchangeable. + +### Anthropic +Anthropic controls the agent side — Claude Agent SDK, Claude Code, the hook API [3]. Most likely move: ship a first-party "repo understanding" tool in the SDK that quietly replaces `Read + Grep` with a lightweight graph. Countermove: *become the reference implementation of that tool*. Ship the Claude hook pack (seam #5) so that when Anthropic looks for a graph backend, they integrate with us rather than rebuild. Mirror: same posture with Cursor and Jules. + +### Secondary risk: Greptile / CodeRabbit +They could reposition from "LLM PR review" to "graph PR review" if they invest in indexing. Greptile v4 already markets graph-based [15]. Countermove: their graph is closed and SaaS-only. Ours is Apache-2.0 and offline. Position against them on auditability (deterministic verdict, graphHash invariant) and license (Apache-2.0 vs SaaS lock-in). Their enterprise buyers already ask for both. + +## 5. Bet recommendations + +1. **Ship Streamable-HTTP MCP transport with OIDC auth within 8 weeks.** *Why now:* remote MCP is production-real in 2026 [27][33]; every runner agent needs a no-install grounding endpoint; Claude Agent SDK + Copilot agent runners are hiring for exactly this shape. *Risk of being wrong:* stdio covers 80% of laptop use; if runner agents take longer to converge on remote MCP we ship ahead of demand. Low risk — this is table stakes. + +2. **Ship `codehub/impact` as a first-party GitHub Check with a signed verdict by Q3.** *Why now:* CodeRabbit, Greptile, and Diamond own the GitHub App slot; we can't beat them on LLM review, but the "deterministic graph verdict" slot is empty [13][15][17]. *Risk of being wrong:* if GitHub ships its own Code Intelligence Check first, we become a complement, not a competitor — still fine if our depth is greater. + +3. **Ship the Claude Agent SDK hook pack this quarter.** *Why now:* hooks API is live [3]; Claude Code's `/loop` and Auto Mode [1][2] mean agents are already running unattended; the hook is the last place to prevent a bad edit before it lands. *Risk of being wrong:* if Anthropic ships a first-party graph tool, our hook becomes redundant — but our backend is still consumable, and shipping now means we're the reference. + +4. **Ship Rego-over-the-graph (`codehub/verdict --policy file.rego`) by Q4.** *Why now:* OPA/Conftest have a deep bench of policy authors [28]; Semgrep reachability [29] proves the appetite for graph-aware gating; nobody composes across the whole stack. *Risk of being wrong:* teams may prefer YAML over Rego for simplicity (see CodeRabbit's YAML success) — mitigate by shipping a YAML-subset frontend that compiles to Rego. + +5. **Ship agent-identity attestations (SLSA-adjacent) in 2026-H2.** *Why now:* SLSA + Sigstore pipeline is mature [32]; no agent platform signs its outputs; Claude Code's co-author trailer [34] is evidence the market feels the gap but hasn't solved it. *Risk of being wrong:* buyers may not care yet. Slower commercial pull than seams 1–4, so sequence it last, but it's the moat when compliance catches up to agent-authored commits. + +--- + +## Sources + +1. sfeir.com — "Claude Code Auto Mode: Permissions & Autonomy" (March 2026). https://institute.sfeir.com/en/articles/claude-code-auto-mode-permissions-autonomy/ +2. winbuzzer.com — "Anthropic Claude Code cron scheduling background worker loop" (March 2026). https://winbuzzer.com/2026/03/09/anthropic-claude-code-cron-scheduling-background-worker-loop-xcxwbn/ +3. Anthropic — "Claude Agent SDK overview" (retrieved April 2026). https://code.claude.com/docs/en/agent-sdk/overview +4. GitHub — "Copilot CLI autopilot" (2026). https://docs.github.com/en/copilot/concepts/agents/copilot-cli/autopilot +5. GitHub — "Copilot direct edits via @mention" (March 2026). https://blockchain.news/news/github-copilot-pull-request-direct-edits +6. digitalapplied.com — "Cursor Automations guide" (early 2026). https://www.digitalapplied.com/blog/cursor-automations-always-on-agentic-coding-agents-guide +7. graphite.com — "Cursor Cloud Agents in Graphite" (March 2026). https://www.graphite.com/blog/cursor-cloud-agents +8. eesel.ai — "Cognition AI pricing" (2026). https://www.eesel.ai/blog/cognition-ai-pricing +9. vibecoding.app — "Devin review" (2026). https://vibecoding.app/blog/devin-review +10. testingcatalog.com — "Google prepares Jules V2 agent" (April 6 2026). https://www.testingcatalog.com/google-prepares-jules-v2-agent-capable-of-taking-bigger-tasks/ +11. AWS — "Amazon Q Developer FAQs" (2026). https://aws.amazon.com/q/developer/faqs/ +12. mindstudio.ai — "Replit Agent 4 vs Bolt" (early 2026). https://www.mindstudio.ai/blog/replit-agent-4-vs-bolt +13. coderabbit.ai — "How CodeRabbit delivers accurate AI code reviews on massive codebases" (2025). https://www.coderabbit.ai/blog/how-coderabbit-delivers-accurate-ai-code-reviews-on-massive-codebases +14. coderabbit.ai — "Pre-merge checks built-in and custom" (2025). https://www.coderabbit.ai/blog/pre-merge-checks-built-in-and-custom-pr-enforced +15. greptile.com — "Greptile v4 release" (2026). https://www.greptile.com/blog/greptile-v4 +16. morphllm.com — "Greptile vs Copilot comparison" (2026). https://www.morphllm.com/comparisons/greptile-vs-copilot +17. devclass.com — "Graphite debuts Diamond AI code reviewer" (March 2025). https://devclass.com/2025/03/19/graphite-debuts-diamond-ai-code-reviewer-insists-ai-will-never-replace-human-code-review/ +18. docs.ellipsis.dev — "Ellipsis features" (retrieved 2026). https://docs.ellipsis.dev/features +19. qodo.ai — "Qodo hands PR-Agent to the community" (April 2026). https://www.qodo.ai/blog/qodo-is-handing-pr-agent-over-to-the-community/ +20. github.com/qodo-ai/pr-agent — Qodo PR-Agent repo (2026). https://github.com/qodo-ai/pr-agent +21. prnewswire.com — "Sweep raises $22.5M Series B" (May 2025). https://www.prnewswire.com/news-releases/sweep-raises-22-5m-in-series-b-funding-led-by-insight-partners-302460023.html +22. sourcegraph.com — "Cody: better, faster, stronger" (2025). https://sourcegraph.com/blog/cody-better-faster-stronger +23. infoworld.com — "Sourcegraph unveils AI coding agents" (2025). https://www.infoworld.com/article/3812799/sourcegraph-unveils-ai-coding-agents.html +24. sourcegraph.com — "The future of SCIP" (March 2026). https://webflow.sourcegraph.com/blog/the-future-of-scip +25. northflank.com — "E2B vs Modal" (2026). https://northflank.com/blog/e2b-vs-modal +26. github.com/github/github-mcp-server — GitHub official MCP server (2026). https://github.com/github/github-mcp-server +27. linear.app — "Linear MCP changelog" (May 2025). https://linear.app/changelog/2025-05-01-mcp ; sentry.io — "Sentry MCP docs." https://docs.sentry.io/ai/mcp/ ; pagerduty.github.io — "PagerDuty remote MCP server" (2026). https://pagerduty.github.io/pagerduty-mcp-server/docs/remote-server/setup +28. policyascode.dev — "GitHub Actions policies with OPA/Conftest" (2025). https://policyascode.dev/guides/github-actions-policies +29. semgrep.dev — "Block malicious dependencies with Semgrep Supply Chain" (2025). https://semgrep.dev/blog/2025/block-malicious-dependencies-with-semgrep-supply-chain +30. docs.mergify.com — "Mergify Dependabot integration" (2025). https://docs.mergify.com/integrations/dependabot +31. docs.github.com — "Checks API & Rulesets" (2026). https://docs.github.com/enterprise-cloud@latest/rest/guides/getting-started-with-the-checks-api +32. github.blog — "SLSA 3 compliance with GitHub Actions" (updated 2024+). https://github.blog/security/supply-chain-security/slsa-3-compliance-with-github-actions/ +33. aws.amazon.com — "Open protocols for agent interoperability: authentication on MCP" (2025). https://aws.amazon.com/blogs/opensource/open-protocols-for-agent-interoperability-part-2-authentication-on-mcp/ +34. github.com/anthropics/claude-code — Issues #1653, #4224, #6848 on Co-Authored-By attribution (2025–2026). https://github.com/anthropics/claude-code/issues/1653 diff --git a/.erpaval/brainstorms/013-synthesis-v2-two-surface-product.md b/.erpaval/brainstorms/013-synthesis-v2-two-surface-product.md new file mode 100644 index 00000000..0a89d7b0 --- /dev/null +++ b/.erpaval/brainstorms/013-synthesis-v2-two-surface-product.md @@ -0,0 +1,160 @@ +# 013 — Synthesis v2: OpenCodeHub as a Two-Surface Product + +*Rewritten 2026-04-27 to reflect locked scope. Supersedes the earlier draft that included HTTP MCP + agent SDK + `grounding_pack`. Inputs carried forward: 001 Strategy, 002 PRD (artifact skills), 003–005 Design, 006 Synthesis v1, 007 Strategy v2 (diagnosis retained, transport rejected), 008 PRD v2 (lifecycle retained, SDK dropped), 009–011 Design v2 (HTTP/SDK rescinded, CI workflows retained), 012 Competitive Landscape. This memo is the current unified recommendation and the handoff to the Act phase.* + +## Locked scope decisions (2026-04-27) + +Three product-scope decisions constrain this synthesis and every follow-on: + +1. **Self-hosted OSS only.** No hosted service, no managed tier, no OpenCodeHub-operated infrastructure. Ever. (Memory: `project_opencodehub_no_saas.md`.) +2. **No remote / HTTP MCP server.** MCP stays stdio-only, for the Claude Code plugin on the developer's laptop. (Memory: `project_opencodehub_no_http_mcp_no_sdk.md`.) +3. **No agent SDK.** No `@opencodehub/agent-sdk` (Python or TypeScript), no `@opencodehub/claude-hooks`, no framework adapters. Agents that want OpenCodeHub grounding either use the Claude Code plugin on a laptop or shell out to the `codehub` CLI inside CI. (Same memory.) + +These decisions are load-bearing. Every item below is derivable from them. + +## The thesis in one paragraph + +**OpenCodeHub is a self-hosted OSS product with two surfaces, unified by a single offline-safe cross-repo graph.** + +- **Surface one — Laptop artifact factory.** A Claude Code plugin that turns the graph into committed Markdown. `codehub-document`, `codehub-pr-description`, `codehub-onboarding`, `codehub-contract-map` all ship in the P0 family. Stdio MCP, Claude Code plugin, local filesystem output. This is the visible, immediate wedge. + +- **Surface two — CI action surface (CLI-wrapping, deferred).** OSS GitHub Actions (and GitLab templates) that shell out to the `codehub` CLI inside the customer's own runner. `analyze-action`, `verdict-action`, a `codehub verdict` CLI subcommand, `opencodehub.policy.yaml` schema. No HTTP server, no SDK install, no OpenCodeHub-operated infrastructure. This is the structural, slower wedge. + +The two surfaces share the graph, the CLI, and the codebase. They are two skins on one primitive. Spec 001 ships first; spec 002 follows after adoption signal. + +## Why drop HTTP MCP + the agent SDK + +The earlier draft argued the HTTP + SDK combo as the fastest way to reach runner-resident agents. Three reasons to unwind that bet: + +1. **CLI-wrapping actions cover the same ground at a fraction of the surface area.** A GitHub Action that shells `codehub verdict --policy file.yaml` gives the customer the same deterministic merge gate as an HTTP-MCP-backed `policy_evaluate` tool. No authentication flow, no presigned URLs pinging a server, no SDK version compatibility — just a CLI call inside the customer's runner against a graph blob cached in Actions Cache. + +2. **HTTP MCP forces operational commitments that don't compose with self-hosted OSS.** A remote MCP server implies an OAuth issuer, a JWKS endpoint, a JWT issuer the customer operates, graph access from a networked service. Each of those is a customer-run component we'd have to document, support, and stabilize — while offering no capability the CLI doesn't already deliver inside a runner. + +3. **An SDK without HTTP is an SDK over stdio — which is what the Claude Code plugin already is.** The SDK was only valuable if it sat in front of a server. Without the server, the SDK either duplicates the plugin or shells the CLI — in both cases we prefer direct consumption. + +The competitive reframe from 012 still holds: agents are running in ephemeral cloud runners. The reframe does **not** dictate HTTP as the transport. A CI action that runs `codehub analyze` + `codehub verdict` inside that same ephemeral runner is a cleaner fit than an HTTP server the runner dials out to. + +## What ships · P0 (spec 001 — laptop artifact factory) + +The complete P0 family. Ships first. Ships together. Not blocked by spec 002. + +1. **`codehub-document`** — primary skill. Single-repo and group mode. 4-phase orchestration (Phase 0 precompute → AB parallel → CD parallel → E assembler). +2. **Six `doc-*` subagents** — `doc-architecture`, `doc-reference`, `doc-behavior`, `doc-analysis`, `doc-diagrams`, `doc-cross-repo` (group-only). +3. **Phase 0 precompute** — writes `.codehub/.context.md` (200-line cap) and `.codehub/.prefetch.md` (JSON tool-call ledger). Shared across every subagent. +4. **`.docmeta.json` + Phase E assembler** — deterministic citation regex, See-also footers, `--refresh` algorithm, cross-repo link graph in group mode. +5. **`codehub-pr-description`** — linear skill, no subagents. Markdown PR body from `detect_changes` + `verdict` + `owners` + `list_findings_delta`. +6. **`codehub-onboarding`** — one specialty subagent. `ONBOARDING.md` with ranked reading order from graph centrality. +7. **`codehub-contract-map`** — promoted from P1 on 2026-04-27. Group-only standalone skill. Renders `group_contracts` + `group_query` + `route_map` into Markdown + Mermaid. Fires on "map the contracts" / "contract matrix" invocations without needing the full `codehub-document` orchestration. +8. **PostToolUse staleness hook** — non-blocking `systemMessage` after `git commit|merge|rebase|pull` when `graph_hash` drifts and `.docmeta.json` exists. +9. **Discoverability patches** — guide-skill Skills table, `codehub analyze` completion hint, `next_steps[]` suggestions on `verdict` / `detect_changes`, Starlight `/skills/` index page. + +Spec: `.erpaval/specs/001-claude-code-artifact-surface/spec.md`. Updated this session with `codehub-contract-map` promoted and three new ACs (AC-2-7, AC-3-4, AC-5-5). + +## What ships · P1 (spec 002 — CI action surface, deferred) + +Only after spec 001 has traction. All CLI-wrapping; zero HTTP server; zero SDK. + +1. **`opencodehub/analyze-action@v1`** — shells `codehub analyze`, uploads graph to configured storage backend. +2. **`opencodehub/verdict-action@v1`** — shells `codehub verdict --policy ...`, posts GitHub Check with per-rule annotations, applies `opencodehub:auto-approve` label on full pass. +3. **`opencodehub/token-action@v1`** — OIDC → JWT for the customer's own storage-backend presign flow (only relevant when the customer opts into bucket-backed storage). +4. **`codehub verdict` CLI subcommand** — new subcommand; consumes `opencodehub.policy.yaml`, emits structured verdict JSON. Byte-identical on unchanged inputs. +5. **`opencodehub.policy.yaml` schema v1** — four rule types: `blast_radius_max`, `license_allowlist`, `ownership_required`, `arch_invariants` (scaffolded, feature-flagged). +6. **Graph storage · Tier 0 (Actions Cache)** — default backend via `actions/cache@v4`. Zero customer infrastructure. +7. **`codehub-adr` skill** — pushed from spec 001 P1 into the laptop family's P1 backlog. Ships when there's appetite. + +Spec: `.erpaval/specs/002-agent-grounding-plane/spec.md` (rewritten this session — directory name retained for history; contents are now CI-action-surface, CLI-wrapping). + +## What ships · P2 (later) + +- `arch_invariants` flag flipped default-on (after field data from design partners) +- GitLab CI templates (after GitHub Actions prove the shape) +- Customer-self-hosted GitHub App (not an OpenCodeHub-operated App — a container the customer deploys themselves) +- Graph storage · Tier 1 (customer S3/R2/MinIO) — presign done inside the customer's CI, not via an OpenCodeHub endpoint +- `codehub provenance record` CLI + `.opencodehub/grounding.json` sidecar +- Sigstore-signed provenance with agent-identity as attestation subject +- Cross-org policy federation (git-based, no central registry) +- `codehub-document --group --auto` on merge-to-main + +## What we are NOT doing (consolidated, no exceptions) + +- **No hosted / managed / SaaS / OpenCodeHub-operated tier.** Ever. +- **No remote / HTTP MCP server.** Stdio MCP on the laptop only. +- **No agent SDK** (Python, TS, `claude-hooks`, or framework adapters). +- **No `grounding_pack` MCP compositor tool.** Its value was SDK consumption. +- **No OpenCodeHub-branded coding agent.** We don't compete with Devin, Claude-for-GitHub, Copilot, Cursor, Q Developer, Jules. +- **No LLM-based PR review.** CodeRabbit/Greptile/Diamond territory. We compete on deterministic verdict, not LLM verdict. +- **No hosted review UI.** GitHub Checks + PR comments are the review surface. +- **No IDE plugin or LSP.** +- **No model fine-tuning.** + +## Three tensions and how they resolved under the new scope + +### Tension 1 — Auth and graph-URL plumbing + +Before: OIDC → JWT minted once per workflow, consumed by `analyze-action`, `verdict-action`, and an HTTP MCP server via presigned URLs. + +Under locked scope: OIDC → JWT is still fine, but its only consumer is the customer's own storage backend (when they pick Tier 1). Default Tier 0 (Actions Cache) doesn't need a JWT at all — `actions/cache@v4` handles credentials natively. The token story collapses to "use OIDC if and only if you run bucket-backed storage." + +### Tension 2 — Graph storage scope in v1 + +Before: argued about whether to ship S3/R2 in v1 vs P1. + +Under locked scope: Tier 0 (Actions Cache) is the only default. Tier 1 (customer bucket via presigned URLs minted in the customer's own workflow) is P2. There is no Tier 2 (hosted) — ruled out permanently. + +### Tension 3 — Policy rules in v1 + +Unchanged. Three evaluated rule types in v1 (`blast_radius_max`, `license_allowlist`, `ownership_required`); `arch_invariants` scaffolded, feature-flagged. v1 reserves the schema slot; flag flips in P2. + +## Competitive posture (unchanged in substance, sharpened in framing) + +From 012 §3, the seams that survive the scope decision: + +- **Blast-radius-as-a-Check.** First-party GitHub Check from `verdict-action` with deterministic `graph_hash`-backed verdict. Still wide open. +- **Cross-repo contract gate.** `group_contracts` + `group_query` surface through the CLI and through the `codehub-contract-map` laptop skill. Still uniquely ours. +- **Policy-as-code over a code graph.** `opencodehub.policy.yaml` evaluated by the CLI. Still wide open; we ship without the OPA-style runtime weight. +- **Staleness-aware grounding.** Every CLI response carries the existing `_meta.codehub/staleness` envelope. +- **Deterministic cross-run verdict.** Audit guarantee from `graphHash` invariant. Buyers can prove two runs on identical inputs returned identical verdicts. + +Seams we explicitly forfeit (scope decision): + +- ~~Agent-scoped grounding server for CI runners~~ — forfeited by dropping HTTP MCP. +- ~~Claude Agent SDK hook pack~~ — forfeited by dropping the SDK. +- ~~Agent-attributable provenance via SDK~~ — the CLI can still record provenance (`codehub provenance record` P2), just without an SDK in front of it. + +The forfeits are real. Counter: being first with the HTTP + SDK shape would have meant operating server code for our customers, or shipping SDK versions that break every time Claude Agent SDK moves, or both. The CLI + Actions posture is cheaper to maintain, and the laptop surface still reaches every Claude Code user directly. + +## Risks carried forward + +From 012 §4, filtered against the new scope: + +1. **GitHub ships a first-party "Code Intelligence Check."** Countermove: be license-open (Apache-2.0), self-hostable, cross-SCM, deterministic. Spec 002 is the countermove; ship it in P1. +2. **Sourcegraph doubles down on Cody review agents.** Countermove: one-tenth the weight (DuckDB + CLI, all self-hosted). Stay downstream-compatible with SCIP. +3. **Anthropic ships a first-party repo-understanding tool in Claude Agent SDK.** This risk *increases* under the scope decision (we don't own the SDK surface). Countermove: the Claude Code plugin on laptops + artifact factory reach gives us an engineer-facing foothold that an Anthropic tool would complement, not replace. +4. **Greptile/CodeRabbit reposition as "graph PR review."** Their graph stays closed, SaaS-only. Ours is Apache-2.0 and self-hostable. Compete on auditability and license. + +## Timeline + +- **Weeks 1–8: Spec 001 ships.** Artifact factory end-to-end on this repo, then released in the plugin. Four skills (doc, pr-description, onboarding, contract-map), six subagents, Phase 0–E, `.docmeta.json`, staleness hook, discoverability. +- **Weeks 9–?: Adoption signal.** At least one external user running the plugin on a group with ≥ 2 repos. No spec 002 work until signal exists. +- **Spec 002: P1.** CI actions + CLI verdict subcommand. Begin only after spec 001 is proven. + +## Open questions for you + +These are the remaining judgment calls in spec 001 — the places the current spec made a call I want you to sign off on before Act phase: + +1. **`codehub-contract-map` output path.** `.codehub/groups//contracts.md` by default; `--committed` writes to `docs//contracts.md`. Consistent with the other skills. OK to lock? +2. **Orchestrator model for `codehub-document`.** Sonnet default; Opus only when `--refresh --group` is passed. PRD tension #3 from synthesis v1. OK to lock? +3. **Gitignored vs committed default.** `.codehub/docs/` gitignored by default; `--committed` opts in. ADRs would have been the one exception (ADR must be in git to be an ADR), but `codehub-adr` moved to P1 backlog so this is moot for v1. OK to lock? + +## Handoff + +Two specs live and consistent with the locked scope: + +- `.erpaval/specs/001-claude-code-artifact-surface/spec.md` — **9 P0 items, ready for Act phase.** +- `.erpaval/specs/002-agent-grounding-plane/spec.md` — **rewritten 2026-04-27, deferred to P1, waits on spec 001.** + +Roadmap SPA live at `.erpaval/roadmap/index.html` — 31 items across both specs and the Never column, all views (Overview / Timeline / Board / Dependencies / Pillars) reflecting the locked scope. + +Project memory updated at `/Users/lalsaado/.claude/projects/-Users-lalsaado-Projects-open-code-hub/memory/` with the three scope decisions. Future sessions start with these constraints already in context. + +Say the word and `/erpaval` Act phase kicks off against spec 001. diff --git a/.erpaval/roadmap/app.js b/.erpaval/roadmap/app.js new file mode 100644 index 00000000..121e6bf4 --- /dev/null +++ b/.erpaval/roadmap/app.js @@ -0,0 +1,408 @@ +/* OpenCodeHub Roadmap SPA — jQuery renderer + interactions */ + +(function ($) { + "use strict"; + + var DATA = window.RoadmapData; + var state = { + view: "overview", + filters: { surface: "all", tier: "all", search: "" }, + selectedId: null + }; + + // ─── Bootstrapping ─────────────────────────────────────── + $(function () { + renderOverview(); + renderTimeline(); + renderBoard(); + renderPillars(); + renderDeps(); + bindEvents(); + applyFilters(); + updateCounter(); + // Deep link support + if (window.location.hash) { + var id = window.location.hash.replace("#", ""); + setTimeout(function () { openDrawer(id); }, 60); + } + }); + + // ─── Event bindings ────────────────────────────────────── + function bindEvents() { + $(".view-btn").on("click", function () { + var v = $(this).data("view"); + state.view = v; + $(".view-btn").removeClass("is-active").attr("aria-selected", "false"); + $(this).addClass("is-active").attr("aria-selected", "true"); + $(".view").removeClass("is-active"); + $(".view[data-view='" + v + "']").addClass("is-active"); + // Dependency view needs positions re-computed after display + if (v === "deps") setTimeout(layoutDeps, 30); + }); + + $(".chip").on("click", function () { + var filter = $(this).data("filter"); + var value = String($(this).data("value")); + state.filters[filter] = value; + $(".chip[data-filter='" + filter + "']").removeClass("is-active"); + $(this).addClass("is-active"); + applyFilters(); + }); + + var searchDebounce; + $("#search").on("input", function () { + var v = $(this).val(); + clearTimeout(searchDebounce); + searchDebounce = setTimeout(function () { + state.filters.search = String(v || "").toLowerCase(); + applyFilters(); + }, 80); + }); + + $(document).on("click", "[data-open-item]", function (e) { + e.preventDefault(); + var id = $(this).data("open-item"); + openDrawer(id); + }); + + $("#drawer-close, #drawer-scrim").on("click", closeDrawer); + $(document).on("keydown", function (e) { + if (e.key === "Escape") closeDrawer(); + }); + + $(window).on("resize", function () { + if (state.view === "deps") layoutDeps(); + }); + } + + // ─── Overview view ─────────────────────────────────────── + function renderOverview() { + $.each(DATA.items, function (_, item) { + if (item.tier === "never") return; + var selector = "[data-tier-grid='" + item.surface + "-" + item.tier + "']"; + $(selector).append(cardHtml(item)); + }); + } + + function cardHtml(item) { + var tags = (item.tags || []).slice(0, 4).map(function (t) { + return "" + t + ""; + }).join(""); + return ( + "
" + + "" + + "
" + + "" + item.id + "" + + "" + item.tier + "" + + "
" + + "
" + escapeHtml(item.title) + "
" + + "
" + escapeHtml(item.blurb) + "
" + + "
" + tags + "
" + + "
" + ); + } + + // ─── Timeline view ─────────────────────────────────────── + function renderTimeline() { + var $tracks = $("#timeline-tracks").empty(); + var totalWeeks = 10; + $.each(DATA.tracks, function (_, track) { + var itemsInTrack = DATA.items.filter(function (i) { + return i.track === track.id && i.week; + }); + if (!itemsInTrack.length) return; + var $row = $("
"); + $row.append("" + track.label + ""); + $.each(itemsInTrack, function (_, item) { + var left = ((item.week.start - 1) / totalWeeks) * 100; + var width = ((item.week.end - item.week.start + 1) / totalWeeks) * 100; + var $bar = $( + "
" + ); + $bar.css({ left: left + "%", width: "calc(" + width + "% - 4px)" }); + if (item.critical) $bar.addClass("is-critical"); + $bar.html( + "" + item.id + "" + + "" + escapeHtml(truncate(item.title, 46)) + "" + ); + $row.append($bar); + }); + $tracks.append($row); + }); + } + + // ─── Board view ────────────────────────────────────────── + function renderBoard() { + var buckets = { backlog: [], next: [], after: [], later: [], never: [] }; + $.each(DATA.items, function (_, item) { + if (item.tier === "never") buckets.never.push(item); + else if (item.tier === "P0") buckets.next.push(item); + else if (item.tier === "P1") buckets.after.push(item); + else if (item.tier === "P2") buckets.later.push(item); + else buckets.backlog.push(item); + }); + $.each(buckets, function (col, list) { + var $col = $("[data-col-list='" + col + "']").empty(); + if (!list.length) { + $col.append("
empty
"); + return; + } + $.each(list, function (_, item) { $col.append(cardHtml(item)); }); + }); + } + + // ─── Dependencies view ─────────────────────────────────── + function renderDeps() { + var $nodes = $("#deps-nodes").empty(); + // Build depth buckets via topo order on 'depends' + var byId = {}; + $.each(DATA.items, function (_, it) { byId[it.id] = it; }); + var depth = {}; + function computeDepth(id, seen) { + if (depth[id] !== undefined) return depth[id]; + if (seen && seen[id]) return 0; + seen = $.extend({}, seen || {}); seen[id] = true; + var item = byId[id]; + if (!item) return 0; + var maxDep = -1; + $.each(item.depends || [], function (_, did) { + var d = computeDepth(did, seen); + if (d > maxDep) maxDep = d; + }); + return (depth[id] = maxDep + 1); + } + $.each(DATA.items, function (_, it) { + if (it.tier === "never") return; + computeDepth(it.id); + }); + + // Place items + var columns = {}; + $.each(DATA.items, function (_, it) { + if (it.tier === "never") return; + var col = depth[it.id] || 0; + (columns[col] = columns[col] || []).push(it); + }); + + var colW = 260; + var rowH = 78; + var colMargin = 24; + var maxCol = 0; + $.each(columns, function (c) { if (+c > maxCol) maxCol = +c; }); + var maxRows = 0; + $.each(columns, function (_, list) { if (list.length > maxRows) maxRows = list.length; }); + + $.each(columns, function (colIdx, list) { + $.each(list, function (i, item) { + var $n = $( + "
" + + "
" + item.id + "
" + + "
" + escapeHtml(item.title) + "
" + + "
" + ); + $n.css({ + left: (colIdx * (colW + colMargin)) + "px", + top: (i * rowH) + "px" + }); + $nodes.append($n); + }); + }); + + $("#deps-nodes").css({ + minHeight: (maxRows * rowH + 40) + "px", + minWidth: ((maxCol + 1) * (colW + colMargin)) + "px" + }); + + layoutDeps(); + } + + function layoutDeps() { + var $svg = $("#deps-svg"); + var svgEl = $svg.get(0); + if (!svgEl) return; + $svg.empty(); + // size svg to match container + var $wrap = $(".deps-wrap"); + var wrapW = Math.max($wrap.width(), $("#deps-nodes").prop("scrollWidth") || 0); + var wrapH = Math.max($wrap.height(), $("#deps-nodes").prop("scrollHeight") || 0); + $svg.attr("viewBox", "0 0 " + wrapW + " " + wrapH); + $svg.attr("width", wrapW).attr("height", wrapH); + + // Collect node positions + var positions = {}; + $(".dep-node").each(function () { + var $n = $(this); + var pos = $n.position(); + positions[$n.data("id")] = { + x: pos.left, + y: pos.top, + w: $n.outerWidth(), + h: $n.outerHeight() + }; + }); + + // Draw edges + $.each(DATA.items, function (_, item) { + if (item.tier === "never") return; + $.each(item.depends || [], function (_, depId) { + var from = positions[depId]; + var to = positions[item.id]; + if (!from || !to) return; + var x1 = from.x + from.w; + var y1 = from.y + from.h / 2; + var x2 = to.x; + var y2 = to.y + to.h / 2; + var cx1 = x1 + 40; + var cx2 = x2 - 40; + var pathD = "M" + x1 + "," + y1 + " C" + cx1 + "," + y1 + " " + cx2 + "," + y2 + " " + x2 + "," + y2; + var path = document.createElementNS("http://www.w3.org/2000/svg", "path"); + path.setAttribute("d", pathD); + path.setAttribute("class", "dep-edge"); + path.setAttribute("data-from", depId); + path.setAttribute("data-to", item.id); + svgEl.appendChild(path); + }); + }); + } + + // ─── Pillars view ──────────────────────────────────────── + function renderPillars() { + var $wrap = $("#pillars-grid").empty(); + $.each(DATA.pillars, function (_, pillar) { + var items = pillar.items.map(function (iid) { + var item = DATA.items.find(function (i) { return i.id === iid; }); + if (!item) return ""; + return ( + "
" + + "" + escapeHtml(item.title) + "" + + "" + item.id + "" + item.tier + "" + + "
" + ); + }).join(""); + var surfaceColor = pillar.surface === "laptop" ? "var(--laptop)" : "var(--runner)"; + $wrap.append( + "
" + + "
" + + "" + escapeHtml(pillar.title) + "
" + + "
" + escapeHtml(pillar.body) + "
" + + "
" + items + "
" + + "
" + ); + }); + } + + // ─── Filtering ─────────────────────────────────────────── + function applyFilters() { + var f = state.filters; + var visible = 0; + $(".card, .dep-node, .timeline-bar, .pillar-item").each(function () { + var $el = $(this); + var id = $el.data("id") || $el.data("open-item"); + var item = DATA.items.find(function (i) { return i.id === id; }); + if (!item) return; + var show = true; + if (f.surface !== "all" && item.surface !== f.surface) show = false; + if (f.tier !== "all" && item.tier !== f.tier) show = false; + if (f.search) { + var hay = (item.title + " " + item.blurb + " " + (item.tags || []).join(" ") + " " + item.id).toLowerCase(); + if (hay.indexOf(f.search) === -1) show = false; + } + $el.toggleClass("is-filtered-out", !show); + if (show && $el.hasClass("card")) visible++; + }); + updateCounter(visible); + } + + function updateCounter(visibleCards) { + var total = DATA.items.filter(function (i) { return i.tier !== "never"; }).length; + var v = typeof visibleCards === "number" ? visibleCards : total; + var uniqueVisible = v / 3; // cards rendered in overview + board + pillars, avg + $("#foot-counter").text("Showing " + Math.round(uniqueVisible) + " of " + total + " tracked items · 5 never-items excluded"); + } + + // ─── Drawer ────────────────────────────────────────────── + function openDrawer(id) { + var item = DATA.items.find(function (i) { return i.id === id; }); + if (!item) return; + state.selectedId = id; + window.history.replaceState(null, "", "#" + id); + $("#drawer-eyebrow").text(item.id + " · " + (item.surface === "laptop" ? "Laptop surface" : "Runner surface")); + $("#drawer-title").text(item.title); + $("#drawer-meta").html( + "" + item.tier + "" + + (item.critical ? "critical path" : "") + + (item.week ? "W" + item.week.start + "–W" + item.week.end + "" : "") + + (item.tags || []).map(function (t) { return "" + t + ""; }).join("") + ); + $("#drawer-why").html(escapeHtml(item.why || item.blurb)); + var $scope = $("#drawer-scope").empty(); + (item.scope || []).forEach(function (s) { + $scope.append("
  • " + escapeHtml(s) + "
  • "); + }); + if (!item.scope || !item.scope.length) $scope.append("
  • "); + + var $deps = $("#drawer-deps").empty(); + (item.depends || []).forEach(function (did) { + var dep = DATA.items.find(function (i) { return i.id === did; }); + if (!dep) return; + $deps.append( + "" + ); + }); + if (!item.depends || !item.depends.length) $deps.append("
  • "); + + var $un = $("#drawer-unblocks").empty(); + (item.unblocks || []).forEach(function (uid) { + var un = DATA.items.find(function (i) { return i.id === uid; }); + if (!un) return; + $un.append( + "" + ); + }); + if (!item.unblocks || !item.unblocks.length) $un.append("
  • "); + + $("#drawer-source").text(item.source || ""); + + // Highlight edges in deps graph + $(".dep-edge").removeClass("is-active"); + $(".dep-edge[data-from='" + id + "'], .dep-edge[data-to='" + id + "']").addClass("is-active"); + $(".dep-node").removeClass("is-dim"); + if (state.view === "deps") { + var connected = { [id]: true }; + (item.depends || []).forEach(function (d) { connected[d] = true; }); + (item.unblocks || []).forEach(function (u) { connected[u] = true; }); + $(".dep-node").each(function () { + var nid = $(this).data("id"); + if (!connected[nid]) $(this).addClass("is-dim"); + }); + } + + $("#drawer").addClass("is-open").attr("aria-hidden", "false"); + $("#drawer-scrim").addClass("is-open"); + } + + function closeDrawer() { + $("#drawer").removeClass("is-open").attr("aria-hidden", "true"); + $("#drawer-scrim").removeClass("is-open"); + $(".dep-edge").removeClass("is-active"); + $(".dep-node").removeClass("is-dim"); + state.selectedId = null; + if (window.history.replaceState) { + window.history.replaceState(null, "", window.location.pathname); + } + } + + // ─── Utils ─────────────────────────────────────────────── + function escapeHtml(s) { + return String(s == null ? "" : s) + .replace(/&/g, "&").replace(//g, ">") + .replace(/"/g, """).replace(/'/g, "'"); + } + function truncate(s, n) { return s.length > n ? s.slice(0, n - 1) + "…" : s; } +})(jQuery); diff --git a/.erpaval/roadmap/data.js b/.erpaval/roadmap/data.js new file mode 100644 index 00000000..e932917c --- /dev/null +++ b/.erpaval/roadmap/data.js @@ -0,0 +1,622 @@ +/* Roadmap data — scope locked 2026-04-27. + * + * Distribution model: self-hosted OSS only. No SaaS, no managed tier, no + * OpenCodeHub-operated infrastructure. + * + * Scope exclusions (per user directive 2026-04-27): + * - NO remote / HTTP MCP server + * - NO agent SDK (Python or TypeScript) + * - NO claude-hooks SDK wrapper + * - NO grounding_pack MCP tool as an agent-SDK compositor + * + * Surfaces that remain: + * 1. Claude Code plugin on the laptop via stdio MCP (artifact factory) + * 2. OSS GitHub Actions + GitLab templates wrapping the codehub CLI + * (policy, verdict, analyze — CLI under the hood, no HTTP, no SDK) + * + * Sources: + * .erpaval/brainstorms/006-synthesis-whats-next.md + * .erpaval/brainstorms/013-synthesis-v2-two-surface-product.md + * .erpaval/specs/001-claude-code-artifact-surface/spec.md + * (Spec 002 HTTP + SDK portions rescinded; CLI-wrapping actions survive) + */ + +window.RoadmapData = { + pillars: [ + { + id: "pillar-laptop-artifacts", + title: "Laptop · Artifact factory", + surface: "laptop", + body: "Claude Code plugin turns the graph into committed Markdown. The visible wedge — developers touch it, it demos well, it builds trust in the graph. All five artifact skills ship P0.", + items: ["l-doc-skill", "l-doc-agents", "l-precompute", "l-docmeta", "l-pr-desc", "l-onboarding", "l-contract-map", "l-hooks", "l-discover"] + }, + { + id: "pillar-laptop-followups", + title: "Laptop · Follow-ups", + surface: "laptop", + body: "P1/P2 additions to the laptop plugin once the P0 family lands and design partners give us field data.", + items: ["l-adr", "l-doc-auto"] + }, + { + id: "pillar-runner-ci", + title: "Runner · OSS GitHub Actions (CLI-wrapping)", + surface: "runner", + body: "CI thin shells around the codehub CLI. No HTTP MCP, no agent SDK — the action runs codehub analyze or codehub verdict inside the customer's runner and posts a GitHub Check. Deferred until after the laptop surface has traction.", + items: ["r-analyze-action", "r-verdict-action", "r-token-action", "r-policy-schema", "r-policy-cli", "r-storage-tier0", "r-storage-tier1", "r-gitlab", "r-arch-invariants"] + }, + { + id: "pillar-runner-later", + title: "Runner · Later", + surface: "runner", + body: "P2 runner items that only make sense once the P1 action + policy surface has adoption.", + items: ["r-provenance-cli", "r-sigstore-prov", "r-self-gh-app", "r-federation"] + }, + { + id: "pillar-exclusions", + title: "Explicit exclusions", + surface: "laptop", + body: "Shape choices we are rejecting, recorded as first-class cards so they are visible in every view. Stability and focus come from what we say no to.", + items: ["x-saas", "x-http-mcp", "x-agent-sdk", "x-agent", "x-llm-review", "x-ide", "x-fine-tune"] + } + ], + + items: [ + /* ───────── Laptop · artifact factory · P0 ───────── */ + { + id: "l-doc-skill", + title: "codehub-document skill", + surface: "laptop", tier: "P0", + track: "laptop-skills", + week: { start: 3, end: 7 }, + critical: true, + tags: ["skill", "plugin", "codeprobe-pattern"], + blurb: "Primary artifact generator. Single-repo and group mode, 4-phase orchestration (Phase 0 precompute → AB parallel → CD parallel → E assembler).", + why: "The flagship of the laptop surface. Ports codeprobe's proven /document pattern to OpenCodeHub's graph, extended with group mode as the multi-repo wedge. If this skill does not ship, the laptop surface remains analytical-only.", + scope: [ + "plugins/opencodehub/skills/codehub-document/SKILL.md", + "references/ directory for progressive disclosure", + "Argument hints: [output-dir] [--group ] [--committed] [--refresh] [--section ]", + "Opus on --refresh --group; Sonnet otherwise" + ], + depends: ["l-doc-agents", "l-precompute", "l-docmeta"], + unblocks: ["l-discover", "l-hooks", "l-contract-map"], + source: "Spec 001 AC-2-1 · brainstorm 003/006/013" + }, + { + id: "l-doc-agents", + title: "Six doc-* subagents", + surface: "laptop", tier: "P0", + track: "laptop-skills", + week: { start: 3, end: 6 }, + critical: true, + tags: ["agents", "sonnet", "plugin"], + blurb: "doc-architecture, doc-reference, doc-behavior, doc-analysis, doc-diagrams, doc-cross-repo (group-only). Eight-section scaffold per codeprobe.", + why: "Parallel subagents are the cost-efficient way to produce 30+ documents — each runs Sonnet, reads the Phase 0 precompute from disk, writes its own files in isolation.", + scope: [ + "plugins/opencodehub/agents/doc-architecture.md", + "plugins/opencodehub/agents/doc-reference.md", + "plugins/opencodehub/agents/doc-behavior.md", + "plugins/opencodehub/agents/doc-analysis.md", + "plugins/opencodehub/agents/doc-diagrams.md", + "plugins/opencodehub/agents/doc-cross-repo.md (group mode)" + ], + depends: [], + unblocks: ["l-doc-skill"], + source: "Spec 001 AC-1-2 · brainstorm 004" + }, + { + id: "l-precompute", + title: "Phase 0 shared-context precompute", + surface: "laptop", tier: "P0", + track: "laptop-substrate", + week: { start: 2, end: 4 }, + critical: true, + tags: ["precompute", "substrate"], + blurb: "Writes .codehub/.context.md (200-line cap) and .codehub/.prefetch.md (JSON tool-call ledger). Prompt-dedup via filesystem.", + why: "The single design choice that makes parallel subagents affordable — every doc-* reads the same precomputed context instead of re-calling tools. Copied from codeprobe and extended for group mode.", + scope: [ + "packages/analysis/src/prefetch.ts (or equivalent)", + "200-line cap with per-section truncation flags", + "Group mode writes to .codehub/groups//" + ], + depends: [], + unblocks: ["l-doc-skill", "l-doc-agents"], + source: "Spec 001 AC-6-1 · brainstorm 004/005" + }, + { + id: "l-docmeta", + title: ".docmeta.json + Phase E assembler", + surface: "laptop", tier: "P0", + track: "laptop-substrate", + week: { start: 4, end: 7 }, + tags: ["assembler", "determinism"], + blurb: "Phase E: regex over backtick citations → co-occurrence join → See-also footers + cross-repo link graph. Sidecar drives --refresh.", + why: "Deterministic 40-line code that makes the artifact tree machine-navigable. Without it, --refresh is impossible and cross-repo links never get computed.", + scope: [ + "JSON Schema for .docmeta.json (generated_at, graph_hash, mode, sections[], cross_repo_refs[])", + "Phase E citation regex + co-occurrence assembler", + "--refresh diff algorithm" + ], + depends: ["l-doc-skill"], + unblocks: ["l-hooks"], + source: "Spec 001 AC-4-1/4-3 · brainstorm 005" + }, + { + id: "l-pr-desc", + title: "codehub-pr-description skill", + surface: "laptop", tier: "P0", + track: "laptop-skills", + week: { start: 4, end: 6 }, + tags: ["skill", "linear"], + blurb: "Generates Markdown PR body from detect_changes + verdict + owners + list_findings_delta. Refuses on a clean tree.", + why: "Highest-frequency use case (every PR). Shortest agent path in the family. Demonstrates the MCP→Markdown pipeline in 10 seconds, not 90.", + scope: [ + "plugins/opencodehub/skills/codehub-pr-description/", + "Sonnet, linear (no subagents)", + "Writes to .codehub/pr/PR-.md by default" + ], + depends: [], + unblocks: [], + source: "Spec 001 AC-2-5 · brainstorm 003" + }, + { + id: "l-onboarding", + title: "codehub-onboarding skill", + surface: "laptop", tier: "P0", + track: "laptop-skills", + week: { start: 5, end: 7 }, + tags: ["skill", "subagent"], + blurb: "Produces ONBOARDING.md with ranked reading order from project_profile + graph centrality + owners + entry points.", + why: "Lowest-effort v1 output that immediately showcases the graph doing something prose can't — ranked reading order from centrality.", + scope: [ + "plugins/opencodehub/skills/codehub-onboarding/", + "One specialty subagent (doc-onboarding)", + "Default .codehub/ONBOARDING.md; --committed writes to docs/" + ], + depends: ["l-precompute"], + unblocks: [], + source: "Spec 001 AC-2-6 · brainstorm 003" + }, + { + id: "l-contract-map", + title: "codehub-contract-map skill", + surface: "laptop", tier: "P0", + track: "laptop-skills", + week: { start: 6, end: 9 }, + tags: ["skill", "group-mode", "mermaid"], + blurb: "Cross-repo-only skill that renders group_contracts into Markdown + Mermaid. Ships standalone alongside codehub-document --group.", + why: "Promoted to P0. Group-level contract artifacts are the uniquely-ours wedge — nobody else exposes cross-repo graph primitives as a skill. Shipping standalone lets the skill fire on direct invocations (\"map the contracts\") without requiring the full codehub-document flow.", + scope: [ + "plugins/opencodehub/skills/codehub-contract-map/", + "Required: positional arg", + "Uses group_list + group_contracts + group_query + route_map", + "Refuses on single-repo scope with a single-line hint", + "Output: .codehub/groups//contracts.md with Mermaid" + ], + depends: ["l-doc-skill"], + unblocks: [], + source: "Promoted P1→P0 on 2026-04-27 · brainstorm 003/006" + }, + { + id: "l-hooks", + title: "PostToolUse staleness hook", + surface: "laptop", tier: "P0", + track: "laptop-hooks", + week: { start: 6, end: 8 }, + tags: ["hook", "freshness"], + blurb: "After git commit/merge/rebase/pull + auto-reindex, emits a non-blocking systemMessage when graph_hash changed and .docmeta.json exists.", + why: "Makes freshness free without spending Bedrock credits automatically. Users see the suggestion and opt in when convenient.", + scope: [ + "plugins/opencodehub/hooks.json extension", + "Non-blocking systemMessage format", + "Precondition check: .codehub/docs/.docmeta.json exists" + ], + depends: ["l-docmeta"], + unblocks: [], + source: "Spec 001 AC-2-7 · brainstorm 006" + }, + { + id: "l-discover", + title: "Discoverability patches", + surface: "laptop", tier: "P0", + track: "laptop-skills", + week: { start: 7, end: 9 }, + tags: ["discovery", "docs"], + blurb: "opencodehub-guide skills table · analyze-completion hint · verdict/detect_changes next_steps · Starlight /skills/ page.", + why: "Users in the mental state of just having run codehub analyze are exactly the ones who want docs. Meeting them there is the decisive surface.", + scope: [ + "opencodehub-guide skills table", + "packages/cli/src/commands/analyze.ts completion hint", + "packages/mcp/src/next-step-hints.ts adds codehub-pr-description suggestion", + "Starlight /skills/ page rendering frontmatter as cards" + ], + depends: ["l-doc-skill", "l-pr-desc", "l-onboarding", "l-contract-map"], + unblocks: [], + source: "Spec 001 AC-7-* · brainstorm 003" + }, + + /* ───────── Laptop · P1 / P2 ───────── */ + { + id: "l-adr", + title: "codehub-adr skill", + surface: "laptop", tier: "P1", + track: "laptop-skills", + tags: ["skill", "adr"], + blurb: "Drafts an ADR from a problem statement + impact query. Consequences section grounded in blast-radius data.", + why: "Impact-grounded consequences differentiate from generic ADR templates. Deferred because the template market is crowded; revisit once the P0 family has adoption.", + scope: [ + "plugins/opencodehub/skills/codehub-adr/", + "Required: \"\" positional arg", + "Defaults to committed (docs/adr/NNNN-.md)" + ], + depends: [], + unblocks: [], + source: "Brainstorm 003" + }, + { + id: "l-doc-auto", + title: "codehub-document --group --auto on merge", + surface: "laptop", tier: "P2", + track: "laptop-hooks", + tags: ["auto-refresh"], + blurb: "PostToolUse hook auto-runs --refresh on merge-to-main for group members. Crosses into CI territory.", + why: "Docs that track code without human gesture. Deferred because Bedrock-credit cost makes auto-regeneration a user-consent issue.", + scope: [ + "plugins/opencodehub/hooks.json extension", + "Merge-to-main detection", + "Customer opt-in flag" + ], + depends: ["l-hooks"], + unblocks: [], + source: "Brainstorm 013 P2 list" + }, + + /* ───────── Runner · OSS Actions · deferred tier ───────── + * All runner items wrap the `codehub` CLI. No HTTP MCP, no agent SDK — + * the action is a thin Node/container shell that shells out to the CLI. + * Tier kept at P1/P2 because the laptop surface is the only priority now. + */ + { + id: "r-token-action", + title: "opencodehub/token-action@v1 (OIDC→JWT)", + surface: "runner", tier: "P1", + track: "runner-actions", + tags: ["action", "oidc", "auth"], + blurb: "Exchanges GitHub OIDC token for a short-lived signed JWT the customer's own verification key signs. Only needed if analyze/verdict actions mint presigned URLs.", + why: "Cleans up the credential story when actions need to read/write storage. Only worth shipping once the analyze+verdict pair has adoption.", + scope: [ + "packages/actions/token/action.yml + dist/", + "OIDC → JWT exchange against customer-operated issuer", + "15-min TTL, RS256 default" + ], + depends: [], + unblocks: ["r-analyze-action", "r-verdict-action"], + source: "Brainstorm 011/013 (scope reduced — no HTTP endpoint to authenticate against; JWT used for storage presign only)" + }, + { + id: "r-analyze-action", + title: "opencodehub/analyze-action@v1 (CLI wrapper)", + surface: "runner", tier: "P1", + track: "runner-actions", + tags: ["action", "indexing", "cli-wrapper"], + blurb: "Runs codehub analyze on the checkout and uploads the graph blob to the configured storage backend. CLI-under-the-hood; no HTTP, no SDK.", + why: "Makes indexing a CI concern. Needed once customers want verdict gates in their pipeline — without it every verdict run re-indexes from scratch.", + scope: [ + "packages/actions/analyze/action.yml + dist/", + "storage-backend input: actions-cache | s3 | r2 | minio", + "Outputs: graph-hash, graph-uri, cache_hit", + "Thin shell that execs `codehub analyze` + uploads output" + ], + depends: ["r-token-action", "r-storage-tier0"], + unblocks: ["r-verdict-action"], + source: "Brainstorm 011 §1 (HTTP MCP removed)" + }, + { + id: "r-verdict-action", + title: "opencodehub/verdict-action@v1 (CLI wrapper)", + surface: "runner", tier: "P1", + track: "runner-actions", + tags: ["action", "checks", "cli-wrapper"], + blurb: "Runs `codehub verdict --policy opencodehub.policy.yaml` and posts a GitHub Check with per-rule annotations. Applies auto-approve label on full pass.", + why: "The CI surface that replaces LGTM-as-reflex with deterministic, auditable merge gating. Humans review only what the policy flags. CLI under the hood — no HTTP call, no SDK install.", + scope: [ + "packages/actions/verdict/action.yml + dist/", + "Shells out to `codehub verdict --policy ...`", + "Posts GitHub Check 'OpenCodeHub / verdict'", + "Applies opencodehub:auto-approve label when outcome=pass && auto_approve=true" + ], + depends: ["r-analyze-action", "r-policy-cli"], + unblocks: [], + source: "Brainstorm 011 §2 (HTTP MCP + grounding_pack removed)" + }, + { + id: "r-policy-schema", + title: "opencodehub.policy.yaml schema v1", + surface: "runner", tier: "P1", + track: "runner-policy", + tags: ["schema", "yaml"], + blurb: "JSON Schema for four rule types. Constrained YAML that compiles to a curated cypher subset run by the codehub CLI. Raw cypher explicitly out of scope.", + why: "Policy-as-code is the moat even without HTTP MCP. The schema choices (what counts as auto-approve, what reviewers see) compound over deployments.", + scope: [ + "packages/policy/schemas/policy-v1.json", + "Four rule types with input schemas (blast_radius_max, license_allowlist, ownership_required, arch_invariants)", + "auto_approve.require gate" + ], + depends: [], + unblocks: ["r-policy-cli"], + source: "Brainstorm 009 (HTTP-side tooling removed)" + }, + { + id: "r-policy-cli", + title: "codehub verdict CLI (policy evaluator)", + surface: "runner", tier: "P1", + track: "runner-policy", + tags: ["cli", "policy", "determinism"], + blurb: "New CLI subcommand: `codehub verdict --policy file.yaml --pr `. Consumes the policy schema, compiles rules against the graph, emits structured verdict JSON.", + why: "With HTTP MCP off the table, policy evaluation lives in the CLI. Same deterministic guarantees (byte-identical on unchanged inputs) — different consumer shape (actions shell out instead of calling a server).", + scope: [ + "packages/policy/src/evaluator.ts (CLI-side)", + "Rule types v1: blast_radius_max, license_allowlist, ownership_required", + "arch_invariants scaffolded behind OPENCODEHUB_EXPERIMENTAL_ARCH_INVARIANTS", + "Command: codehub verdict --policy file.yaml --pr base..head" + ], + depends: ["r-policy-schema"], + unblocks: ["r-verdict-action"], + source: "Brainstorm 009 refactored to CLI" + }, + { + id: "r-arch-invariants", + title: "arch_invariants rule evaluation", + surface: "runner", tier: "P2", + track: "runner-policy", + tags: ["policy", "feature-flag"], + blurb: "Flip OPENCODEHUB_EXPERIMENTAL_ARCH_INVARIANTS=1 by default. Constrained YAML compiles to curated cypher subset.", + why: "Scaffolded in v1 schema to reserve the slot; P2 flips the flag once we have field data from design partners on safe cypher patterns.", + scope: [ + "packages/policy/src/rules/arch-invariants.ts", + "Cypher subset whitelist", + "Query timeout + result-size caps" + ], + depends: ["r-policy-cli"], + unblocks: [], + source: "Brainstorm 013 tension #3" + }, + { + id: "r-storage-tier0", + title: "Graph storage · Tier 0 (Actions Cache)", + surface: "runner", tier: "P1", + track: "runner-storage", + tags: ["storage", "zero-setup"], + blurb: "GitHub Actions Cache backend via actions/cache@v4. graph_hash-derived key. 10 GB per-repo quota. Zero customer infra.", + why: "Without caching the story collapses — every CI run re-indexing a large monorepo is unworkable. Tier 0 is the on-ramp for the analyze+verdict action pair.", + scope: [ + "packages/graph-store/src/backends/actions-cache.ts", + "Integrates with actions/cache@v4 directly in the workflow", + "Content-addressed key format: opencodehub:{repo}:{graph_hash}" + ], + depends: [], + unblocks: ["r-analyze-action", "r-verdict-action"], + source: "Brainstorm 013 tension #2" + }, + { + id: "r-storage-tier1", + title: "Graph storage · Tier 1 (customer S3/R2/MinIO)", + surface: "runner", tier: "P2", + track: "runner-storage", + tags: ["storage", "self-hosted"], + blurb: "Customer supplies bucket + optional KMS. Signed URLs minted by the customer's own CI host step. Runner never sees raw creds.", + why: "Growth-stage customers outgrow Actions Cache quotas. Self-hosted bucket is the upgrade path — still no OpenCodeHub-operated infrastructure.", + scope: [ + "packages/graph-store/src/backends/s3.ts", + "KMS/CMK binding support", + "Presigned-URL minting runs in the customer's CI, not via an OpenCodeHub HTTP service" + ], + depends: ["r-storage-tier0"], + unblocks: [], + source: "Brainstorm 013 tension #2" + }, + { + id: "r-gitlab", + title: "GitLab CI templates", + surface: "runner", tier: "P2", + track: "runner-actions", + tags: ["gitlab", "ci"], + blurb: "Mirror of the GitHub Actions as GitLab CI templates. Same semantics, same CLI-wrapping shape.", + why: "Drops reliance on GitHub as the only supported forge. Composition with the customer's existing GitLab runners.", + scope: [ + "packages/cli/src/ci-templates/gitlab/analyze.yml", + "packages/cli/src/ci-templates/gitlab/verdict.yml" + ], + depends: ["r-analyze-action", "r-verdict-action"], + unblocks: [], + source: "Brainstorm 011 §Self-hosted-runner considerations" + }, + { + id: "r-provenance-cli", + title: "codehub provenance CLI + .opencodehub/grounding.json", + surface: "runner", tier: "P2", + track: "runner-provenance", + tags: ["cli", "manifest", "audit"], + blurb: "CLI subcommand writes a signed JSON manifest with graph_hash, tools_called[], policy_result, agent_identity. No SDK — the agent or workflow calls the CLI.", + why: "Incident forensics become possible. The agent-SDK path is closed, so provenance is recorded by invoking `codehub provenance record` at the end of an agent turn (or as a CI step).", + scope: [ + "New subcommand: codehub provenance record", + ".opencodehub/grounding.json JSON Schema", + "Verification step inside verdict-action" + ], + depends: ["r-verdict-action"], + unblocks: ["r-sigstore-prov"], + source: "Brainstorm 007 Action F (reshaped from SDK to CLI)" + }, + { + id: "r-sigstore-prov", + title: "Sigstore-signed provenance", + surface: "runner", tier: "P2", + track: "runner-provenance", + tags: ["sigstore", "slsa", "signing"], + blurb: "OIDC → Fulcio → Rekor signing with agent-identity as the attestation subject. Closes the audit loop.", + why: "Competitive research: nobody signs agent output. SLSA attests builds; we attest the agent that authored. Compliance-tier moat.", + scope: [ + "codehub provenance sign subcommand", + "verdict-action verifies signatures", + "in-toto predicate for agent identity" + ], + depends: ["r-provenance-cli"], + unblocks: [], + source: "Brainstorm 012 §3 seam 6 / 013 P2 list" + }, + { + id: "r-self-gh-app", + title: "Customer-self-hosted GitHub App", + surface: "runner", tier: "P2", + track: "runner-actions", + tags: ["github-app", "self-hosted"], + blurb: "Webhook subscriber customers deploy on their own infra. Native Checks + PR comments without editing workflows.", + why: "Zero-config onboarding for orgs that already operate GitHub Apps. Critically — runs on the customer's infrastructure, never OpenCodeHub's.", + scope: [ + "packages/github-app/ — deployable container", + "Install flow docs for customer-hosted deployment", + "No OpenCodeHub-operated endpoint", + "App itself shells out to the codehub CLI" + ], + depends: ["r-verdict-action"], + unblocks: [], + source: "Brainstorm 013 P2 list" + }, + { + id: "r-federation", + title: "Cross-org policy federation (git-based)", + surface: "runner", tier: "P2", + track: "runner-policy", + tags: ["policy", "federation", "self-hosted"], + blurb: "Mechanism for policies shared across related orgs without a central registry. Git-based federation via customer-controlled mirrors.", + why: "Orgs with multi-subsidiary or open-source-consortium topologies. Self-hosted substrate: policy files flow through customer-controlled git mirrors.", + scope: [ + "packages/policy/src/federation/", + "Git-based policy inheritance", + "No central registry, no OpenCodeHub-operated service" + ], + depends: ["r-policy-cli"], + unblocks: [], + source: "Brainstorm 013 P2 list" + }, + + /* ───────── Never · explicit exclusions ───────── */ + { + id: "x-saas", + title: "Hosted · Managed · SaaS · OpenCodeHub-operated tier", + surface: "laptop", tier: "never", + track: "never", + tags: ["distribution-model", "self-hosted-oss"], + blurb: "OpenCodeHub is self-hosted OSS. No hosted service, no managed SaaS, no OpenCodeHub-operated infrastructure. Ever.", + why: "Durable product-distribution decision, stated 2026-04-27. Not a timeline call. Every surface is customer-deployable.", + scope: [ + "No OpenCodeHub-operated webhook receiver", + "No hosted graph store", + "No managed policy evaluator", + "No SaaS tier" + ], + depends: [], + unblocks: [], + source: "User directive / project memory / spec 002 scope block" + }, + { + id: "x-http-mcp", + title: "Remote / HTTP MCP server", + surface: "runner", tier: "never", + track: "never", + tags: ["scope-exclusion", "stdio-only"], + blurb: "No Streamable HTTP MCP, no `/mcp` endpoint, no remote MCP transport. MCP stays stdio-only for the Claude Code plugin on the laptop.", + why: "Scope decision 2026-04-27. The agent-framework-plays-nice-with-HTTP-MCP story is deprioritized. CI integrations happen via OSS GitHub Actions that shell out to the codehub CLI, not via remote MCP.", + scope: [ + "No packages/mcp-http/", + "No OAuth/JWT flow for remote MCP callers", + "No SSE or Streamable-HTTP transport" + ], + depends: [], + unblocks: [], + source: "User directive 2026-04-27" + }, + { + id: "x-agent-sdk", + title: "Agent SDK (Python / TypeScript)", + surface: "runner", tier: "never", + track: "never", + tags: ["scope-exclusion", "cli-only"], + blurb: "No @opencodehub/agent-sdk, no opencodehub_agent_sdk Python, no @opencodehub/claude-hooks, no Vercel AI SDK or LangGraph adapters.", + why: "Scope decision 2026-04-27. Without HTTP MCP, an agent SDK has nothing to call. Agent frameworks that want OpenCodeHub grounding can shell out to the codehub CLI directly or use the Claude Code stdio MCP.", + scope: [ + "No packages/agent-sdk-python/", + "No packages/agent-sdk-ts/", + "No packages/claude-hooks/", + "No framework adapters" + ], + depends: [], + unblocks: [], + source: "User directive 2026-04-27" + }, + { + id: "x-agent", + title: "OpenCodeHub-branded coding agent", + surface: "runner", tier: "never", + track: "never", + tags: ["no-compete", "composability"], + blurb: "We don't compete with Devin, Claude-for-GitHub, Amazon Q, Cursor, Copilot. We provide the graph; they use it (via Claude Code plugin or CLI).", + why: "Composability wins against consolidation when the primitive is hard and the runtime is easy.", + scope: [], + depends: [], + unblocks: [], + source: "Brainstorm 007 §4 / 013 exclusions" + }, + { + id: "x-llm-review", + title: "LLM-based PR review", + surface: "runner", tier: "never", + track: "never", + tags: ["no-compete"], + blurb: "We don't compete with CodeRabbit / Greptile / Diamond on LLM verdict quality. We compete on deterministic verdict quality.", + why: "Deterministic graph verdict + graphHash invariant is the auditability wedge. LLM review is a crowded market with no moat.", + scope: [], + depends: [], + unblocks: [], + source: "Brainstorm 012 §3 seam 1 / 013 exclusions" + }, + { + id: "x-ide", + title: "IDE plugin / LSP", + surface: "laptop", tier: "never", + track: "never", + tags: ["no-compete"], + blurb: "Three surfaces are enough: Claude Code plugin, OSS GitHub Action, GitLab template.", + why: "LSPs and IDE plugins are Sourcegraph/Copilot territory. Our distribution channel is Claude Code and CI, not IDEs.", + scope: [], + depends: [], + unblocks: [], + source: "Brainstorm 007 §4 / 013 exclusions" + }, + { + id: "x-fine-tune", + title: "Fine-tuned models", + surface: "runner", tier: "never", + track: "never", + tags: ["no-compete"], + blurb: "Model choice stays with the agent platform. We are model-neutral grounding, not a model-provider.", + why: "Picking a model forfeits the neutrality that makes every agent vendor a distribution partner.", + scope: [], + depends: [], + unblocks: [], + source: "Brainstorm 007 §4 / 013 exclusions" + } + ], + + // Named tracks for the Timeline view + tracks: [ + { id: "laptop-substrate", label: "Laptop · substrate", surface: "laptop" }, + { id: "laptop-skills", label: "Laptop · skills", surface: "laptop" }, + { id: "laptop-hooks", label: "Laptop · hooks", surface: "laptop" }, + { id: "runner-policy", label: "Runner · policy (CLI)", surface: "runner" }, + { id: "runner-actions", label: "Runner · actions (CLI-wrap)", surface: "runner" }, + { id: "runner-storage", label: "Runner · storage", surface: "runner" }, + { id: "runner-provenance", label: "Runner · provenance", surface: "runner" } + ] +}; diff --git a/.erpaval/roadmap/index.html b/.erpaval/roadmap/index.html new file mode 100644 index 00000000..f7b7152d --- /dev/null +++ b/.erpaval/roadmap/index.html @@ -0,0 +1,213 @@ + + + + + + OpenCodeHub Roadmap — Two-Surface Product + + + +
    +
    + +
    +
    OpenCodeHub Roadmap
    +
    Self-hosted OSS · Two-surface product
    +
    +
    + +
    + +
    +
    Thesis · 2026-04-27
    +

    + OpenCodeHub is a self-hosted, two-surface OSS product, unified by a single offline-safe cross-repo graph. Surface one is the + Claude Code artifact factory on the developer's laptop. Surface two is the + MCP-over-HTTP grounding plane for coding agents running off-laptop at PR-scale, deployed entirely on the customer's own infrastructure. + No hosted tier, no managed service, no SaaS — customers run everything themselves. +

    +
    + +
    +
    + Surface + + + +
    +
    + Tier + + + + +
    +
    + +
    +
    + +
    + +
    +
    +
    +
    +
    L
    +
    +
    Laptop · Artifact Factory
    +
    Spec 001 · Claude Code plugin
    +
    +
    +
    +
    P0 · Ship this quarter
    +
    +
    +
    +
    P1 · Next quarter
    +
    +
    +
    +
    P2 · Later
    +
    +
    +
    +
    +
    +
    R
    +
    +
    Runner · Grounding Plane
    +
    Spec 002 · Self-hosted OSS
    +
    +
    +
    +
    P0 · Ship this quarter
    +
    +
    +
    +
    P1 · Next quarter
    +
    +
    +
    +
    P2 · Later
    +
    +
    +
    +
    +
    + + +
    +
    +
    W1
    +
    W3
    +
    W5
    +
    W7
    +
    W9
    +
    W10
    +
    +
    +
    + Laptop + Runner + Critical path +
    +
    + + +
    +
    +
    +
    Backlog
    +
    +
    +
    +
    Next · P0
    +
    +
    +
    +
    After · P1
    +
    +
    +
    +
    Later · P2
    +
    +
    +
    +
    Never
    +
    +
    +
    +
    + + +
    +
    + +
    +
    +
    + Laptop + Runner + Blocks +
    +
    + + +
    +
    +
    +
    + + + +
    + +
    +
    + Sources: .erpaval/specs/001, .erpaval/specs/002, + .erpaval/brainstorms/001–013 +
    +
    +
    + + + + + + diff --git a/.erpaval/roadmap/styles.css b/.erpaval/roadmap/styles.css new file mode 100644 index 00000000..45acc2de --- /dev/null +++ b/.erpaval/roadmap/styles.css @@ -0,0 +1,404 @@ +/* OpenCodeHub Roadmap SPA — Void Design System inflection */ + +:root { + --bg: #0a0c10; + --bg-elev: #10131a; + --bg-elev-2: #161a24; + --bg-elev-3: #1d2230; + --line: #232836; + --line-2: #2d3342; + --ink: #e6eaf3; + --ink-2: #aab3c8; + --ink-3: #7a8398; + --ink-dim: #545b6f; + + --accent: #8ab4ff; + --accent-2: #c4a6ff; + --laptop: #7dd3fc; + --runner: #c4a6ff; + --p0: #5dd6a8; + --p1: #f6c177; + --p2: #ff9e9e; + --never: #6c7280; + --danger: #ff5a78; + + --radius: 10px; + --radius-sm: 6px; + --shadow: 0 10px 40px rgba(0,0,0,.45); + + --font-ui: system-ui, -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif; + --font-mono: "JetBrains Mono", "SF Mono", Menlo, Consolas, monospace; +} + +* { box-sizing: border-box; } +html, body { margin: 0; padding: 0; } +body { + background: radial-gradient(ellipse at top, #141a2a 0%, var(--bg) 55%); + color: var(--ink); + font-family: var(--font-ui); + font-size: 14px; + line-height: 1.5; + min-height: 100vh; + letter-spacing: 0.01em; +} + +/* ─── Topbar ─────────────────────────────────────────────── */ +.topbar { + position: sticky; top: 0; z-index: 30; + display: flex; align-items: center; justify-content: space-between; + padding: 14px 24px; + background: rgba(10,12,16,0.85); + backdrop-filter: blur(14px) saturate(140%); + -webkit-backdrop-filter: blur(14px) saturate(140%); + border-bottom: 1px solid var(--line); +} +.brand { display: flex; align-items: center; gap: 12px; } +.brand-mark { + width: 28px; height: 28px; border-radius: 6px; + background: conic-gradient(from 220deg, #7dd3fc, #c4a6ff, #8ab4ff, #7dd3fc); + box-shadow: 0 0 0 1px var(--line-2), 0 6px 16px rgba(125,211,252,.25); +} +.brand-title { font-weight: 600; letter-spacing: 0.02em; } +.brand-sub { color: var(--ink-3); font-size: 12px; } + +.viewswitch { display: flex; gap: 4px; padding: 4px; background: var(--bg-elev); border: 1px solid var(--line); border-radius: 10px; } +.view-btn { + background: transparent; border: 0; color: var(--ink-2); + padding: 7px 14px; font: inherit; cursor: pointer; + border-radius: 7px; transition: all .15s ease; +} +.view-btn:hover { color: var(--ink); background: var(--bg-elev-2); } +.view-btn.is-active { color: #0a0c10; background: var(--ink); font-weight: 600; } + +/* ─── Thesis banner ───────────────────────────────────────── */ +.thesis { padding: 22px 24px 4px; max-width: 1400px; } +.thesis-eyebrow { + font-family: var(--font-mono); font-size: 11px; + color: var(--ink-3); letter-spacing: 0.12em; text-transform: uppercase; + margin-bottom: 8px; +} +.thesis-body { color: var(--ink-2); font-size: 15px; max-width: 920px; margin: 0; } +.thesis-body strong { color: var(--ink); font-weight: 600; } +.thesis-body em { font-style: normal; color: var(--ink); } + +/* ─── Filters ─────────────────────────────────────────────── */ +.filters { + display: flex; flex-wrap: wrap; gap: 18px; align-items: center; + padding: 16px 24px 8px; max-width: 1400px; +} +.filter-group { display: flex; align-items: center; gap: 6px; } +.filter-label { + font-family: var(--font-mono); font-size: 11px; + color: var(--ink-3); letter-spacing: 0.1em; text-transform: uppercase; + margin-right: 4px; +} +.chip { + background: var(--bg-elev); color: var(--ink-2); + border: 1px solid var(--line); border-radius: 999px; + padding: 5px 12px; font: inherit; font-size: 12.5px; + cursor: pointer; transition: all .15s ease; +} +.chip:hover { color: var(--ink); border-color: var(--line-2); } +.chip.is-active { background: var(--ink); color: #0a0c10; border-color: var(--ink); font-weight: 600; } +.search-group { margin-left: auto; } +#search { + background: var(--bg-elev); color: var(--ink); + border: 1px solid var(--line); border-radius: 8px; + padding: 7px 12px; font: inherit; font-size: 13px; width: 260px; + transition: border-color .15s ease; +} +#search:focus { outline: none; border-color: var(--accent); } + +/* ─── Stage ───────────────────────────────────────────────── */ +.stage { padding: 8px 24px 40px; max-width: 1400px; margin: 0 auto; } +.view { display: none; animation: fade-in .22s ease; } +.view.is-active { display: block; } +@keyframes fade-in { from { opacity: 0; transform: translateY(4px); } to { opacity: 1; transform: none; } } + +/* ─── Overview: two-surface columns ──────────────────────── */ +.surface-cols { + display: grid; grid-template-columns: repeat(2, 1fr); gap: 18px; +} +@media (max-width: 1040px) { + .surface-cols { grid-template-columns: 1fr; } +} +.surface-col { + background: linear-gradient(180deg, var(--bg-elev) 0%, rgba(16,19,26,0.5) 100%); + border: 1px solid var(--line); border-radius: 14px; padding: 16px; +} +.surface-col[data-surface="laptop"] { box-shadow: inset 2px 0 0 var(--laptop); } +.surface-col[data-surface="runner"] { box-shadow: inset 2px 0 0 var(--runner); } +.surface-head { + display: flex; align-items: center; gap: 12px; + padding: 6px 4px 14px; border-bottom: 1px solid var(--line); + margin-bottom: 14px; +} +.surface-icon { + width: 34px; height: 34px; border-radius: 8px; + display: flex; align-items: center; justify-content: center; + font-family: var(--font-mono); font-weight: 600; font-size: 15px; + background: var(--bg-elev-2); border: 1px solid var(--line-2); +} +.surface-col[data-surface="laptop"] .surface-icon { color: var(--laptop); } +.surface-col[data-surface="runner"] .surface-icon { color: var(--runner); } +.surface-title { font-weight: 600; font-size: 15px; } +.surface-sub { color: var(--ink-3); font-size: 12px; } + +.tier-band { margin-bottom: 18px; } +.tier-head { + font-family: var(--font-mono); font-size: 11px; + color: var(--ink-3); letter-spacing: 0.14em; text-transform: uppercase; + margin-bottom: 10px; display: flex; align-items: center; gap: 8px; +} +.tier-head::before { + content: ""; width: 6px; height: 6px; border-radius: 50%; + background: var(--ink-3); +} +.tier-band[data-tier="P0"] .tier-head::before { background: var(--p0); } +.tier-band[data-tier="P1"] .tier-head::before { background: var(--p1); } +.tier-band[data-tier="P2"] .tier-head::before { background: var(--p2); } +.tier-grid { + display: grid; gap: 10px; + grid-template-columns: repeat(auto-fill, minmax(240px, 1fr)); +} + +/* ─── Item card ──────────────────────────────────────────── */ +.card { + background: var(--bg-elev-2); + border: 1px solid var(--line); border-radius: var(--radius); + padding: 12px 14px; cursor: pointer; + transition: transform .12s ease, border-color .12s ease, background .12s ease; + position: relative; +} +.card:hover { + border-color: var(--line-2); background: var(--bg-elev-3); + transform: translateY(-1px); +} +.card.is-filtered-out { display: none; } +.card.is-highlighted { + border-color: var(--accent); + box-shadow: 0 0 0 1px var(--accent), 0 0 24px rgba(138,180,255,0.2); +} +.card-head { display: flex; align-items: baseline; justify-content: space-between; gap: 8px; } +.card-id { + font-family: var(--font-mono); font-size: 11px; + color: var(--ink-3); letter-spacing: 0.04em; +} +.card-tier { + font-family: var(--font-mono); font-size: 10px; font-weight: 600; + padding: 2px 6px; border-radius: 4px; letter-spacing: 0.08em; +} +.card-tier[data-tier="P0"] { background: rgba(93,214,168,.12); color: var(--p0); } +.card-tier[data-tier="P1"] { background: rgba(246,193,119,.12); color: var(--p1); } +.card-tier[data-tier="P2"] { background: rgba(255,158,158,.12); color: var(--p2); } +.card-tier[data-tier="never"] { background: rgba(108,114,128,.18); color: var(--never); } +.card-title { font-weight: 600; margin: 6px 0 4px; line-height: 1.3; color: var(--ink); } +.card-blurb { color: var(--ink-2); font-size: 12.5px; margin-bottom: 8px; } +.card-tags { display: flex; flex-wrap: wrap; gap: 4px; } +.card-tag { + font-family: var(--font-mono); font-size: 10px; + padding: 2px 6px; border-radius: 4px; + background: var(--bg-elev); color: var(--ink-3); + border: 1px solid var(--line); +} +.card-surface-dot { + position: absolute; top: 10px; right: 10px; + width: 6px; height: 6px; border-radius: 50%; +} +.card[data-surface="laptop"] .card-surface-dot { background: var(--laptop); } +.card[data-surface="runner"] .card-surface-dot { background: var(--runner); } + +/* ─── Timeline view ───────────────────────────────────────── */ +.timeline-ruler { + position: relative; height: 28px; + margin-top: 8px; margin-bottom: 14px; + border-bottom: 1px solid var(--line); +} +.ruler-tick { + position: absolute; bottom: 0; transform: translateX(-50%); + font-family: var(--font-mono); font-size: 10px; color: var(--ink-3); +} +.ruler-tick::after { + content: ""; display: block; width: 1px; height: 8px; + background: var(--line-2); margin: 2px auto 0; +} +#timeline-tracks { display: flex; flex-direction: column; gap: 8px; } +.timeline-track { + position: relative; height: 44px; + background: var(--bg-elev); border: 1px solid var(--line); + border-radius: 8px; +} +.timeline-track-label { + position: absolute; left: 12px; top: 50%; transform: translateY(-50%); + font-family: var(--font-mono); font-size: 11px; + color: var(--ink-3); letter-spacing: 0.08em; text-transform: uppercase; + pointer-events: none; z-index: 1; +} +.timeline-bar { + position: absolute; top: 8px; height: 28px; + border-radius: 6px; padding: 5px 10px; + font-size: 12px; font-weight: 500; + white-space: nowrap; overflow: hidden; text-overflow: ellipsis; + cursor: pointer; transition: transform .12s ease, box-shadow .12s ease; + display: flex; align-items: center; gap: 6px; + min-width: 80px; +} +.timeline-bar:hover { transform: translateY(-1px); box-shadow: 0 6px 20px rgba(0,0,0,.4); } +.timeline-bar[data-surface="laptop"] { + background: rgba(125,211,252,0.14); color: var(--laptop); + border: 1px solid rgba(125,211,252,0.35); +} +.timeline-bar[data-surface="runner"] { + background: rgba(196,166,255,0.14); color: var(--runner); + border: 1px solid rgba(196,166,255,0.35); +} +.timeline-bar.is-critical { box-shadow: inset 3px 0 0 #ff5a78; } +.timeline-bar-id { + font-family: var(--font-mono); font-size: 10px; opacity: 0.75; +} +.timeline-legend { + display: flex; gap: 16px; align-items: center; + margin-top: 14px; padding: 10px 14px; + background: var(--bg-elev); border: 1px solid var(--line); + border-radius: 8px; font-size: 12px; color: var(--ink-2); +} +.lg { display: inline-block; width: 14px; height: 10px; border-radius: 3px; margin-right: 4px; vertical-align: middle; } +.lg-laptop { background: rgba(125,211,252,0.5); border: 1px solid var(--laptop); } +.lg-runner { background: rgba(196,166,255,0.5); border: 1px solid var(--runner); } +.lg-critical { background: transparent; border-left: 3px solid var(--danger); width: 10px; } +.lg-dashed { background: transparent; border-top: 2px dashed var(--ink-2); height: 0; } + +/* ─── Board view ──────────────────────────────────────────── */ +.board { + display: grid; gap: 12px; + grid-template-columns: repeat(5, 1fr); +} +@media (max-width: 1200px) { .board { grid-template-columns: repeat(3, 1fr); } } +@media (max-width: 820px) { .board { grid-template-columns: 1fr; } } +.board-col { + background: var(--bg-elev); border: 1px solid var(--line); + border-radius: 10px; padding: 10px; +} +.board-col-never { opacity: 0.75; } +.board-head { + font-family: var(--font-mono); font-size: 11px; + color: var(--ink-3); letter-spacing: 0.12em; text-transform: uppercase; + padding: 4px 4px 10px; border-bottom: 1px solid var(--line); + margin-bottom: 10px; +} +.board-list { display: flex; flex-direction: column; gap: 8px; } + +/* ─── Dependencies view ──────────────────────────────────── */ +.deps-wrap { position: relative; min-height: 700px; } +#deps-svg { + position: absolute; inset: 0; width: 100%; height: 100%; + pointer-events: none; +} +.dep-edge { stroke: var(--line-2); stroke-width: 1.5; fill: none; } +.dep-edge.is-active { stroke: var(--accent); stroke-width: 2; } +.deps-nodes { position: relative; min-height: 700px; } +.dep-node { + position: absolute; width: 220px; + padding: 10px 12px; + background: var(--bg-elev-2); + border: 1px solid var(--line); border-radius: 8px; + cursor: pointer; + transition: transform .12s ease, border-color .12s ease; + box-shadow: var(--shadow); +} +.dep-node:hover { border-color: var(--accent); transform: translateY(-1px); } +.dep-node[data-surface="laptop"] { border-left: 3px solid var(--laptop); } +.dep-node[data-surface="runner"] { border-left: 3px solid var(--runner); } +.dep-node.is-dim { opacity: 0.35; } +.dep-node-id { font-family: var(--font-mono); font-size: 10px; color: var(--ink-3); } +.dep-node-title { font-size: 12.5px; font-weight: 600; margin-top: 4px; } +.deps-legend { + display: flex; gap: 16px; align-items: center; + margin-top: 14px; padding: 10px 14px; + background: var(--bg-elev); border: 1px solid var(--line); + border-radius: 8px; font-size: 12px; color: var(--ink-2); +} + +/* ─── Pillars view ────────────────────────────────────────── */ +.pillars { + display: grid; gap: 14px; + grid-template-columns: repeat(auto-fit, minmax(260px, 1fr)); +} +.pillar { + background: var(--bg-elev); border: 1px solid var(--line); + border-radius: 12px; padding: 18px; +} +.pillar-head { display: flex; align-items: center; gap: 10px; margin-bottom: 10px; } +.pillar-dot { width: 10px; height: 10px; border-radius: 50%; background: var(--accent); } +.pillar-title { font-weight: 600; font-size: 15px; } +.pillar-body { color: var(--ink-2); font-size: 13px; margin: 8px 0 12px; } +.pillar-items { display: flex; flex-direction: column; gap: 6px; } +.pillar-item { + background: var(--bg-elev-2); border: 1px solid var(--line); + border-radius: 6px; padding: 7px 10px; + font-size: 12.5px; cursor: pointer; color: var(--ink-2); + display: flex; align-items: center; justify-content: space-between; + transition: border-color .12s ease, color .12s ease; +} +.pillar-item:hover { border-color: var(--accent); color: var(--ink); } +.pillar-item .pi-id { font-family: var(--font-mono); font-size: 10px; color: var(--ink-3); } + +/* ─── Drawer ──────────────────────────────────────────────── */ +.drawer-scrim { + position: fixed; inset: 0; background: rgba(0,0,0,0.5); + opacity: 0; pointer-events: none; transition: opacity .2s ease; + z-index: 40; +} +.drawer-scrim.is-open { opacity: 1; pointer-events: all; } +.drawer { + position: fixed; top: 0; right: 0; bottom: 0; width: min(520px, 94vw); + background: var(--bg-elev); border-left: 1px solid var(--line); + z-index: 50; overflow-y: auto; + padding: 22px 22px 40px; + transform: translateX(100%); transition: transform .25s ease; + box-shadow: -30px 0 80px rgba(0,0,0,0.6); +} +.drawer.is-open { transform: none; } +.drawer-head { display: flex; align-items: center; justify-content: space-between; margin-bottom: 8px; } +.drawer-eyebrow { + font-family: var(--font-mono); font-size: 11px; + color: var(--ink-3); letter-spacing: 0.12em; text-transform: uppercase; +} +.drawer-close { + background: transparent; border: 0; color: var(--ink-2); + font-size: 28px; line-height: 1; cursor: pointer; padding: 4px 10px; + border-radius: 6px; +} +.drawer-close:hover { background: var(--bg-elev-3); color: var(--ink); } +.drawer-title { font-size: 20px; font-weight: 600; margin: 4px 0 12px; line-height: 1.25; } +.drawer-meta { display: flex; gap: 6px; flex-wrap: wrap; margin-bottom: 18px; } +.drawer-section { margin-bottom: 18px; } +.drawer-label { + font-family: var(--font-mono); font-size: 10px; + color: var(--ink-3); letter-spacing: 0.12em; text-transform: uppercase; + margin-bottom: 6px; +} +.drawer-body { color: var(--ink-2); font-size: 13.5px; line-height: 1.55; } +.drawer-body code { + font-family: var(--font-mono); font-size: 12px; + background: var(--bg-elev-2); padding: 1px 5px; border-radius: 3px; + color: var(--ink); +} +.drawer-list { margin: 0; padding: 0; list-style: none; display: flex; flex-direction: column; gap: 6px; } +.drawer-list li { + background: var(--bg-elev-2); border: 1px solid var(--line); + border-radius: 6px; padding: 8px 10px; + font-size: 13px; color: var(--ink-2); +} +.drawer-list li.is-link { cursor: pointer; color: var(--ink); } +.drawer-list li.is-link:hover { border-color: var(--accent); } +.drawer-list li .li-id { font-family: var(--font-mono); font-size: 10px; color: var(--ink-3); margin-right: 8px; } + +/* ─── Footbar ─────────────────────────────────────────────── */ +.footbar { + display: flex; justify-content: space-between; align-items: center; + padding: 14px 24px; border-top: 1px solid var(--line); + color: var(--ink-3); font-size: 12px; + max-width: 1400px; margin: 0 auto; +} +.footbar code { font-family: var(--font-mono); font-size: 11px; color: var(--ink-2); } diff --git a/.erpaval/specs/001-claude-code-artifact-surface/spec.md b/.erpaval/specs/001-claude-code-artifact-surface/spec.md new file mode 100644 index 00000000..5c2cc764 --- /dev/null +++ b/.erpaval/specs/001-claude-code-artifact-surface/spec.md @@ -0,0 +1,91 @@ +# Spec 001 — Claude Code Artifact Surface + +*EARS form. Feeds `/erpaval` Act phase. Source memo: `.erpaval/brainstorms/006-synthesis-whats-next.md`. Cycle references: 001 strategy, 002 PRD, 003-005 design.* + +## Scope + +Ship an artifact-generation skill family inside `plugins/opencodehub/` that ports codeprobe's `/document` choreography to OpenCodeHub's graph + supply-chain surface, with first-class group (multi-repo) support. v1 covers `codehub-document` (single + group), `codehub-pr-description`, `codehub-onboarding`, **`codehub-contract-map`** (standalone group-only skill, promoted from P1 on 2026-04-27), plus the shared-context precompute, `.docmeta.json` sidecar, cross-reference assembler, and PostToolUse staleness hook. + +## Out of scope for v1 + +- `codehub-adr` (deferred P1 — ADR template market is crowded, revisit after the P0 family has adoption) +- Auto-regeneration on merge-to-main +- `group_wiki` / `group_synthesize` MCP tools (Phase 0 precompute + existing `group_*` tools cover the data path) +- SVG/PNG diagram generation +- Starlight auto-publish of generated docs + +## Acceptance criteria (EARS) + +### Ubiquitous + +- **AC-1-1** The system shall ship four new skill directories under `plugins/opencodehub/skills/`: `codehub-document/`, `codehub-pr-description/`, `codehub-onboarding/`, `codehub-contract-map/`. The existing `opencodehub-guide/`, `opencodehub-exploring/`, `opencodehub-impact-analysis/`, `opencodehub-debugging/`, `opencodehub-refactoring/`, `opencodehub-pr-review/` skills remain unchanged. [P] +- **AC-1-2** The system shall ship six subagent files under `plugins/opencodehub/agents/`: `doc-architecture.md`, `doc-reference.md`, `doc-behavior.md`, `doc-analysis.md`, `doc-diagrams.md`, `doc-cross-repo.md`. [P] Dependencies: AC-1-1 +- **AC-1-3** Every generated Markdown artifact shall have H1 identifier, no YAML frontmatter, and at least one backtick citation of form `` `:` `` or `` `::` ``. [P] + +### Event-driven + +- **AC-2-1** When the user invokes `/codehub-document` and `codehub status` reports a fresh index, the system shall execute Phase 0 (precompute) then Phase AB (four subagents parallel) then Phase CD (two subagents parallel, `doc-cross-repo` skipped in single-repo mode) then Phase E (assembler) and write at least 10 Markdown files under the output directory plus a valid `.docmeta.json`. Dependencies: AC-1-1, AC-1-2, AC-3-1 +- **AC-2-2** When the user invokes `/codehub-document --group `, Phase 0 shall call `group_list` + `group_status` + `group_contracts` + `group_query`, and Phase CD shall dispatch `doc-cross-repo` with the group manifest. Dependencies: AC-2-1 +- **AC-2-3** When the user invokes `/codehub-document --refresh`, the system shall compare each `sections[].sources[].mtime` against `sections[].mtime` and regenerate only stale sections; Phase E shall always re-run. Dependencies: AC-2-1, AC-4-2 +- **AC-2-4** When the user invokes `/codehub-document --committed`, the system shall write under `docs/codehub/` (or user-supplied path) and shall not add any entry to `.gitignore`. Dependencies: AC-2-1 +- **AC-2-5** When the user invokes `/codehub-pr-description` inside a branch with changes, the system shall call `detect_changes` + `verdict` + `owners` + `list_findings_delta` and write Markdown citing verdict tier, affected symbols, owner-reviewers, and findings-delta summary. Dependencies: AC-1-1 +- **AC-2-6** When the user invokes `/codehub-onboarding`, the system shall dispatch one subagent that reads `project_profile` + `query` + `owners` + `route_map` + `tool_map` and writes `.codehub/ONBOARDING.md` with a ranked reading order section. Dependencies: AC-1-1 +- **AC-2-7** When the user invokes `/codehub-contract-map `, the system shall call `group_list` (to validate the group exists) + `group_status` (to validate freshness) + `group_contracts` + `group_query` + `route_map`, and shall write a Markdown artifact containing a contracts matrix table, at least one Mermaid diagram of consumer→producer flows, and a `See also` footer linking to each member repo's generated docs when present. Output defaults to `.codehub/groups//contracts.md`. Dependencies: AC-1-1, AC-3-4 +- **AC-2-8** When the `PostToolUse` hook observes `git commit|merge|rebase|pull` and `.codehub/docs/.docmeta.json` exists whose `codehub_graph_hash` disagrees with the live hash, the hook shall emit a `systemMessage` suggesting `/codehub-document --refresh` without auto-regenerating. Dependencies: AC-2-1 + +### State-driven + +- **AC-3-1** While the index is stale (per `codehub status`), `codehub-document`, `codehub-onboarding`, and `codehub-contract-map` shall refuse to run and shall emit a single-line remediation hint naming the stale repo. Dependencies: AC-1-1 +- **AC-3-2** While `codehub-document --group ` is invoked and any member repo is stale, the skill shall abort and name each stale repo in the error. Dependencies: AC-2-2, AC-3-1 +- **AC-3-3** While Bedrock is unreachable from the skill host, any skill that would summarize via `@opencodehub/summarizer` shall degrade to raw graph output and shall not block. Dependencies: AC-1-1 +- **AC-3-4** While `/codehub-contract-map` is invoked without a `` argument or against a group `group_list` does not return, the skill shall refuse to run with `Contract map requires a named group — run 'codehub group list' to see registered groups.` and shall not consume any additional tool budget. Dependencies: AC-2-7 + +### Optional feature + +- **AC-4-1** Where Phase E detects ≥2 shared source citations between two sibling documents, the assembler shall append a `## See also` footer listing 3–5 sibling links to both documents. [P] Dependencies: AC-2-1 +- **AC-4-2** Where group mode is active and a per-repo `.codehub/docs/` tree exists under `.codehub/groups///`, Phase E shall emit a `## See also (other repos in group)` section in every `cross-repo/*.md` file linking into the sibling repo's equivalent section. Dependencies: AC-2-2, AC-4-1 +- **AC-4-3** Where `codehub-document` exits successfully, `.docmeta.json` shall include `generated_at`, `codehub_graph_hash`, `mode`, `sections[]` with `path`/`agent`/`sources[]`/`mtime`/`citation_count`/`mermaid_count`, and in group mode also `cross_repo_refs[]`. [P] Dependencies: AC-2-1 + +### Unwanted behavior + +- **AC-5-1** If Phase AB dispatches more than 10 subagents in a single message, the orchestrator shall batch by subagent role (all `doc-architecture` first, then `doc-behavior`, etc.) and shall not exceed 10 concurrent `Agent` tool calls per message. Dependencies: AC-2-1, AC-2-2 +- **AC-5-2** If a subagent attempts to call an MCP tool whose response digest is already present in `.prefetch.md`, the subagent prompt shall instruct it to reuse the cached result; compliance shall be enforced by prompt text and verified by the `Quality Checklist` block in each agent file. Dependencies: AC-1-2, AC-6-1 +- **AC-5-3** If any generated document contains a YAML frontmatter block, Phase E shall strip it and log a `frontmatter_removed` entry in `.docmeta.json`. Dependencies: AC-2-1 +- **AC-5-4** If `codehub-pr-description` is invoked on a clean working tree with no diff, the skill shall refuse to run and emit `No diff detected — resolve base/head or stage changes.` Dependencies: AC-2-5 +- **AC-5-5** If `codehub-contract-map` finds zero inter-repo contracts in `group_contracts` output, the skill shall still write the artifact file with a `No inter-repo contracts detected` banner and the empty matrix, rather than erroring. Dependencies: AC-2-7 + +### Precompute + +- **AC-6-1** The Phase 0 writer shall emit `.codehub/.context.md` (hard-capped at 200 lines, with truncation indicators per subsection) and `.codehub/.prefetch.md` (newline-delimited JSON ledger: one record per tool call with `tool`, `args`, `sha256`, `keys`, `cached_at`). [P] Dependencies: AC-1-1 +- **AC-6-2** Where group mode is active, Phase 0 shall write precompute files under `.codehub/groups//` instead of `.codehub/`. Dependencies: AC-6-1, AC-2-2 + +### Discoverability + +- **AC-7-1** The `opencodehub-guide` skill shall include a Skills table with one row per artifact skill (name, trigger example, one-line purpose). [P] Dependencies: AC-1-1 +- **AC-7-2** After `codehub analyze` completes, `packages/cli/src/commands/analyze.ts` shall print `Try: /codehub-document · /codehub-onboarding · /codehub-contract-map ` as the last status line (the third hint only appears if the analyzed repo is a member of at least one group). [P] Dependencies: AC-1-1 +- **AC-7-3** The `verdict` and `detect_changes` MCP tools' `next_steps[]` arrays shall include `{suggest: "codehub-pr-description"}` when the call was executed on a branch with a non-empty diff. [P] Dependencies: AC-1-1 +- **AC-7-4** The Starlight docs site shall include a `/skills/` index page rendering each skill's frontmatter as a card with trigger examples. [P] Dependencies: AC-1-1 + +## Validation + +- **Static layer**: `tsc --noEmit` over `packages/mcp`, `packages/cli`, `packages/analysis` must pass. Every new file typed. +- **Plugin layer**: invoke `plugin-dev:plugin-validator` against `plugins/opencodehub/`. Must report zero errors on frontmatter, tool allowlists, and manifest. +- **Behavioral layer**: self-test inside `/Users/lalsaado/Projects/open-code-hub` itself — run `/codehub-document` against this repo, assert `.codehub/docs/` contains ≥10 files, assert `.docmeta.json` validates, assert every `See also` link resolves. Spot-check three citations resolve to real files. +- **Regression layer**: existing `/probe`, `/verdict`, `/audit-deps`, `/rename`, `/owners` must still work. Run each once post-change. + +## Risks (see synthesis §Risks) + +1. Parallel subagent ceiling — verify current release +2. Subagent tool sprawl context bloat +3. Bedrock credential gating +4. Precompute size explosion on large repos + +## References + +- `/Users/lalsaado/Projects/open-code-hub/.erpaval/brainstorms/001-opencodehub-next-strategy.md` — Rumelt kernel +- `/Users/lalsaado/Projects/open-code-hub/.erpaval/brainstorms/002-opencodehub-artifact-skills-prd.md` — PRD +- `/Users/lalsaado/Projects/open-code-hub/.erpaval/brainstorms/003-opencodehub-skill-interface-design.md` — SKILL.md frontmatter +- `/Users/lalsaado/Projects/open-code-hub/.erpaval/brainstorms/004-opencodehub-subagent-prompts.md` — doc-* agent prompts +- `/Users/lalsaado/Projects/open-code-hub/.erpaval/brainstorms/005-opencodehub-output-conventions.md` — output contract +- `/Users/lalsaado/Projects/open-code-hub/.erpaval/brainstorms/006-synthesis-whats-next.md` — synthesis + tension resolutions +- `/Users/lalsaado/Projects/codeprobe/src/codeprobe/bootstrap/templates/claude-plugin/skills/document/SKILL.md` — pattern reference diff --git a/.erpaval/specs/002-agent-grounding-plane/spec.md b/.erpaval/specs/002-agent-grounding-plane/spec.md new file mode 100644 index 00000000..2d6bbec2 --- /dev/null +++ b/.erpaval/specs/002-agent-grounding-plane/spec.md @@ -0,0 +1,124 @@ +# Spec 002 — CI Action Surface (CLI-wrapping, deferred) + +*EARS form. Feeds `/erpaval` Act phase once spec 001 ships. Rewritten 2026-04-27 to remove the HTTP/SDK framing. Source memos: `007-agents-at-scale-strategy.md` (diagnosis retained, transport rejected), `011-ci-integration-playbook.md` (workflows retained, HTTP server removed), `013-synthesis-v2-two-surface-product.md` §rescinded sections. Companion spec: `.erpaval/specs/001-claude-code-artifact-surface/spec.md` (artifact factory, laptop — ships first).* + +## Spec directory note + +The directory is named `002-agent-grounding-plane/` for historical reasons — the original v1 of this spec designed an HTTP MCP "grounding plane" with an agent SDK. Both are now explicitly out of scope (see project memory: no HTTP MCP, no agent SDK). The renamed content is a **CI action surface**: OSS GitHub Actions (and GitLab templates) that shell out to the `codehub` CLI. + +## Scope + +Ship OSS CI integrations that let a customer run OpenCodeHub's graph-backed verdict + blast-radius gates in their own runners. Every integration is a **thin wrapper around the `codehub` CLI** — no HTTP server, no MCP-over-HTTP, no remote transport, no SDK for agents. The shape is: + +1. Customer authors `opencodehub.policy.yaml` in their repo (or group root). +2. CI action checks out code, runs `codehub analyze` inside the runner, uploads the graph blob to their chosen storage tier. +3. CI action pulls the graph on the PR side, runs `codehub verdict --policy opencodehub.policy.yaml --pr base..head`, posts a GitHub Check with per-rule annotations, applies an `opencodehub:auto-approve` label on full pass. +4. Customer's branch-protection rules decide whether the Check gates merge. + +**Distribution model (re-stated):** OpenCodeHub is self-hosted OSS. Every action runs inside the customer's CI runner. Storage lives in the customer's GitHub Actions Cache, bucket, or self-operated MinIO. No OpenCodeHub-operated infrastructure. + +**Priority:** This entire spec is P1 — deferred until spec 001 (laptop artifact factory) ships and has adoption. Some items slide to P2. + +## Out of scope for this spec (now and forever for most) + +- **Remote / HTTP MCP server** — ruled out permanently (project memory). MCP stays stdio-only for the Claude Code plugin on the laptop. +- **Agent SDK (Python or TypeScript)** — ruled out permanently. Agents call OpenCodeHub via (a) the Claude Code plugin on a laptop, or (b) shelling out to the `codehub` CLI. +- **`@opencodehub/claude-hooks` SDK wrapper** — ruled out permanently. +- **`grounding_pack` MCP tool** — ruled out; its value was SDK consumption. Composable equivalents already exist as individual MCP tools accessible from Claude Code. +- **Hosted / managed / SaaS tier** — ruled out permanently. +- **OpenCodeHub-branded coding agent** — ruled out permanently. +- **LLM-based PR review** — we compete on deterministic verdict, not LLM verdict. +- **IDE plugin / LSP** — out. +- **Fine-tuned models** — out. + +## Acceptance criteria (EARS) + +### Ubiquitous + +- **AC-1-1** The system shall ship three GitHub Actions under `packages/actions/`: + - `opencodehub/token-action@v1` (OIDC → signed JWT used only for storage-backend presign; no HTTP MCP endpoint consumes it), + - `opencodehub/analyze-action@v1` (shells `codehub analyze`, uploads graph blob), + - `opencodehub/verdict-action@v1` (shells `codehub verdict`, posts GitHub Check). + Each is publishable to the GitHub Marketplace under the `opencodehub/` org. [P] +- **AC-1-2** The `codehub` CLI shall gain a `codehub verdict` subcommand that consumes `opencodehub.policy.yaml`, compiles rules against the local graph, and writes a structured verdict JSON to stdout. The subcommand shall not open any network connections beyond what existing analyzers already require. [P] +- **AC-1-3** The system shall define `opencodehub.policy.yaml` schema version 1 at `packages/policy/schemas/policy-v1.json` with four rule types (`blast_radius_max`, `license_allowlist`, `ownership_required`, `arch_invariants`). [P] +- **AC-1-4** The system shall ship GitLab CI templates at `packages/cli/src/ci-templates/gitlab/` mirroring the two primary actions with equivalent semantics. [P] + +### Event-driven + +- **AC-2-1** When `token-action@v1` runs inside a workflow with `permissions: id-token: write`, it shall exchange the GitHub OIDC token for a signed JWT (15-minute TTL) scoped to `{install_id, repo, pr_ref?}` and shall write it to `$GITHUB_ENV` as `OPENCODEHUB_TOKEN`. The JWT shall be used only by the `analyze-action` and `verdict-action` storage-backend presign endpoints that the customer themselves operates. +- **AC-2-2** When `analyze-action@v1` runs with `OPENCODEHUB_TOKEN` set, it shall execute `codehub analyze` on the working tree, capture the resulting `graph_hash`, and upload the graph blob to the configured storage backend. Step outputs shall include `graph-hash`, `graph-uri`, and `cache_hit` (boolean). Dependencies: AC-2-1, AC-4-1 +- **AC-2-3** When `verdict-action@v1` runs with `OPENCODEHUB_TOKEN` and `graph-uri` inputs, it shall download the graph blob, execute `codehub verdict --policy --pr `, parse the structured verdict JSON, post a GitHub Check named `OpenCodeHub / verdict` with per-rule annotations, and apply the `opencodehub:auto-approve` label if and only if the verdict is `pass` with `auto_approve: true`. Dependencies: AC-1-2, AC-2-2 +- **AC-2-4** When `codehub verdict` is invoked with a valid `opencodehub.policy.yaml` and a fresh graph, it shall return a structured JSON verdict with top-level `outcome` in `pass|fail|needs-review`, an `auto_approve` boolean, and a `rules[]` array with per-rule entries `{id, type, outcome, evidence, blocked_merge}`. Dependencies: AC-1-2, AC-1-3 +- **AC-2-5** When `codehub verdict` is invoked twice on unchanged inputs (same `graph_hash`, same `pr_ref`, same policy file, same `policy_version`), the two verdicts shall be byte-identical. Dependencies: AC-2-4 + +### State-driven + +- **AC-3-1** While the customer has configured `storage-backend: actions-cache` (default), the `analyze-action` and `verdict-action` shall route all blob I/O through `actions/cache@v4` keyed by `opencodehub:{repo}:{graph_hash}`. No external service is contacted. Dependencies: AC-1-1 +- **AC-3-2** While `OPENCODEHUB_EXPERIMENTAL_ARCH_INVARIANTS` is unset or `0`, `codehub verdict` shall return `outcome: skipped, reason: feature_flag` for every `arch_invariants` rule and shall evaluate the other three rule types normally. Dependencies: AC-1-2, AC-1-3 +- **AC-3-3** While the graph blob referenced by `(repo, commit_sha)` is missing from the configured storage backend when `verdict-action@v1` runs, the action shall post a GitHub Check with conclusion `neutral` and message `graph not yet indexed — analyze-action must run first`; the workflow shall not exit `failure`. Dependencies: AC-2-3 + +### Optional feature + +- **AC-4-1** Where the workflow configures `storage-backend: s3 | r2 | minio`, the action shall accept customer-supplied bucket credentials via GitHub Secrets and shall mint presigned URLs inside the runner using the customer's own tooling (e.g., `aws-actions/configure-aws-credentials`). No OpenCodeHub-operated service mints URLs. Dependencies: AC-2-2 +- **AC-4-2** Where `opencodehub.policy.yaml` declares `auto_approve.require: [...]`, the `codehub verdict` output shall compute `auto_approve: true` if and only if every `require` clause passes. Dependencies: AC-2-4 +- **AC-4-3** Where a repo-root `opencodehub.policy.yaml` and a group-root policy both exist, the repo-root shall take precedence and the merged effective policy (with inheritance chain) shall appear in the verdict output. Dependencies: AC-2-4 +- **AC-4-4** Where the workflow wants `arch_invariants` rule evaluation in production, setting `OPENCODEHUB_EXPERIMENTAL_ARCH_INVARIANTS=1` in the action `env:` shall unlock evaluation; before the flag is flipped, the schema slot shall still be reserved so customers may author rules. Dependencies: AC-3-2 + +### Unwanted behavior + +- **AC-5-1** If `opencodehub.policy.yaml` fails JSON Schema validation, `codehub verdict` shall exit with a non-zero code, shall not return a pass verdict, and shall print the schema error path and message. Dependencies: AC-1-3, AC-2-4 +- **AC-5-2** If `analyze-action@v1` runs without `permissions: id-token: write`, it shall exit with a clear error message naming the missing permission, before attempting any upload. Dependencies: AC-2-1, AC-2-2 +- **AC-5-3** If `verdict-action@v1` encounters a rule that the current CLI build cannot evaluate (e.g., a rule type introduced in a newer schema), it shall mark that rule `outcome: error, reason: unsupported_rule_type` but shall continue evaluating other rules. Dependencies: AC-2-4 +- **AC-5-4** If any action attempts to contact an OpenCodeHub-operated endpoint (i.e., a hostname not matching `github.com`, GitHub Marketplace artifact hosts, the customer's configured storage bucket, or the customer's own JWT issuer), the action shall fail fast. No calls home. Dependencies: AC-1-1 + +### Policy schema (four rule types, reused across CLI + actions) + +- **AC-6-1** `blast_radius_max` rule: input `{max_tier: 1..5}`; evaluated by calling `impact(detect_changes(pr_ref))` through the in-process graph client and asserting `max(tier) <= max_tier`. Evidence: affected symbols + max tier. [P] Dependencies: AC-1-3, AC-2-4 +- **AC-6-2** `license_allowlist` rule: input `{allow: [SPDX-ids], deny: [SPDX-ids]}`; evaluated by the CLI's existing `license_audit` pipeline filtered by allow/deny. Evidence: package + resolved license + decision. [P] Dependencies: AC-1-3, AC-2-4 +- **AC-6-3** `ownership_required` rule: input `{paths: [globs], require_approval_from: [teams | @users]}`; evaluated via `owners(pr_ref.changed_paths)` plus an approval-state lookup through the GitHub API (the action already has a token from the workflow context, so the CLI reads the approval state from the action rather than from a remote service). Evidence: path → required reviewer set → approval state. [P] Dependencies: AC-1-3, AC-2-4 +- **AC-6-4** `arch_invariants` rule (schema-v1 scaffolded, feature-flagged): input `{id, query: , severity}`; compiled to a curated cypher subset evaluated against the local graph. Only evaluated when `OPENCODEHUB_EXPERIMENTAL_ARCH_INVARIANTS=1`. Dependencies: AC-1-3, AC-2-4, AC-3-2 + +## Success criteria (beyond ACs) + +- At least one design-partner org runs `analyze-action` + `verdict-action` on at least 10 real PRs per week with no false-block incidents for two consecutive weeks. +- Published Marketplace actions with ≥ 100 workflow runs across external repos in the month post-launch. +- `codehub verdict` p50 latency ≤ 5s on a warm graph for a 20-file PR. +- Zero calls home — audit the action manifests against AC-5-4. + +## Validation + +- **Static layer**: `tsc --noEmit` across `packages/actions/*/`, `packages/policy/`, `packages/cli/`. All typed. +- **Action layer**: GitHub Actions metadata validation via `actions/validate-workflow-schema`. Each action's `action.yml` lints clean. +- **Behavioral layer**: end-to-end synthetic PR test — a workflow on this repo runs `token → analyze → verdict` through the matrix (policy pass, policy fail per rule type, missing graph, missing token, expired token). Asserts Check annotations, labels, and exit codes. +- **Regression layer**: spec 001 artifact factory must still operate; existing `/probe`, `/verdict`, `/audit-deps`, `/rename`, `/owners` skill flows must still work laptop-side. +- **No-call-home layer**: action manifests audited; tcpdump on a test runner confirms no traffic leaves the (runner, GitHub, customer bucket) triad. + +## Risks (see synthesis 013 §Risks, filtered) + +1. **GitHub ships a first-party "Code Intelligence Check"** — countermove: be license-open, self-hostable, cross-SCM. +2. **Graph-privacy pushback from regulated orgs** — countermove (P2): per-repo CMK binding for graph storage, documented air-gap deployment pattern. +3. **Policy DSL expressiveness vs safety** — mitigated by v1 constrained YAML; `arch_invariants` feature-flagged. +4. **CLI invocation overhead dwarfing the verdict itself** — mitigated by graph blob caching via `actions/cache@v4`. + +## Priority and sequencing + +- **All items in this spec: P1** unless marked below. +- **P2 items** (deferred further): + - `arch_invariants` flag-flip default-on + - GitLab templates (ship GitHub Actions first) + - Customer-self-hosted GitHub App webhook subscriber (wraps the same actions) + - Sigstore-signed provenance attestations + - Cross-org policy federation (git-based, no central registry) + - `codehub provenance record` + `.opencodehub/grounding.json` sidecar + +None of this spec begins until spec 001 (laptop artifact factory) has landed and produced adoption signal. + +## References + +- `/Users/lalsaado/Projects/open-code-hub/.erpaval/brainstorms/007-agents-at-scale-strategy.md` — diagnosis retained; Actions A/C/D/F rescinded (HTTP/SDK); Actions B/E/G/J reshaped to CLI-wrapping +- `/Users/lalsaado/Projects/open-code-hub/.erpaval/brainstorms/011-ci-integration-playbook.md` — workflow YAML shapes retained; HTTP server steps removed +- `/Users/lalsaado/Projects/open-code-hub/.erpaval/brainstorms/012-competitive-landscape.md` — deterministic-verdict wedge positioning +- `/Users/lalsaado/Projects/open-code-hub/.erpaval/brainstorms/013-synthesis-v2-two-surface-product.md` — synthesis (revised 2026-04-27 to remove HTTP + SDK) +- `/Users/lalsaado/Projects/open-code-hub/.erpaval/specs/001-claude-code-artifact-surface/spec.md` — companion spec, ships first +- Project memory: `project_opencodehub_no_http_mcp_no_sdk.md`, `project_opencodehub_no_saas.md` diff --git a/docs/adr/0007-artifact-factory.md b/docs/adr/0007-artifact-factory.md new file mode 100644 index 00000000..2a47e695 --- /dev/null +++ b/docs/adr/0007-artifact-factory.md @@ -0,0 +1,110 @@ +# ADR 0007 — Artifact Factory: Claude Code plugin turns the graph into committed Markdown + +- Status: accepted +- Date: 2026-04-27 +- Authors: Laith Al-Saadoon + Claude +- Branch: `feat/artifact-factory` + +## Context + +The OpenCodeHub plugin ships six analytical skills (`opencodehub-guide`, +`opencodehub-exploring`, `opencodehub-impact-analysis`, +`opencodehub-debugging`, `opencodehub-refactoring`, +`opencodehub-pr-review`) and five slash commands (`/probe`, `/verdict`, +`/audit-deps`, `/rename`, `/owners`). Every current surface answers +questions. None of them emit a committed Markdown artifact — the +durable unit of output Principal engineers actually ship. + +The CLI already has `codehub wiki --llm` wired to `@opencodehub/summarizer` +and Bedrock, and the MCP server has a `generate-map` prompt that sketches +an `ARCHITECTURE.md`. Both are invisible to a Claude Code session. The +group primitives (`group_contracts`, `group_query`, `group_status`, +`group_sync`) are the single feature no other code-graph tool has and +they have no artifact-producing skill on top. + +codeprobe (`/Users/lalsaado/Projects/codeprobe`) has validated the +pattern at single-repo scope: one `/document` skill with Phase 0–E +orchestration produces ~33 Markdown files in `.codeprobe/docs/` via 8 +parallel subagents with a shared-context precompute on disk. That +pattern is portable and compositional. + +The broader "grounding plane for runner-resident agents" design (an +MCP-over-HTTP server + `@opencodehub/agent-sdk` + `@opencodehub/claude-hooks`) +was explored across brainstorms 007–013 and **rejected as scope** in +favor of the simpler shape below. + +## Decision + +Ship an **artifact factory** inside the existing `plugins/opencodehub/` +Claude Code plugin that ports codeprobe's pattern to OpenCodeHub's graph +and extends it with first-class **group mode**. + +### What ships in v1 + +Nine components, tracked in `.erpaval/specs/001-claude-code-artifact-surface/spec.md`: + +1. **`codehub-document`** — primary skill (single + group mode, 4-phase orchestration) +2. **Six `doc-*` subagents** — `doc-architecture`, `doc-reference`, `doc-behavior`, `doc-analysis`, `doc-diagrams`, `doc-cross-repo` +3. **Phase 0 shared-context precompute** — writes `.codehub/.context.md` (200-line cap) and `.codehub/.prefetch.md` (JSON tool-call ledger) +4. **`.docmeta.json` + Phase E deterministic assembler** — citation regex → co-occurrence join → See-also footers → cross-repo link graph +5. **`codehub-pr-description`** — linear skill (no subagents) +6. **`codehub-onboarding`** — one specialty subagent +7. **`codehub-contract-map`** — group-only standalone skill (promoted from P1) +8. **PostToolUse staleness hook** — non-blocking `systemMessage` after git mutations +9. **Discoverability patches** — guide-skill Skills table, `codehub analyze` completion hint, `next_steps[]` suggestions, Starlight `/skills/` index + +### Scope exclusions (durable) + +| Excluded surface | Reason | +|---|---| +| Hosted / managed / SaaS / OpenCodeHub-operated tier | Self-hosted OSS only. Product-distribution decision, not a timeline call. | +| Remote / HTTP MCP server | Stdio MCP on the laptop only. No `packages/mcp-http/`, no Streamable HTTP, no remote transport. | +| Agent SDK (Python or TS) | No `@opencodehub/agent-sdk`, no `@opencodehub/claude-hooks`, no framework adapters. Agents consume OpenCodeHub via the Claude Code plugin or the `codehub` CLI. | +| `grounding_pack` MCP compositor tool | Its value was SDK consumption. Individual tools remain accessible. | +| OpenCodeHub-branded coding agent | We don't compete with Devin / Copilot / Cursor / Amazon Q / Claude-for-GitHub. | +| LLM-based PR review | CodeRabbit / Greptile / Diamond territory. We compete on deterministic verdict. | +| IDE plugin / LSP | Out. | +| Model fine-tuning | Out. | + +### Three locked defaults (from synthesis 013 §Open questions) + +1. **`codehub-contract-map` output path**: `.codehub/groups//contracts.md` gitignored default; `--committed` writes to `docs//contracts.md`. +2. **Orchestrator model**: Sonnet default; Opus only when `--refresh --group` is passed (refresh logic that prunes by mtime + fans out partial subagent set needs judgment; first-run single-repo does not). +3. **Output default**: `.codehub/docs/` gitignored; `--committed` opt-in to `docs/codehub/`. See ADR 0009 for the full output contract. + +## Consequences + +### Positive + +- **Composability over consolidation.** Being the neutral graph substrate that Claude Code (and later, any CLI consumer) calls beats shipping a competing agent runtime. +- **Self-hosted posture is a moat in regulated orgs.** No customer contract lists "OpenCodeHub Cloud" as a dependency. +- **The `group_*` primitives finally have a Claude Code surface.** `codehub-contract-map` alone demonstrates the uniquely-ours cross-repo artifact — nobody else does this. +- **Low operational commitment.** No server to run, no SDK versions to support, no OAuth issuer, no JWKS endpoint. +- **Deterministic artifacts with a machine-readable `.docmeta.json`.** `--refresh` works on mtime comparison; audit-tier guarantees drop out of `graph_hash` invariance. + +### Negative + +- **We forfeit two competitive seams from brainstorm 012 §3**: + - Agent-scoped grounding server for CI runners (would have required HTTP MCP) + - Claude Agent SDK hook pack as reference implementation (would have required the SDK) +- **Distribution reach is narrower.** Without the SDK, agent frameworks outside Claude Code must shell out to the CLI or skip us. +- **Anthropic ships a first-party repo-understanding tool in the Claude Agent SDK at some point.** We complement rather than compete, which means less share of that audience. + +Counter: both forfeits would have meant operating server code for our +customers or shipping SDK versions that break every time Claude Agent +SDK moves. The CLI + plugin posture is cheaper to maintain, and the +laptop surface still reaches every Claude Code user directly. + +### Neutral + +- The CI action surface (spec 002, now rewritten as CLI-wrapping) is + deferred to P1 and only begins after spec 001 has adoption signal. + +## References + +- `.erpaval/specs/001-claude-code-artifact-surface/spec.md` — the EARS spec driving this work +- `.erpaval/brainstorms/006-synthesis-whats-next.md` — earlier synthesis (artifact factory only) +- `.erpaval/brainstorms/013-synthesis-v2-two-surface-product.md` — current unified recommendation +- `docs/adr/0008-codeprobe-pattern-port.md` — records the pattern we are porting +- `docs/adr/0009-artifact-output-conventions.md` — output contract for every generated artifact +- `/Users/lalsaado/Projects/codeprobe/src/codeprobe/bootstrap/templates/claude-plugin/skills/document/SKILL.md` — pattern source diff --git a/docs/adr/0008-codeprobe-pattern-port.md b/docs/adr/0008-codeprobe-pattern-port.md new file mode 100644 index 00000000..817d5d5e --- /dev/null +++ b/docs/adr/0008-codeprobe-pattern-port.md @@ -0,0 +1,131 @@ +# ADR 0008 — Port the codeprobe `/document` pattern (Phase 0–E orchestration) + +- Status: accepted +- Date: 2026-04-27 +- Authors: Laith Al-Saadoon + Claude +- Branch: `feat/artifact-factory` + +## Context + +ADR 0007 committed to shipping an artifact factory inside the Claude Code +plugin. The orchestration pattern — how Claude drives the skill → precompute → +parallel subagents → assembler flow — is the hard part. It is also already +solved. + +`codeprobe` (sibling project at `/Users/lalsaado/Projects/codeprobe`) ships a +`/document` skill that produces 33 cross-linked Markdown files in 45–90 s via +8 parallel subagents, with a deterministic `.docmeta.json` sidecar that powers +`--refresh`. The pattern is: + +- **Phase 0 (pre-flight, inline in the skill)** — read the 14 enumerated data + artifacts plus 3 GitNexus cypher queries, persist their combined context to + `/.context.md` (200-line cap) and `/.gitnexus-prefetch.md`. + Subagents read these files instead of re-calling tools. +- **Phase AB (6 subagents in parallel, single message with 6 Agent tool calls)** — + each reads the two shared-context files first, then writes its section-specific + files. +- **Phase CD (2 subagents in parallel)** — diagrams + migration. +- **Phase E (inline deterministic assembler)** — regex over backtick source + citations builds a co-occurrence index, appends `See also:` footers, writes + `README.md` + `.docmeta.json`. + +The pattern resolves the three hardest problems in generative documentation: +**prompt dedup at fan-out** (filesystem, not copy-paste), **determinism of +structure** (Phase E is LLM-free), and **refresh without full regen** (mtime +comparison against declared `data_sources`). + +## Decision + +Adopt codeprobe's four-phase pattern for OpenCodeHub's `codehub-document` skill +with three deliberate adaptations. + +### Adaptation 1 — six subagents, not eight + +codeprobe runs 8 doc-* subagents. OpenCodeHub's supply-chain tools +(`verdict`, `list_findings`, `license_audit`, `list_dead_code`, +`list_findings_delta`) pre-digest a lot of analysis output, so we +consolidate into six: + +| Subagent | Replaces / consumes | +|---|---| +| `doc-architecture` | codeprobe's `doc-architecture` | +| `doc-reference` | codeprobe's `doc-reference` + exports from SCIP | +| `doc-behavior` | codeprobe's `doc-behavior` + route/tool inventories | +| `doc-analysis` | codeprobe's `doc-technical-debt` + `doc-analysis`, driven by `verdict` / `risk_trends` / `owners` / `list_dead_code` | +| `doc-diagrams` | codeprobe's `doc-diagrams` | +| `doc-cross-repo` | **new** — group mode only. Consumes `group_contracts` + `group_query`. | + +### Adaptation 2 — group mode is a first-class topology + +codeprobe is single-repo only. Our Phase 0 writer emits context either to +`.codehub/` (single-repo) or `.codehub/groups//` (group mode). Phase AB +fans out 4 subagents per repo in group mode — batched by role if the +cardinality exceeds Claude Code's concurrent-subagent ceiling (~10 per +message, per brainstorm 004 and spec 001 AC-5-1). Phase E builds a +cross-repo link graph in addition to the single-repo See-also footers. + +### Adaptation 3 — the assembler contract + +codeprobe's Phase E regex matches backtick citations in `path:LOC` form. +We extend the grammar to also match `repo:path:LOC` for group mode, and +the `.docmeta.json` schema adds a `cross_repo_refs[]` array. See ADR 0009 +for the full output contract. + +### Pattern invariants we preserve verbatim + +- Phase 0 writes two files on disk; subagents read them before touching + any MCP tool. +- Every subagent prompt follows codeprobe's 8-section scaffold: frontmatter, + output files, input specification, process, document format rules, tool + usage guide, fallback paths, quality checklist. +- Phase E is **deterministic Markdown assembly, no LLM call**. Regex → join → + footer → manifest. +- `.docmeta.json` is the source of truth for `--refresh`. Mtime comparison + against `section.sources[]` decides what to regenerate. +- No YAML frontmatter on generated outputs. H1 is the identifier. +- Generated docs have backtick source citations (`` `path:LOC` `` or + `` `repo:path:LOC` ``). + +## Consequences + +### Positive + +- **Proven pattern.** codeprobe has run it in production. Risk is + adaptation, not invention. +- **Prompt dedup is a filesystem property, not a prompt-engineering + property.** Per-subagent prompts stay small; the context lives in + `.context.md` + `.prefetch.md`. +- **Determinism where it matters.** Phase E is regex + join. Same inputs, + same cross-links. Prose is LLM-generated (non-deterministic), but + structure and citations are deterministic. +- **Group mode comes for free** once the topology is named. The unique + codehub wedge (`group_contracts` + `group_query`) fits as a single + additional subagent. + +### Negative + +- **Subagent tool sprawl.** Each doc-* carries 6–10 MCP tools plus + Read/Write/Grep/Glob. Context-bloat-from-tool-metadata is the realistic + failure mode, not bad prompts. Mitigation is baked into the subagent + prompts: each opens with "do not re-call tools whose digest is in + `.prefetch.md`" plus a Tool Usage Guide table. +- **Parallel subagent ceiling.** Claude Code caps concurrent Agent calls + at ~10 per message. Groups of 3+ repos require role-batched dispatch + (all `doc-architecture` first, then `doc-behavior`, etc.). Verified + against the current Claude Code release when `codehub-document --group` + ships. + +### Neutral + +- **codeprobe stays the pattern source.** We cite it in subagent prompts + and the skill README. Pattern divergences are recorded here and in ADR + 0009. + +## References + +- `docs/adr/0007-artifact-factory.md` — the parent decision +- `docs/adr/0009-artifact-output-conventions.md` — output contract +- `.erpaval/brainstorms/004-opencodehub-subagent-prompts.md` — per-agent 8-section scaffolds +- `.erpaval/brainstorms/005-opencodehub-output-conventions.md` — citation grammar + `.docmeta.json` schema +- `/Users/lalsaado/Projects/codeprobe/src/codeprobe/bootstrap/templates/claude-plugin/skills/document/SKILL.md` — pattern source +- `/Users/lalsaado/Projects/codeprobe/src/codeprobe/bootstrap/templates/claude-plugin/agents/doc-*.md` — 8-section scaffold reference diff --git a/docs/adr/0009-artifact-output-conventions.md b/docs/adr/0009-artifact-output-conventions.md new file mode 100644 index 00000000..d8fb15b4 --- /dev/null +++ b/docs/adr/0009-artifact-output-conventions.md @@ -0,0 +1,253 @@ +# ADR 0009 — Artifact output conventions (paths, citations, `.docmeta.json`, Mermaid) + +- Status: accepted +- Date: 2026-04-27 +- Authors: Laith Al-Saadoon + Claude +- Branch: `feat/artifact-factory` + +## Context + +ADR 0007 committed to shipping four artifact-generation skills. ADR 0008 +locked the orchestration pattern. This ADR records the **output contract** +every generated artifact must satisfy — the on-disk shape, the citation +grammar, the `.docmeta.json` schema, and the diagram conventions. + +Without a single authoritative contract, Phase E's deterministic assembler +has no regex to run against, `--refresh` cannot compare section mtimes to +source-artifact mtimes, and cross-repo `See also` footers are impossible +to compute. + +## Decision + +### Directory layout + +**Single-repo mode** (default, no `--group`): + +``` +.codehub/ +├── .context.md # Phase 0 shared context (200-line cap) +├── .prefetch.md # Phase 0 tool-call digest ledger +└── docs/ + ├── README.md # Landing page — written by Phase E + ├── .docmeta.json # Manifest (schema below) + ├── architecture/ + │ ├── system-overview.md + │ ├── module-map.md + │ └── data-flow.md + ├── reference/ + │ ├── public-api.md + │ ├── cli.md # Conditional (CLI package present) + │ └── mcp-tools.md # Conditional (MCP package present) + ├── behavior/ + │ ├── processes.md + │ └── state-machines.md # Conditional + ├── analysis/ + │ ├── risk-hotspots.md + │ ├── ownership.md + │ └── dead-code.md + └── diagrams/ + ├── architecture/components.md + ├── behavioral/sequences.md + └── structural/dependency-graph.md +``` + +**Group mode** (`--group ` or autodetected at a group root): + +Each member repo keeps its own single-repo tree at `/.codehub/docs/`. +The group tree adds cross-repo artifacts only: + +``` +.codehub/groups// +├── .context.md +├── .prefetch.md +└── docs/ + ├── README.md + ├── .docmeta.json + └── cross-repo/ + ├── portfolio-map.md + ├── contracts-matrix.md + └── dependency-flow.md +``` + +**`codehub-contract-map` standalone** writes to +`.codehub/groups//contracts.md` (gitignored) or `docs//contracts.md` +when invoked with `--committed`. + +### Gitignored by default, `--committed` opt-in + +The `.codehub/` tree is gitignored by default. The `--committed` flag on +every artifact skill writes under `docs/codehub/` (or the user-supplied +path) without adding a `.gitignore` entry. Regeneration stays safe by +default; users who want version-controlled artifacts make an active choice. + +This is an exception-free rule in v1 — ADRs were the earlier proposed +exception, but `codehub-adr` moved to P1 backlog so the issue is deferred. + +### Citation grammar + +Every factual claim in a generated artifact carries an inline +backtick-wrapped citation. Two forms, both recognized by Phase E: + +- **Single-repo**: `` `:` `` or `` `:-` ``. File-level cites append ` (N LOC)`. +- **Group-qualified**: `` `::` `` — **mandatory** in any file under `cross-repo/` or `contracts.md`. + +#### The Phase E assembler regex + +``` +(?P[a-zA-Z0-9_-]+:)?(?P[^\s`:]+\.[a-zA-Z0-9]+)(?::(?P\d+)(?:-(?P\d+))?)?(?:\s*\((?P\d+)\s*LOC\))? +``` + +One regex for every citation form keeps the assembler ~40 lines of +deterministic code. Matches are scanned only between backtick pairs. + +### `.docmeta.json` schema + +Written by Phase E at the end of every run. Drives `--refresh` and +`codehub status` staleness reporting. + +```json +{ + "$schema": "https://opencodehub.dev/schemas/docmeta-v1.json", + "generated_at": "2026-04-27T18:12:04Z", + "codehub_graph_hash": "sha256:a1b2c3…", + "mode": "single-repo" , + "repo": "opencodehub", + "staleness_at": "2026-04-27T18:12:04Z", + "sections": [ + { + "path": "architecture/system-overview.md", + "agent": "doc-architecture", + "sources": [ + "packages/mcp/src/server.ts", + "packages/mcp/src/index.ts" + ], + "mtime": "2026-04-27T18:11:58Z", + "citation_count": 18, + "mermaid_count": 1 + } + ], + "cross_repo_refs": [ + { + "repo": "billing", + "from_doc": "cross-repo/contracts-matrix.md", + "to_doc": "../../../billing/.codehub/docs/reference/public-api.md", + "contract_count": 4 + } + ] +} +``` + +`cross_repo_refs[]` is emitted only in group mode. +`staleness_at` is copied from the `_meta.codehub/staleness` envelope on the +last MCP response the assembler observed. + +### Cross-reference rules + +- **Within a single repo**: if two docs share ≥ 2 citations to the same + source files, Phase E appends `## See also` (3–5 links) to both. + Threshold enforced by the assembler. +- **Group mode**: `cross-repo/*` files additionally receive a + `## See also (other repos in group)` section linking into sibling repos' + generated docs via relative paths rooted at the group directory. +- **Link form**: Markdown reference-style links, not inline URLs — keeps + footers tidy when lists grow. +- **Dedup**: a sibling path appears at most once across both footer + sections. + +### Mermaid conventions + +One diagram type per artifact. Diagrams capped at 20 nodes; overflow goes +into a legend table, never into the diagram. + +| Diagram | Type | Lives in | +|---|---|---| +| Dependency graph | `flowchart LR` | `diagrams/structural/dependency-graph.md` | +| Component view | `classDiagram` | `diagrams/architecture/components.md` | +| Top process | `sequenceDiagram` | `diagrams/behavioral/sequences.md` | +| State machine | `stateDiagram-v2` | `behavior/state-machines.md` (conditional) | +| Data flow | `flowchart TB` | `architecture/data-flow.md` | + +No SVG or PNG generation. Mermaid in fenced ```mermaid blocks only. + +### Determinism guarantees + +- **Deterministic**: file list, directory layout, section ordering, + diagram node set, citation targets, `.docmeta.json` structure. Given + the same `codehub_graph_hash`, two runs produce the same *structure*. +- **Non-deterministic**: prose sentences, diagram edge ordering within a + node (Mermaid renderers stable but LLM-emitted source ordering is not), + choice of which 3 processes to render as sequence diagrams among ties. +- **Explicit call-out**: every generated `README.md` landing page + includes a one-line "Prose is LLM-generated; structure is graph-derived" + note so reviewers treat the diff accordingly. + +### `--refresh` algorithm + +Deterministic, per-section. Avoids regenerating unchanged sections. + +1. Load `.docmeta.json`. +2. Fetch current `codehub_graph_hash` from `list_repos`. If it matches + the manifest's hash exactly, skip to step 5. +3. For each `section`: + - Compute `max(mtime(source))` across `sources[]` (via `stat`). + - If `max(source_mtime) > section.mtime`: mark section stale. +4. Collect the union of stale sections and their `section.agent` owners. + Dispatch only the owning subagents; pass them a `sections_to_refresh` + list so they write only those files. +5. Re-run Phase E over the full tree (cross-reference assembly is cheap + and idempotent). + +Source-mtime comparison is tolerant of the common case where +`codehub analyze` updates the graph but touches only a few files. +Falling back to a full regen when `graph_hash` churns avoids subtle +staleness when node IDs shift. + +### Staleness signals + +- **`codehub status`** reads `.docmeta.json.codehub_graph_hash` and + compares against the live graph hash; reports `docs stale at ` + when different. Bolts into the existing status command at + `packages/cli/src/commands/status.ts`; no new command surface. +- **Phase E writes `staleness_at`** from the last MCP + `_meta.codehub/staleness` envelope observed during assembly. +- **PostToolUse hook** (per spec 001 AC-2-8) emits a non-blocking + `systemMessage` suggesting `/codehub-document --refresh` when the + graph hash changes and `.docmeta.json` exists. + +## Consequences + +### Positive + +- **Single authoritative contract.** The Phase E assembler, the `--refresh` + algorithm, and the staleness hook all read `.docmeta.json` against one + schema. +- **Citations are regex-inverse-indexable.** No AST, no parser, no LLM call + for cross-referencing. +- **Gitignored default is low-friction.** Users who don't want to commit + generated docs never have to configure anything. +- **Mermaid-only keeps docs LLM-consumable.** Every diagram round-trips + through Claude Code's paste-as-Markdown flow without binary assets. + +### Negative + +- **Deterministic structure + non-deterministic prose** is a subtle + contract. Reviewers may mistake prose variance for a generation bug. + Mitigated by the one-line disclaimer on every `README.md`. +- **The 20-node Mermaid cap truncates very large graphs.** Overflow + legend tables are readable but less scannable than a rendered diagram. + Accepted tradeoff — Mermaid rendering past 40 nodes is unreliable + across viewers. + +### Neutral + +- **ADR staleness exception dropped for v1.** `codehub-adr` is P1 backlog; + when it ships, the `--committed` default will flip so the ADR + convention (ADRs must be in git) is respected. + +## References + +- `docs/adr/0007-artifact-factory.md` — parent decision +- `docs/adr/0008-codeprobe-pattern-port.md` — orchestration pattern +- `.erpaval/brainstorms/005-opencodehub-output-conventions.md` — original design memo +- `.erpaval/specs/001-claude-code-artifact-surface/spec.md` — AC-4-3 (`.docmeta.json`), AC-3-4 (contract-map output), AC-7-2 (analyze-completion hint) +- `/Users/lalsaado/Projects/codeprobe/src/codeprobe/bootstrap/templates/claude-plugin/skills/document/references/cross-reference-spec.md` — pattern source diff --git a/packages/docs/astro.config.mjs b/packages/docs/astro.config.mjs index 36581825..b2fe006b 100644 --- a/packages/docs/astro.config.mjs +++ b/packages/docs/astro.config.mjs @@ -117,6 +117,10 @@ export default defineConfig({ label: "MCP Server", autogenerate: { directory: "mcp" }, }, + { + label: "Skills", + autogenerate: { directory: "skills" }, + }, { label: "Reference", autogenerate: { directory: "reference" }, diff --git a/packages/docs/src/content/docs/architecture/adrs.md b/packages/docs/src/content/docs/architecture/adrs.md index b50ce304..43ce14ee 100644 --- a/packages/docs/src/content/docs/architecture/adrs.md +++ b/packages/docs/src/content/docs/architecture/adrs.md @@ -100,6 +100,58 @@ resolves to the `scip-code` fork rather than upstream `sourcegraph`. [Read ADR 0006](https://github.com/theagenticguy/opencodehub/blob/main/docs/adr/0006-scip-indexer-pins.md) +### ADR 0007 — Artifact factory + +**Status:** Accepted (2026-04-27). + +**Decision:** Ship an artifact-generation skill family inside +`plugins/opencodehub/` that turns the graph into committed Markdown. +Four P0 skills (`codehub-document`, `codehub-pr-description`, +`codehub-onboarding`, `codehub-contract-map`), six `doc-*` subagents, +Phase 0 precompute, `.docmeta.json` + Phase E assembler, PostToolUse +staleness hook, discoverability patches. + +Scope exclusions (durable, not timeline): no hosted/managed/SaaS tier, +no remote/HTTP MCP server, no agent SDK, no `grounding_pack` +compositor tool, no own coding agent, no LLM-based PR review, no +IDE plugin/LSP, no model fine-tuning. + +[Read ADR 0007](https://github.com/theagenticguy/opencodehub/blob/main/docs/adr/0007-artifact-factory.md) + +### ADR 0008 — codeprobe pattern port + +**Status:** Accepted (2026-04-27). + +**Decision:** Port codeprobe's four-phase `/document` pattern (Phase 0 +precompute → Phase AB parallel content → Phase CD parallel diagrams + +specialty → Phase E deterministic assembler) to OpenCodeHub, with +three adaptations: six subagents instead of eight (supply-chain tools +pre-digest), group mode as a first-class topology, and an extended +assembler contract that handles both `path:LOC` and `repo:path:LOC` +citation forms. + +Preserves the pattern invariants verbatim: shared-context files on +disk (not in-prompt copy-paste), eight-section agent scaffold, +deterministic Phase E (no LLM call), `.docmeta.json` as source of +truth for `--refresh`, no YAML frontmatter on outputs. + +[Read ADR 0008](https://github.com/theagenticguy/opencodehub/blob/main/docs/adr/0008-codeprobe-pattern-port.md) + +### ADR 0009 — Artifact output conventions + +**Status:** Accepted (2026-04-27). + +**Decision:** Single authoritative output contract. `.codehub/docs/` +gitignored default; `--committed` opts in to `docs/codehub/`. Backtick +citation grammar with a single Phase E regex covering both single-repo +and group-qualified forms. `.docmeta.json` schema v1 with +`cross_repo_refs[]` for group mode. Mermaid-only diagrams (no +SVG/PNG). 20-node diagram cap with a Legend table for overflow. +Deterministic structure; non-deterministic prose; disclaimer on every +generated `README.md`. + +[Read ADR 0009](https://github.com/theagenticguy/opencodehub/blob/main/docs/adr/0009-artifact-output-conventions.md) + ## Superseded ### ADR 0003 — CI toolchain pins (gopls ↔ Go, pnpm build-script allowlist) diff --git a/packages/docs/src/content/docs/reference/docmeta-schema.mdx b/packages/docs/src/content/docs/reference/docmeta-schema.mdx new file mode 100644 index 00000000..ebfb8df2 --- /dev/null +++ b/packages/docs/src/content/docs/reference/docmeta-schema.mdx @@ -0,0 +1,98 @@ +--- +title: ".docmeta.json schema" +description: "Manifest written by Phase E of codehub-document. Drives --refresh and cross-reference assembly." +--- + +import { Aside, Code } from "@astrojs/starlight/components"; + +`codehub-document` writes a `.docmeta.json` sidecar alongside the generated +Markdown tree at the end of every Phase E run. The file is the source of truth +for `--refresh` and for `codehub status` staleness reporting. + + + +## Schema (v1) + + + +### Top-level fields + +| Field | Type | Meaning | +|---|---|---| +| `$schema` | string | JSON Schema URL for v1. Locked. | +| `generated_at` | ISO-8601 | When Phase E completed. | +| `codehub_graph_hash` | `sha256:` | Taken from `list_repos` at orchestration start. The hash that anchors this doc tree. | +| `mode` | `"single-repo" \| "group"` | Whether the tree was produced by single-repo or group invocation. | +| `repo` | string \| null | The target repo (single mode) or the group root's registered repo reference (group mode). | +| `group` | string \| null | The group name (group mode only). | +| `staleness_at` | ISO-8601 | Lifted from the last MCP response's `_meta.codehub/staleness` envelope observed during assembly. | +| `sections[]` | array | One entry per generated Markdown file. | +| `cross_repo_refs[]` | array | Cross-repo links computed by Phase E. Only populated in group mode. | +| `frontmatter_removed[]` | string[] | Paths where Phase E stripped stray YAML frontmatter. Normally empty. | + +### `sections[]` entries + +| Field | Type | Meaning | +|---|---|---| +| `path` | string | Relative path from the docs root. | +| `agent` | string | The subagent that wrote this section (`doc-architecture`, `doc-reference`, etc.). Identifies ownership for `--refresh` dispatch. | +| `sources[]` | string[] | Source-file paths this section cites. Used by `--refresh` to decide staleness via mtime comparison. | +| `mtime` | ISO-8601 | When this section file was last written. | +| `citation_count` | number | Total backtick citations extracted by Phase E. | +| `mermaid_count` | number | Fenced ```` ```mermaid ```` blocks detected. | + +### `cross_repo_refs[]` entries (group mode only) + +| Field | Type | Meaning | +|---|---|---| +| `repo` | string | The sibling repo being linked. | +| `from_doc` | string | Relative path (from the group docs root) of the source doc. | +| `to_doc` | string | Relative path into the sibling repo's generated docs. | +| `contract_count` | number | Number of contracts sharing source citations across this cross-repo pair. Computed from `group_contracts`. | + +## How `--refresh` uses the schema + +1. Load `.docmeta.json`. +2. Compare the manifest's `codehub_graph_hash` against `list_repos`. If they match exactly, skip to step 5. +3. For each section, `stat` every `sources[i]`. If `max(source_mtime) > section.mtime`, mark it stale. +4. Collect stale sections + owners (`section.agent`); dispatch only the owning subagents with a `sections_to_refresh` list. +5. Always re-run Phase E (cross-reference assembly is cheap and idempotent). + +See [`references/cross-reference-spec.md`](https://github.com/theagenticguy/opencodehub/blob/main/plugins/opencodehub/skills/codehub-document/references/cross-reference-spec.md) inside the plugin for the Phase E algorithm. + +## Validation + +The JSON Schema is locked at v1. Breaking changes bump to v2 and keep v1 readers working for one release cycle. Run-time validation lives in `packages/analysis/src/docmeta.ts` (written as part of spec 001 Act phase). + +## See also + +- [ADR 0009 — Artifact output conventions](/architecture/adr-0009/) +- [Skills index](/skills/) +- [`codehub-document` skill](/skills/codehub-document/) diff --git a/packages/docs/src/content/docs/skills/codehub-contract-map.mdx b/packages/docs/src/content/docs/skills/codehub-contract-map.mdx new file mode 100644 index 00000000..29050609 --- /dev/null +++ b/packages/docs/src/content/docs/skills/codehub-contract-map.mdx @@ -0,0 +1,89 @@ +--- +title: "codehub-contract-map" +description: "Group-only. Consumer/producer contract matrix across a repo group, with Mermaid flow." +--- + +import { Aside } from "@astrojs/starlight/components"; + +Standalone group-only skill. Renders `group_contracts` into a Markdown + +Mermaid artifact. Fires on direct invocations ("map the contracts") +without needing the full `codehub-document` orchestration. + + + +## Frontmatter + +```yaml +name: codehub-contract-map +argument-hint: " [--output ] [--committed]" +color: magenta +model: sonnet +``` + +## Preconditions + +1. A `` positional argument is required. Missing or unknown group: + `Contract map requires a named group — run 'codehub group list' to see registered groups.` +2. Every member repo must be `fresh` per `mcp__opencodehub__group_status`. Stale members abort with named repos. + +## Process + +1. `mcp__opencodehub__group_list` — confirm ``. +2. `mcp__opencodehub__group_status({group})` — confirm freshness per member. +3. `mcp__opencodehub__group_contracts({group})` — the spine. +4. If zero contracts: write the artifact with a "No inter-repo contracts detected" banner. **Don't error** (spec 001 AC-5-5). +5. `mcp__opencodehub__group_query({group, text: "api handlers"})` — disambiguate producer-side locations. +6. `mcp__opencodehub__route_map({repo})` per member — for handler citations. +7. Build the N×N consumer/producer matrix + Mermaid flow + notable-contracts list. +8. Write to the resolved output path. + +## Output shape + +```markdown +# · Contract map + +## Contracts matrix +Rows = producers, columns = consumers. Cell = contract count. + +| | billing | core | web | +|-------|---------|------|-----| +| billing | — | 3 | 5 | +| core | — | — | 12 | +| web | — | — | — | + +## Flow +```mermaid +flowchart LR + web --> billing : 5 + web --> core : 12 + billing --> core : 3 +``` + +## Notable contracts +- **`web:packages/checkout/src/api.ts:22`** → **`billing:packages/api/src/handlers/invoice.ts:45`** + - Method: `POST /v1/invoices` + - Shape: `{amount, userId, idempotencyKey}` +... +``` + + + +## Arguments + +| Flag | Meaning | +|---|---| +| `` (required) | The group to map. Must appear in `group_list`. | +| `--output ` | Override output path. | +| `--committed` | Write to `docs//contracts.md` instead of `.codehub/groups//contracts.md`. | + +## Related + +- [codehub-document](/opencodehub/skills/codehub-document/) — full group-mode docs +- [ADR 0007 — Artifact factory](/opencodehub/architecture/adr-0007/) +- [Skills index](/opencodehub/skills/) diff --git a/packages/docs/src/content/docs/skills/codehub-document.mdx b/packages/docs/src/content/docs/skills/codehub-document.mdx new file mode 100644 index 00000000..408dafbc --- /dev/null +++ b/packages/docs/src/content/docs/skills/codehub-document.mdx @@ -0,0 +1,121 @@ +--- +title: "codehub-document" +description: "Primary artifact generator. Single-repo and group mode, 4-phase orchestration, .docmeta.json sidecar." +--- + +import { Aside, Tabs, TabItem } from "@astrojs/starlight/components"; + +Primary artifact generator. Ports codeprobe's proven `/document` pattern to +OpenCodeHub's graph and extends it with first-class **group mode**. + +Writes a tree of cross-linked Markdown under `.codehub/docs/` (single-repo) +or `.codehub/groups//docs/` (group mode) plus a `.docmeta.json` +sidecar that drives `--refresh`. + +## Frontmatter + +```yaml +name: codehub-document +argument-hint: "[output-dir] [--group ] [--committed] [--refresh] [--section ]" +color: indigo +model: sonnet +``` + + + +## Preconditions + +1. `mcp__opencodehub__list_repos` returns the target. Otherwise: run `codehub analyze`. +2. `codehub status` reports fresh. Otherwise: run `codehub analyze`. +3. Group mode only: every member repo must be `fresh` per `mcp__opencodehub__group_status`. Stale members abort with named repos. + +## Four-phase orchestration + + + + Inline, no subagent. Writes two shared-context files on disk: + + - **`/.context.md`** (hard 200-line cap) — repo profile, top communities, top processes, routes, MCP tools, owners summary, staleness envelope. Group mode adds the manifest + contracts matrix + freshness table. + - **`/.prefetch.md`** — newline-delimited JSON ledger of tool calls with `{tool, args, sha256, keys, cached_at, truncated}`. Subagents read this instead of re-calling tools. + + Prompt dedup via filesystem, not copy-paste. + + + Four subagents dispatched in a single message: + + - `doc-architecture` → `architecture/{system-overview,module-map,data-flow}.md` + - `doc-reference` → `reference/{public-api,cli,mcp-tools}.md` + - `doc-behavior` → `behavior/{processes,state-machines}.md` + - `doc-analysis` → `analysis/{risk-hotspots,ownership,dead-code}.md` + + In group mode, fan-out multiplies by member count (4 × N subagents). + Claude Code's concurrent-Agent ceiling is ~10 per message — groups of + 3+ repos batch by role. + + + Two subagents in parallel: + + - `doc-diagrams` → `diagrams/{architecture,behavioral,structural}/*.md` + - `doc-cross-repo` → `cross-repo/{portfolio-map,contracts-matrix,dependency-flow}.md` *(group mode only)* + + Skipped silently in single-repo mode. + + + **Deterministic Markdown assembly. No LLM call.** + + 1. Regex over backtick `path:LOC` (or `repo:path:LOC`) citations. + 2. Build co-occurrence index: `source_file → [docs_citing_it]`. + 3. For any two docs sharing ≥ 2 common sources, append `## See also` footers. + 4. In group mode: add `## See also (other repos in group)` to every `cross-repo/*.md`. + 5. Write `README.md` (landing page with determinism disclaimer) + `.docmeta.json`. + + Same inputs, same output. See [`.docmeta.json` schema](/opencodehub/reference/docmeta-schema/). + + + +## Arguments + +| Flag | Meaning | +|---|---| +| `[output-dir]` | Where to write. Default `.codehub/docs/` (gitignored). With `--committed`, default flips to `docs/codehub/`. | +| `--group ` | Enable group mode. Phase 0 calls `group_list` + `group_status` + `group_contracts` + `group_query`. Phase CD dispatches `doc-cross-repo`. | +| `--committed` | Write to a committed path instead of `.codehub/docs/`. Does not touch `.gitignore`. | +| `--refresh` | Regenerate only sections whose `sources[]` mtimes are newer than the section's `mtime`. Phase E always re-runs. | +| `--section ` | Regenerate one named section (e.g., `architecture/system-overview`). | + +## Invocation examples + +```bash +# Single-repo, default gitignored output +/codehub-document + +# Group mode with an explicit output +/codehub-document docs/platform --group platform --committed + +# Refresh stale sections only +/codehub-document --refresh + +# One-section regenerate +/codehub-document --section architecture/system-overview +``` + +## Output contract + +See [ADR 0009](/opencodehub/architecture/adr-0009/) for the full contract. + +- No YAML frontmatter on outputs. +- Every factual claim carries a backtick `path:LOC` citation (or `repo:path:LOC` in group mode). +- Mermaid diagrams only (no SVG/PNG). +- `.docmeta.json` is the source of truth for `--refresh` and staleness. + +## Related + +- [ADR 0007 — Artifact factory](/opencodehub/architecture/adr-0007/) +- [ADR 0008 — codeprobe pattern port](/opencodehub/architecture/adr-0008/) +- [ADR 0009 — Output conventions](/opencodehub/architecture/adr-0009/) +- [`.docmeta.json` schema](/opencodehub/reference/docmeta-schema/) +- [Skills index](/opencodehub/skills/) diff --git a/packages/docs/src/content/docs/skills/codehub-onboarding.mdx b/packages/docs/src/content/docs/skills/codehub-onboarding.mdx new file mode 100644 index 00000000..10d9074c --- /dev/null +++ b/packages/docs/src/content/docs/skills/codehub-onboarding.mdx @@ -0,0 +1,86 @@ +--- +title: "codehub-onboarding" +description: "ONBOARDING.md with a graph-centrality-ranked reading order and an end-to-end process walk." +--- + +import { Aside } from "@astrojs/starlight/components"; + +Produces a single ONBOARDING.md. The wedge is the **ranked reading order** +drawn from graph centrality — a generic README scaffold cannot produce this. + +## Frontmatter + +```yaml +name: codehub-onboarding +argument-hint: "[output-path] [--committed]" +color: green +model: sonnet +``` + +## Preconditions + +- `mcp__opencodehub__list_repos` must return the target. +- `codehub status` must be fresh. + +Both refuse loudly with a one-line remediation hint per spec 001 AC-3-1. + +## Process + +1. `mcp__opencodehub__project_profile` — languages, stacks, entry points. +2. `mcp__opencodehub__route_map` / `mcp__opencodehub__tool_map` — HTTP / MCP surface. +3. `mcp__opencodehub__sql` for top-centrality nodes: + ```sql + SELECT name, file_path, in_degree + out_degree AS centrality + FROM nodes + WHERE kind IN ('File','Module','Class') + ORDER BY centrality DESC + LIMIT 15 + ``` +4. `mcp__opencodehub__context` on the top 8 for one-line summaries. +5. `mcp__opencodehub__owners` on top 3 folders → "ask these humans" table. +6. Dispatch one specialty `doc-onboarding` subagent. +7. Assemble ONBOARDING.md and write to the resolved output path. + +## Output shape + +```markdown +# · Onboarding + +## TL;DR +2 sentences — what this repo does + the mental model to hold. + +## Stack +| Layer | Tech | Source | + +## Read these 10 files first (in order) +1. `packages/cli/src/bin.ts` — CLI entry point. (45 LOC) +2. `packages/mcp/src/server.ts` — MCP bootstrap. (320 LOC) +... (ranked by centrality) + +## Walk one process end-to-end +(the highest-step-count process, traced step by step) + +## Ask these humans +| Area | Owner | Share | + +## Next steps +- Concrete first actions. +``` + +## Arguments + +| Flag | Meaning | +|---|---| +| `[output-path]` | Where to write. Default: `.codehub/ONBOARDING.md` (gitignored). With `--committed`: `docs/ONBOARDING.md`. | +| `--committed` | Opt in to a committed path. | + + + +## Related + +- [codehub-document](/opencodehub/skills/codehub-document/) — for the full architecture book +- [Skills index](/opencodehub/skills/) diff --git a/packages/docs/src/content/docs/skills/codehub-pr-description.mdx b/packages/docs/src/content/docs/skills/codehub-pr-description.mdx new file mode 100644 index 00000000..1f08d605 --- /dev/null +++ b/packages/docs/src/content/docs/skills/codehub-pr-description.mdx @@ -0,0 +1,72 @@ +--- +title: "codehub-pr-description" +description: "Draft a PR body from detect_changes + verdict + owners + findings-delta. Refuses on a clean tree." +--- + +Linear skill. No subagents. Sonnet. Writes a Markdown PR body you can +paste into `gh pr create --body-file` (or let the Claude Code session +drive the GitHub CLI directly). + +## Frontmatter + +```yaml +name: codehub-pr-description +argument-hint: "[--base ] [--head ] [--out ]" +color: teal +model: sonnet +``` + +## Preconditions + +- `git diff --name-only ..` must return ≥ 1 path. **Refuses on a clean tree** with `No diff detected — resolve base/head or stage changes.` + +## Process + +1. Resolve `--base` (default `main`) and `--head` (default `HEAD`). +2. `mcp__opencodehub__detect_changes({base, head})` → affected symbols + processes. +3. `mcp__opencodehub__verdict({base, head})` → 5-tier merge recommendation. +4. `mcp__opencodehub__owners({paths})` → required reviewers per path. +5. `mcp__opencodehub__list_findings_delta({base, head})` → new / resolved scanner findings. +6. For verdict tier ≥ 3: `mcp__opencodehub__impact({symbol, direction: "downstream", depth: 2})` — spell out who breaks. +7. For public API changes: `mcp__opencodehub__api_impact({route})` when the diff touches a handler. +8. Assemble the Markdown body and write to `` (default `.codehub/pr/PR-.md`). + +## Output shape + +```markdown +# + +## Summary +2–3 sentences — what changes, why. + +## Verdict +**Tier