diff --git a/.claude/memory/MEMORY.md b/.claude/memory/MEMORY.md index 2535f15..5cd594b 100644 --- a/.claude/memory/MEMORY.md +++ b/.claude/memory/MEMORY.md @@ -7,7 +7,8 @@ Keep this file under 200 lines — anything longer is content bloat, not memory. - [learnings/sentry-mcp-no-comment-tool](learnings/2026-05-26-sentry-mcp-no-comment-tool.md) — Sentry MCP can't comment on issues; back-link from Linear only - [learnings/vercel-blocks-unknown-author-email](learnings/2026-05-26-vercel-blocks-unknown-author-email.md) — Vercel preview deploys block when commit author email has no GitHub account; use the noreply alias - [learnings/sandbox-cant-clone-private-repo](learnings/2026-05-26-sandbox-cant-clone-private-repo.md) — Don't `git clone` from sandbox bash; the `github_repository` resource is auth'd, raw `git clone` is not -- [learnings/github-mcp-strips-html-comments](learnings/2026-05-26-github-mcp-strips-html-comments.md) — `update_pull_request` silently strips `` from PR bodies; session-id marker can't be set by agent (ENG-25) +- [learnings/github-mcp-strips-html-comments](learnings/2026-05-26-github-mcp-strips-html-comments.md) — `create_pull_request` AND `update_pull_request` strip `` from PR bodies; ENG-25 resolved by moving the session-id marker to a plain-text line +- [learnings/regex-last-match-semantics](learnings/2026-05-26-regex-last-match-semantics.md) — `String.match(/.../m)` returns FIRST match; for "last line of body" use `matchAll(...).at(-1)` (PR #20 bug) - [learnings/recheck-open-prs-at-pr-open](learnings/2026-05-26-recheck-open-prs-at-pr-open.md) — session-start grep is insufficient; re-grep `list_pull_requests` right before `create_pull_request` to catch parallel-session races (PR #18 vs #20 ENG-25) - [learnings/closed-pr-receives-review-after-close](learnings/2026-05-26-closed-pr-receives-review-after-close.md) — Closed PR can still get a REQUEST_CHANGES review seconds after dup-close; don't reopen, surface to sibling PR + Linear (PR #18 vs #20 ENG-25 race) @@ -16,6 +17,6 @@ Keep this file under 200 lines — anything longer is content bloat, not memory. - [decisions/two-agent-builder-reviewer](decisions/2026-05-25-two-agent-builder-reviewer.md) — Separate agents for build vs. review so the reviewer reads diffs cold ## Conventions -- [conventions/pr-session-id-marker](conventions/pr-session-id-marker.md) — PR body MUST end with `` (or legacy `sthr_...`) so webhooks can resume +- [conventions/pr-session-id-marker](conventions/pr-session-id-marker.md) — PR body MUST end with plain-text `session-id: sesn_...` on its own line; legacy `` shape also matched but stripped by MCP (ENG-25) - [conventions/agent-review-marker](conventions/agent-review-marker.md) — Reviewer's verdict goes on the first line as `AGENT_REVIEW: APPROVED|REQUEST_CHANGES|ESCALATE — ` - [conventions/check-open-pr-before-ticket-pickup](conventions/check-open-pr-before-ticket-pickup.md) — Before branching for a Linear ticket, grep open PRs for the ticket ID; abort if one already exists (PR #15 vs #16 ENG-26 race) diff --git a/.claude/memory/conventions/pr-session-id-marker.md b/.claude/memory/conventions/pr-session-id-marker.md index 4bad662..2df86a9 100644 --- a/.claude/memory/conventions/pr-session-id-marker.md +++ b/.claude/memory/conventions/pr-session-id-marker.md @@ -1,10 +1,10 @@ # PR body session-id marker -Every PR opened by the manager MUST end with this HTML comment as the -**last line** of the PR body: +Every PR opened by the manager MUST end with a `session-id:` marker on +its own line as the last non-empty line of the PR body: ``` - +session-id: sesn_xxxxxxxxxxxxxxxx ``` The `/api/github-webhook` route extracts this marker when a webhook @@ -12,17 +12,21 @@ fires for the PR (e.g. `issue_comment.created` with `AGENT_REVIEW: APPROVED`). With it, the webhook resumes the original manager session — full implementation context, no re-explaining. -The webhook regex accepts both the legacy `sthr_` prefix and the -current `sesn_` prefix returned by `client.beta.sessions.create()`. -Use whichever prefix the kickoff `user.message` carries. +## Accepted shapes -Without it, the webhook falls back to creating a fresh session. The -fresh session loses all design rationale and re-derives everything. -Functional but wasteful. +The webhook regex accepts: -The kickoff `user.message` includes the actual session id. Substitute -it verbatim; don't paraphrase or omit. +- Plain-text line: `session-id: sesn_...` — **preferred**. Survives + the GitHub MCP `update_pull_request` and `create_pull_request` body + filters (ENG-25). Always use this on new PRs. +- Legacy HTML comment: `` — still + matched for back-compat with PRs opened before ENG-25 landed, but + the agent cannot write this shape on a new PR because MCP strips it + on body create AND update. Read-only acceptance. -The regex in `app/api/github-webhook/route.ts` accepts either -`sesn_` (current SDK prefix) or `sthr_` (legacy) — both are valid. -Use whatever the kickoff hands you. +Both `sesn_` (current SDK prefix) and `sthr_` (legacy) are accepted. +Use whatever the kickoff `user.message` carries, verbatim. + +Without the marker, the webhook falls back to creating a fresh +session. The fresh session loses all design rationale and re-derives +everything. Functional but wasteful. diff --git a/.claude/memory/learnings/2026-05-26-github-mcp-strips-html-comments.md b/.claude/memory/learnings/2026-05-26-github-mcp-strips-html-comments.md index 5e8fde0..645bf8a 100644 --- a/.claude/memory/learnings/2026-05-26-github-mcp-strips-html-comments.md +++ b/.claude/memory/learnings/2026-05-26-github-mcp-strips-html-comments.md @@ -1 +1 @@ -The GitHub MCP `update_pull_request` tool unconditionally strips `` HTML comments from the `body` field before persisting — the PATCH fires (other prose changes land, `updated_at` advances), but any HTML comment line silently disappears. Confirmed twice on PR #12 review-feedback round 2 with both `sesn_...` and `sthr_...` prefixes; the filter isn't gated on content. This makes the `pr-session-id-marker` convention unenforceable by the agent via the standard transport — there's no `GITHUB_TOKEN` in the sandbox env and `api.github.com` is outside the egress allowlist, so you can't route around MCP with a direct PATCH. Don't waste review rounds on it — see ENG-25 for the fix proposal and the open question of whether `create_pull_request` shares the same filter (worth opportunistic testing whenever you open a new PR). +The GitHub MCP `update_pull_request` AND `create_pull_request` tools both unconditionally strip `` HTML comments from the `body` field before persisting — the PATCH/POST fires (other prose changes land, `updated_at` advances), but any HTML comment line silently disappears. Confirmed across PR #12 review-feedback round 2 (`update_pull_request`, both `sesn_...` and `sthr_...` prefixes) and PR #17 initial open (`create_pull_request`, `sesn_...`). The filter is unconditional, not gated on content. Worse: the strip is **greedy** — an unclosed `` in the same paragraph) causes the MCP to drop everything from that point to the end of the body. PR #20's first body version hit this and ate its own trailing session-id marker mid-PR-creation. Workaround in prose: refer to the token by phrase (\"HTML-comment opener\", \"the HTML-comment shape\") rather than as a literal substring. ENG-25 resolved the convention problem by moving the `pr-session-id-marker` to a plain-text line shape that MCP doesn't filter; the webhook regex still accepts the legacy HTML form for back-compat with already-merged PRs. Don't write HTML-comment markers on new PRs — they'll vanish and waste review rounds. diff --git a/.claude/memory/learnings/2026-05-26-regex-last-match-semantics.md b/.claude/memory/learnings/2026-05-26-regex-last-match-semantics.md new file mode 100644 index 0000000..5ef561c --- /dev/null +++ b/.claude/memory/learnings/2026-05-26-regex-last-match-semantics.md @@ -0,0 +1,18 @@ +# Regex "last match" needs `/g` + matchAll, not just `/m` + +`String.prototype.match(/.../m)` returns the **first** occurrence; the `m` +flag only line-anchors `^` / `$`. If your convention says "the marker is +the last line of the body" (e.g. `pr-session-id-marker`), you must take +the last match explicitly: + +```ts +const all = [...text.matchAll(/^\s*marker:\s*(\w+)\s*$/gm)]; +const value = all.length ? all[all.length - 1][1] : null; +``` + +PR #20 shipped the buggy form first and a docs-only code-block placeholder +(`marker: xxxxxxxxxxxxxxxx`) won against the real trailer, silently. The +unit test at `tests/unit/extract-session-id.spec.ts` now guards this case. + +General rule: any time the contract is "the LAST occurrence of X in a +multi-line body," reach for `matchAll(...).at(-1)`, not `match(... /m)`. diff --git a/app/api/github-webhook/route.ts b/app/api/github-webhook/route.ts index 15fc659..79df217 100644 --- a/app/api/github-webhook/route.ts +++ b/app/api/github-webhook/route.ts @@ -1,5 +1,6 @@ import Anthropic from '@anthropic-ai/sdk'; import { createHmac, timingSafeEqual } from 'node:crypto'; +import { extractSessionId } from '@/lib/extract-session-id'; export const runtime = 'nodejs'; export const maxDuration = 60; @@ -15,12 +16,6 @@ function verifySignature(rawBody: string, sigHeader: string | null, secret: stri return timingSafeEqual(a, b); } -function extractSessionId(text: string | undefined | null): string | null { - if (!text) return null; - const m = text.match(//); - return m?.[1] ?? null; -} - async function fetchPrBody(prNumber: number, token: string): Promise { const res = await fetch( `https://api.github.com/repos/WillTaylor22/self-managing-codebase/pulls/${prNumber}`, diff --git a/lib/extract-session-id.ts b/lib/extract-session-id.ts new file mode 100644 index 0000000..2df4fbb --- /dev/null +++ b/lib/extract-session-id.ts @@ -0,0 +1,21 @@ +// Pulled out of `app/api/github-webhook/route.ts` so it can be unit-tested +// without instantiating Next's route module. Pure, no I/O. +// +// Accepts two shapes (see .claude/memory/conventions/pr-session-id-marker.md): +// 1. Plain-text line: session-id: sesn_xxxx (preferred — MCP-survivable, ENG-25) +// 2. HTML comment: (legacy, read-only-accepted) +// +// Convention says the marker is the LAST non-empty line of the PR body. We +// honor that for the plain shape so a placeholder inside a fenced code block +// earlier in the body doesn't beat the real trailer (PR #20 review). +// +// The HTML shape is left first-wins: distinctive token, and we don't author +// it anymore so duplicate-in-prose risk is low. +export function extractSessionId(text: string | undefined | null): string | null { + if (!text) return null; + const html = text.match(//); + if (html) return html[1]; + const plain = [...text.matchAll(/^\s*session-id:\s*((?:sthr_|sesn_)[A-Za-z0-9]+)\s*$/gm)]; + if (plain.length) return plain[plain.length - 1][1]; + return null; +} diff --git a/playwright.config.ts b/playwright.config.ts index 8a2f65c..b50210f 100644 --- a/playwright.config.ts +++ b/playwright.config.ts @@ -3,7 +3,12 @@ import { defineConfig, devices } from '@playwright/test'; const baseURL = process.env.BASE_URL ?? 'http://localhost:3000'; export default defineConfig({ - testDir: './tests/e2e', + // Picks up both `tests/e2e/*.spec.ts` and `tests/unit/*.spec.ts`. + // Unit specs (e.g. extract-session-id) are pure-function tests that + // simply don't touch the `page` fixture; they still run under the + // chromium project but never spawn a browser context. + testDir: './tests', + testMatch: ['e2e/**/*.spec.ts', 'unit/**/*.spec.ts'], globalSetup: './tests/global-setup.ts', fullyParallel: true, forbidOnly: !!process.env.CI, diff --git a/tests/unit/extract-session-id.spec.ts b/tests/unit/extract-session-id.spec.ts new file mode 100644 index 0000000..78de6b1 --- /dev/null +++ b/tests/unit/extract-session-id.spec.ts @@ -0,0 +1,92 @@ +import { test, expect } from '@playwright/test'; +import { extractSessionId } from '../../lib/extract-session-id'; + +// Pure-function unit tests for the webhook's session-id marker extractor. +// No browser, no network — Playwright is just our test runner here. +// +// The shape contract is documented in +// .claude/memory/conventions/pr-session-id-marker.md. + +test.describe('extractSessionId', () => { + test('null body → null', () => { + expect(extractSessionId(null)).toBeNull(); + }); + + test('undefined body → null', () => { + expect(extractSessionId(undefined)).toBeNull(); + }); + + test('empty body → null', () => { + expect(extractSessionId('')).toBeNull(); + }); + + test('plain-text marker as the last line', () => { + const body = ['# Some PR', '', 'Body text.', '', 'session-id: sesn_abc123'].join('\n'); + expect(extractSessionId(body)).toBe('sesn_abc123'); + }); + + test('plain-text marker tolerates trailing whitespace / blank lines', () => { + const body = '# Some PR\n\nsession-id: sesn_abc123 \n\n\n'; + expect(extractSessionId(body)).toBe('sesn_abc123'); + }); + + test('sthr_ legacy prefix accepted on the plain shape', () => { + const body = 'Body.\n\nsession-id: sthr_legacy999'; + expect(extractSessionId(body)).toBe('sthr_legacy999'); + }); + + test('HTML-comment marker (legacy shape) still matches', () => { + const body = 'Body.\n\n'; + expect(extractSessionId(body)).toBe('sesn_html42'); + }); + + test('HTML shape wins over plain when both are present (legacy precedence)', () => { + // Documented behavior — legacy PRs may have both during transition. + const body = 'session-id: sesn_plain1\n\n'; + expect(extractSessionId(body)).toBe('sesn_html2'); + }); + + test('placeholder in a fenced code block does NOT beat the real trailer (PR #20)', () => { + // This is the exact regression: an earlier `session-id:` line that's + // documentation (e.g. inside a ```code``` block) used to win because + // .match without /g returns the first occurrence. The real marker is + // the last non-empty line; that must be what we return. + const body = [ + '## Fix', + '', + 'The canonical shape is now a plain-text line on its own:', + '', + '```', + 'session-id: sesn_xxxxxxxxxxxxxxxx', + '```', + '', + 'More prose here.', + '', + 'session-id: sesn_012j21sUvdmnhx3baX6ivYLW', + ].join('\n'); + expect(extractSessionId(body)).toBe('sesn_012j21sUvdmnhx3baX6ivYLW'); + }); + + test('multiple plain markers → last one wins', () => { + const body = 'session-id: sesn_first\nstuff\nsession-id: sesn_second\nmore\nsession-id: sesn_third'; + expect(extractSessionId(body)).toBe('sesn_third'); + }); + + test('inline mention of "session-id:" in prose does NOT match', () => { + // The marker must be on its own line. A sentence like "the session-id: foo line" + // should not produce a false match — note the `^\s* ... \s*$` anchors. + const body = 'In the body we mention the session-id: sesn_inline in prose. End.'; + expect(extractSessionId(body)).toBeNull(); + }); + + test('unknown prefix (not sthr_/sesn_) does not match', () => { + const body = 'session-id: zzzz_notvalid'; + expect(extractSessionId(body)).toBeNull(); + }); + + test('plain marker indented by leading whitespace is still matched', () => { + // Tolerated by `^\s*` — quoted-block / list-indented bodies still work. + const body = 'Body.\n\n session-id: sesn_indented'; + expect(extractSessionId(body)).toBe('sesn_indented'); + }); +});