fix(cache): revalidate route hardening & ordering (#5 review) by DevHDI · Pull Request #15 · DevHDI/my-bmad

DevHDI · 2026-05-13T20:35:34Z

Summary

Addresses 3 cache/revalidation findings surfaced by the Raven's Verdict review on bmad-method-ui#5.

Each commit is atomic and references its finding.

Findings addressed

#	Severity	Issue	Fix
#8	🔴 Critical	`request.json()` not caught — malformed body returned 500 instead of 400	Wrap in try/catch returning 400 "Invalid JSON"
#6	🔴 Critical	`revalidateTag` called before the GitHub fetch — a failed fetch left the cache invalidated with no recovery	Move `revalidateTag` after both fetch and `prisma.repo.update` succeed
#13	🟡 Moderate	`/api/revalidate` had no rate limit; if the shared secret leaked, an attacker could hammer it	Add per-IP rate limit (30/min) via the existing `checkRateLimit` helper

Commits

15bfa26 fix(api): handle malformed JSON on /api/revalidate as 400
ff0b729 fix(actions): revalidate cache only after successful refresh
93023cf fix(api): rate-limit /api/revalidate per client IP

Verification

pnpm tsc --noEmit — clean (after clearing stale .next/types cache)
pnpm lint — 0 errors (2 pre-existing warnings unrelated)
pnpm test — 151/151 passing

Test plan

curl -X POST /api/revalidate -H "x-revalidate-secret: …" -d 'not-json' → expect HTTP 400 (was 500)
Simulate a GitHub getTree failure in refreshGitHubRepo → cache tag is NOT invalidated; existing cached entry remains
40 rapid POST requests to /api/revalidate from one IP → expect 429 after request 30 within 60s

Context

PR 2 of 4 addressing bmad-method-ui#5 review. Follow-ups:

PR 1 ✅ security & input hardening (feat(bmad): BMAD V1 compatibility — dynamic output, frontmatter-driven status, epic-folder layout #7, fix(security): authorization & input hardening (#5 review) #14, feat: local folder support + email/password auth #2) — open as #14
PR 3 ⏳ docker & env hygiene (fix(local): scan only BMAD dirs to avoid file count limit #3, Test #10, Test #11)
PR 4 ⏳ observability (fix(cache): revalidate route hardening & ordering (#5 review) #15, fix(infra): docker fail-fast & env hygiene (#5 review) #16, chore(observability): admin metrics & error logging (#5 review) #17)

await request.json() throws SyntaxError on empty bodies, wrong Content-Type, or truncated payloads. The error was not caught, so the endpoint returned an unstructured 500 instead of a 400, and bypassed any error-monitoring middleware that expects structured 4xx responses. Wrap the parse in try/catch and return a clear 400 "Invalid JSON". Addresses finding #8 from bmad-method-ui#5 review.

refreshGitHubRepo called revalidateTag *before* the GitHub fetch. When the fetch threw (network error, rate limit, 404), the cache was already invalidated: subsequent renders fetched stale or empty data with no recovery until a future successful refresh. Move revalidateTag to the end of the function, after the GitHub fetch and the prisma.repo.update both succeed. If either step fails, the existing cached entry remains valid. Addresses finding #6 from bmad-method-ui#5 review.

The endpoint was protected by a shared secret with no rate limit of its own. If the secret were ever obtained (logs, accidental commit, compromised CI), an attacker could issue unlimited revalidateTag calls to drive continuous cache invalidation and elevate upstream load. Add a per-IP limit using the existing in-memory checkRateLimit helper: 30 requests per minute, which comfortably exceeds any legitimate build/deploy cadence. The IP is read from x-forwarded-for (first hop) with x-real-ip and a literal "unknown" fallback so a missing header cannot bypass the bucket. Defense in depth — the secret remains the primary protection. Addresses finding #13 from bmad-method-ui#5 review.

The per-IP pre-auth limit added in 93023cf trusted the x-forwarded-for header, which is client-controlled. In the very threat the limit was meant to mitigate ("secret leaked"), the attacker can rotate x-forwarded-for on every request and obtain a fresh bucket each time, neutralising the cap. Restructure the handler: 1. Validate the shared secret first. timingSafeEqual on SHA-256 digests is computationally resistant to brute force, so no pre-auth quota is needed. 2. Apply a *post-auth* rate limit with a fixed key `revalidate:authenticated`. This caps total invalidations across all valid-secret holders, so even with a leaked secret an attacker is bounded to 30/min regardless of source IP. Drop the now-unused getClientIp helper. Addresses follow-up review on 93023cf.

DevHDI · 2026-05-13T21:04:36Z

Follow-up fix: rate-limit was bypassable via x-forwarded-for spoofing

Reviewer's finding was correct. The per-IP pre-auth limit added in 93023cf trusted a client-controlled header, so the very scenario it was meant to mitigate (secret leaked) could be bypassed by rotating x-forwarded-for on every request.

Fix in `8dfe59d`

Restructured src/app/api/revalidate/route.ts:

Validate the shared secret first. timingSafeEqual on SHA-256 digests makes brute force computationally infeasible (2^256 search space), so no pre-auth quota is meaningful.
Apply a post-auth rate limit with a fixed global key (revalidate:authenticated). This caps total invalidations across all valid-secret holders — even with a leaked secret, an attacker is bounded to 30/min regardless of source IP.
Dropped the now-unused getClientIp helper.

Verification

pnpm tsc --noEmit — clean
pnpm lint — 0 errors (2 pre-existing warnings)
pnpm test — 151/151

Note

Considered keeping a layered approach (pre-auth IP-or-global + post-auth global) but rejected it: pre-auth limits without trusted IP source are theatre, and the secret check itself is fast enough that pathological 401-spam is bounded by the runtime's request handling capacity. Single post-auth global bucket is the honest fix.

fetchFileContent wraps provider.extendBmadDirs in `try { ... } catch {}` with a comment relying on the downstream getFileContent → assertSafePath check to enforce the whitelist invariant. That reasoning is still correct (assertSafePath refuses any path whose top-level segment is not in the whitelist, so an extension that failed to apply cannot be exploited), but discarding the error gives zero visibility when the validation path is exercised in production. Replace the silent catch with a console.warn that captures the owner / name / segment / error message. No behavior change — the authoritative refusal still happens in assertSafePath if the segment was not added. Addresses finding #15 from bmad-method-ui#5 review.

* fix(admin): show parsingErrorRate as N/A until tracking exists The admin dashboard displayed `0%` for "Parsing Errors" with the description "Target KPI: < 1%", suggesting the metric was working when in fact no parsing-error log table exists yet — the value was a hardcoded zero. Admins reading "0%" would conclude the system has no parsing errors when the reality is the data is not collected. Change `parsingErrorRate` from `number` to `number | null` in the UsageMetrics interface, return `null` from getUsageMetrics(), and have the stats card render "N/A" with the description "Not yet tracked" when the value is null. The existing display path (number + "%") is preserved for when a real metric is wired up. Addresses finding #17 from bmad-method-ui#5 review. * fix(actions): represent detectBmadRepos batch failures as null When a GraphQL batch fails (transient network error, GitHub hiccup), the catch handler set every repo in that chunk to `false`, which is the same value used for "confirmed no BMAD". A caller asking the UI "does this repo have BMAD?" got "no" — when the truthful answer was "we don't know, the API call failed". Change the result map type from `Record<string, boolean>` to `Record<string, boolean | null>`: - true → BMAD detected - false → confirmed absent - null → detection failed for this repo (unknown) The single in-repo consumer (add-repo-dialog) already coerces `bmadResult.data[r.fullName] ?? false` at the boundary so existing behavior is preserved; surfacing the unknown count in the UI is a follow-up that can build on this contract. Addresses finding #16 from bmad-method-ui#5 review. * chore(actions): log extendBmadDirs failures instead of silently catching fetchFileContent wraps provider.extendBmadDirs in `try { ... } catch {}` with a comment relying on the downstream getFileContent → assertSafePath check to enforce the whitelist invariant. That reasoning is still correct (assertSafePath refuses any path whose top-level segment is not in the whitelist, so an extension that failed to apply cannot be exploited), but discarding the error gives zero visibility when the validation path is exercised in production. Replace the silent catch with a console.warn that captures the owner / name / segment / error message. No behavior change — the authoritative refusal still happens in assertSafePath if the segment was not added. Addresses finding #15 from bmad-method-ui#5 review. * fix(dialog): surface unknown-detection count to the user Commit 8090e03 had the server return `null` for repos in a failed batch, but add-repo-dialog still applied `?? false` at the boundary and rendered the repo identically to a confirmed "no BMAD". The data contract was honest but the user saw the same lie as before. Count the null values returned by detectBmadRepos and surface that count via a new `unknownDetectCount` state. When > 0 the dialog shows an amber warning ("BMAD detection failed for N repositor(y/ies)…") in the same area as the existing detection status. The repo list itself stays usable; only the misleading badge gets the disclaimer. State name is intentionally distinct from existing/planned warning state in PR #14 so the two surfaces can coexist when both PRs merge. Addresses the follow-up review on 8090e03. * fix(dialog): use complete words for repository pluralization The previous JSX split the word as `repositor` + `{count === 1 ? "y" : "ies"}`. React preserves whitespace between text and expressions, so the rendered output had a space in the middle of the word ("repositor y" / "repositor ies"). Move the whole word into the conditional ("repository" / "repositories") and use an explicit `{" "}` to control the space between the count and the word. The dash and trailing text are also on their own lines, which JSX collapses to single spaces correctly. Addresses the follow-up review on 56567db.

DevHDI added 4 commits May 13, 2026 22:34

This was referenced May 13, 2026

fix(infra): docker fail-fast & env hygiene (#5 review) #16

Merged

chore(observability): admin metrics & error logging (#5 review) #17

Merged

DevHDI merged commit 2c6c85d into main May 13, 2026
2 checks passed

DevHDI deleted the fix/revalidate-route-hardening branch May 13, 2026 22:00

DevHDI mentioned this pull request May 13, 2026

fix(security): authorization & input hardening (#5 review) #14

Merged

6 tasks

DevHDI mentioned this pull request May 13, 2026

feat: import MyBMAD web dashboard alongside VS Code extension bmad-code-org/bmad-method-ui#5

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cache): revalidate route hardening & ordering (#5 review)#15

fix(cache): revalidate route hardening & ordering (#5 review)#15
DevHDI merged 4 commits into
mainfrom
fix/revalidate-route-hardening

DevHDI commented May 13, 2026

Uh oh!

DevHDI commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DevHDI commented May 13, 2026

Summary

Findings addressed

Commits

Verification

Test plan

Context

Uh oh!

DevHDI commented May 13, 2026

Follow-up fix: rate-limit was bypassable via x-forwarded-for spoofing

Fix in 8dfe59d

Verification

Note

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix in `8dfe59d`