Skip to content

chore(proto): auto-commit generated artifacts for internal PRs#27

Open
lspassos1 wants to merge 5 commits into
mainfrom
chore/sebuf-codegen-ci
Open

chore(proto): auto-commit generated artifacts for internal PRs#27
lspassos1 wants to merge 5 commits into
mainfrom
chore/sebuf-codegen-ci

Conversation

@lspassos1

Copy link
Copy Markdown
Owner

Summary

Updates the proto generation workflow so same-repository PR branches can receive an automatic generated-artifacts commit after make generate, reducing generated-code churn in human-authored PRs. Fork PRs keep a safe verification-only path because CI should not push privileged commits back to fork branches.

Closes #22.
Refs koala73#3340.

Root cause

The existing proto workflow only checks generated artifact freshness and fails when src/generated/ or docs/api/ drift. That forces contributors to carry generated diffs in the same PR and creates review noise around Sebuf/OpenAPI changes.

Changes

  • Splits checkout behavior for internal PR branches and fork-origin PRs.
  • Grants contents: write so internal PR branches can receive a bot commit when generated artifacts drift.
  • Keeps fork-origin PRs on a safe make generate + diff verification path with clearer failure messaging.
  • Documents the generated-artifact workflow in CONTRIBUTING.md.

Validation

  • Confirmed the fork main branch is aligned with koala73/main before branching.
  • Reviewed the GitHub compare diff: 2 files changed, 29 workflow additions, 8 docs additions.
  • Local shell execution is currently unavailable in this session, so make generate was not run locally. This PR should validate the workflow path through GitHub Actions.

Risk

Moderate workflow risk: this changes CI behavior and allows bot pushes only for same-repository PRs. Fork PRs remain verification-only to avoid privileged execution on untrusted branches.

@vercel

vercel Bot commented Apr 27, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
worldmonitor Ignored Ignored Preview Apr 28, 2026 4:41pm

greptile-apps[bot]

This comment was marked as off-topic.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cee229746f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/proto-check.yml Outdated
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add src/generated/ docs/api/
git commit -m "chore(proto): update generated artifacts"
git push origin HEAD:${{ github.event.pull_request.head.ref }}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Quote branch ref in git push command

The git push origin HEAD:${{ github.event.pull_request.head.ref }} call interpolates an unquoted branch name directly into a shell command. Git branch names can include shell metacharacters (for example ;), so a same-repo PR branch with a crafted name can inject additional commands in this step and execute them with the job’s contents: write token. This makes the auto-commit path unsafe for internal PR branches unless the ref is quoted/escaped or validated before use.

Useful? React with 👍 / 👎.

Repository owner deleted a comment from coderabbitai Bot Apr 27, 2026
Repository owner deleted a comment from coderabbitai Bot Apr 27, 2026
Repository owner deleted a comment from coderabbitai Bot Apr 27, 2026
coderabbitai[bot]

This comment was marked as off-topic.

@coderabbitai

coderabbitai Bot commented Apr 27, 2026

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Repository owner deleted a comment from coderabbitai Bot Apr 27, 2026
@lspassos1

Copy link
Copy Markdown
Owner Author

Addressed the bot review item in 1dc9e0e.

Change:

  • Moved the PR head ref into the step environment and quoted it in the git push target so the shell does not evaluate an interpolated branch name directly.

Validation run locally:

  • ruby -e "require 'yaml'; YAML.load_file('.github/workflows/proto-check.yml'); puts 'yaml ok'"
  • git diff --check

@coderabbitai

coderabbitai Bot commented Apr 28, 2026

Copy link
Copy Markdown
📝 Walkthrough

Summary by CodeRabbit

  • Chores
    • Enhanced CI/CD workflow to automatically manage and verify consistency of generated artifacts
    • Updated contributor documentation to clarify expected workflows for handling generated artifacts in pull requests

Walkthrough

The CI proto-check workflow is split by PR origin: same-repo PRs run generation and may auto-commit generated artifacts back to the branch; fork-origin PRs only verify generated artifacts against the merge result and fail on missing/stale generated files. CONTRIBUTING.md documents this contract.

Changes

Cohort / File(s) Summary
Proto Code Generation Workflow
.github/workflows/proto-check.yml
Refactors proto-check into three behaviors based on PR origin: fork PRs run freshness checks against the merge commit and fail on stale/untracked generated outputs; internal PRs run make generate, auto-commit/push regenerated artifacts using contents: write, then re-run verification against the merge result (skipping enforcement when auto-push occurred). Verification messaging updated to "fresh against the PR merge result."
Contribution Guidelines
CONTRIBUTING.md
Adds contributor guidance requiring committed generated artifacts (src/generated/, docs/api/), explains CI auto-generation and auto-commit behavior for same-repo PRs, and clarifies fork PRs must run make generate locally and commit updates before review.

Sequence Diagram

sequenceDiagram
    participant GH as GitHub (PR)
    participant CI as CI Workflow
    participant Repo as Repository
    participant Git as Git API

    alt Internal PR (same-repo)
        GH->>CI: trigger proto-check
        CI->>Repo: checkout PR head
        CI->>CI: run make generate
        CI->>CI: diff `proto/` vs `src/generated/` & `docs/api/`
        alt changes detected
            CI->>Git: write updated generated artifacts (contents: write)
            Git->>Repo: push commit to PR branch
            CI->>CI: skip merge-result enforcement or re-run verification marked as satisfied
        else no changes
            CI->>CI: report "fresh against the PR merge result"
        end
    else Fork-origin PR
        GH->>CI: trigger proto-check
        CI->>Repo: checkout merge commit
        CI->>CI: run make generate
        CI->>CI: diff `proto/` vs generated outputs
        alt stale or untracked outputs
            CI->>CI: fail with explicit message instructing local generation & commit
        else fresh
            CI->>CI: report "fresh against the PR merge result"
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I nibble on bytes and spin up a cheer,

Generated protos now tidy and near.
Internal branches get a gentle shove,
Forks stay safe — run generate, commit with love.
Hops of joy for CI that helps us move.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The PR description covers the summary, root cause, changes, validation approach, and risk assessment. However, it does not follow the repository's standard template structure (Type of change, Affected areas, Checklist, Screenshots). Consider whether the custom description format is acceptable for CI/infrastructure changes, or align with the standard template by adding Type of change (CI/Build/Infrastructure) and relevant checklist items.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'chore(proto): auto-commit generated artifacts for internal PRs' clearly and directly describes the main change—enabling automatic commits of generated artifacts for internal PR branches.
Linked Issues check ✅ Passed The PR fulfills all coding requirements from issue #22: internal PRs auto-commit generated artifacts, fork PRs remain verification-only, generated files stay checked in, and contributor docs are updated. All acceptance criteria are marked complete.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the linked issue #22: workflow refactoring for proto generation, permission updates for internal PR branches, and documentation updates. No unrelated changes detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/sebuf-codegen-ci

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/proto-check.yml:
- Around line 50-64: Add a concurrency group to the workflow job that performs
the "Commit generated artifacts for internal PRs" push so stale runs for the
same PR are cancelled before they attempt to push; add a top-level concurrency
block on the job (jobs.<that job id>.concurrency) using a key based on the PR
number or head ref (e.g., github.event.pull_request.number or github.head_ref)
and set cancel-in-progress: true to ensure only the latest run for that PR
proceeds to the git push step.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: d7cabb1a-0f80-4c4d-bd21-dc5566ccf66f

📥 Commits

Reviewing files that changed from the base of the PR and between 3741676 and 1dc9e0e.

📒 Files selected for processing (2)
  • .github/workflows/proto-check.yml
  • CONTRIBUTING.md

Comment thread .github/workflows/proto-check.yml Outdated
@lspassos1

Copy link
Copy Markdown
Owner Author

Addressed the latest CodeRabbit item in d3dff9a.

Change:

  • Added job-level concurrency for proto-freshness keyed by pull request number with cancel-in-progress enabled, so stale runs for the same PR are cancelled before they can race on the generated-artifact push.

Validation run locally:

  • ruby -e "require 'yaml'; YAML.load_file('.github/workflows/proto-check.yml'); puts 'yaml ok'"
  • git diff --check

@lspassos1

Copy link
Copy Markdown
Owner Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/proto-check.yml:
- Around line 39-42: Update the codegen-detection logic so changes to the
Makefile trigger regeneration: include "Makefile" in the GENERATED_CHANGED check
(or add a separate grep for Makefile) so the existing if condition that uses
PROTO_CHANGED and GENERATED_CHANGED will treat Makefile edits as codegen inputs;
specifically modify the GENERATED_CHANGED assignment that currently checks
'^(src/generated/|docs/api/)' to also match 'Makefile' so the if [ -n
"$PROTO_CHANGED" ] && [ -z "$GENERATED_CHANGED" ] branch will block PRs that
change Makefile without regenerated artifacts.
- Around line 97-117: The current workflow pushes generated-artifact commits in
the "Commit generated artifacts" step (id: commit) using the default
GITHUB_TOKEN which will not trigger subsequent workflow runs, so the downstream
job internal-merge-freshness may never run; fix this by moving the merge-result
verification into the same job before the git push (i.e., run the
freshness/merge checks in-band after generation and before the git commit/push
in the commit step), or alternatively replace the push with a trigger that will
fire workflows (use a repository PAT or a workflow_dispatch/dispatch API call
from this job to create a new run) and adjust the logic that checks
needs.internal-auto-generate.outputs.pushed to reflect the new trigger
mechanism.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: 7f74b6a1-411d-4199-ae95-9a1392a0aa2e

📥 Commits

Reviewing files that changed from the base of the PR and between 1dc9e0e and ddac671.

📒 Files selected for processing (2)
  • .github/workflows/proto-check.yml
  • CONTRIBUTING.md
✅ Files skipped from review due to trivial changes (1)
  • CONTRIBUTING.md

Comment on lines +39 to +42
PROTO_CHANGED=$(printf '%s\n' "$CHANGED_FILES" | grep -E '^proto/' || true)
GENERATED_CHANGED=$(printf '%s\n' "$CHANGED_FILES" | grep -E '^(src/generated/|docs/api/)' || true)

if [ -n "$PROTO_CHANGED" ] && [ -z "$GENERATED_CHANGED" ]; then

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Treat Makefile changes as codegen inputs here.

Line 42 only fails when ^proto/ changed, but make generate also depends on Makefile. A fork PR that updates generator versions or flags in Makefile can currently pass with stale src/generated/ / docs/api/ outputs, even though the same change on an internal branch would auto-commit regenerated artifacts.

Suggested fix
-          PROTO_CHANGED=$(printf '%s\n' "$CHANGED_FILES" | grep -E '^proto/' || true)
+          CODEGEN_INPUT_CHANGED=$(printf '%s\n' "$CHANGED_FILES" | grep -E '^(proto/|Makefile$)' || true)
           GENERATED_CHANGED=$(printf '%s\n' "$CHANGED_FILES" | grep -E '^(src/generated/|docs/api/)' || true)

-          if [ -n "$PROTO_CHANGED" ] && [ -z "$GENERATED_CHANGED" ]; then
+          if [ -n "$CODEGEN_INPUT_CHANGED" ] && [ -z "$GENERATED_CHANGED" ]; then
             echo ""
             echo "============================================================"
-            echo "ERROR: Proto files changed without committed generated artifacts."
+            echo "ERROR: Codegen inputs changed without committed generated artifacts."
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
PROTO_CHANGED=$(printf '%s\n' "$CHANGED_FILES" | grep -E '^proto/' || true)
GENERATED_CHANGED=$(printf '%s\n' "$CHANGED_FILES" | grep -E '^(src/generated/|docs/api/)' || true)
if [ -n "$PROTO_CHANGED" ] && [ -z "$GENERATED_CHANGED" ]; then
CODEGEN_INPUT_CHANGED=$(printf '%s\n' "$CHANGED_FILES" | grep -E '^(proto/|Makefile$)' || true)
GENERATED_CHANGED=$(printf '%s\n' "$CHANGED_FILES" | grep -E '^(src/generated/|docs/api/)' || true)
if [ -n "$CODEGEN_INPUT_CHANGED" ] && [ -z "$GENERATED_CHANGED" ]; then
echo ""
echo "============================================================"
echo "ERROR: Codegen inputs changed without committed generated artifacts."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/proto-check.yml around lines 39 - 42, Update the
codegen-detection logic so changes to the Makefile trigger regeneration: include
"Makefile" in the GENERATED_CHANGED check (or add a separate grep for Makefile)
so the existing if condition that uses PROTO_CHANGED and GENERATED_CHANGED will
treat Makefile edits as codegen inputs; specifically modify the
GENERATED_CHANGED assignment that currently checks '^(src/generated/|docs/api/)'
to also match 'Makefile' so the if [ -n "$PROTO_CHANGED" ] && [ -z
"$GENERATED_CHANGED" ] branch will block PRs that change Makefile without
regenerated artifacts.

Comment on lines +97 to +117
- name: Commit generated artifacts
id: commit
env:
PR_HEAD_REF: ${{ github.event.pull_request.head.ref }}
run: |
if git diff --quiet -- src/generated/ docs/api/ && [ -z "$(git ls-files --others --exclude-standard src/generated/ docs/api/)" ]; then
echo "Proto-generated code is up to date on the PR branch."
echo "pushed=false" >> "$GITHUB_OUTPUT"
exit 0
fi

git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add src/generated/ docs/api/
git commit -m "chore(proto): update generated artifacts"
git push origin HEAD:"$PR_HEAD_REF"
echo "pushed=true" >> "$GITHUB_OUTPUT"

internal-merge-freshness:
needs: internal-auto-generate
if: github.event.pull_request.head.repo.full_name == github.repository && needs.internal-auto-generate.outputs.pushed != 'true'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

According to the GitHub Actions documentation, do pushes made with the default GITHUB_TOKEN trigger new push or pull_request workflow runs?

💡 Result:

No, pushes made with the default GITHUB_TOKEN do not trigger new push or pull_request workflow runs. According to the GitHub Actions documentation, events triggered by the repository's GITHUB_TOKEN (except workflow_dispatch and repository_dispatch) will not create a new workflow run. This prevents recursive workflow runs. For example, if a workflow pushes code using GITHUB_TOKEN, no new workflow runs on push events, even if configured to do so. The same applies to pull_request events, as they are not exceptions.

Citations:


The bot commit won't trigger the follow-up freshness check.

When Line 112 pushes a generated-artifact commit using secrets.GITHUB_TOKEN, GitHub's security model prevents that push from triggering new push or pull_request workflow runs (by design, to prevent recursion). The condition on Line 117 skips internal-merge-freshness whenever a commit was pushed (pushed != 'true'). This means the updated PR branch receives no merge-result verification despite the contributor docs promising one.

Either run the verification in-band before pushing (within the same job), or use a token/dispatch mechanism that reliably triggers the follow-up run.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/proto-check.yml around lines 97 - 117, The current
workflow pushes generated-artifact commits in the "Commit generated artifacts"
step (id: commit) using the default GITHUB_TOKEN which will not trigger
subsequent workflow runs, so the downstream job internal-merge-freshness may
never run; fix this by moving the merge-result verification into the same job
before the git push (i.e., run the freshness/merge checks in-band after
generation and before the git commit/push in the commit step), or alternatively
replace the push with a trigger that will fire workflows (use a repository PAT
or a workflow_dispatch/dispatch API call from this job to create a new run) and
adjust the logic that checks needs.internal-auto-generate.outputs.pushed to
reflect the new trigger mechanism.

lspassos1 pushed a commit that referenced this pull request May 20, 2026
 C1+C2) (koala73#3811)

* feat(forecast): trigger-simulation endpoint + runId filter (koala73#3734 C1+C2)

Closes koala73#3734. Implements the C1+C2 scope of v3 plan
(docs/plans/2026-05-18-003-feat-simulation-trigger-and-runid-filter-plan.md;
also tracked as koala73#3798):

  - POST /api/forecast/v1/trigger-simulation — PRO-gated, mirrors run-scenario
    shape (isCallerPremium → queue-depth → server-derived runId → idempotency
    pre-check → enqueue). Returns { queued, runId, pkgFingerprint, reason }.
    External reason taxonomy collapses to {'', 'no_package', 'already-handled'}
    — server logs retain the distinction between already-queued vs
    already-completed-this-cycle to avoid a cron-timing oracle (Sec4).

  - GET /api/forecast/v1/get-simulation-outcome?runId=X — filter now actually
    fires. New behavior: by-run hit → return outcome; tombstone hit → fall
    through to :latest with distinct note; queued-but-not-completed → return
    processing=true (uses listQueuedSimulationTasks; prevents queued-vs-expired
    indistinguishability from contaminating the 30-day demand signal); miss →
    fall through with expiry note.

  - Worker (writeSimulationOutcome) writes BOTH :latest (canonical, 60d TTL)
    AND :by-run:{runId} (secondary index, 24h TTL). By-run failure logs +
    attempts tombstone payload so reads can distinguish transient failure
    from true expiry (D9).

  - pkgFingerprint = sha256(pkgKey).slice(0,16) opaque token. Replaces what
    v2 had as raw R2 path (raw path leaked bucket layout to Pro callers per
    Sec7). Worker verifies task.pkgFingerprint vs current pointer fingerprint
    with truthy-guard predicate to avoid spurious package_rotated logs on
    pre-upgrade in-flight tasks (Feas-F4).

  - .mjs seeder enqueueSimulationTask backports redis_error reason code so
    the auto-trigger path can surface transport failures cleanly (Adv-F4 /
    SG-2). No documented asymmetry between TS and .mjs paths.

  - Shared shim at server/_shared/_simulation-queue-constants.mjs (with .d.ts)
    is single source of truth for queue key constants + 24h TTL +
    pkgFingerprint helper + MAX_QUEUE_DEPTH. Both the seeder and the new TS
    handler module import natively from it.

  - PREMIUM_RPC_PATHS entry added so the gateway's Pro-bearer gate fires
    before per-IP rate-limit (matches run-scenario precedent).

  - ENDPOINT_RATE_POLICIES entry: { limit: 10, window: '60 s' } per-IP,
    identical to run-scenario.

  - Proto: TriggerSimulationRequest carries an optional client_version field
    (sebuf v0.11.1 emits a typecheck-broken POST client for fully-empty
    request messages; one field works around it). Server logs but never
    branches on it.

  - Plan + tracking metadata: TODO #21 transitioned from pending → complete;
    TODO #27 work log updated noting the filter is now actually active
    (the original 2026-03-24 "complete" was docs-only).

Tests (26 total, all passing):
  - tests/simulation-queue-parity.test.mts (7) — TS ↔ .mjs parity for all
    reason codes (no documented asymmetry); structural lock on the worker's
    truthy-guard predicate
  - tests/forecast-trigger-simulation.test.mts (9) — handler test for 403 /
    429 / no_package / already-handled (both states collapse to single
    external code) / happy path with opaque fingerprint / 503 transport on
    pointer read AND enqueue / regression test that internal codes never
    leak externally
  - tests/forecast-get-simulation-outcome.test.mts (6) — all 5 read paths
    + NOT_FOUND
  - tests/simulation-outcome-by-run-write.test.mjs (4) — structural locks on
    :latest canonicality, :by-run TTL + no-NX flag, tombstone-on-failure
    pattern, return shape

Follow-up: koala73#3809 filed for the audit-reframe docs/marketing work (surface
"simulations auto-run hourly via cron" in OpenAPI / Pro marketing / future
MCP tool descriptions). Closing koala73#3734 in code without that docs change leaves
the audit's framing defensible to future reviewers.

* docs(simulation): operator notes for process-simulation-tasks.mjs --run-id

Inline operator notes on the script that runs the simulation worker:
  - The `--run-id=X` format must match `/^\d{13,}-[a-z0-9-]{1,64}$/i`;
    invalid IDs are silently skipped (pre-existing footgun koala73#3734 Adv-F5).
  - Force-reprocess via `--run-id` overwrites both :latest and :by-run
    keys with no NX, resets the 24h TTL; CDN cache may serve stale for
    up to 30min (D6 trade-off documented in v3 plan).

No behavior change — comments only. Discoverable by operators reading
the script they invoke, rather than requiring knowledge of a separate
runbook file.

* fix(deps): restore platform-optional peer deps in package-lock.json

The local pre-push hook ran 'npm install --prefer-offline' (its safety net
when node_modules is missing in a worktree) and that pruned five
'utf-8-validate@5.0.10' optional peer-dep entries that exist for non-macOS
platforms. CI's 'npm ci' on Linux then refused the lockfile.

Restoring the lockfile to origin/main's version — those entries belong
there for cross-platform compatibility.

* fix(simulation): switch pkgFingerprint to Web Crypto (Edge-compat)

Vercel Edge bundler hoisted `server/_shared/_simulation-queue-constants.mjs`
into Edge function bundles (transitively via cache-keys.ts → various Edge
handlers). The shim's `import { createHash } from 'node:crypto'` is not
available in Edge runtime, breaking deploy with:

  The Edge Function "api/discord/oauth/callback" is referencing
  unsupported modules:
    - __vc__ns__/5/server/_shared/_simulation-queue-constants.mjs:
      node:crypto

Switch to `globalThis.crypto.subtle.digest('SHA-256', ...)` — the only
crypto API available in BOTH Edge runtime AND Node 19+. The seeder
(scripts/seed-forecasts.mjs) keeps working without a polyfill.

Web Crypto's subtle.digest is async-only, so pkgFingerprint() is now
async. Both call sites (handler's getSimulationPackagePointer in the TS
module + worker's currentFingerprint in the .mjs seeder) already run
inside async functions; updated to await.

Output is byte-identical to the previous Node-crypto version (16-char
sha256 hex prefix), so payloads and parity tests remain stable.

* review(pr-3811): address all 4 Greptile findings

P1 — Tombstone SET now uses NX (scripts/seed-forecasts.mjs).
A by-run primary SET that landed server-side but threw a network error
on the response would have been overwritten by the tombstone on the
catch path, corrupting the real outcome for the 24h TTL window. NX
preserves the real write when it actually landed; if Redis is fully
down the tombstone also fails and the read path falls back to :latest
with the standard "may have expired" note (acceptable).

P2 — Single source of truth for SIMULATION_OUTCOME_LATEST_KEY
(server/worldmonitor/forecast/v1/get-simulation-outcome.ts).
Was importing the key from both _simulation-queue-constants.mjs (the
designated shim) AND cache-keys.ts. Drift on a future schema-version
bump could silently break the :latest fallback path. Import only from
the shim now; cache-keys still has the const (unchanged, kept for
non-simulation consumers).

P2 — clientVersion now logged per the proto comment promise
(server/worldmonitor/forecast/v1/trigger-simulation.ts).
The proto field's doc says "Server LOGS this with the success
breadcrumb" but the handler was marking req as unused. Echoes the
field in the success log line; sanitized (truncated to 64 chars,
restricted charset) since it's a free-text caller-provided string.

P2 — User-Agent header on Upstash GET in simulation-queue.ts
(server/_shared/simulation-queue.ts).
AGENTS.md requires User-Agent on all server-side fetch calls.
redisGetThrowing was missing it.

Tests: 26/26 simulation tests still pass. Typecheck clean.

* review(pr-3811): address 2 human-review blocking findings

Fix 1 — processing=true responses can be cached for minutes (P1).
  get-simulation-outcome.ts returned {processing: true} on the by-run-miss
  + runId-queued branch without marking no-cache. The gateway's `slow`
  cache tier would cache the transient state for up to 30 min, so polling
  clients would never see the outcome land after the worker finished.
  Added markNoCacheResponse on that branch + regression test asserting
  X-No-Cache=1 on the response.

Fix 2 — enqueueSimulationTaskForServer + .mjs enqueueSimulationTask
  could return queued:true even when ZADD failed (P1, both sides).
  Worker discovers tasks EXCLUSIVELY via ZRANGE on the queue ZSET
  (scripts/seed-forecasts.mjs listQueuedSimulationTasks at line 17052,
  then processNextSimulationTask iterates queuedRunIds). A task key
  without a corresponding ZSET member is invisible to the worker until
  the 4h task TTL — silent stuck-invisible failure mode.

  Previously labeled "best-effort"; ZADD is actually load-bearing. Now:
  - Capture the ZADD result (transport throw OR non-numeric reply both
    count as failure)
  - On failure: issue compensating DEL on the task key so a future trigger
    isn't blocked by SET NX and no stale half-state lingers for the 4h TTL
  - Return {queued: false, reason: 'redis_error'}
  - EXPIRE failure stays non-fatal (TTL hint only — queue member is
    durable; logged but doesn't roll back)

  Added parity regression test covering both .mjs and .ts: simulates SET
  NX success + ZADD non-numeric reply, asserts both implementations
  surface redis_error AND issue the compensating DEL.

All 9223 tests pass; typecheck + typecheck:api green.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[koala73#3340] Move sebuf codegen out of PRs into CI/CD

1 participant