Conversation
Extracts the SSRF blocklist data (CIDR ranges, hostname patterns,
NAT64 well-known prefix) to a new lib/ssrf-blocklist.ts module. Both
consumers read from this one file:
- lib/safe-fetch.ts imports the data and builds its node:net
BlockList and Set<string> at module-load time.
- lib/sandbox/child-source.ts interpolates the data into the
grandchild source template via JSON.stringify, so the inlined
literals in the rendered "node -e" string are automatically in
sync with the main-app blocklist.
Closes the drift this PR was originally responding to. Before this
commit the sandbox guard was a hand-maintained copy that stopped at
KEEP-314 - missing the four IPv6 ranges added in the KEEP-603 PR
(64:ff9b:1::/48 site-local NAT64, 2001::/32 Teredo, 2002::/16 6to4,
2001:db8::/32 documentation) and missing the entire pre-DNS hostname
denylist (localhost, *.local, *.internal, *.svc.cluster.local,
*.pod.cluster.local). It now picks up the same set automatically.
The sandbox docblock is updated: imports are now allowed in this
module, but restricted to pure data consumed at template-render time;
runtime behaviour still cannot cross into the grandchild because the
grandchild has no access to npm or to the rest of the codebase.
Validation: pnpm vitest passes (189/189). Lint and type-check clean
on the changed files. The rendered SANDBOX_CHILD_SOURCE string was
piped through node --check to confirm the interpolated template is
syntactically valid JavaScript, and the key CIDR and host-pattern
markers are present in the rendered output.
No public API change. The sandbox guard remains always-enforce; no
SAFE_FETCH_ENFORCE wiring is needed (child-source.ts does not read
the env var).
Fixes the standalone @keeperhub/sandbox package's tsc build, which failed on the previous commit's @/-prefixed import because that package's tsconfig maps `@/*` to its own `./src/*` (not to the keeperhub workspace root). The new import in child-source.ts is switched to a relative path that resolves identically from both build contexts; an inline comment notes the asymmetry with safe-fetch.ts. Adds tests/unit/sandbox-child-source.test.ts that asserts every entry from lib/ssrf-blocklist.ts appears inline in the rendered SANDBOX_CHILD_SOURCE string, plus a parse-only vm.Script check. This locks the SoT contract: a future regression that reverts to hardcoded literals in child-source.ts would fail the test, not slip through. Validation: sandbox tsc build now emits dist/lib/ssrf-blocklist.js alongside dist/lib/sandbox/child-source.js, with the compiled require target resolving correctly at runtime. sandbox package tests 25/25 pass; keeperhub touched suites 207/207 pass (was 197 before this file). Lint clean on changed files; no new type errors.
…ontext
The sandbox Dockerfile's build stage takes a "copy only what is needed"
approach and explicitly named `lib/sandbox/child-source.ts` as the one
shared file pulled in from the keeperhub workspace. After this branch's
SoT extraction, child-source.ts also imports `../ssrf-blocklist`, but
that file wasn't being copied into the build context - so the sandbox
tsc step failed with "Cannot find module '../ssrf-blocklist'" inside
the Docker build (local pnpm builds passed because the full workspace
tree was available).
Adds the file as an explicit COPY alongside the existing child-source.ts
copy, with a comment explaining the dependency. The sandbox tsc emits
dist/lib/ssrf-blocklist.js next to dist/lib/sandbox/child-source.js;
the latter's compiled `require("../ssrf-blocklist")` resolves at
runtime.
CI's test-unit-sandbox-remote still failed after the Dockerfile COPY addition: the build stage now succeeded but the runtime container hit Error [ERR_MODULE_NOT_FOUND]: Cannot find module '/app/dist/lib/ssrf-blocklist' imported from /app/dist/lib/sandbox/child-source.js The standalone @keeperhub/sandbox package is `"type": "module"` and runs its compiled output via plain `node`, whose strict ESM loader requires explicit file extensions in import specifiers. The extension- less `"../ssrf-blocklist"` import worked at tsc time (sandbox tsconfig uses moduleResolution: "bundler") and on the keeperhub side (Next.js bundler) but blew up at sandbox runtime. Switches the import to `"../ssrf-blocklist.js"` - the NodeNext / strict ESM convention. This matches how `sandbox/src/run-code.ts` already imports `"../../lib/sandbox/child-source.js"`. TS still resolves to the matching .ts source at compile time; the emitted output keeps the .js suffix verbatim so Node's resolver finds the compiled file. Validated locally: sandbox tsc emits the .js suffix into dist/lib/sandbox/child-source.js, and `node dist/sandbox/src/index.js` launches without module-resolution errors. Keeperhub side: 197/197 tests pass; type-check unchanged.
… import
CI surfaced an irreducible conflict between the two consumers of the
shared SSRF blocklist:
- lib/safe-fetch.ts is bundled by Next.js Turbopack, which resolves
extension-less relative imports to .ts source files but refuses
.js-suffixed imports back to the .ts source ("Module not found:
Can't resolve '../ssrf-blocklist.js'" during the keeperhub build).
- lib/sandbox/child-source.ts is also compiled by the standalone
@keeperhub/sandbox package, whose strict ESM runtime requires
explicit .js extensions and rejects extension-less imports with
ERR_MODULE_NOT_FOUND at startup.
A .ts data file is unreachable from one consumer or the other with any
single import specifier. Switches the data file to .json, which both
bundlers resolve identically.
- lib/ssrf-blocklist.json owns the data.
- lib/ssrf-blocklist.ts is now a thin typed wrapper that imports the
JSON and re-exports the values with proper Cidr tuple types. The
keeperhub side imports paths are unchanged; safe-fetch.ts and
connection-host-guard.ts keep their existing module specifiers.
- lib/sandbox/child-source.ts imports the JSON directly with
`with { type: "json" }`, skipping the TS wrapper entirely - the
sandbox doesn't need TS types because the values are interpolated
into the template literal as JSON.stringify(...).
- sandbox/Dockerfile copies the .json (not the .ts) into the build
context.
- sandbox/package.json's build script copies the .json into dist
after tsc, because tsc does not emit .json files even with
resolveJsonModule enabled.
Validated locally: sandbox tsc + JSON copy emits dist/lib/ssrf-blocklist.json
alongside dist/lib/sandbox/child-source.js; `node dist/sandbox/src/index.js`
starts without module-resolution errors; keeperhub unit suites 197/197
pass; lint clean; sandbox typecheck clean.
The workflow execution path that actually runs user step code (and therefore calls safeFetch) is the ephemeral workflow-* runner pod spawned by the executor controller via createWorkflowJob. These pods do not read Helm values directly - their env is constructed in keeperhub-executor/k8s-job.ts from a hardcoded explicit list plus the RUNNER_SYSTEM_ENV_VARS allowlist. Pre-fix posture: SAFE_FETCH_ENFORCE was on neither path, so runner pods got the env unset, isShadowMode() returned true, and the SSRF guard logged blocks but let requests through. The IMDS-401 case originally reported in production was almost certainly a runner pod - the controller pod can be hardened via Helm (PR #1331) but that does not propagate to its children. Hardcodes `{ name: "SAFE_FETCH_ENFORCE", value: "true" }` in the explicit list rather than adding to RUNNER_SYSTEM_ENV_VARS. The hardcoded form makes the runner enforce regardless of controller config drift; flipping it back to shadow mode now requires a code change here, which is the intended posture for an SSRF guard in an incident-response context. Tests added: - runner pod gets SAFE_FETCH_ENFORCE=true when controller env is unset - runner pod still gets SAFE_FETCH_ENFORCE=true (hardcoded value wins) when controller env says "false", and the entry appears exactly once
…etch-forward fix: force-enable SAFE_FETCH_ENFORCE on workflow runner pods
The six test files under keeperhub-executor/ (k8s-job.test.ts, runner-env.test.ts, startup-checks.test.ts, api-execute.test.ts, lib/metrics-shipping.test.ts, lib/ship-metrics.test.ts - 44 tests total) have not been running in CI. They are discoverable by the root vitest config, but the pr-checks test-unit job invokes `pnpm test:unit` which is `vitest run tests/unit` - a positional path filter that excludes everything outside tests/unit/. Adds a new `test:executor` package.json script (`vitest run keeperhub-executor`) and a follow-on step in the test-unit job that invokes it. The new step runs in the same job as the existing unit tests so it reuses the Node + pnpm setup and does not add a fresh runner; failure attribution stays clean because GitHub Actions reports each step separately. Verified locally: `pnpm test:executor` discovers and passes all 44 tests in ~10s. Why not a separate workflow file like pr-checks-scheduler.yml or pr-checks-events.yml: keeperhub-scheduler and keeperhub-events are separate pnpm workspaces with their own lockfiles, tsconfigs, and vitest configs. keeperhub-executor is part of the root workspace and shares dependencies, so a follow-on step in the existing job is the lighter-weight match. Why hardcoded as a separate script rather than expanding test:unit: `test:unit` is reused by the sandbox-remote job (`pnpm test:unit tests/unit/code-run-code.test.ts`) which appends a path arg to narrow the run. Expanding test:unit's base path to include keeperhub-executor would broaden that job too. Keeping the executor invocation in its own script preserves the existing test:unit semantics.
Mirrors the pr-checks-events.yml pattern: a standalone workflow file that fires only when keeperhub-executor/** (or the workflow itself) change, instead of an extra step on every PR's main test-unit job. The previous commit's approach (a follow-on step in pr-checks.yml test-unit) ran the executor suite on every PR regardless of whether any executor code changed, which wastes runner time and clutters the status checks UI. The dedicated workflow is cheap to skip when nothing matches the path filter. Differences from pr-checks-events.yml: events is a separate pnpm workspace with its own lockfile and setup-events-deps action; the executor lives in the root workspace, so the existing setup-node-pnpm action is reused. Executor tests are pure unit (no docker, no anvil), so only a single test-unit job is needed - no test-integration mirror. Verified locally: pnpm test:executor still passes 44/44. The test:executor package.json script added in the previous commit is retained because it is the entry point this workflow invokes.
…ntain permissions' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…n-ci ci: run keeperhub-executor unit tests in pr-checks
Spawn the grandchild via node -e and assert await fetch(...) is rejected for every blocklist category (IMDSv2, RFC 1918, loopback, ULA, link-local, Teredo, 6to4, doc range, site-local NAT64, multicast, and pre-DNS hostname patterns). Existing data-only tests proved the list was inlined but not that the guard actually fires. Also tighten CIDR data tests to assert the full tuple, add a guard that locks child-source.ts to a single data-only import, and list the JSON dependency in sandbox/tsconfig.json.
…-parity hotfix: single source of truth for SSRF blocklist
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Release
staging->prod. Notable security-critical content in this release:Security incident 2026-05-21 follow-up (PR #1335 merged into staging)
169.254.0.0/16,fe80::/10,fc00::/7(IMDS + link-local + ULA).NEXT_PUBLIC_AI_PROMPT_ENABLED(defaultsfalse). UI +/api/ai/generateboth gated.0082):block_database_integrations_security_2026_05_21declared in migrations (the trigger was already ad-hoc on prod).block_executions_security_2026_05_21rejects INSERT intoworkflow_executionswhen the workflow owner isdeactivated_at IS NOT NULLor the workflow isdeleted_at IS NOT NULL. Raises42501.CREATE OR REPLACE FUNCTION+DROP TRIGGER IF EXISTS / CREATE TRIGGER).Other content
Test plan
SELECT tgname FROM pg_trigger WHERE tgname LIKE '%security_2026_05_21%')workflow_executionsfor a soft-deleted workflow on prod returns42501error