feat(oci): build-cache/v1 storage-key fix + post-falsification hardening (consolidates #84-#87)#90
Open
jamestexas wants to merge 4 commits into
Open
feat(oci): build-cache/v1 storage-key fix + post-falsification hardening (consolidates #84-#87)#90jamestexas wants to merge 4 commits into
jamestexas wants to merge 4 commits into
Conversation
…ss + caller-provided storage key Closes the conformance loop opened by cloister-4d376e (PR #83). Three pieces: 1. **cloister-spec/build-cache/v1/** — vendor-neutral spec authored per cloister-spec/LAYOUT.md. README.md covers wire summary, digest-encoding decision (BLAKE3-in-sha256:), media-type namespace (application/vnd.cloister.build-cache.v1.*), and the conformance contract ("round-trip byte-equality on every vector in vectors/"). Vectors generated deterministically by LLO's gen_build_cache_vectors example (rs/ll-core/schema-capnp/examples/) — chunk-001.bin, chunk-002.bin, lockfile-config.bin, manifest.json, plus digests.json (per-file BLAKE3 + SHA-256) and VECTORS.sha256 (integrity sidechannel). 2. **Caller-provided storage key on BlobStore.put** — fixes a latent bug in PR #83. The OCI route was verifying body-against-BLAKE3-claim and returning 201, but `blob.put(body)` always stored under SHA-256 internally. A subsequent `GET /v2/.../blobs/sha256:<BLAKE3-hex>` then 404'd because the lookup key didn't match the storage key. `BlobStore.put` now accepts an optional `key: Digest` that overrides the default content-addressed key, used by the OCI route to honor the build-cache/v1 wire (the route has already verified the body matches the key under either SHA-256 or BLAKE3, so the substrate's content-addressed invariant holds under the union). All three verify-on-write sites updated (monolithic upload, chunked finalize, manifest PUT by digest) — the manifest PUT was also still on the SHA-256-only path in PR #83, which this PR also migrates through verifyClaimedDigest(). 3. **test/routes/oci-build-cache-conformance.test.ts** — the conformance harness. Loads each vector (base64-inlined; cloudflare:test pool has no fs), asserts BLAKE3 integrity against the committed digests.json, then runs the v1 round-trip contract: push every blob under sha256:<blake3-hex>, pull back, assert byte-equal. Plus an end-to-end consumer walk: push everything, then fetch manifest → parse → fetch config + each layer like a real `mache cache fetch` consumer would. 9 tests, all green. Verification: task lint → 1136/1136 tests pass (was 1127, +9 conformance) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…dening for cloister-667ea6)
Closes the highest-priority finding from cloister-667ea6 adversarial
review (dos-resilience-auditor): push paths called arrayBuffer()
unconditionally, with no Content-Length precheck and no size cap. A
single push of an N-byte body materialized ~3N bytes in isolate
memory before any hashing ran, exposing the shared isolate to
cross-tenant memory-exhaustion attacks.
Approach (two-layer defense):
1. **Cheap header check** (checkContentLengthHeader): rejects on the
client's Content-Length header alone, BEFORE arrayBuffer() runs.
Standard clients send this; if it claims > cap, return 413
PAYLOAD_TOO_LARGE (OCI SIZE_INVALID) without paying any
buffering cost.
2. **Post-buffer actual-size check** (checkActualSize): catches
clients that omit Content-Length or spoof it under the cap then
send more. Returns the same 413.
Both checks applied at all five push call sites in OciRegistryRoute:
- Monolithic POST .../blobs/uploads/?digest=... (line 472)
- Chunked-begin POST .../blobs/uploads/ optional seed body (line 494)
- PATCH .../blobs/uploads/<uuid> — CUMULATIVE size check on
(session.size + chunk.byteLength), so paced PATCHes can't grow
a session past the cap incrementally (line 540)
- Chunked-finalize PUT .../blobs/uploads/<uuid>?digest=... — also
cumulative on trailing body (line 591)
- Manifest PUT .../manifests/<reference> (line 645)
Per-instance configurable via `new OciRegistryRoute({ maxBlobBytes })`,
default 256 MiB (DEFAULT_MAX_BLOB_BYTES). The default fits realistic
mache .db chunks and OCI image layers while keeping one push under
the workerd per-isolate heap budget. Env-binding override surface
deferred to a follow-up bead.
Tests: 5 new cases in oci-registry-push.test.ts §"body-size cap":
- monolithic body > cap → 413
- Content-Length header > cap (small actual body) → 413
- chunked PATCH cumulative > cap → 413
- manifest PUT body > cap → 413
- body exactly == cap → 201 (boundary inclusive)
Verification: task lint → 1141/1141 tests pass (was 1136, +5 body-cap).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…BlobStore.put (cloister-7e631b) Closes cloister-7e631b — defense-in-depth seam flagged by adversarial review of cloister-667ea6 (bundle-isolation-tester + trust-root-adversary). Context: PR #84 added `put(bytes, key?)` so the OCI route could honor the build-cache/v1 BLAKE3-in-`sha256:` wire (the storage key is BLAKE3 of body, not SHA-256). All current callers — the three OCI verify sites + bead-create-orchestrator — verify body-against-key BEFORE calling put. But that moved the content-addressed invariant from substrate-enforced to caller-discipline: a future contributor adding a put-with-key call site could skip verification and store bytes under an arbitrary unverified key. This PR restores the substrate-side guarantee: `WorkerdBlobStore.put`, when given a key, now re-verifies the body matches it under SHA-256 OR BLAKE3 (same dual-algorithm check the OCI route uses, kept in sync via the shared `digestBytes`/`blake3HexBytes` helpers in storage/canonical.ts). Mismatch throws — substrate refuses to store under an unverified key. Cost: one additional hash computation per put-with-key (the same hash the caller just computed; in practice the OCI route already ran verifyClaimedDigest, so this is duplicated work). The reverse trade is the alternative — trust callers forever — which the adversarial review flagged as fragile. Defense-in-depth wins. Behavior unchanged for default callers (`put(bytes)` with no key): the substrate computes SHA-256 as before; the default path is the content-addressed contract that's existed since the file was written. Tests: 4 new cases in storage/workerd.test.ts §"caller-provided key verification": - put(bytes, sha256-of-bytes) accepts - put(bytes, blake3-of-bytes) accepts (build-cache/v1 path) - put(bytes, wrong-key) throws - put(bytes) default path unchanged Verification: task lint → 1145/1145 tests pass (was 1141, +4). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…uted hash hex (cloister-667ea6 P2) Adversarial review of cloister-667ea6 (enumeration-oracle-hunter, dos-resilience-auditor) flagged the DIGEST_INVALID response body as a free hash-as-a-service oracle: posting bytes with a known-wrong sha256: claim returned BOTH the computed sha256= AND blake3= of the caller-controlled body. Both hashes are publicly computable (the caller could just hash locally), so the leak isn't a confidentiality break — but it's computation amplification (offload hashing to cloister), and the BLAKE3 disclosure is more interesting since some OCI-native client toolchains don't carry a BLAKE3 library. Fix: keep the error message naming WHICH algorithms ran (so client diagnostics aren't useless — "I checked both sha256 and blake3, neither matched your claim") but drop the hex values. Four error sites scrubbed: - oci-registry.ts:482 monolithic upload digest mismatch - oci-registry.ts:607 chunked-finalize digest mismatch - oci-registry.ts:662 manifest PUT by-digest ref mismatch - oci-registry.ts:683 manifest PUT Docker-Content-Digest header mismatch Tests: 4 new cases in oci-registry-push.test.ts §"no hash disclosure" assert that the DIGEST_INVALID body still names "sha256" and "blake3" as algorithms but contains no 64-hex string OTHER than the client's own claim (echoing the claim is fine — they already know it). Uses a strict regex-based hex scanner so future regressions fail loudly. Verification: task lint → 1149/1149 tests pass (was 1145, +4). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This was referenced May 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Consolidates #84, #85, #86, #87 into one unit. Will close those PRs in favor of this one.
Builds on #83 (BLAKE3-in-
sha256:digest fallback, now merged to main) and on PR #82 (build-cache/v1 spec + vectors, also already merged). Adds the runtime pieces and the hardening surfaced by the cloister-667ea6 adversarial review.Why one PR not four
Each underlying commit was a self-contained TDD cycle, but they're logically one unit of work: "make cloister's build-cache/v1 push surface actually work end-to-end AND close the falsification findings." Splitting was iteration cadence, not separation of concerns.
Four commits, one unit of work
1. build-cache/v1 storage-key fix + conformance harness (was #84, bead cloister-667ea6)
The latent bug: PR #83's BLAKE3 fallback verified body-against-claim and returned 201, but
BlobStore.put(body)always stored under SHA-256. A subsequentGET .../blobs/sha256:<BLAKE3>then 404'd because lookup key ≠ storage key.Fix:
BlobStore.put(bytes, key?)accepts an optional storage key. Three OCI verify sites (monolithic, chunked finalize, manifest-by-digest PUT) now passverified.key. Manifest PUT also migrated throughverifyClaimedDigest().test/routes/oci-build-cache-conformance.test.ts: 9 tests round-trip every vector incloister-spec/build-cache/v1/vectors/byte-equal — chunks + config + manifest, plus an end-to-end consumer walk (manifest → config → layers).2. Body-size cap on push paths (was #85, cloister-667ea6 P0)
DoS finding from adversarial review: push paths called
arrayBuffer()unconditionally — single push could exhaust the V8 isolate. Two-layer defense:Content-Lengthheader fast-reject + post-buffer actual-size check. Cumulative on chunked PATCH. Per-instance configurable (new OciRegistryRoute({ maxBlobBytes })); default 256 MiB.3. Substrate-level key verification on
BlobStore.put(was #86, cloister-7e631b)Defense-in-depth:
WorkerdBlobStore.put, when given a key, re-verifies the body matches under SHA-256 OR BLAKE3. Mismatch throws — substrate refuses unverified keys. Restores the content-addressed invariant as substrate-enforced (rather than caller-discipline) after thekey?parameter loosened it.4. DIGEST_INVALID errors no longer leak hex (was #87, cloister-667ea6 P2)
Free-hash-oracle finding: error responses included both
sha256=<hex>andblake3=<hex>of the caller-posted body. Algorithm names preserved for diagnostics, hex values dropped.Test plan
task lint→ 1149/1149 tests pass (was 1136 on main; +9 conformance + 5 body-cap + 4 substrate-verify + 4 no-leak − some pre-existing tests redundant after merge)Will close
Once this merges, #84 / #85 / #86 / #87 are no-ops on main and will be closed referencing this PR.
🤖 Generated with Claude Code