Conversation
…rom cache sync Extend `isInternalCacheFile` to exclude any path with basename `.lock` (in addition to the existing top-level `.datastore.lock`). Without this, the self-healing `discoverIndexFromBucket` path added in #122 captures transient per-target lock files (`data/<kind>/<type>/<id>/.lock`, written by the data tier's lock subsystem via S3 PUT/DELETE) into the synthesized index. The lock subsystem then deletes them on release, leaving the remote index referencing missing objects. A fresh reader running `datastore setup` 404s during hydration, fails to persist the `datastore:` block to `.swamp.yaml`, and the next `datastore sync --pull` falls back to the filesystem datastore and reports `Current datastore type: filesystem`. Surfaced by the swamp-uat sync_test.ts cross-repo pull tests (`sync pull retrieves data written by another repo …`) on @swamp/s3-datastore@2026.05.04.4. Reproduced end-to-end in a Linux container with a mock S3 server: writer runs a workflow (acquires the per-target lock, pushes data, releases the lock), reader's setup hits the 404 on the stale `.lock` index entry, `.swamp.yaml` never gets the datastore block, sync --pull surfaces the filesystem-fallback error. With the fix the reader hydrates 19 files cleanly and the pull succeeds. Mirrored in @swamp/gcs-datastore — same `discoverIndexFromBucket` path, same latent bug. Test plan: - 1 new unit test per extension pinning the `.lock` exclusion at any depth, plus negative cases (`data/@m/.locked.yaml`, `data/@m/lock`). - All existing s3 (45) + gcs (43) cache_sync tests still pass. - Linux+mock-S3 end-to-end repro (writer workflow run → reader setup → reader sync --pull) goes from `errors: ["…/.lock - 404 NoSuchKey"]` + filesystem-fallback to `filesPulled: 19` + a successful pull returning the writer's 4 data records. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
-
Misleading comment in s3_cache_sync_test.ts — The new test includes:
// The top-level distributed lock stays excluded (already covered by the // exact-match branch, but the basename branch catches it too). assertEquals(isInternalCacheFile(".datastore.lock"), true);
The claim "the basename branch catches it too" is factually wrong.
".datastore.lock".split("/").pop()yields".datastore.lock", not".lock", so thebase === ".lock"guard does not fire for it. The assertion is correct (the exact-match branch covers it), but the comment misstates the mechanism and could mislead a future reader into thinking the exact-match check is redundant. The GCS test omits this comment entirely — worth fixing the S3 version to match. -
The
.datastore.lockassertion is already covered by the existingisInternalCacheFile: excludes the sync-state sidecartest (line 1487). Repeating it for context is fine, but it's redundant.
The fix is correct and minimal: a single exact basename check base === ".lock" is the right tool — precise enough not to catch .locked.yaml or lock, confirmed by the negative test cases. The root-cause explanation in the PR description and the comment blocks in both source files are thorough and accurate. Version bumps are appropriate. No CLAUDE.md violations, no security concerns, no type issues.
There was a problem hiding this comment.
Adversarial Review
I traced every code path through the modified isInternalCacheFile in both S3 and GCS, verified the discoverIndexFromBucket filter application at s3_cache_sync.ts:781 and gcs_cache_sync.ts:744, and tested edge cases against the basename extraction logic (rel.split("/").pop() ?? "").
Critical / High
None found.
Medium
None found.
Low
-
No test for bare
.lockat root level (both test files) — The tests cover deep paths (data/command/shell/<uuid>/.lock), shallow paths (data/@m/.lock), and the existing.datastore.lockexact-match, but not a bare.lockas the entire relative path. This case is handled correctly by the basename branch (since".lock".split("/").pop()is".lock"), and it's unlikely to occur in practice since the data tier nests lock files underdata/<kind>/<type>/<id>/. Purely theoretical gap in test coverage. -
Trailing-slash keys would bypass the filter — If a cloud storage listing ever returned a key like
data/foo/.lock/(trailing slash),".lock/".split("/").pop()yields"", not".lock", so the filter would miss it. This can't happen in practice: S3ListObjectsV2and GCSlistreturn exact object keys, and.lockfiles are regular objects, not prefix placeholders. Not actionable.
Verdict
PASS — The fix is minimal, correct, and well-targeted. The single-line if (base === ".lock") return true addition in each file correctly excludes per-target FileLock files at any depth from cache sync. The exact-string comparison avoids false positives on similar names (.locked.yaml, lock). Both S3 and GCS implementations are identical, matching the mirrored bug. Tests cover the primary positive and negative cases. Manifest versions follow CalVer convention. No logic errors, no security concerns, no resource leaks.
Summary
isInternalCacheFile(s3 + gcs) to exclude any path with basename.lock, in addition to the existing top-level.datastore.lockexact-match.2026.05.05.1.data/@m/.locked.yaml,data/@m/lockare NOT filtered).Why
The self-healing
discoverIndexFromBucketpath added in #122 lists the bucket and synthesizes an index from whatever it finds. The data tier writes per-target FileLock files atdata/<kind>/<type>/<id>/.lockvia the s3-datastore's owncreateLock— which PUTs and DELETEs them directly through S3 outside cache sync. OldisInternalCacheFileonly excluded the top-level.datastore.lock, so:setup-timepullIndex404s →discoverIndexFromBucket→ LISTs the bucket → catches the freshly PUTdata/.../<id>/.lock→ builds an index containing it → uploads.DELETEremoves the.lockobject.setuppulls the now-stale index, hydrates → 404 on the missing.lock→ setup reports an error and never persistsdatastore:to.swamp.yaml.swamp datastore sync --pullreads.swamp.yaml, sees no datastore section, falls back to filesystem, and the sync command's preflight emitsCurrent datastore type: filesystem.Surfaced by the swamp-uat
tests/cli/datastore/sync_test.tscross-repo pull tests on@swamp/s3-datastore@2026.05.04.4: https://github.com/systeminit/swamp-uat/actions/runs/25343366047. GCS got the parallel fix in #122; the latent bug is identical there.Test plan
deno test— 1 new unit test per extension pinning.lockexclusion at any depth, plus negative cases.sync --pull):errors: ["…/.lock - S3 getObject failed HTTP 404 NoSuchKey"],.swamp.yamlmissing datastore block,sync --pullreportsCurrent datastore type: filesystem.filesPulled: 19, errors: [],.swamp.yamlhas the s3 datastore block,sync --pullsucceeds,data list shell-echoreturns the writer's 4 records.deno fmt --checkanddeno lintclean for both extensions.🤖 Generated with Claude Code