Conversation
…er non-zero pullChanged Pre-fix, pullChanged's slow path skipped markSynced when pulled > 0 and left the on-disk index file carrying the original pusher's localMtime values (the raw remote payload pullIndex wrote at the start of the slow path). A subsequent fresh-process pushChanged then took the slow path (no sidecar), pullIndex(forceRemote) reloaded the pusher's mtimes into this.index, and the walk pushed every file — size matched the index but localMtime didn't (pusher's mtime ≠ local stat.mtime from Deno.writeFile). N redundant byte-identical PUTs per fresh-cache bootstrap (swamp-club #222). The fix mirrors the existing pulled === 0 markSynced path: when the local cache matches the remote index whose ETag we GET'd (whether by walk-finds-zero-diff OR by downloading the missing files), persist the in-memory index back to disk (carrying the puller's stat-derived localMtimes) AND mark synced so the next pushChanged hits the fast path. Ordering invariant preserved: atomicWriteTextFile runs BEFORE markSynced because markSynced derives lastVerifiedAt from indexPath mtime + 1ms. Out-of-scope follow-up: when a concurrent writer bumps the remote ETag between pullChanged and pushChanged, the sidecar mismatch forces the slow path, and pullIndex(forceRemote) overwrites this.index with the new remote payload (carrying the writer's localMtimes), and the walk re-fires redundant pushes for any local file whose mtime differs. Pre-existing behavior, not widened by this fix; will be filed separately post-merge. Verified: - 5-file regression test in each of s3 and gcs cache_sync_test.ts: fresh-process pushChanged after fresh-cache pullChanged uploads zero bytes, with positive assertions on sidecar presence/cleanliness and on-disk index localMtime values - s3 suite: 61/61 passing, gcs suite: 64/64 passing - MinIO end-to-end: pulled=5, pushed=0 - GCS end-to-end: pulled=5, pushed=0 Manifests bumped to 2026.05.04.1 in both extensions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Summary
This is a targeted, correct fix for a real bandwidth-waste bug. The root cause analysis in the PR description is accurate and matches the code: pullChanged was updating this.index in memory with the locally-derived stat.mtime values after each download, but only flushing the in-memory state to disk and calling markSynced when pulled === 0. A fresh-process pushChanged loaded the stale on-disk index (carrying the original pusher's localMtime epoch values), found a mismatch against the local stat.mtime, and re-uploaded all files unnecessarily.
Fix correctness:
- The change from
if (pulled === 0 && indexETag)→if (indexETag)with an innerif (pulled > 0 && this.index)guard to write the index first is correct. - The ordering invariant (index write before
markSynced) is essential and clearly documented:markSyncedderiveslastVerifiedAtfromDeno.stat(indexPath).mtime + 1ms, so writing the index after would bump the mtime pastlastVerifiedAtand cause spurious slow-path fallbacks. - If
atomicWriteTextFilethrows, it's caught by the outer try/catch andmarkSyncedis correctly skipped — we don't record a clean sidecar when the on-disk index is stale. - The
this.indexnull guard (pulled > 0 && this.index) is sound: if the index were null, no files would be intoPull, sopulledwould remain 0 and the guard never fires. - Mirrored symmetrically between
s3_cache_sync.tsandgcs_cache_sync.tsas expected.
Tests:
- Both new regression tests correctly use in-memory mock clients (no live cloud services).
- Temp dirs are cleaned up in
finallyblocks; no env vars set/restored needed. - The two positive assertions (sidecar contents + on-disk index mtimes) are load-bearing: they pin the persistence mechanism so a future refactor can't satisfy only the symptom assertion while silently breaking the fix.
- ETag normalization is handled correctly: the mock stores
'"index-pre-pull"'(quoted HTTP ETag format),normalizeETagstrips the quotes before storage, and the assertion checks the unquoted form. The GCS test uses a raw generation string ("42"), which doesn't need normalization. - No
sanitizeResources: falseneeded — mock clients don't open connection pools. - CLAUDE.md rules: no
anytypes in new hand-written code (the typed castas { entries: ... }is correct), named exports, no model files touched.
Suggestions
- Minor naming oddity: The second service in each test is named
serviceC(skippingserviceB), which is slightly confusing. Not a bug — the comment explains it simulates a cross-process scenario — butserviceBorfreshServicewould be clearer. - Minor inconsistency: The GCS test reads the sidecar via
Deno.readTextFile+JSON.parsedirectly, while the S3 test uses thereadSidecarhelper. Functionally identical; using the helper in both would be marginally cleaner.
There was a problem hiding this comment.
Adversarial Review
Traced the full data flow through pullChanged → atomicWriteTextFile → markSynced → (next process) pushChanged → tryFastPushChanged / slow-path walk for both S3 and GCS implementations. Examined edge cases around null indexETag/indexGeneration, null this.index, multipart ETags, generation "0", concurrent writers, and atomicWriteTextFile failures.
Critical / High
None found.
Medium
None found.
Low
-
s3_cache_sync.ts:840/gcs_cache_sync.ts:813—atomicWriteTextFilefailure skipsmarkSynced: If the index rewrite throws (disk full, permissions), the catch block swallows it and also skipsmarkSynced. The on-disk index retains the remote payload'slocalMtimevalues, so a subsequent fresh-processpushChangedstill sees the pre-fix mismatch and redundantly pushes. This is the same as pre-fix behavior (self-healing on next slow-path sync), and the comment correctly describes it as "opportunistic." Not a regression — just an inherent limitation of the opportunistic design that prevents full self-healing on disk-full. -
s3_cache_sync.ts:836/gcs_cache_sync.ts:809—pulled === 0with reconciled in-memory mtimes not persisted: During the walk phase (before downloads), the code reconcilesthis.index.entries[rel].localMtimein memory when a local file'sstat.mtimediffers from the remote entry'slocalMtimebut size matches. Whenpulled === 0, these reconciled values are NOT written to disk (theif (pulled > 0 && this.index)guard skips the rewrite). The sidecar IS written, so the next sync hits the fast path — meaning the stale on-disk mtimes are never consulted. If the fast path later fails (concurrent writer bumps ETag),pullIndex(forceRemote)reloads from remote anyway. No practical impact, but worth noting for future refactors that might remove the fast path.
Verdict
PASS. The fix is well-targeted: it extends the existing pulled === 0 sidecar-persistence block to also handle pulled > 0 by (a) rewriting the on-disk index with the puller's stat-derived localMtime values before (b) calling markSynced. The ordering invariant (atomicWriteTextFile before markSynced) is correctly maintained — markSynced derives lastVerifiedAt from Deno.stat(indexPath).mtime + 1ms, so reversing would capture a stale baseline. The error handling preserves the existing opportunistic/self-healing contract. The S3 and GCS implementations are structurally identical modulo indexETag ↔ indexGeneration. Both regression tests correctly pin the persistence machinery (sidecar contents + on-disk index mtimes) in addition to the symptom assertion (zero redundant PUTs).
Summary
Closes swamp-club #222.
The first explicit `pushChanged` after a fresh-cache `pullChanged` against a populated remote re-uploaded every just-downloaded file with byte-identical content. Not a correctness issue (data converges after the first redundant push), but a per-bootstrap bandwidth waste of N S3/GCS PUTs and a confusing `filesPushed: N` immediately after `filesPulled: N`.
Root cause
In `pullChanged`'s slow path:
In a subsequent process, `pushChanged`'s `tryFastPushChanged` returned null (no sidecar), the slow path's `pullIndex(forceRemote)` reloaded the pusher's mtimes into `this.index`, and the walk fell through:
Fix
Extend the existing `pulled === 0` branch to also handle `pulled > 0`: persist the in-memory index back to disk (with the puller's `stat`-derived `localMtime` values) and call `markSynced(indexETag)`. Wrapped in the same opportunistic try/catch as the pre-existing path.
Ordering invariant preserved: `atomicWriteTextFile(indexPath, ...)` runs before `markSynced` because `markSynced` derives `lastVerifiedAt` from `Deno.stat(indexPath).mtime + 1ms`. Reversing would spuriously fail `tryFastPullChanged`'s mtime check on every subsequent sync.
Mirrored line-for-line in `gcs_cache_sync.ts` (uses `indexGeneration` instead of `indexETag`), per the precedent set in PRs #105/#106/#108/#111-#114.
Out-of-scope follow-up
When a concurrent writer bumps the remote ETag between this process's `pullChanged` and `pushChanged`, the sidecar mismatch forces the slow path, `pullIndex(forceRemote)` overwrites `this.index` with the new remote payload (carrying the writer's `localMtime` values), and the walk re-fires redundant pushes. Pre-existing behavior, not widened by this fix; will file as a separate swamp-club issue post-merge.
Test plan
Manifest bumps:
🤖 Generated with Claude Code