Conversation
…-indexed buckets (swamp-club#225) Residual case from swamp-club#220: when a bucket holds data under standard prefixes but no `.datastore-index.json`, the post-#220 hydrate step reported `Hydrated: 0 pulled` and silently left the cache empty. The user's contributors saw a successful `swamp datastore setup` that delivered no data. `pullIndex()`'s NotFound branch now lists the bucket via ListObjectsV2, filters via `isInternalCacheFile`, builds an index from the listing, publishes it back unconditionally, and continues. Both `pullChanged` and `pushChanged` callers inherit discovery — push against an unindexed populated bucket also now produces a correct merged index instead of losing remote entries. The unconditional PutObject is deliberate: discovery is idempotent (same listing → same synthesized index across racing peers), matches the existing `pushChanged` writeback semantics, and `If-None-Match: *` is not portable across all S3-compatible backends this extension supports. Inline comments document the rationale and the failure-state ordering (`this.index` set before put, atomicWriteTextFile after) so the post-fetch path's contract isn't accidentally diverged from. Mirrors the same fix in @swamp/gcs-datastore. Test plan: - 5 new unit tests per extension pinning the discovery path (discovery populates+writes back, pullChanged hydrates, internal-file filter, empty-bucket fallthrough, pushChanged side benefit). - 1 new programmable-server integration test proving the AWS SDK surfaces the missing-index 404 with the discriminant the catch branch matches against. - End-to-end MinIO repro: single-peer setup hydrates 3 files and publishes the index; second-peer setup hydrates 3 files with no redundant writeback (index ETag stable). Closes swamp-club#225. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Adversarial Review
Reviewed all 8 changed files: s3_cache_sync.ts, gcs_cache_sync.ts, s3_client.ts, gcs_client.ts, their test files, and both manifests. Traced every new code path, checked error handling on all branches, verified the API migration has no orphaned callers, and stress-tested the concurrency claims.
Critical / High
None found.
Medium
-
Inaccurate idempotency claim across racing peers —
s3_cache_sync.ts:731,gcs_cache_sync.ts:700Both doc comments state "same listing → same synthesized index across racing peers" to justify the unconditional
PutObject. This is technically false:lastPulled(line 787 S3 / line 746 GCS) usesnew Date().toISOString(), which differs per peer. Additionally, thelastModifiedfallback(entry.lastModified ?? new Date()).toISOString()(S3 line 782) and(entry.updated ?? new Date()).toISOString()(GCS line 741) produce different timestamps if the listing entry lacks the field.Concrete scenario: Two peers race through discovery at timestamps T1 and T2. Both see the same 3 objects (with
lastModifiedpopulated), but the serialized JSONs differ in thelastPulledfield. Last-writer-wins overwrites the first peer's index.Impact: Functionally benign — the entry keys, sizes, and (usually) modification timestamps are identical, so sync behavior is correct. No data loss. But a future maintainer reading "same… index" could rely on byte-level idempotency for a conditional write optimization and be surprised. Consider softening the comment to "same listing → functionally equivalent synthesized index (entry keys and sizes match; only metadata timestamps may differ)".
-
S3ListResult.keys→S3ListResult.entriesandGcsListResult.keys→GcsListResult.entriesare breaking interface changes —s3_client.ts:71,gcs_client.ts:72These are exported interfaces. No in-tree callers use the old
.keysshape (verified via grep), but any external consumer importingS3ListResultorGcsListResultfrom the_libmodule will get a compile-time break. The_libpath convention suggests internal-only, so this is likely fine, but worth confirming no external extensions depend on these types.
Low
-
No upper bound on
listAllObjectsin the discovery path —s3_cache_sync.ts:758,gcs_cache_sync.ts:727A bucket with millions of legacy objects (not inconceivable for a long-running datastore) produces an unbounded in-memory array, an O(n) filter pass, an O(n) entries map, and a potentially multi-MB JSON string — all held simultaneously. For the migration/setup path this targets, this is acceptable, but a bucket with 1M+ objects would cause significant memory pressure. A
maxKeys/maxResultsguard with a warning log would be a belt-and-suspenders safety net. -
Unnecessary optional chaining on
retryWithBackoffreturn —s3_cache_sync.ts:798,gcs_cache_sync.ts:761putResult?.etagandputResult?.generationuse optional chaining, butretryWithBackoff<T>returnsPromise<T>(neverundefined) — it either resolves with the value or throws. The?.is harmless but suggestsputResultcould be nullish, which it can't be. Cosmetic only; not blocking.
Verdict
PASS — The core logic is correct. The discovery path properly handles both sub-cases (empty bucket vs. populated-but-unindexed), error propagation is sound (listing/put failures throw rather than leaving corrupted state), isInternalCacheFile covers all known internal files including .datastore-index.json itself (defensive against concurrent writes), the this.index-before-put ordering mirrors the existing post-fetch path, and the API migration from keys: string[] to entries: Entry[] is complete with no orphaned callers. Tests are thorough — 5 unit tests per extension plus an integration test proving the real SDK error shape. The one substantive note (Medium #1) is a documentation accuracy issue, not a correctness issue.
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
-
Missing inline comment on the new S3 integration test — CLAUDE.md requires a comment explaining why is needed whenever it appears. The DEF-2 section comment at line 1076 covers the earlier integration tests, but the new
pullIndex discoverytest at the end of the file is in its own block and has no comment. The GCS test file has a standalone comment block (around line 1508) that serves exactly this purpose; a similar one-liner near the new S3 test would satisfy the rule (e.g. "AWS SDK connection pooling trips Deno's resource leak detection"). -
Synthesized falls back to when listing lacks timestamps — in both implementations, / sets the index entry's to the time discovery ran rather than the object's actual modification time. This is documented and reconciles real mtimes during download, so it's not incorrect — just worth noting that a discovery index is not a faithful reflection of remote object timestamps.
The implementation is sound. The idempotency argument (same listing → same synthesized index across racing peers, so unconditional is benign) is correctly reasoned and documented. The interface changes to / (adding , /, / to list entries) are fully migrated with no orphaned callers. The five unit tests per extension cover all the cases claimed (discovery write-back with exact-match assertion, hydration via , internal-file filter, empty-bucket fallthrough regression pin, and the side benefit). The S3 integration test additionally proves the real AWS SDK surfaces the missing-index 404 with the discriminant the catch branch relies on — a valuable guard against future SDK reclassification. No types, no default exports, no live cloud services, env vars restored in via 's cleanup. All good.
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
-
Missing inline sanitizeResources comment on new S3 integration test — CLAUDE.md requires a comment explaining why sanitizeResources: false is needed whenever it appears. The DEF-2 section comment at line 1076 covers the earlier integration tests, but the new "pullIndex discovery" test added at the end of the file is in its own block and has no comment. The GCS test file has a standalone comment block (around line 1508) that serves exactly this purpose; a similar one-liner near the new S3 test would satisfy the rule (e.g. "AWS SDK connection pooling trips Deno resource leak detection").
-
Synthesized lastModified falls back to now when listing lacks timestamps — in both discoverIndexFromBucket implementations, the fallback
(entry.updated ?? new Date()).toISOString()sets the index entry's lastModified to the time discovery ran rather than the object's actual modification time. This is documented and pullChanged reconciles real mtimes during download, so it is not incorrect — just worth noting that a discovery index is not a faithful reflection of remote object timestamps.
The implementation is sound. The idempotency argument (same listing produces same synthesized index across racing peers, so unconditional PutObject is benign) is correctly reasoned and documented. The interface changes to S3ListResult / GcsListResult (adding size, etag/generation, lastModified/updated to list entries) are fully migrated with no orphaned callers. The five unit tests per extension cover all the cases claimed: discovery write-back with exact-match assertion, hydration via pullChanged, internal-file filter, empty-bucket fallthrough regression pin, and the pushChanged side benefit. The S3 integration test additionally proves the real AWS SDK surfaces the missing-index 404 with the name === "NoSuchKey" discriminant the catch branch relies on — a valuable guard against future SDK reclassification. No any types, no default exports, no live cloud services, env vars restored in finally via withProgrammableServer's cleanup.
- Soften the "idempotent across racing peers" claim in both discoverIndexFromBucket inline notes (s3 + gcs). Discovery is FUNCTIONALLY idempotent — entry keys and sizes match, but per-peer metadata timestamps (`lastPulled`, the `lastModified` fallback when the SDK didn't surface one) make the JSON bodies byte-different. Sync behaviour depends on keys + sizes, not timestamps, so last-writer-wins is still benign — but the comment now warns future maintainers NOT to add a content-fingerprint optimization on the assumption of byte-equality. - Add a comment block above the new s3 integration test explaining why `sanitizeResources: false` is set (mirrors the same setting used by every test in the DEF-2 integration block — AWS SDK keep-alive connections trip Deno's resource-leak detector and are reclaimed at runtime teardown). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
-
GCS fallback on zero-byte objects: In
gcs_client.ts,parseInt(obj.size, 10)returnsNaNonly ifobj.sizeis present but non-numeric — GCS always returns a numeric string so this is safe in practice, but an explicitNumber(obj.size) || 0would be more defensive. Non-blocking. -
ETag double-quotes note: The integration test comments that the raw ETag from the PUT includes surrounding double-quotes (e.g.
'"index-etag"') and correctly documents thatnormalizeETag()is what callers use for comparison. Consider adding a similar note indiscoverIndexFromBucketitself for future readers who may be surprised the returned fingerprint is not normalized. Non-blocking.
Overall: The fix is correct and well-designed. The discoverIndexFromBucket method is properly symmetric between S3 and GCS, the documented invariants (unconditional put, in-memory write before remote put, all callers inherit discovery) are sound, and the test suite covers all meaningful cases including the regression pin for empty buckets and the pushChanged side-benefit. The S3ListResult/GcsListResult API change from keys: string[] to entries: S3ListEntry[]/GcsListEntry[] is internal to each extension and all callers are updated. CLAUDE.md rules are followed — no model files touched, no any types, named exports, exact version pins, sanitizeResources: false with comment. Approved.
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
-
GCS size fallback: In gcs_client.ts, parseInt(obj.size, 10) is safe since GCS always returns a numeric string, but Number(obj.size) || 0 would be more defensive if obj.size is ever an unexpected value. Non-blocking.
-
ETag double-quotes: The integration test correctly notes the raw ETag from PUT includes surrounding double-quotes and that normalizeETag() is used for comparisons. Adding a similar note in discoverIndexFromBucket itself would help future readers who may be surprised the returned fingerprint is not normalized. Non-blocking.
Overall: The fix is correct and well-designed. discoverIndexFromBucket is properly symmetric between S3 and GCS, the documented invariants (unconditional put, in-memory write before remote put, all callers inherit discovery) are sound, and the test suite covers all meaningful cases including regression pins for empty buckets and the pushChanged side-benefit. The S3ListResult/GcsListResult API change from keys: string[] to entries: S3ListEntry[]/GcsListEntry[] is internal and all callers are updated. CLAUDE.md rules are followed: no model files touched, no any types, named exports, exact version pins, sanitizeResources: false with comment. Approved.
There was a problem hiding this comment.
Adversarial Review
Reviewed all 6 hand-written files (2 sync services, 2 client modules, 2 test files); skipped manifest YAMLs (version bumps only, no logic).
Critical / High
None found.
Medium
-
datastore/gcs/extensions/datastores/_lib/gcs_client.ts:1009—parseIntonobj.sizeis safe but fragile against a hypothetical empty string.
parseInt("", 10)returnsNaN. The guardobj.size ? parseInt(obj.size, 10) : 0handlesundefinedand empty-string (both falsy), so this is safe today. However, a GCS JSON API response with"size": "0"(truthy string) correctly produces0. No actual breakage scenario exists with the real API — noting only because the S3 sibling getsSizeas a nativenumberfrom the SDK and avoids parsing altogether. No action required. -
Both
discoverIndexFromBucketimplementations skipscrubIndex()after building the index.
The normalpullIndexfetch path callsscrubIndex()to remove zombie internal-file entries (swamp-club#29). The discovery path builds the index by filteringisInternalCacheFilebefore creating entries, which is functionally equivalent to a post-hoc scrub. This is correct — but it's an implicit invariant: ifscrubIndexever gains additional scrub criteria beyondisInternalCacheFile, the discovery path won't inherit them. Acceptable because both functions live in the same file and the relationship is documented. No action required.
Low
-
Test mocks for
listAllObjectsdon't returnupdated(GCS) orlastModified(S3) timestamps.
The mocks always exercise the?? new Date()fallback path indiscoverIndexFromBucket. The happy path where timestamps are present and propagated correctly is untested. Unlikely to break (it's a direct assignment), but a gap. Cosmetic — no action required. -
S3 mock
listAllObjectsdoesn't accept/checksignalparameter.listAllObjects(): Promise<Array<{...}>>
vs the GCS mock which accepts and checks it:
listAllObjects(_subPrefix?: string, signal?: AbortSignal): Promise<...>
JavaScript silently ignores extra arguments, so no test breakage, but the S3 mock wouldn't catch a bug where discovery fails to propagate abort signals. Cosmetic inconsistency — no action required.
Verified non-issues
-
Return type change (
keys: string[]→entries: S3ListEntry[]/GcsListEntry[]). Confirmed all callers oflistObjectsandlistAllObjectsacross thedatastore/tree. Only internal pagination (listAllObjects → listObjects) and the newdiscoverIndexFromBucketmethods call them. No external consumers reference the old.keysproperty. No breakage. -
isInternalCacheFilecorrectly excludes.datastore-index.jsonfrom the listing. During discovery the index is absent (that's the trigger), so the listing won't include it. After the put-back, any subsequent listing would include it, butisInternalCacheFilefilters it. No self-referential index entry possible. -
Race between concurrent peers discovering simultaneously. Both list the same objects, build functionally-equivalent indexes (same keys, same sizes, different metadata timestamps), and put unconditionally. Last writer wins. Since sync depends on keys + sizes (not timestamps), the last-writer-wins outcome is benign. The unconditional put is the correct choice given the documented cross-backend portability concern (
If-None-Match: *not portable). -
indexMutatednot set in discovery path. Correct:pushChangedusesindexMutatedto trigger writeback when scrub removed zombie entries. Discovery produces a clean index by construction, so no writeback is needed beyond what discovery itself performs. IfpushChangedsubsequently pushes files,pushed > 0triggers writeback independently. -
Error propagation. If
listAllObjectsorputObjectthrows insidediscoverIndexFromBucket, the error propagates topullIndex's caller. This is correct:this.indexis set in-memory butatomicWriteTextFilehasn't run, so no orphaned local state. The nextforceRemotecall re-triggers discovery.
Verdict
PASS. The discovery logic is sound, the two provider implementations are properly symmetric, the listObjects return-type change is fully contained, and the test coverage (5 unit tests + 1 integration test per provider) pins the important invariants. The edge cases I probed (empty bucket, internal-only-files bucket, concurrent peers, partial failure mid-discovery) are all handled correctly.
- Swap `obj.size ? parseInt(obj.size, 10) : 0` for `Number(obj.size) || 0` in gcs_client.ts. More defensive against a hypothetical empty-string `size` field; matches the S3 sibling's spirit (which gets a native number from the SDK and avoids parsing entirely). Comment explains the asymmetry between the two clients. - Add inline notes in both `discoverIndexFromBucket` implementations documenting that the returned fingerprint is the raw form from the PUT response — S3's quoted ETag, GCS's numeric `generation` string — matching the post-fetch path's contract. `normalizeETag()` is what callers apply for byte-level comparison; don't strip quotes here or the discovery path would diverge from the existing fingerprint convention. Both observations from the reviewer were "no action required" but worth documenting for future readers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Summary
This PR correctly implements self-healing index discovery for the @swamp/s3-datastore and @swamp/gcs-datastore extensions. When pullIndex hits a 404/NotFound for .datastore-index.json, it now lists the bucket, filters internal cache files, synthesizes an index, publishes it, and returns a fingerprint — restoring the pre-existing empty-bucket semantics as a fallthrough. The side benefit (preventing push from clobbering remote-only entries on unindexed buckets) is also correctly handled since both pullChanged and pushChanged go through pullIndex.
CLAUDE.md compliance — no model/ files touched, no any types in hand-written code, all exports are named, no new npm deps introduced.
Testing — comprehensive unit coverage (A: discovery+writeback, B: pullChanged hydration, C: internal-file filter, D: empty-bucket fallthrough, E: pushChanged merged view), plus a programmable-server integration test proving the real AWS SDK surfaces name === "NoSuchKey". The integration test correctly uses sanitizeResources: false with an explanatory comment. All mocks use in-memory storage and temp dirs; no live cloud services.
API surface change — listAllObjects / listObjects return types changed from string[] / { keys: string[] } to richer entry structs. Both methods have exactly one external caller each (the respective cache_sync.ts files), both updated correctly. TypeScript strict mode would catch any missed callers at check time.
Security — no credential exposure, no shell interpolation, path traversal protection unchanged (assertSafePath still applied on all file-path operations).
Minor observation (not blocking): In the GCS discoverIndexFromBucket, putResult?.generation uses optional chaining even though retryWithBackoff returns GcsWriteResult (non-nullable) and GcsWriteResult.generation is a required string. The ?. and ?? null will never trigger for GCS. Harmless defensive code — mirrors the S3 sibling where etag is genuinely optional — so no action needed.
There was a problem hiding this comment.
Adversarial Review
I traced every code path in the new discoverIndexFromBucket methods (both S3 and GCS), the caller integration in pullIndex, the interactions with pullChanged and pushChanged, the fast-path sidecar, signal propagation, error handling, and concurrent-peer scenarios. The client return-type changes (listObjects/listAllObjects from string[] to structured entries) have no other callers in production code — no breakage.
Critical / High
None found.
Medium
-
listAllObjectsin discovery not wrapped inretryWithBackoff—s3_cache_sync.ts:765andgcs_cache_sync.ts:732. ThePutObjectthat publishes the discovered index IS retried (line 799 / 760), but the listing call that precedes it is not. A transient 5xx or timeout during listing would fail the entire sync without retry, while the adjacent PutObject would have been retried up to 3 times. This is inconsistent — if the retry budget exists because these operations are flaky, the listing is equally flaky. Breaking example: S3 returns a 503ServiceUnavailableon theListObjectsV2call during discovery; the sync fails immediately instead of retrying. Mitigation: wrap thelistAllObjectscall inretryWithBackoff, same as theputObjectcall two lines below. Not blocking because: (a) the discovery path is only reached on a 404 (uncommon), and (b) the user retries the sync at the CLI level. -
S3ListResultandGcsListResultinterfaces changed from{ keys: string[] }to{ entries: S3ListEntry[]/GcsListEntry[] }—s3_client.ts:71andgcs_client.ts:72. These are exported interfaces. While no current callers broke (I verified), any downstream consumers of these clients who referenceresult.keyswould get a compile-time error. ThelistAllObjectsreturn type also changed fromstring[]toS3ListEntry[]/GcsListEntry[]. This is a breaking API contract change. Not blocking because: the_lib/location signals internal use, there are no external callers, and this repo pins exact versions.
Low
-
Theoretical: discovery races with concurrent index creation — Between the initial
getObject404 and thelistAllObjectscall, another peer could publish a valid index. Discovery would still list, build, and unconditionally overwrite with a functionally-equivalent index. This is documented as intentional (note i in both implementations), and the overwrite is benign (entry keys + sizes match, timestamps differ). No action needed. -
GCS
Number(obj.size) || 0treats a genuine size-zero object as0—gcs_client.ts:1017.Number("0") || 0evaluates to0, which is correct, but arrives via the|| 0fallback rather than theNumber()conversion. If a future change replaced||with??(to distinguish0fromNaN), the semantics would silently change. This is a standard JS idiom and not worth fixing, but worth noting thatNumber(obj.size) ?? 0would be more precise.
Verdict
PASS. The code is well-structured with careful documentation of invariants and failure-state ordering. Discovery is functionally idempotent across racing peers, the internal-file filter is correctly applied before index construction (not just after), empty-bucket semantics are preserved, both pullChanged and pushChanged callers inherit discovery transparently, and the test coverage is thorough (5 new unit tests per extension + 1 integration test). The medium findings are minor consistency and API hygiene issues, neither of which risks data loss or incorrect sync behavior.
…rom cache sync (#123) Extend `isInternalCacheFile` to exclude any path with basename `.lock` (in addition to the existing top-level `.datastore.lock`). Without this, the self-healing `discoverIndexFromBucket` path added in #122 captures transient per-target lock files (`data/<kind>/<type>/<id>/.lock`, written by the data tier's lock subsystem via S3 PUT/DELETE) into the synthesized index. The lock subsystem then deletes them on release, leaving the remote index referencing missing objects. A fresh reader running `datastore setup` 404s during hydration, fails to persist the `datastore:` block to `.swamp.yaml`, and the next `datastore sync --pull` falls back to the filesystem datastore and reports `Current datastore type: filesystem`. Surfaced by the swamp-uat sync_test.ts cross-repo pull tests (`sync pull retrieves data written by another repo …`) on @swamp/s3-datastore@2026.05.04.4. Reproduced end-to-end in a Linux container with a mock S3 server: writer runs a workflow (acquires the per-target lock, pushes data, releases the lock), reader's setup hits the 404 on the stale `.lock` index entry, `.swamp.yaml` never gets the datastore block, sync --pull surfaces the filesystem-fallback error. With the fix the reader hydrates 19 files cleanly and the pull succeeds. Mirrored in @swamp/gcs-datastore — same `discoverIndexFromBucket` path, same latent bug. Test plan: - 1 new unit test per extension pinning the `.lock` exclusion at any depth, plus negative cases (`data/@m/.locked.yaml`, `data/@m/lock`). - All existing s3 (45) + gcs (43) cache_sync tests still pass. - Linux+mock-S3 end-to-end repro (writer workflow run → reader setup → reader sync --pull) goes from `errors: ["…/.lock - 404 NoSuchKey"]` + filesystem-fallback to `filesPulled: 19` + a successful pull returning the writer's 4 data records. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Closes swamp-club#225. Paired follow-up to swamp-club#220.
When a bucket held data under standard prefixes (
data/,workflow-runs/, …) but no.datastore-index.json, the post-#220 hydrate step inswamp datastore setupreportedHydrated: 0 pulledand silently left the local cache empty. From the contributor's view, setup exited 0 andswamp datastore statuswas healthy, butswamp data listandswamp workflow run searchreturned nothing.pullIndex()'sNotFound/NoSuchKeybranch now lists the bucket viaListObjectsV2, filters viaisInternalCacheFile, builds an index from the listing, publishes it back, and continues. The same fix is mirrored in@swamp/gcs-datastore. BothpullChangedandpushChangedcallers inherit discovery automatically.Side benefit
Previously,
pushChangedagainst an unindexed populated bucket would write an index reflecting only LOCAL files, dropping the existing remote entries from the index even though the storage still held them. With discovery inpullIndex, push first builds a complete merged view, then walks local against it. Pinned by Test E.Design choices documented in inline comments
PutObject— discovery is idempotent (same listing → same synthesized index across racing peers), matchespushChangedwriteback (line 1029), andIf-None-Match: *is not portable across all S3-compatible backends this extension supports.this.indexset BEFORE put — mirrors the existing post-fetch order. If put throws, no local file is written and the nextforceRemote: truere-triggers discovery — idempotent, no orphaned state.pushChangedis needed.Invariant note (regression pin)
pullChangedalone never writes the remote index — onlypushChangeddoes. Setup's hydrate callspullChangedbut notpushChanged, so a second peer's setup against an already-discovered bucket must NOT modify the remote index. The second-peer ETag-stability check below pins this property.MinIO verification
Container running on
:9100; bucketrepro-bucketseeded directly withaws s3 cp(no.datastore-index.json).Before fix (origin/main):
swamp workflow run search→ empty..datastore-index.jsonstill absent in bucket.After fix, peer 1:
.datastore-index.jsonnow present in bucket (size 602, ETag8771228f99da99eeaa91a198860bf0ed).After fix, peer 2 (
/tmp/swamp-repro-issue-225-peer2, fresh repo, same bucket):.datastore-index.jsonETag UNCHANGED (still8771228f99da99eeaa91a198860bf0ed) — no redundant discovery write fired.Test plan
pullChangedhydrates, C internal-file filter for.datastore.lock/.push-queue.json/_catalog.db, D empty-bucket fallthrough regression pin, EpushChangedside benefit)name === "NoSuchKey"discriminant our catch branch expectsdeno fmt --checkanddeno lintclean on all 8 changed filesDEF-2 integration: 403 throws immediately with credential hinttest for the new credential-hint format from #226 (rebase merge artifact, not a behaviour change)Manifest bumps
@swamp/s3-datastore:2026.05.04.3→2026.05.04.4@swamp/gcs-datastore:2026.05.04.3→2026.05.04.4Post-merge follow-ups (human-side bookkeeping, NOT a lifecycle gate)
These live outside this PR and the issue-225 lifecycle:
systeminit/swamp-uat(labeladversarial): "datastore setup against bucket with data and no index file should hydrate, not silently report 0 pulled". The existingtests/cli/datastore/setup_test.tscovers basic setup but not this edge case.systeminit/swamp-club(content/manual/reference/datastore-configuration.md) noting the self-healing behaviour under the Hydrate step.systeminit/swampadding an "Initial Index Discovery" subsection todesign/datastores.md.🤖 Generated with Claude Code