Conversation
…mp-club#231) (swamp-club#233)
## Summary
Pure persistence-layer refactor that consolidates the scattered
upstream_extensions.json read/write surface into a single
`LockfileRepository`. No user-visible CLI behavior changes; lockfile
JSON shape is byte-identical pre/post.
Filed and lifecycled separately as a refactor-prequel for W2
(swamp-club#231) per its Phase A audit. The pre-committed mechanical
threshold rule (>7 callsites → SPLIT) chose this carve-out: ~160 LOC
new + 16 callsites touched.
## What ships
- `src/infrastructure/persistence/lockfile_repository.ts` — sole
gateway for upstream_extensions.json. ASYMMETRIC semantics:
- Reads serve from a snapshot captured at construction time
(preserves W1b's "snapshot frozen at construction" race-window
contract documented in `extension_repository.ts:88-95`).
- Writes acquire the advisory file lock, re-read disk, merge, atomic-
write, release, and update the local cache. Two concurrent writers
don't clobber each other.
- 12 unit tests including a cross-instance snapshot-divergence
regression and concurrent-writer contention.
- `ExtensionRepository` constructor migrated to take
`LockfileRepository` (replaces the W1b `getLockedVersion` closure).
All 5 construction sites + the stub helper move in the same commit.
- 9 `installExtension` callsites and 16 lockfile reader files migrated
to the repository surface.
- Duplicate `acquireLock` and `removeUpstreamExtension` helpers in
pull.ts and rm.ts deleted (consolidated into the repository).
- Three DI surfaces (`doctor.ts`, `list.ts`, `update.ts`) take
`LockfileRepository` directly instead of function-shape DIs.
- `emptyLockedVersionLookup` deleted (zero callers verified).
## Why split from W2
W2 lifecycle services (Install/Remove/Upgrade) need to write the
lockfile as part of their unit-of-work; W1b only exposed read access.
Bundling this refactor with W2's ~1300 LOC of new service code would
have pushed the W2 PR over the auto-ship-on-merge soak threshold.
Splitting matches the W1a → W1b precedent (each ships and soaks before
the next builds on it).
## Test plan
- [x] deno check, deno lint, deno fmt, deno run test (5397 passed, 0 failed)
- [x] deno run compile (binary built)
- [x] Author smoke against a real repo: pull + rm round-trip, list
(log + JSON), doctor extensions, model type search, sequential
pulls (writeEntry re-read-under-lock), extension update --check.
All green.
- [x] Cross-process concurrency: 50 concurrent Deno subprocesses
writing distinct entries to the same lockfile, repeated 3 times.
All 150 operations completed; no failures, no leftover .lock
files, no structural corruption (~600-800ms per 50-worker run).
- [ ] Reviewer smoke on a different repo
- [ ] ~2-day diversity-matrix soak (Linux + macOS, mixed local/pulled,
version transitions). REQUIRED before merge per the prequel-
specific gate from swamp-club#231 Phase A.
## Forward-only revert posture
No schema changes; no migration. Lockfile JSON shape is unchanged so a
binary downgrade reads the same file back. Reverting is purely a code
revert.
## Out of scope
- W2 lifecycle services (swamp-club#231) — depends on this landing.
- Cross-platform lockfile concurrency improvements — W3 territory.
- ADV-13 test-seam audit (SQLite-commit-vs-lockfile-write fault
injection) — stays in W2's plan v4 step 10; this prequel doesn't
introduce that boundary.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
CLI UX Review
Blocking
None.
Suggestions
None.
Verdict
PASS — This is a pure persistence-layer refactor. No user-visible changes: no flags, no help text, no output formats, no error messages (beyond the minor improvement adding "Please retry." to the lock-contention error). Lockfile JSON shape is unchanged. The audit of command files (extension_pull.ts, extension_rm.ts, extension_update.ts, extension_search.ts, doctor_extensions.ts, open.ts) confirms all user-facing surfaces are identical before and after.
There was a problem hiding this comment.
Code Review
Clean persistence-layer refactor that consolidates scattered lockfile read/write into a single LockfileRepository. The migration is thorough — 16 callsites migrated, duplicate acquireLock helpers consolidated, tests comprehensive (12 new unit tests including cross-instance snapshot regression and concurrent writer tests).
DDD assessment
LockfileRepository is well-placed in infrastructure/persistence/ as a persistence gateway. It encapsulates the advisory file lock, atomic writes, and snapshot-caching semantics behind a clean interface. The asymmetric read/write design (cached reads, lock-re-read-merge writes) is clearly documented and protected by a regression test. The constructor/static-factory split (public constructor for test seams, create() for production) is a good pattern.
No blocking issues
The code adheres to CLAUDE.md conventions: license headers present, named exports, no any types, test files co-located, libswamp import boundary respected (CLI files importing LockfileRepository directly from infrastructure/persistence/ follows the established pattern for ExtensionRepository and ExtensionCatalogStore). All async signature changes (createExtensionPullDeps, createExtensionRmDeps, createExtensionUpdateDeps, createInstallContext, createExtensionListDeps) have their callers properly updated.
Suggestions
-
acquireLockerror type: The oldacquireLockinpull.tsthrewUserError(clean CLI message), but the consolidated version inLockfileRepositorythrows plainError. For the pull/rm paths where this surfaces to users, aUserErrorwould give a cleaner experience. Minor since the lock timeout is already rare in practice. -
removeEntrymissingmkdir:writeEntrycallsDeno.mkdir(dirname(...))beforeacquireLock, butremoveEntrydoes not. IfremoveEntryis called when the parent directory doesn't exist,acquireLockwould fail trying to create the.lockfile. This is safe in practice (you can't remove an extension without a lockfile), but adding the samemkdirguard would make it symmetric. -
shorthandUpstreamduplication:update_test.tsdefinesshorthandUpstream()which duplicatesfixedLockedVersions()fromstub_extension_repository.ts. Could be consolidated into a shared test helper.
There was a problem hiding this comment.
Adversarial Review
Thorough code-path trace of all 39 changed files. This is a well-executed mechanical refactoring that correctly preserves the existing concurrency and snapshot semantics. The asymmetric read/write design is sound, the lock-re-read-merge-write pattern prevents clobbering, and the cache-update-after-write keeps the writer's own reads consistent.
Critical / High
None found.
Medium
-
Error type regression in lock acquisition failure —
src/infrastructure/persistence/lockfile_repository.ts:192: The consolidatedacquireLockthrows plainError, but the oldacquireLockinpull.tsthrewUserError. The top-level error handler inerror_output.ts:64-68rendersUserErroras a clean message ("Error: {message}") andErroras the full object with stack trace. This means a lock-exhaustion failure duringextension pullnow produces a noisy stack trace instead of a clean error message.Breaking example: Two concurrent
swamp extension pulloperations contend for the lock, the loser retries 10 times and fails — user sees a stack trace instead of"Error: Could not acquire lock on upstream_extensions.json. Another pull may be in progress. Please retry.".Suggested fix: Import
UserErrorand throw it instead of plainError, matching the old pull.ts behavior. The old rm.ts also threw plainError, so this was an inconsistency even before — but theUserErrorpath is the better UX.
Low
-
removeEntrylacksDeno.mkdirfor parent directory —lockfile_repository.ts:162:writeEntry(line 130) callsDeno.mkdir(dirname(this.lockfilePath), { recursive: true })before acquiring the lock, butremoveEntrydoes not. IfremoveEntryis ever called when the lockfile's parent directory doesn't exist,acquireLockwill throw aNotFounderror fromDeno.open(notAlreadyExists), which gets rethrown as an unhandled error. In practice this is unreachable — callers confirm the entry exists before calling removeEntry, which implies the lockfile was readable — but the asymmetry withwriteEntryis notable. -
getAllEntriesshallow copy doesn't protect nested entry mutation —lockfile_repository.ts:106: The method returns{ ...this.cache }which copies top-level keys but shares innerUpstreamExtensionEntryobject references. A caller doingentries["@foo/bar"].files!.push("injected")would mutate the internal cache. I verified all current callers are read-only, so this is theoretical. The regression test only verifies top-level key deletion, not nested mutation.
Verdict
PASS — No critical or high-severity issues. The one medium finding (error type regression) is on an uncommon failure path and the error message content is still descriptive; it's a cosmetic presentation regression, not a logic error. The refactoring is mechanically correct, all 16 callsites are properly migrated, the snapshot-frozen-at-construction contract is preserved, and the concurrent-writer protection (lock → re-read → merge → atomic-write) is faithfully ported. The cross-instance snapshot regression test is a good guardrail.
Summary
Pure persistence-layer refactor. Consolidates the scattered
upstream_extensions.jsonread/write surface into a singleLockfileRepository. No user-visible CLI behavior changes; lockfile JSON shape is byte-identical pre/post.W2 prequel for swamp-club#231 — the lifecycle services planned in W2 (Install/Remove/Upgrade) need to write the lockfile as part of their unit-of-work, but W1b only exposed read access via the
getLockedVersionclosure. Filed and lifecycled separately (swamp-club#233) per W2's Phase A audit; pre-committed threshold rule (>7 callsites → SPLIT) chose this carve-out.Net: ~160 LOC for the new repository, 16 callsites migrated, 2 duplicate
acquireLockhelpers consolidated.Architectural decision: asymmetric read/write semantics
This is the load-bearing design choice and it's documented in the JSDoc —
LockfileRepositorydeliberately splits read and write semantics:getEntry,getAllEntries,getLockedVersion) serve from a cache populated once at construction. The cache does NOT auto-refresh. This preserves W1b's "snapshot frozen at construction" race-window contract documented inextension_repository.ts:88-95— relocated one layer out, contract identical.writeEntry,removeEntry) acquire the advisory file lock, re-read CURRENT disk state, merge, atomic-write, release, and update the local cache. Two concurrent writers don't clobber each other.A regression test (
LockfileRepository: cross-instance snapshot regression) constructsrepoA, mutates the lockfile via a siblingLockfileRepository, assertsrepoAstill serves the OLD value, then constructsrepoBand asserts it sees the NEW value. Future contributors who try to "fix" the cached read to be live will hit this test.What changed
src/infrastructure/persistence/lockfile_repository.ts+ 12 unit tests.ExtensionRepositoryconstructor takesLockfileRepository(replaces W1b'sgetLockedVersionclosure). All 5 construction sites + the stub helper migrate in the same commit.installExtensioncallsites and 16 lockfile reader files migrated.doctor.ts,list.ts,update.ts) takeLockfileRepositorydirectly instead of function-shape DIs.emptyLockedVersionLookupdeleted (zero callers verified by grep).createInstallContext/createExtensionPullDeps/createExtensionRmDeps/createExtensionUpdateDepsare now async (they construct the snapshot-capturing repository); JSDoc pins the single-use rule (don't reuse a context across multiple operations).Test plan
deno checkcleandeno lintcleandeno fmtcleandeno run test— 5397 passed, 0 faileddeno run compile— binary built/tmp/swamp-smoke-233):swamp repo init✓swamp extension pull @stack72/system-extensions✓ (lockfile written with all expected fields)swamp extension list(log mode) ✓swamp extension list --json✓ (JSON shape unchanged)swamp doctor extensions✓ — all 5 registries pass (validates DI surface change)swamp model type search system✓ — types resolve viaExtensionRepositorylockfile-fallback path through new injectionswamp extension rm --force✓ — lockfile becomes{}, files deleted, dirs prunedextension pull --force✓ — second pull'swriteEntryre-reads disk under lockswamp extension update --check✓.lockfiles, 0 structural corruption (574–766ms per 50-worker run).Forward-only revert posture
No schema changes; no migration. Lockfile JSON shape unchanged → binary downgrade reads the same file back. Reverting this PR is purely a code revert.
Out of scope (deferred by design)
acquireLockrace window.🤖 Generated with Claude Code