Skip to content

feat(extensions): W2 — lifecycle services (Install/Remove/Upgrade) — closes swamp-club#231 + swamp-club#201#1310

Merged
stack72 merged 15 commits intomainfrom
feat/w2-extension-lifecycle-services
May 5, 2026
Merged

feat(extensions): W2 — lifecycle services (Install/Remove/Upgrade) — closes swamp-club#231 + swamp-club#201#1310
stack72 merged 15 commits intomainfrom
feat/w2-extension-lifecycle-services

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented May 5, 2026

Summary

W2 of the extension catalog rearchitecture (swamp-club#231). Introduces three narrow lifecycle services through which all catalog writes flow:

  • InstallExtensionService — owns the catalog-write surface for install. Phase 8 walks the per-extension subtree, calls each loader's bundleAndIndexOne, and commits via repository.saveAll([extension]) in one SQLite transaction. I-Repo-1 fires synchronously at install time rather than at the next steady-state loader pass.
  • RemoveExtensionService — owns rm. Closes swamp-club#201: today's rm leaves stale catalog rows; this service makes the catalog tombstone-save the FIRST step. Asymmetric ordering (catalog → lockfile → FS) is pinned in plan v4.
  • UpgradeExtensionService — thin facade that expresses upgrade-intent at the call site. Atomic upgrade pattern lives inside InstallExtensionService phase 8: saveAll([tombstoneAll(vN), vN+1]) commits in one transaction so I-Repo-1 evaluates against the post-save state.

CLI commands (extension_pull, extension_rm, extension_update) and 4 internal CONVERT callsites (auto_resolver_adapters, open, resolve_datastore, extension_update closure) now construct services rather than reaching into the catalog directly. The 5 KEEP callsites preserve byte-identical event-stream output (Pin 2).

Closes

  • swamp-club#231 — W2 phase of the catalog rearchitecture.
  • swamp-club#201 — extension rm now prunes catalog rows.

Plan v4 mapping

Step Description Status
1 Phase A audit + SPLIT decision done
2 LockfileRepository (split prequel) shipped externally (PR #1298 + #1299)
3 InstallExtensionService commits 2a–2c
4 Migrate 4 install callsites + KEEP wrappers commit 3
5 RemoveExtensionService commit 4
6 swamp-club#201 lifecycle-layer reproducer commit 4
7 rm.ts migration commit 4
8 UpgradeExtensionService commit 5
9 Order-independence regression commit 7 (bulk-upgrade form — see note below)
10 Crash-state recovery + FaultingStubRepository commit 8
11 DuplicateTypeError UserError surface commits 6 + 10
12 CLI routing test commit 7
13–15 docs + verify + soak commit 9 + this PR + post-merge

Pinned architectural contracts

Three contracts have load-bearing tests so a future regression fails the suite:

  • Pin 1bundleAndIndexOne MUST NOT call catalog.upsert. One regression test per loader (5 total) asserting catalog row count is unchanged after a bundleAndIndexOne call.
  • Pin 2 — KEEP callsite event streams are byte-identical to pre-W2. Renderer-level baseline tests pin the ExtensionPullEvent shape.
  • Pin 3 — W4-inherits surface documented inline so the W4 planning agent doesn't have to re-derive the deletion list.

Three deliberate deviations from plan v4 (per architecture-agent review)

  1. DuplicateTypeError JSON shape. Plan v4 sketched { error: "duplicate_type", existing: ..., conflicting: ... } (string discriminator). Implementation uses { error: "<human-readable message>", duplicateType: { kind, type, existing, conflicting } } (side-car object). Functionally equivalent for jq consumers and AI agents; the side-car shape lets the human-readable message stay in the conventional error field. Pinned by a parity test (log message and JSON error agree).
  2. Order-independence test framing. Tests bulk-upgrade across two distinct extensions (distinct catalog rows) rather than the literal saveAll([v2, tombstoneAll(v1)]) within a single extension. The catalog's primary key is source_path — v1 and v2 of the same extension share a row, so the literal "inverted order" is a no-op concept. Documented inline.
  3. CLI routing test for extension update. Source-text grep asserting UpgradeExtensionService is constructed in the command's .action(...) closure. A runtime test would require full registry standup; the source-text form catches the regression mechanically.

W4-inherits surface

When the unified loader (KindAdapter) lands in W4, the deletion surface is:

  • 5 bundleAndIndexOne public methods → 1 dispatch.
  • 4 CONVERT callsites' direct InstallExtensionService construction stays (services persist past W4).
  • 5 KEEP callsites' wrapper-internal service-construction stays.
  • bundleAndIndexOne deliberately does NOT touch legacyStore so W4's legacyStore deletion stays clean.

Verification

  • deno fmt / deno lint / deno check / deno run test (5433 passed, 0 failed, 28 ignored).
  • deno run compile produces a working binary.
  • Author smoke test in /tmp/swamp-w2-smoke-*: init → pull two extensions → list → model type search → rm one → catalog cleaned + types removed from search → force-pull the other → atomic upgrade pattern handles same-version reinstall → rm → double-rm yields clean Extension <name> is not installed. UserError.
  • swamp-uat extension suite (tests/cli/extension/): 87 passed against the freshly compiled binary, including sibling_coinstall_test (~2m exercising multi-extension install + rm of one).

Follow-up

Filed as separate swamp-club issues (none blocking W2 merge — all are post-merge or future-architecture work):

  • swamp-club#251 — swamp doctor extensions repair for catalog-only orphans (W6 territory). Surfaces when an extension has catalog rows but no lockfile entry — extension rm short-circuits today.
  • swamp-club#253 — UAT additions for the W2 lifecycle services. Tracks new tests/cli/extension/ files for feat: Selective telemetry argument recording #201 closure visibility, atomic upgrade, force-pull regression, half-state recovery messages, and DuplicateTypeUserError JSON shape.
  • swamp-club#254 — Cross-process concurrency stress for the lifecycle services, mirroring swamp-club#234s pattern. Architecturally safe (one SQLite txn per saveAll, WAL on, per-extension atomicity), but the explicit stress test is missing.
  • swamp-club#255 — Plan v4 step 9 literal saveAll([v2, tombstoneAll(v1)]) test is structurally untestable under the current catalog PK semantics (source_path is the PK; v1 and v2 of the same extension share a row). The bulk-upgrade order-independence test in this PR covers the meaningful generalization. Issue captures the open architectural question for a future pass.

Test plan

  • CI green
  • Auto-merge

🤖 Generated with Claude Code

stack72 and others added 14 commits May 5, 2026 18:35
…oaders

W2 of the extension catalog rearchitecture (swamp-club#231) introduces
three lifecycle services that own catalog writes end-to-end. This
commit lays the foundation: a public per-file entry point on each of
the 5 user loaders that the lifecycle services will call during
install to extract type metadata WITHOUT writing to the catalog.

## What ships

`bundleAndIndexOne({ absolutePath, relativePath, baseDir })` on:
- UserModelLoader   — returns { kind: "model" | "extension", typeNormalized, bundlePath, fingerprint } | null
- UserDriverLoader  — returns { kind: "driver",    ... } | null
- UserVaultLoader   — returns { kind: "vault",     ... } | null
- UserDatastoreLoader — returns { kind: "datastore", ... } | null
- UserReportLoader  — returns { kind: "report",    ... } | null

Each method does the bundle build + dynamic import + Zod schema
validation that the existing private `rebundleAndUpdateCatalog`
methods do, but stops there — no catalog row is written, no runtime
registry binding fires. The lifecycle service is the new owner of
those writes (via repository.save(extension)) so I-Repo-1 evaluates
the post-save state and can fire DuplicateTypeError synchronously at
install time.

## Pin 1 — load-bearing for I-Repo-1-fires-at-install

The architecture-agent's pre-W2 review pinned a contract on this
method: bundleAndIndexOne MUST NOT call catalog.upsert. If a future
refactor sneaks an upsert back in, I-Repo-1 silently stops firing at
install time — the user-visible payoff regresses without the test
suite noticing.

Each loader gets a regression test asserting that
catalog.findAll().length is unchanged after a bundleAndIndexOne call:
- UserModelLoader.bundleAndIndexOne: returns model metadata without
  writing catalog rows (Pin 1)
- UserModelLoader.bundleAndIndexOne: returns null for files without a
  model/extension export
- UserDriverLoader / UserVaultLoader / UserDatastoreLoader /
  UserReportLoader: returns <kind> metadata without writing catalog
  rows (Pin 1)

Six new tests total. All pass.

## Why this is its own commit

Plan v4 step 3 prerequisite. The InstallExtensionService (commit 2)
calls these methods to build the Extension aggregate's Sources before
calling repository.save(extension). Splitting this commit out keeps
each subsequent W2 commit focused on one architectural concern.

## LOC budget

Production code: 334 LOC across 5 loader files — within the
pre-committed ≤ 400 LOC threshold for commit 1's loader-API additions.
Test code: 426 LOC across 5 test files (Pin 1 regression net per
loader).

## Out of scope

- The existing rebundleAndUpdateCatalog methods are unchanged and
  still write to the catalog directly via legacyStore. Refactoring
  them to delegate to bundleAndIndexOne is a future cleanup (W4
  KindAdapter territory) — keeping them parallel for now is the lower-
  risk shape and the duplication is small.
- The lifecycle services themselves (InstallExtensionService etc.)
  ship in subsequent commits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… event-stream baseline

W2 of the extension catalog rearchitecture (swamp-club#231). This is
the first of three commits implementing plan v4 step 3+4 KEEP-side
(InstallExtensionService + pull.ts migration + Pin 2 regression test).

## What ships

1. **Test seam on `ExtensionPullDeps`.** New optional
   `installExtensionFn?: (ref, ctx) => Promise<InstallResult | undefined>`
   field. Defaults to the real `installExtension` from this module;
   `extensionPull` falls back automatically when unset, so production
   callers are unaffected. Tests inject a stub to drive the
   `ExtensionPullEvent` stream without a real registry, tarball, or
   filesystem write.

2. **Pin 2 event-stream regression tests** in `pull_test.ts`. Three
   tests lock in the current shape:
   - `installing → completed` for a successful install
   - `installing → orphans-pruned → completed` when prior version files
     were pruned
   - `installing` only when the install short-circuits (alreadyPulled)

   Each test asserts the exact event-kind sequence AND the structural
   shape of the payload (key fields present with expected values).

## Why a separate commit

Plan v4 commit 2 is the architecturally heavy piece (service skeleton +
pull.ts migration + phase 8 type extraction). The architecture-agent's
Pin 2 review demanded a regression test that asserts BYTE-IDENTICALITY
of the event stream pre/post refactor. The natural shape: capture
current behavior FIRST (this commit), refactor in subsequent commits,
watch the test stay green.

If the test ever fails during 2b/2c, that's the canary the architecture-
agent designed it to be — the inner refactor leaked into the event
payload.

## What's next

- 2b: Create `InstallExtensionService` class and migrate
  `pull.ts:installExtension` body into it. Pure refactor; Pin 2 test
  must remain green.
- 2c: Add phase 8 — walk the extracted per-extension subtree, call
  each loader's `bundleAndIndexOne`, build the `Extension` aggregate,
  call `repository.save(extension)` with FS rollback on
  `DuplicateTypeError`. The architectural payoff that activates I-Repo-1
  at install time.

Test plan:
- [x] deno check, lint, fmt clean
- [x] deno run test — 5420 passed (3 new), 0 failed

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…xtensionPull routing

W2 of the extension catalog rearchitecture (swamp-club#231). Second of
three commits implementing plan v4 step 3+4 KEEP-side (service +
pull.ts wrapper migration). This is a pure refactor — Pin 2 event-
stream regression test (added in 2a) verifies byte-identicality.

## What ships

1. **`src/libswamp/extensions/install_extension_service.ts`** — new
   file with the `InstallExtensionService` class. Single public method
   `execute(ref, ctx)` that delegates to the existing `installExtension`
   free function unchanged. The service is the architectural boundary
   the W2 lifecycle owns; phase 8 (synchronous type extraction +
   `repository.save` + FS rollback on `DuplicateTypeError`) lands in
   commit 2c.

2. **`extensionPull` routes through the service.** The default fallback
   for `deps.installExtensionFn` switches from the bare
   `installExtension` free function to
   `(ref, ctx) => new InstallExtensionService().execute(ref, ctx)`.
   Tests still inject their own stub via `installExtensionFn`.

## Verification — Pin 2 holds

The architecture-agent's Pin 2 contract requires byte-identical event
stream pre/post the W2 refactor. The 3 Pin 2 baseline tests added in 2a
all pass after this commit:
- `extensionPull: emits installing → completed for a successful install`
- `extensionPull: emits installing → orphans-pruned → completed when prior version files are pruned`
- `extensionPull: emits only installing when install short-circuits (alreadyPulled)`

If 2c (phase 8) ever inadvertently changes the event payload, these
tests catch it before it leaks to renderers.

## What's next

- 2c: Add phase 8 to `InstallExtensionService.execute()` —
  bundleAndIndexOne walk per kind, build Extension aggregate,
  repository.save with FS rollback on DuplicateTypeError. The
  architectural payoff that activates I-Repo-1 at install time.

Test plan:
- [x] deno check, lint, fmt clean
- [x] deno run test — 5420 passed, 0 failed (Pin 2 baseline holds)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
….save + FS rollback

W2 of the extension catalog rearchitecture (swamp-club#231). Third of
three commits implementing plan v4 step 3+4 KEEP-side. **This commit
delivers the architectural payoff for W2** — I-Repo-1 fires
synchronously at install time.

## What ships

1. **`InstallExtensionService.execute()` runs phase 8.** After the
   wrapped install completes (filesystem write, lockfile write), the
   service walks the per-extension subtree on disk and builds the
   `Extension` aggregate's Sources by calling each loader's
   `bundleAndIndexOne` (Pin 1 contract). Then commits via
   `repository.saveAll([topLevelExt, ...freshlyInstalledDeps])` so
   I-Repo-1 evaluates against the post-save state across the entire
   install operation as a unit.

2. **FS rollback on `DuplicateTypeError`** — the architecture-agent's
   "expensive miss #2". SQLite ROLLBACK does not undo filesystem
   mutations. The service catches `DuplicateTypeError` from `saveAll`,
   deletes every just-installed file (top-level + deps) via the
   `extractedFiles` lists, restores the lockfile to its pre-install
   state, and rethrows as a `UserError` that names both conflicting
   extensions.

3. **Service constructor takes `denoRuntime` + `repository`** — the
   deps phase 8 needs (loaders construct lazily per kind directory).
   Optional `installExtensionFn` test seam mirrors
   `ExtensionPullDeps`'s pattern; lets phase 8 be exercised against a
   pre-staged on-disk subtree without driving a real registry/tarball.

4. **`installZodGlobal()` added to all 5 loaders' `bundleAndIndexOne`**
   so the bundle-import path runs cold-start-safe outside the
   `loadModels` / `buildIndex` entry points.

5. **Pull.ts wiring.** `ExtensionPullDeps` gains optional
   `denoRuntime` + `repository` fields. When BOTH are provided,
   `extensionPull` routes through the service (W2 contract fires).
   When either is missing, falls back to the pre-W2 free-function path
   (catalog rows populated lazily on next loader pass — same behavior
   as before W2). Migrating callers to provide both is plan v4 commit
   3 (CONVERT callsites + KEEP wrapper-internal swap).

6. **Two integration tests** (`install_extension_service_test.ts`)
   exercise the service end-to-end with a real catalog and real
   bundling:
   - `phase 8 populates the catalog with Indexed Sources` — proves the
     bundleAndIndexOne walk + repository.save chain works against a
     real on-disk fixture.
   - `DuplicateTypeError triggers FS rollback and surfaces as
     UserError` — proves the architecture-agent's expensive-miss-#2
     contract holds. Stages two extensions claiming the same type;
     second install fails with UserError, B's files are gone, B's
     lockfile entry is absent, A's catalog row is preserved.

## Pin 2 holds

The 3 Pin 2 baseline tests added in 2a still pass. Event stream byte-
identical pre/post the W2 refactor.

## Budget check (architecture-agent's threshold rule)

Cumulative diff through this commit: ~1713 LOC. Pre-committed budget
was ≤1500 LOC for the entire W2 PR. We've crossed the threshold
mid-W2 — only 1 of 3 lifecycle services (Install) is done. Per the
rule, this is the signal to **SPLIT before continuing**.

Options for the next step (surfaced to the human):
- Open this branch as the "W2 install" PR; file Remove/Upgrade as
  follow-ups.
- Or push through and split later — the threshold was a heuristic, not
  a hard invariant.

Test plan:
- [x] deno check, lint, fmt clean
- [x] deno run test — 5422 passed (2 new), 0 failed (Pin 1 + Pin 2
      regression tests both green)
- [x] Phase 8 verified against real bundling + real catalog + real
      I-Repo-1 firing + real FS rollback

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…InstallExtensionService

W2 of the extension catalog rearchitecture (swamp-club#231). With the
service skeleton + phase 8 done in commit 2, this commit **activates
phase 8 in production**. After this lands, every install path through
the CLI fires I-Repo-1 synchronously at install time —
`DuplicateTypeError` becomes the user-visible error for
cross-extension `(kind, type)` collisions instead of silent first-wins.

## What ships

### Pull.ts factory + KEEP-side

- `createExtensionPullDeps` accepts optional `args.denoRuntime` and
  `args.repository` and threads them onto the returned
  `ExtensionPullDeps`. Backwards-compatible — callers that don't pass
  them see no change.
- `PullContext` (CLI-side wrapper) gains optional `denoRuntime` +
  `repository` fields. `pullExtension` threads them onto the
  `ExtensionPullDeps` it constructs internally.

### Production callsites — 4 CONVERT, 2 KEEP-wired

CONVERT (build their own InstallContext + call `installExtension`
inline):
- **`src/cli/auto_resolver_adapters.ts`**: auto-install on missing
  type. Routes through `InstallExtensionService` when the parent
  context has a repository (via the existing `repository?` field).
  Falls back to free-function `installExtension` when no repository
  is available.
- **`src/cli/resolve_datastore.ts`**: datastore auto-update. Constructs
  catalog + repository inline, executes via the service, closes the
  catalog in `finally`.
- **`src/cli/commands/extension_update.ts`** (bulk upgrade): catalog
  stays open for the duration of the bulk run, repository is freshly
  constructed per upgrade with a fresh lockfile snapshot.

KEEP (use the `extensionPull` async generator wrapper, get phase 8 by
having `denoRuntime` + `repository` plumbed through `PullContext`):
- **`src/cli/commands/extension_pull.ts`** (`swamp extension pull` CLI
  entry).
- **`src/cli/commands/open.ts`** (web UI `pullExtension` callback).

### Lifetime + cleanup

Each new construction of `ExtensionCatalogStore` is paired with a
`try { ... } finally { catalog.close(); }` so the SQLite handle is
released cleanly. The bulk-update path keeps the catalog open across
all upgrades for atomic-per-upgrade semantics.

### Pin 2 holds

Event-stream byte-identicality verified. All 3 Pin 2 baseline tests
remain green:
- `installing → completed` happy path
- `installing → orphans-pruned → completed` with prior version files
- `installing` only when alreadyPulled short-circuits

The W2 service routes through the same `extensionPull` event sequence;
phase 8 is internal to `InstallExtensionService.execute()` and emits no
new events (the renderer surface is unchanged).

## What this delivers user-visibly

- `swamp extension pull <name>` against a colliding type now emits
  `DuplicateTypeError` with both source paths named, exit code 1, and
  filesystem rollback. Previously: silent first-wins, no error.
- `swamp open` UI install path inherits the same contract.
- `swamp extension update` bulk upgrade fires phase 8 per upgrade —
  per-extension atomic transitions; A's collision-rollback does NOT
  roll back B's already-completed upgrade in the same bulk run.
- Auto-resolver (lazy install on unknown type) and datastore
  auto-update get phase 8 too.

## Out of scope (next commits)

- `swamp extension rm` still leaves catalog rows behind (closes
  swamp-club#201 — needs RemoveExtensionService, commit 4).
- Atomic upgrade as a single saveAll([tombstoneAll(v1), v2])
  transaction — needs UpgradeExtensionService, commit 5. Currently
  upgrade is "save v2 over v1" via the per-upgrade install path.

Test plan:
- [x] deno check, lint, fmt clean
- [x] deno run test — 5422 passed, 0 failed (Pin 1 + Pin 2 + service
      tests all green)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lub#201

W2 of the extension catalog rearchitecture (swamp-club#231). This
commit delivers the second user-visible W2 payoff: **`swamp extension
rm` now prunes catalog rows** (closes swamp-club#201). Pre-W2,
`extension rm` deleted files + the lockfile entry but never touched
the catalog, leaving stale `(kind, type)` rows that survived removal.

## What ships

1. **`src/libswamp/extensions/remove_extension_service.ts`** — new
   `RemoveExtensionService` class. Single public method `execute(name)`
   with the **inverted ordering** that's load-bearing for safety:

   - Catalog tombstone-save FIRST (`saveAll([tombstoneAll(ext)])`) —
     DELETEs every row owned by this extension in one SQLite
     transaction.
   - Lockfile remove SECOND.
   - Filesystem delete LAST.

   The architecture-agent's "expensive miss #1": going FS-first leaves
   a window where the catalog points at deleted bundles and resolution
   crashes for that window. Catalog-first means a mid-rm crash leaves
   files on disk but the catalog clean — next loader pass surfaces
   orphans via the existing `findStaleFiles` fallback. Pinned in plan
   v4.

   **Idempotency contract.** Double-rm yields a clean
   `UserError("not installed")` on the second call, NOT silent success
   and NOT an ambiguous error. Plan v4 challenge #9.

2. **`rm.ts:extensionRm` migrated to a thin wrapper.** The async
   generator surface and `ExtensionRmEvent` shape are preserved so
   renderers and the CLI two-phase prompt flow are untouched. Internal
   work is delegated to `RemoveExtensionService`.

3. **`ExtensionRmDeps` evolved.** Old per-op stubs (`removeFile`,
   `readDirEntries`, `removeDir`) replaced by a required
   `repository: ExtensionRepository` field. Filesystem deletion is now
   internal to the service, not a deps concern. `createExtensionRmDeps`
   constructs the catalog + repository; the CLI command closes the
   catalog handle in `finally`.

4. **swamp-club#201 lifecycle-layer reproducer test** at
   `remove_extension_service_test.ts`. Stages an extension, installs
   via `InstallExtensionService` (catalog populated), removes via
   `RemoveExtensionService`, asserts:
   - `repository.loadByName(name).length === 0` (catalog empty)
   - `lockfileRepository.getEntry(name) === null`
   - tracked files deleted

5. **Idempotency, never-installed, and preserve-others tests.** Triple
   coverage of the safety contracts.

6. **rm_test.ts pruned of 3 stub-based tests** that depended on the
   removed deps fields. Coverage shifts to the real-fixture service
   tests where filesystem behaviour is exercised honestly. The
   wrapper-specific tests (event shape, missing-extension error path,
   #120 sibling-preservation) remain.

## Pin 1 + Pin 2 hold

All Pin 1 (loader catalog-row-count regression) and Pin 2
(`extensionPull` event-stream byte-identicality) tests still pass.

## Test plan

- [x] deno check, lint, fmt clean
- [x] deno run test — 5424 passed, 0 failed
- [x] swamp-club#201 reproducer green at the lifecycle layer

## What this delivers user-visibly

- `swamp extension rm <name>` now prunes the catalog rows for that
  extension. `swamp doctor extensions` no longer reports the orphan
  rows. `swamp model type search` no longer returns the type after rm.
- `swamp extension rm` of a never-installed (or already-removed)
  extension yields a clean error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ade pattern

W2 of the extension catalog rearchitecture (swamp-club#231). This
commit completes the third lifecycle service AND fixes a regression
introduced earlier in the W2 series.

## Atomic upgrade — what changed

`InstallExtensionService.execute` phase 8 now tombstones any existing
aggregates with the SAME name but a DIFFERENT version BEFORE calling
`repository.saveAll`:

  saveAll([...tombstoneAll(currentVersions), ...newExtensions])

This commits in one SQLite transaction so I-Repo-1 evaluates only the
post-save state where the new version holds the type identifier.
Without this, force-pulling an already-installed extension OR any
version-bump pull would have failed with `DuplicateTypeError` because
the prior version's rows still claimed the type.

**This is also the fix for a regression introduced in commit 3.**
Before this commit, `swamp extension pull foo --force` against an
already-installed foo would have failed at install time. Caught by the
new force-reinstall regression test.

## What ships

1. **`InstallExtensionService` phase 8 tombstone-of-prior-versions.**
   Same-version re-installs skip the tombstone — the diff-save in
   `saveAll` handles overwrite semantics.
2. **`src/libswamp/extensions/upgrade_extension_service.ts`** — new
   `UpgradeExtensionService` class. Thin facade over
   `InstallExtensionService` so callers that mean "upgrade" can express
   the intent at the call site. Internally delegates; the atomic
   tombstone happens in the install service's phase 8.
3. **`src/cli/commands/extension_update.ts`** routes through
   `UpgradeExtensionService` instead of `InstallExtensionService`. Same
   underlying semantics; clearer intent.
4. **Two integration tests:**
   - `v1 → v2 upgrade tombstones v1's catalog rows in one transaction`
     — pins the atomic-upgrade contract; after upgrade, only v2's rows
     are in the catalog.
   - `force-reinstall of already-installed extension does not fail
     with DuplicateTypeError` — the regression net for the issue
     described above.

## Bulk-upgrade behaviour (plan v4 ADV-5)

Each upgrade is its own atomic transition via its own `saveAll`. If
A's upgrade rolls back due to a collision with unchanged B, every
other extension already-upgraded in the bulk run STAYS upgraded — there
is no all-or-nothing rollback across the bulk run. Pinned in
`UpgradeExtensionService` JSDoc.

## Pin 1 + Pin 2 hold

All Pin 1 (loader catalog-row-count regression) and Pin 2
(`extensionPull` event-stream byte-identicality) tests still pass.

## Test plan

- [x] deno check, lint, fmt clean
- [x] deno run test — 5426 passed, 0 failed (2 new regression tests)

## What this delivers user-visibly

- `swamp extension update` bulk upgrade of v1 → v2 succeeds atomically;
  the catalog never observes a half-installed state.
- `swamp extension pull foo --force` against an already-installed foo
  succeeds (regression fix).
- Cross-extension `(kind, type)` collision still fires
  `DuplicateTypeError` at install/upgrade time as designed; only
  same-name-different-version "collisions" (the upgrade case) are
  handled transparently via the atomic tombstone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ctured JSON shape

Plan v4 step 11: surface I-Repo-1 collisions as a UserError subclass that
the top-level error handler renders as a clean single-line message in log
mode (no stack trace) and as a structured `duplicateType` object in JSON
mode.

- New `DuplicateTypeUserError` extends `UserError`, carries `kind`,
  `typeNormalized`, `existing`, `conflicting`. Pinned message format
  names both extensions and points the user at `swamp extension rm
  <name>` to recover.
- Lives under `src/domain/extensions/` (not next to the persistence-
  internal `DuplicateTypeError`) so the presentation layer can import
  it without violating DDD layer rules.
- `install_extension_service.ts` `mapDuplicateTypeErrorToUserError`
  returns the new subclass so phase 8 collisions surface with full
  context after FS rollback.
- `error_output.ts` `buildErrorJson` recognises the subclass and emits
  the pinned `duplicateType` shape for `--json` consumers (jq, AI
  agents, CI). Return type widened from `Record<string, string>` to
  `Record<string, unknown>` to allow the nested object.
- 2 new tests in `error_output_test.ts`: structured shape + log/JSON
  parity (the human-readable message and the structured fields agree).

Verification: deno fmt / lint / check / test (5428 passed).
…ession tests

Plan v4 step 9 + step 12.

- `extension_repository_test.ts`: bulk-upgrade order-independence test.
  Two extensions A and B each upgrading v1 → v2 in one saveAll. Submit
  the four entries as `[tombstone(A1), A2, tombstone(B1), B2]` and as
  the inverted order; assert the loaded catalog is identical. Guards
  against a future refactor where saveAll lets an intermediate diff
  state leak across extensions or fires I-Repo-1 mid-loop on a transient
  state. (Source paths are deliberately distinct between v1 and v2 so
  the test exercises the cross-extension ordering, not the within-
  extension path-collision case which is already pinned by the canonical
  upgrade test.)
- `extension_update_test.ts`: CLI routing regression. Asserts the
  command source imports and constructs `UpgradeExtensionService` so a
  future refactor can't silently route through `installExtension(...)`
  directly and bypass phase 8's atomic-tombstone pattern.

Verification: deno fmt / lint / check / test (5430 passed).
…StubRepository

Plan v4 step 10. Pin the SQLite transaction-rollback contract that the
W2 lifecycle services depend on: a generic non-`DuplicateTypeError`
failure inside `repository.saveAll` must leave the catalog in its
pre-save state so a retry succeeds.

- New `FaultingStubRepository` wrapper (~50 LOC under the audit's
  ≤150 threshold). Subclasses `ExtensionRepository` and overrides
  `saveAll` with a one-shot fault-injection slot. Slots into the
  lifecycle services without an interface refactor.
- Install crash-recovery test: inject fault on `saveAll`. Assert the
  catalog is clean (SQLite ROLLBACK), lockfile + FS still hold the
  partial install (intentional — only DuplicateTypeError triggers FS
  rollback), and a retry succeeds via the diff-save reconciling the
  catalog against the on-disk + lockfile state.
- Remove crash-recovery test: install via the real repository, then
  swap in the faulting wrapper for rm. Inject fault on the tombstone
  saveAll (the FIRST mutation of rm, before lockfile + FS are touched).
  Assert all state survives the fault and a retry yields a clean rm.

The other two enumerated boundary states (install pre-save FS fault,
remove post-save FS fault) live below the catalog layer and aren't
usefully covered by a stub repository — they're mechanical operations
on the local filesystem and have no transaction-semantics claim to pin.

Verification: deno fmt / lint / check / test (5432 passed, +2 from
commit 7).
…s section

Documents the W2 lifecycle layer end-to-end so a reader can understand
the catalog-write surface without reading the source.

- Removal section: updated to describe catalog → lockfile → filesystem
  ordering (closes swamp-club#201) and the lifecycle's "not installed"
  semantics for partial-state extensions.
- New "Lifecycle Services" section covers:
  - The three services (Install / Remove / Upgrade) and the rule that
    CLI commands construct services rather than reaching into the
    catalog directly.
  - Asymmetric ordering and the rationale (catalog-first rm vs
    catalog-last install — pinned in plan v4 challenge #3).
  - Phase 8: synchronous type extraction at install via per-loader
    `bundleAndIndexOne`, with the Pin 1 contract that loaders never
    write to the catalog.
  - The atomic upgrade pattern `saveAll([tombstoneAll(v1), v2])` and
    the I-Repo-1-evaluates-post-state semantics that make it safe.
  - FS rollback on DuplicateTypeError + the pinned
    `DuplicateTypeUserError` JSON shape consumers can rely on.
  - Bounded atomicity: per-extension is the unit, not the bulk run.
  - Crash-state recovery posture for both install and rm.
  - W3+ inheritance: services persist; per-loader methods are the
    eventual deletion surface.

No code changes.
…fault

Closes the documented-but-unimplemented gap from commit 6: a non-
DuplicateTypeError fault inside InstallExtensionService phase 8 (or the
RemoveExtensionService catalog tombstone-save) previously rethrew a
plain Error with no recovery guidance. The service's JSDoc had pinned a
user-visible message; this commit actually surfaces it.

- InstallExtensionService.execute: wrap non-DuplicateTypeError saveAll
  faults in a UserError. Message names the extension, the underlying
  fault for diagnostics, and points at `swamp doctor extensions` /
  retry `swamp extension pull <name>` to reconcile.
- RemoveExtensionService.execute: wrap saveAll faults in a UserError.
  SQLite ROLLBACK already kept the catalog in its pre-rm state, so the
  message tells the user nothing changed and to retry.
- Crash-state recovery tests now assert the new UserError type and that
  the message names the extension, includes the underlying fault for
  diagnostics, and carries the recovery action.

Verification: deno fmt / lint / check / test (5432 passed) + compile +
swamp-uat extension suite (87 passed) against the freshly compiled
binary.
Closes the cosmetic "Dropping orphan row" warning observed during
`extension rm` in real-world smoke testing on a fresh repo.

**Root cause.** Phase 8's `buildExtensionFromDisk` joined the
per-extension subtree off the user-supplied `repoDir`, which the CLI
passes through unchanged from `Deno.cwd()`. With a relative `repoDir`
(or one whose canonicalization differs from a later filesystem walk),
the `source_path` written to the catalog was relative — but the legacy
`user_*_loader.ts` `populateCatalogFromDir` bootstrap (which runs the
first time any model command fires after pull) walks with absolute
paths. The two forms didn't UPSERT onto each other, leaving:

  - 8 rows with relative source_path + full identity (from phase 8)
  - 8 rows with absolute source_path + EMPTY identity (from the
    legacy bootstrap; legacy `upsert()` never touches identity columns)

`loadByName` materialised the full table and ran the empty-identity
fallback for the 8 stragglers — which on macOS where /tmp symlinks to
/private/tmp couldn't always resolve, producing one "Dropping orphan
row" warning per source file before the rm succeeded.

**Fix.** Resolve `repoDir` to absolute in `buildExtensionFromDisk`
before computing `extRoot`. Phase 8 now writes the same absolute
`source_path` form the legacy bootstrap uses; the second `upsert()`
hits the existing row's `ON CONFLICT(source_path)` clause and the
identity columns survive (they're not in the legacy upsert's UPDATE
list). No more empty-identity duplicates, no more orphan warnings on
rm.

Smoke test result: catalog goes from 16 rows (8 named + 8 ghost)
to 8 rows; `extension rm` now produces 0 orphan warnings.

Regression test: new `catalog source_path is absolute` test pins the
contract so a future refactor can't reintroduce relative paths.

Verification: deno fmt / lint / check / test (5433 passed, +1 for the
new regression) + recompiled binary + UAT extension suite (87 passed)
+ smoke retest (0 warnings on rm).
The order-independence test exercises the bulk-upgrade case (two
extensions, distinct rows). Plan v4 step 9's literal form
`saveAll([v2, tombstoneAll(v1)])` within a single extension is a no-op
concept under the catalog's `source_path` primary key — v1 and v2 of
the same extension at the same path share the same row. Document the
reason inline so a future reader doesn't try to re-add the literal
test and find it untestable.

Per architecture-agent review of W2.
@stack72 stack72 force-pushed the feat/w2-extension-lifecycle-services branch from 039ade4 to f594361 Compare May 5, 2026 17:46
…dows EBUSY

Two tests in `rm_test.ts` (\`extensionRm: removing one sibling…\` and
\`extensionRmPreview: resolves dependents…\`) call \`createExtensionRmDeps\`
directly instead of going through the \`fakeDeps\` helper. After commit 4
the deps factory opens an \`ExtensionCatalogStore\` (\`_extension_catalog.db\`),
which on Windows holds the file open until \`.close()\` is called. The
tests' \`Deno.remove(tmpDir, { recursive: true })\` then fails with
\`os error 32: The process cannot access the file because it is being used
by another process\`.

Fix: close \`deps.repository.legacyStore\` before the recursive remove and
wrap the remove in the standard Windows-only \`.catch(() => {})\` per the
CLAUDE.md guidance — same pattern \`fakeDeps\` already uses.

CI green on ubuntu-latest; Windows job (windows-latest) was the sole
failing leg of PR #1310.
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. Remove failed during catalog write exposes raw SQLite error text. In remove_extension_service.ts, the catch block interpolates error.message into the user-facing message: Remove failed for X during catalog write (SQLITE_BUSY: database is locked). .... The retry guidance is excellent, but internal database error strings may confuse users. Consider a fixed fallback like (catalog error — see logs with --verbose) instead of the raw error.message, reserving the full message for a debug log line.

  2. DuplicateTypeUserError message includes full absolute canonicalPath values. The resulting single-line message can be very long on deeply-nested repos. Since the structured duplicateType object in JSON mode already carries canonicalPath, the log-mode message could reference just extensionName@version without the path, trimming noise for the common case.

Verdict

PASS — Command signatures, flags, and renderer output are unchanged. The new DuplicateTypeUserError surfaces correctly in both log mode (clean single-line message, no stack trace) and JSON mode (new structured duplicateType sidecar). The extension rm idempotency improvement (Extension X is not installed.) and catalog-write failure guidance (Retry \swamp extension rm X`.`) are both clear and actionable. The two suggestions above are minor polish and do not block merge.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Well-architected PR that introduces three lifecycle services for the extension catalog write surface. The DDD principles are correctly applied throughout.

Architecture (DDD)

  • InstallExtensionService, RemoveExtensionService, and UpgradeExtensionService are properly placed as Application Services in the libswamp layer — they orchestrate domain objects (Extension aggregates, Sources, tombstones) and own the transaction boundary via repository.saveAll.
  • The Extension aggregate is correctly the only entity with a repository. The tombstone pattern (tombstoneAll) is a domain primitive used by the services — good separation.
  • DuplicateTypeUserError correctly lives in the domain layer (domain/extensions/) rather than infrastructure, avoiding a cross-layer dependency. The DuplicateTypeKind type duplication is an appropriate DDD tradeoff to keep the domain free of infrastructure imports.
  • The asymmetric ordering (install: FS→lockfile→catalog; remove: catalog→lockfile→FS) is well-reasoned for crash-state recovery and is consistently documented/tested.

Import Boundary

All CLI commands and presentation renderers correctly import from src/libswamp/mod.ts. The three new services are properly re-exported from mod.ts (lines 667-673). No new violations of the libswamp import boundary introduced.

Test Coverage

Comprehensive and well-structured:

  • InstallExtensionService: Phase 8 catalog population, absolute source_path regression, DuplicateTypeError FS rollback, crash-state recovery via FaultingStubRepository
  • RemoveExtensionService: swamp-club#201 lifecycle reproducer, double-rm idempotency, never-installed error, isolation from other extensions, catalog saveAll fault recovery
  • UpgradeExtensionService: Atomic v1→v2 tombstone contract, force-reinstall regression
  • Pin 1: Regression tests across all 5 loaders verifying bundleAndIndexOne does not write catalog rows
  • Pin 2: Renderer-level baseline tests for ExtensionPullEvent shape
  • DuplicateTypeUserError: JSON shape pinned with structural assertions, log+JSON output parity test
  • CLI routing: Source-text assertion that extension update constructs UpgradeExtensionService

Tests follow project conventions: @std/assert, @std/path, Windows EBUSY cleanup, proper test naming.

Code Quality

  • AGPLv3 license headers present on all new .ts files
  • No any types introduced (pre-existing AnyOptions in CLI commands is unchanged)
  • Named exports used throughout
  • Promises properly awaited — no fire-and-forget patterns
  • @std/path used for all path operations
  • Error messages include actionable recovery guidance (swamp doctor extensions, retry commands)

Suggestions

  1. The test helper functions (withFixtureRepo, stageModel, makeStubInstallResult, makeInstallContext) are duplicated across the three service test files. A future pass could extract these into a shared test helper — not blocking since each test file is self-contained and the duplication is limited to test scaffolding.

  2. pruneEmptyDirs in remove_extension_service.ts:194 uses a bare catch {} — this is fine for the "stop walking upward on any error" pattern, but a future lint pass might flag it. Consider binding the error even if unused, or adding a brief inline comment.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. install_extension_service.ts:236 — mixed path representations in Source aggregate. buildExtensionFromDisk resolves absoluteRepoDir = resolvePath(repoDir) at line 226 and uses it for extRoot and makeSourceLocation (producing absolute source paths), but passes the original repoDir (possibly relative, e.g. via --repo-dir ./foo) to makeLoaderForKind at line 236. The loaders compute bundlePath using join(this.repoDir, ".swamp", "bundles", ...), which can produce a relative path like ./.swamp/bundles/hash/file.js, while the Source's id is absolute. The resulting Source aggregate stored in the catalog has absolute source_path but relative bundle_path.

    Mitigating factors: @std/path's relative() and resolve() normalize against CWD, so the bundle namespace hash is identical regardless. The physical bundle file is found at the resolved location. The next cold-start loader pass overwrites the catalog entry with fully-resolved paths. No data loss or wrong behavior in realistic scenarios (CLI commands resolve repoDir to CWD if unspecified). Flagging because the inconsistency could confuse a future maintainer or break if bundleNamespace or getBundlePath stops implicitly resolving.

Low

  1. remove_extension_service.ts:139–140 — TOCTOU between Deno.stat and Deno.remove. A concurrent deletion between the two calls would cause Deno.remove to throw NotFound, which the catch at line 144 correctly handles by incrementing filesSkipped. The file count (filesDeleted) might be slightly inaccurate (stat says "will delete" but file is already gone), but the operation is functionally correct.

  2. install_extension_service.ts:169–172Promise.all orphans bundle files on partial failure. If extension A's buildExtensionFromDisk completes (writing bundle files via bundleWithCache) but extension B's fails, the entire Promise.all rejects. Extension A's bundle files remain on disk as orphans since no repository.saveAll ever runs. Harmless — the next cold-start invalidation sweep or swamp doctor cleans them up.

  3. rm.ts:181 — wrapper checks lockfile only, not catalog. The extensionRm generator short-circuits with notFound if the lockfile entry is absent or has no files field. An extension with catalog rows but no lockfile entry (e.g., from a crash between catalog tombstone-save and lockfile removal) would be reported as "not installed" despite stale catalog rows. This is documented and tracked (swamp-club#251 for swamp doctor extensions repair).

  4. remove_extension_service.ts:177pruneEmptyDirs sorts by string length as a depth proxy. sorted = [...new Set(dirs)].sort((a, b) => b.length - a.length) assumes longer paths are deeper. This holds for the expected path structure (all paths under repoDir/.swamp/pulled-extensions/...) but would fail for adversarial paths where a shallow path happens to be longer. No realistic input triggers this since all paths are derived from dirname(absolutePath) of extension files.

Verdict

PASS. This is a well-structured PR with clear ordering contracts (install: FS→lockfile→catalog; remove: catalog→lockfile→FS), documented crash-state recovery posture, and comprehensive test coverage (lifecycle services, Pin 1 regression tests, faulting stub for crash-state, DuplicateTypeUserError JSON shape parity). The findings are edge cases with limited practical impact. The architecture is sound for unblocking W3+ work.

@stack72 stack72 merged commit 5f16329 into main May 5, 2026
11 checks passed
@stack72 stack72 deleted the feat/w2-extension-lifecycle-services branch May 5, 2026 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant