Skip to content

feat(extensions): W1b extension catalog rearchitecture — domain aggregate + repository abstraction (swamp-club#223)#1295

Merged
stack72 merged 1 commit intomainfrom
worktree-223
May 4, 2026
Merged

feat(extensions): W1b extension catalog rearchitecture — domain aggregate + repository abstraction (swamp-club#223)#1295
stack72 merged 1 commit intomainfrom
worktree-223

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented May 4, 2026

Summary

Second of two PRs for the extension catalog rearchitecture (parent issue swamp-club#211). W1a (#1292) shipped the schema migration plus the state column. W1b finishes the rearchitecture with the domain layer (Extension aggregate + Source entity + RowState 7-tag DU + value objects), the ExtensionRepository abstraction (diff-based saves, I-Repo-1 cross-aggregate uniqueness on every commit, lockfile-backed empty-version fallback, explicit legacyStore escape hatch flagged for W4 removal), all 5 loaders migrated to (a-2) wiring with collapsed cold-start guards, callsite migration of forceCatalogRescan + DELETION, and the validation_failed column drop via SQLite recreate-table dance.

Pure plumbing. The only user-visible improvement is the silent fix to cold-start guard parity: pre-W1b only the model loader detected layout-version, datastore-base-path, and source-dirs-fingerprint changes; the other 4 detected only source-dirs-fingerprint changes. After W1b, all 5 kinds invalidate uniformly — closing the audit's "model has 3 guards, siblings have 1" gap.

This is groundwork for swamp-club#201, NOT "fixes #201." The diff-aware repository plumbing for extension rm lives here; the user-facing wiring is W2's RemoveExtensionService.

Out of scope (deferred to W2-W6 by design)

  • swamp-club#201 user-facing fix → W2 lifecycle services
  • Cross-extension DuplicateType errors at lifecycle time → W2
  • ReconcileFromDisk / freshness-as-aggregate-query → W3
  • KindAdapter loader unification → W4 (legacyStore removal grep target)
  • Per-fingerprint import URLs + subprocess test harness → W5
  • swamp doctor extensions rendering aggregate state → W6

Forward-only revert

Post-merge, downgrading to a pre-W1b binary requires deleting <repo>/.swamp/_extension_catalog.db (the loader bootstraps a fresh catalog on next run). The pre-W1b binary doesn't know how to read a catalog whose schema dropped a column it expects.

Windows

Not a W1b merge gate (parallel workstream per W1a precedent).

Architectural debt this introduces

The DDD ratchet test bumps from 26 to 30. The 4 new domain → infrastructure imports are canonicalizePath (×2) and ExtensionKind (×2). These are accepted as transitional ports — the canonicalizer should move to a shared path-utility module (W3 territory) and ExtensionKind should hoist to the domain layer when the catalog gets fully replaced (W4).

Test Plan

Automated:

  • 5,348 unit/integration tests pass / 0 fail / 28 ignored (vs 5,294/0/28 on main — net +54 = 33 domain + 15 repository + 2 drop migration + 4 supporting tests)
  • 13 load-bearing tests per plan v3 step 17:
    • round-trip save/load
    • diff-save INSERT
    • diff-save DELETE — labelled as swamp-club#201 reproducer at repository layer
    • diff-save UPDATE
    • saveAll([vN.tombstoneAll(), vN+1]) upgrade pattern
    • saveAll cross-extension reject + ROLLBACK + both source paths named
    • I-Repo-1 fires on save(ext) directly (distinct from saveAll)
    • lockfile fallback happy path with write-back
    • lockfile fallback orphan path with DELETE + warn
    • cold-start guard parity: 5 kinds × 4 triggers (20-assertion matrix)
    • drop verification — pragma_table_info, index survival, mid-dance ROLLBACK, atomicity on Deno's node:sqlite
    • drop idempotency
    • W3-corruption boundary (two pulled versions on disk → DuplicateType naming both paths)

Manual verification (compiled binary against real extensions):

  • Fresh repo + 4 extensions pulled (@swamp/aws/ec2, @swamp/aws/s3, @swamp/digitalocean, @john/k8s); EC2/S3/k8s models load cleanly
  • Pre-W1b vs post-W1b binary comparison against same scenario: DigitalOcean failures are identical on both binaries — confirmed pre-existing bug in @swamp/digitalocean@2026.05.02.1's upgrades chain, NOT a W1b regression
  • Real upgrade path: pre-W1b catalog (164 rows, validation_failed column populated) opened with post-W1b binary → drop migration runs, all 164 rows survive, all 3 indexes recreated, both markers set, EC2 model creation works on the migrated catalog
  • All 5 kinds get per-kind bundle_meta keys after first W1b boot; swamp extension source add triggers invalidation across all 5 kinds (silent fix verified empirically)

🤖 Generated with Claude Code

…gate + repository abstraction (swamp-club#223)

Second of two PRs for the extension catalog rearchitecture (parent issue
swamp-club#211). W1a (#1292) shipped the schema migration plus the `state`
column and migrated all readers/writers from `validation_failed` to `state`.
W1b finishes the rearchitecture with the domain layer, the repository
abstraction, and the `validation_failed` column drop.

Domain layer (`src/domain/extensions/`):
- `findRepoRoot` + `RepoRootNotFoundError` — lexical-only ancestor walk
- `SourceLocation`, `BundleLocation`, `SourceFingerprint` value objects
- `RowState` 7-tag discriminated union (Indexed, Bundled, BundleBuildFailed,
  ValidationFailed, EntryPointUnreadable, OrphanedBundleOnly, Tombstoned)
  with literal markdown state-machine table in module-level comment
- `Source` entity (fully immutable per ADV-7 — every transition produces a
  new instance)
- `Extension` aggregate root keyed `(name, version)` with invariants I1+I2,
  8 transitions including `tombstoneAll(): Extension` (load-bearing for the
  upgrade-as-atomic-transition pattern)

Infrastructure layer (`src/infrastructure/persistence/`):
- `ExtensionRepository` — composition over `ExtensionCatalogStore`, with
  diff-based transactional saves, lockfile-backed empty-version fallback
  for pulled rows (info-log once per row per boot, write-back makes
  subsequent boots silent), explicit `legacyStore` escape hatch with
  do-not-alias JSDoc (W4 will grep `.legacyStore` to remove)
- I-Repo-1 (cross-aggregate `(kind, typeNormalized)` uniqueness) fires on
  every commit — `save(ext)` is sugar for `saveAll([ext])` and runs the
  same check; ROLLBACK + DuplicateTypeError naming both source paths
- `invalidationGuards(kind)` returning explicit reason; `invalidateAll()`
  with best-effort error semantics
- 5 catalog support methods: `findAll`, `findByExtension`,
  `updateExtensionIdentity`, `runInTransaction`, `upsertWithIdentity`
- `BUNDLE_LAYOUT_VERSION` hoisted to shared location; per-kind
  `getDatastoreBasePath`/`setDatastoreBasePath`
- `dropValidationFailedColumn()` migration phase via SQLite recreate-table
  pattern, gated on `migration_applied:validation-failed-dropped-v1`
  marker AND pragma_table_info probe; all 3 indexes recreated explicitly

Loaders migrated to (a-2) wiring per ADV-V2-1:
- All 5 loaders (model/vault/driver/datastore/report) take `repository`
  as a long-lived constructor field; `buildIndex`, `loadSingleType`, and
  (model only) `attachPendingExtensionsForType` drop their per-call
  catalog/repository params
- Cold-start guards collapse to `repository.invalidationGuards(kind)`,
  closing the audit's "model has 3 guards, siblings have 1" coverage gap
- Model loader migrated from legacy global `source_dirs_fingerprint` to
  per-kind `"model"` key (legacy global codepath retained for one
  release-window of backward-compat per ADV-9)

CLI wiring:
- 8 loader construction sites + 2 repository constructions in `cli/mod.ts`
  and `auto_resolver_adapters.ts`; lockfile pre-read once per
  `configureExtensionLoaders`
- `open.ts:109` and `doctor_extensions.ts:105` migrated from
  `forceCatalogRescan` to a temporary `ExtensionRepository.invalidateAll()`
  with hoisted lockfilePath; standalone `forceCatalogRescan` DELETED

Schema cleanup:
- `validation_failed` column dropped from fresh schema; recreate-table
  dance for old DBs (single transaction with ROLLBACK on any failure)
- `ExtensionTypeRow.validation_failed?` field + `mapRow` reader removed

W1b is pure plumbing. The only user-visible improvement is the silent fix
to cold-start guard parity: pre-W1b, only the `model` loader detected
layout-version, datastore-base-path, and source-dirs-fingerprint changes;
the other 4 loaders only detected source-dirs-fingerprint changes. After
W1b, all 5 kinds invalidate uniformly. `swamp extension source add` now
correctly forces a rescan of vaults/drivers/datastores/reports too — not
just models.

The plumbing for `swamp extension rm` to actually clean up catalog rows
(swamp-club#201) is in place via the diff-aware repository, but the
user-facing wiring is W2's `RemoveExtensionService`. This is groundwork
for #201, NOT "fixes #201."

- swamp-club#201 — `RemoveExtensionService` (W2)
- Cross-extension `DuplicateType` errors at lifecycle time (W2)
- `ReconcileFromDisk` / freshness-as-aggregate-query (W3)
- `KindAdapter` loader unification (W4)
- Per-fingerprint import URLs + subprocess test harness (W5)
- `swamp doctor extensions` rendering aggregate state (W6)

Post-merge, downgrading to a pre-W1b binary requires deleting
`<repo>/.swamp/_extension_catalog.db`. The pre-W1b binary doesn't know
how to read a catalog whose schema dropped `validation_failed`. The
loader bootstraps a fresh catalog on next run.

Not a W1b merge gate (parallel workstream per W1a precedent).

The DDD ratchet test (`integration/ddd_layer_rules_test.ts`) bumps from
26 to 30. The 4 new domain → infrastructure imports are
`canonicalizePath` (in `bundle_location.ts`, `source_location.ts`) and
`ExtensionKind` (in `source.ts`, `extension.ts`). These are accepted
as transitional ports — the canonicalizer should move to a shared
path-utility module (W3 territory) and `ExtensionKind` should hoist
to the domain layer when the catalog gets fully replaced (W4).

Tests: 5,348 pass / 0 fail / 28 ignored (vs 5,294 / 0 / 28 on main).
Net new: +33 domain tests, +15 repository tests, +2 drop migration
tests, +4 supporting tests.

Load-bearing tests covered (per plan v3 step 17 — 13 distinct tests):
1. round-trip save/load
2. diff-save INSERT
3. diff-save DELETE — labelled as the swamp-club#201 reproducer at the
   repository layer
4. diff-save UPDATE
5. saveAll([vN.tombstoneAll(), vN+1]) upgrade pattern
6. saveAll cross-extension reject + ROLLBACK + both source paths named
7. I-Repo-1 fires on save(ext) directly (distinct from saveAll)
8. lockfile fallback happy path with write-back
9. lockfile fallback orphan path with DELETE + warn
10. cold-start guard parity over all 5 kinds × 4 triggers
11. drop verification — pragma_table_info, index survival, mid-dance
    ROLLBACK, atomicity on Deno's node:sqlite
12. drop idempotency
13. W3-corruption boundary (two pulled versions on disk → DuplicateType)

Manual verification:
- Fresh repo + 4 extensions pulled (@swamp/aws/ec2, @swamp/aws/s3,
  @swamp/digitalocean, @john/k8s); EC2/S3/k8s models load cleanly
- Pre-W1b vs post-W1b binary comparison: identical DigitalOcean errors
  (pre-existing bug in @swamp/digitalocean@2026.05.02.1's `upgrades`
  chain — NOT a W1b regression)
- Real upgrade path: pre-W1b catalog (164 rows, validation_failed
  column populated) opened with post-W1b binary → drop migration runs,
  all 164 rows survive, all 3 indexes recreated, both markers set,
  EC2 model creation works on the migrated catalog
- All 5 kinds get per-kind `bundle_meta` keys after first boot;
  `swamp extension source add` triggers invalidation across all 5
  (the silent fix verified empirically)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. ** message uses internal jargon** (): The error message starts with I-Repo-1 violation: — an internal invariant label. This error isn't wired to any user-facing path yet (W2 lifecycle services are out of scope), so it's not a current issue. Before W2 ships, consider wrapping or translating it into a UserError with plain language (e.g. "Two installed extensions claim the same type …; remove one with swamp extension rm") so users never see the raw internal label.

  2. Warn-level log messages in reference internal paths (): Messages like "Dropping orphan row at ${row.source_path}" and "Dropping orphan pulled row … lockfile has no entry for ${name}" are accurate but action-free — the user can't act on a raw source_path. Low priority since these fire at warn level (not shown by default), and the conditions are transient, but a follow-up could add a hint pointing at swamp extension rm or swamp doctor extensions.

Verdict

PASS — Pure internal plumbing. No new commands, flags, help text, or user-visible output are added. The forceCatalogRescan → repository.invalidateAll() swap in doctor_extensions.ts and open.ts is behaviour-preserving (best-effort, silent). JSON and log output modes are unaffected. No exit-code changes. Good to merge.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. DuplicateTypeError message uses internal jargon (src/infrastructure/persistence/duplicate_type_error.ts): The error message starts with I-Repo-1 violation: — an internal invariant label. This error is not wired to any user-facing path yet (W2 lifecycle services are out of scope), so it is not a current issue. Before W2 ships, consider wrapping or translating it into a UserError with plain language (e.g. "Two installed extensions claim the same type …; remove one with swamp extension rm") so users never see the raw internal label.

  2. Warn-level log messages in resolveIdentity reference internal paths (extension_repository.ts lines ~382, ~406): Messages like "Dropping orphan row at ${row.source_path}" are accurate but action-free — the user can't act on a raw source_path. Low priority since these fire at warn level (not shown by default) and the conditions are transient, but a follow-up could add a hint pointing at swamp extension rm or swamp doctor extensions.

Verdict

PASS — Pure internal plumbing. No new commands, flags, help text, or user-visible output are added. The forceCatalogRescan → repository.invalidateAll() swap in doctor_extensions.ts and open.ts is behaviour-preserving (best-effort, silent). JSON and log output modes are unaffected. No exit-code changes. Good to merge.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a well-structured PR that introduces the Extension aggregate, Source entity, value objects (SourceLocation, BundleLocation, SourceFingerprint, RowState), and the ExtensionRepository abstraction as part of the W1b catalog rearchitecture. The DDD patterns are sound and the code quality is high.

Blocking Issues

None found.

Verification

  • License headers: All 18 new .ts files have the AGPLv3 copyright header. ✅
  • Test coverage: Every new domain file has a corresponding _test.ts file (except source_fingerprint.ts which is a pure type alias). Test counts: extension(17), source(3), row_state(3), bundle_location(4), source_location(6), find_repo_root(5), repository(15), catalog_store(36 total). ✅
  • No any types in new source files. ✅
  • Named exports only (no default exports). ✅
  • libswamp import boundary: CLI commands (mod.ts, doctor_extensions.ts, open.ts) import from libswamp/mod.ts only — no internal path violations. ✅
  • DDD ratchet: Count 27 → 31 is correct. The test counts violating files (breaks after first violation per file). 4 new files: bundle_location.ts, source_location.ts, source.ts, extension.ts. ✅
  • Domain→infrastructure violations: The 4 transitional ports (canonicalizePath ×3, ExtensionKind type ×2) are documented and planned for W3/W4 resolution. ✅
  • Path operations: Uses @std/path (dirname, join) in find_repo_root.ts. No hand-rolled path manipulation. ✅
  • No fire-and-forget promises in new code. ✅
  • LogTape tagged templates: Bare interpolation used correctly (no double-quoting). ✅

DDD Assessment

  • Extension is a proper aggregate root — immutable transitions return new instances, invariants (I1: extensionRoot match, I2: intra-aggregate type uniqueness) enforced on every construction/transition. The functional-immutable style is a valid DDD pattern.
  • Source is an entity within the aggregate (identity by SourceLocation).
  • SourceLocation, BundleLocation, SourceFingerprint are well-formed value objects (immutable, equality by value, factory functions).
  • RowState is a 7-tag discriminated union with a documented state machine — clean domain modeling.
  • ExtensionRepository correctly serves as the sole persistence gateway for the aggregate, with diff-based saves and the I-Repo-1 cross-aggregate invariant evaluated post-save.
  • The legacyStore escape hatch for transitional code is appropriately scoped with a W4 removal grep target.

Suggestions

  1. Ratchet comment precision (integration/ddd_layer_rules_test.ts:75): The comment says "added 4 new domain→infrastructure imports" but extension.ts has 2 infrastructure imports (ExtensionKind and canonicalizePath). Since the ratchet counts files (not individual imports), saying "4 new domain files with infrastructure imports" would be more precise. Minor documentation nit.

  2. loadByName full-table scan (extension_repository.ts:169): Currently reads all rows and filters in memory. The comment explains the constraint (empty-identity fallback must run per-row). Acceptable for W1b since it's not on a hot path, but a findByExtensionName(name) SQL query that returns all versions for a name could avoid materializing unrelated extensions once the empty-identity fallback is removed in W3+.

Overall this is clean, well-documented plumbing with thorough test coverage and proper DDD structure. The transitional boundaries (legacyStore, domain→infra violations) are bounded and tracked for removal.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Reviewed all 36 changed files (~4800 additions). Traced logic paths through the Extension aggregate, ExtensionRepository, ExtensionCatalogStore migration, all 5 loaders, and the CLI wiring callsites.

Critical / High

None found.

Medium

  1. ReadonlyMap<SourceLocation, Source> uses reference equality, not value equalitysrc/domain/extensions/extension.ts:100,184,193,434

    The Extension.sources map is typed ReadonlyMap<SourceLocation, Source> where SourceLocation is a plain interface. JavaScript Map uses SameValueZero (reference equality) for object keys, but the module defines a separate sourceLocationEquals function (source_location.ts:78) that compares by canonicalPath. These two equality semantics diverge.

    Breaking example: When a W2/W3 caller constructs a fresh SourceLocation via makeSourceLocation(samePath, sameRoot) and passes it to observeFreshSource (line 267: const existing = next.get(args.location)), the lookup returns undefined even though a source with the same canonicalPath exists in the map. The function takes the "add new source" branch (line 280), creating a duplicate entry under a distinct key. enforceI2 (line 290) then throws IntraExtensionDuplicateType — a confusing error when the real issue is key identity mismatch.

    W1b impact: None — observeFreshSource and the other transition methods that take external SourceLocation args are not called in W1b production paths. The repository constructs aggregates from rows and the only post-construction transition is tombstoneAll, which iterates existing keys. But W2/W3 implementers will hit this on first use.

    Suggested fix: Either (a) use a Map<string, Source> keyed by canonicalPath (the actual identity), or (b) document prominently on the Extension interface that callers MUST pass the exact Source.id reference from the map, never a freshly-constructed SourceLocation.

Low

  1. loadByName full-table scanextension_repository.ts:166-171. Calls findAll() then filters in-memory because the empty-identity fallback has side effects that require seeing all rows. Performance is O(total_rows) instead of O(matching_rows). Correctness is fine; the comment explains the rationale. Worth revisiting in W3 when ReconcileFromDisk eliminates the fallback path.

  2. PR body DDD ratchet count is stale — Body says "bumps from 26 to 30" but actual code shows 27 → 31 (integration/ddd_layer_rules_test.ts:88). The code is correct (the test counts per-file per the break at line 112 and CI passes). The body text is just out of sync — not a merge concern.

  3. Read-path side effects without transaction wrappingextension_repository.ts:379-416. resolveIdentity performs removeBySourcePath and updateExtensionIdentity writes during loadAll()/loadByName(). Each write auto-commits individually. A crash mid-backfill leaves partially-resolved rows, but the next loadAll() re-resolves idempotently. Acceptable convergence design for W1b; noted for W3 when the reconcile service takes ownership.

Verdict

PASS. The domain aggregate is properly immutable with invariants enforced at construction and every transition. The repository's diff-based save with I-Repo-1 verification inside a transaction is correct. The SQLite recreate-table migration dance follows best practices (single transaction, explicit ROLLBACK, defence-in-depth via marker + pragma probe). The cold-start guard parity fix closes a real gap. The one medium finding (Map reference-vs-value equality) is a design trap for future work phases but is unreachable in W1b production paths.

@stack72 stack72 merged commit 191e8d1 into main May 4, 2026
11 checks passed
@stack72 stack72 deleted the worktree-223 branch May 4, 2026 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant