Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 92 additions & 3 deletions design/extension.md
Original file line number Diff line number Diff line change
Expand Up @@ -1020,12 +1020,101 @@ no lockfile-retry-budget or SQLite-busy exhaustion at iteration end.

### W3+ inheritance

The lifecycle services' shape is W3-stable. The unified loader
(`KindAdapter`) work in W3 collapses the five per-loader
The lifecycle services' shape is W4-stable. The unified loader
(`KindAdapter`) work in W4 collapses the five per-loader
`bundleAndIndexOne` methods to one dispatch, but the install/remove/
upgrade services keep their current public surface. CLI command files'
direct service construction (in `extension_pull.ts`,
`extension_update.ts`, `extension_rm.ts`, etc.) persists past W3.
`extension_update.ts`, `extension_rm.ts`, etc.) persists past W4.

### W3: ReconcileFromDisk + freshness as aggregate query

W3 introduces `ReconcileFromDiskService`
(`src/libswamp/extensions/reconcile_from_disk_service.ts`) and rewrites
the freshness contract as a two-layer model.

**Two-layer freshness model.** The freshness contract has two distinct
concerns:

1. **Type resolution layer** (W3 makes this trivial):
`isFresh(state) = state === "Indexed"`. Constant-time aggregate
query. All other RowState tags are not visible to type resolution.

2. **State maintenance layer** (split between two paths):
- **Cold-start / explicit reconcile:** `ReconcileFromDiskService`.
Full disk walk across all three origin types (locals, pulled,
source-mounted). Post-hoc state repair. Fires when
`anyKindNeedsInvalidation()` returns true (i.e. any kind's
`isPopulated` flag is false).
- **Warm-start / hot path:** `findStaleFiles` (preserved from
pre-W3). Incremental fingerprint comparison. Fires per-loader
`buildIndex` when the catalog is already populated.

The original W3 plan targeted slimming `findStaleFiles` to a ~20 LOC
deletion-sweep shim. Ground truth showed that warm-start incremental
detection is load-bearing for the development workflow — 12 loader
tests exercise this path. `findStaleFiles` retains its fingerprint
comparison. The scope change is deliberate.

**ReconcileFromDisk semantics.** The service:

- Walks on-disk source trees for all origin types.
- Loads current aggregate state via `repository.loadAll()`.
- Diffs disk vs aggregate and emits RowState transitions using the
existing Extension aggregate methods.
- Delegates to per-loader `bundleAndIndexOne` for type extraction —
NOT `InstallExtensionService`. The source is already on disk and the
lockfile already exists; reconcile is post-hoc state repair.
- Saves via `repository.saveAll()` inside a single SQLite transaction.

**Locals vs pulled reconcile matrix:**

| Origin | Source on disk | Source in aggregate | Transition |
|--------|---------------|--------------------|-|
| Local | present | absent | `bundleAndIndexOne` → `Indexed` |
| Local | absent | present | `markSourceMissing` → `OrphanedBundleOnly` or `Tombstoned` |
| Pulled | present | absent | `bundleAndIndexOne` → `Indexed` |
| Pulled | absent | lockfile present | `recordEntryPointUnreadable` (re-fetch is W4) |
| Pulled | absent | lockfile absent | `Tombstoned` (orphan from failed rm) |
| Source-mounted | — | — | Follows local semantics |

**Trigger points:** cold-start (when `anyKindNeedsInvalidation()`
returns true) + explicit `swamp doctor extensions` call. NOT on every
command — reconcile would dominate the hot-path performance.

**dryRun mode:** `execute({ dryRun: true })` collects transitions
without calling `repository.saveAll()`. Returns structured
`ReconcileTransition` records (`{ source, fromState, toState, reason }`)
that W6's `swamp doctor extensions` will render directly.

**Transition-count guardrail:** if a reconcile run would transition
> 50% of existing rows (minimum 10 rows), the run aborts and returns
the transitions without applying them. Catches mass-tombstone bugs.

**enforceI2 transform.** W3 replaces the `IntraExtensionDuplicateType`
throw in the Extension aggregate's I2 enforcement with a
deterministic-winner + tombstone-loser transform. The Source with the
lexicographically smaller `canonicalPath` wins; the loser is tombstoned
with reason `"renamed"`. Cross-aggregate uniqueness (I-Repo-1) still
throws `DuplicateTypeError` at the repository layer.

**UNREADABLE_DEP_SENTINEL removal.** The sentinel constant was renamed
to `UNREADABLE_PLACEHOLDER` (internal to `computeSourceFingerprint`).
No external code compares against it. Broken transitive deps produce a
stable fingerprint; the failure surfaces at `bundleAndIndexOne` as
`BundleBuildFailed`. Existing catalog rows with the old sentinel value
are caught by the first reconcile run — no schema migration needed.

**Forward-only revert posture.** Same as W1b/W2: revert means deleting
`_extension_catalog.db` and rebuilding from disk on the next cold-start.

**Out of scope (deferred):**

- Bundle cache file eviction (W3 detects `OrphanedBundleOnly` but does
NOT delete bundle files)
- Loader unification / `KindAdapter` → W4
- `legacyStore` escape hatch removal → W4
- `swamp doctor extensions` aggregate-state rendering → W6

## Lazy Per-Bundle Loading

Expand Down
14 changes: 14 additions & 0 deletions src/cli/mod.ts
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ import "../domain/datastore/datastore_types.ts";
// Import builtin reports to trigger registration
import "../domain/reports/builtin/mod.ts";
import { EmbeddedDenoRuntime } from "../infrastructure/runtime/embedded_deno_runtime.ts";
import { ReconcileFromDiskService } from "../libswamp/mod.ts";
import {
type RepoMarkerData,
RepoMarkerRepository,
Expand Down Expand Up @@ -274,6 +275,19 @@ export async function configureExtensionLoaders(
repoRoot: repoDir,
});

// W3: cold-start reconcile. Runs once when any kind is not yet
// populated — repairs aggregate state from disk before the loaders
// fire. NOT on every command; only on cold-start.
if (repository.anyKindNeedsInvalidation()) {
const reconciler = new ReconcileFromDiskService({
denoRuntime,
repository,
lockfileRepository,
repoDir,
});
await reconciler.execute();
}

modelRegistry.setLoader(() =>
loadUserModels(
repoDir,
Expand Down
100 changes: 40 additions & 60 deletions src/domain/extensions/bundle_freshness.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,9 @@
// You should have received a copy of the GNU Affero General Public License
// along with Swamp. If not, see <https://www.gnu.org/licenses/>.

import { getLogger } from "@logtape/logtape";
import { relative, resolve } from "@std/path";
import { resolveLocalImports } from "../models/local_import_resolver.ts";

const logger = getLogger(["swamp", "extensions", "bundle-freshness"]);

/**
* The extension-catalog kinds this helper can query. Declared
* domain-local so the freshness check does not import ExtensionKind
Expand Down Expand Up @@ -102,15 +99,18 @@ export interface StaleFile {
}

/**
* Sentinel emitted in place of a real sha-256 hex hash when a transitive
* dep cannot be read (broken symlink, deleted file, FilesystemLoop). The
* fingerprint then encodes "this dep is currently unreadable" as part of
* the source state, so a stable broken state produces a stable
* fingerprint instead of marking the entry permanently stale (#208).
* Cannot collide with a real hash — "MISSING" contains non-hex
* characters.
* Placeholder emitted in place of a real sha-256 hex hash when a
* transitive dep cannot be read (broken symlink, deleted file,
* FilesystemLoop). Encodes "this dep is currently unreadable" into the
* fingerprint so a stable broken state produces a stable fingerprint
* and repairing the dep correctly invalidates it (#208). Cannot collide
* with a real hash — contains non-hex characters.
*
* Internal to computeSourceFingerprint. No external code compares
* against this value — ReconcileFromDisk handles broken-dep behavior
* via the BundleBuildFailed RowState transition.
*/
const UNREADABLE_DEP_SENTINEL = "MISSING";
const UNREADABLE_PLACEHOLDER = "MISSING";

/**
* Computes a content-based fingerprint covering an entry point and every
Expand All @@ -123,10 +123,8 @@ const UNREADABLE_DEP_SENTINEL = "MISSING";
* resolveLocalImports stops at the boundary dir, matching the bundler's
* own dependency scope.
*
* Unreadable deps (broken symlinks, deleted files, FilesystemLoop)
* produce an UNREADABLE_DEP_SENTINEL entry instead of throwing — so a
* stable broken state yields a stable fingerprint, and repairing the
* dep correctly invalidates it (#208).
* Unreadable deps produce a stable placeholder entry instead of
* throwing, so a stable broken state yields a stable fingerprint (#208).
*/
export async function computeSourceFingerprint(
absolutePath: string,
Expand All @@ -142,7 +140,7 @@ export async function computeSourceFingerprint(
try {
fileHash = await hashFile(file, cache);
} catch {
fileHash = UNREADABLE_DEP_SENTINEL;
fileHash = UNREADABLE_PLACEHOLDER;
}
entries.push(`${relPath}:${fileHash}`);
}
Expand Down Expand Up @@ -194,15 +192,6 @@ async function hashFile(
return toHex(digest);
}

async function bundleExists(bundlePath: string): Promise<boolean> {
try {
await Deno.stat(bundlePath);
return true;
} catch {
return false;
}
}

function toHex(buffer: ArrayBuffer): string {
const view = new Uint8Array(buffer);
let out = "";
Expand Down Expand Up @@ -242,20 +231,23 @@ export interface FindStaleFilesParams {
}

/**
* Walks all source directories, compares each file's current fingerprint
* against the catalog-stored fingerprint, and returns the files that need
* rebundling. Also removes catalog entries whose source file has been
* deleted.
*
* A file is stale when —
* 1. It is new (no catalog entry), or
* 2. Its computed fingerprint differs from the catalog's, or
* 3. Fingerprint computation fails (e.g. dep disappeared mid-scan).
* W3 freshness query: a Source is fresh iff its RowState is `Indexed`.
* All other states are not visible to type resolution. An absent state
* (`undefined`) is NOT fresh — the source needs indexing.
*/
export function isFresh(state: string | undefined): boolean {
return state === "Indexed";
}

/**
* Warm-start incremental change detection. Walks source directories,
* compares each file's current fingerprint against the catalog, and
* returns files that need rebundling. Also removes catalog entries
* whose source file has been deleted.
*
* Previously this was mtime-based. mtime is fragile — atomic-rename
* saves, rsync --times, and sub-millisecond edits can all leave the
* source mtime <= catalog mtime while the content has changed. Content
* fingerprint is strictly stronger.
* Cold-start reconciliation is handled by ReconcileFromDisk (W3).
* This function handles the warm-start path: catalog is populated,
* a few files may have changed since the last run.
*/
export async function findStaleFiles(
params: FindStaleFilesParams,
Expand Down Expand Up @@ -303,35 +295,14 @@ export async function findStaleFiles(
continue;
}

// Source content is unchanged, but the cached bundle may have been
// deleted out from under us (manual rm, partial GC, failed previous
// bundle attempt). Without this check the catalog row stays "fresh"
// and a downstream importBundleByPath ENOENTs (swamp-club#212).
//
// ValidationFailed rows are skipped: rebundling them is a no-op
// cycle — bundle still fails schema validation, markCatalogValidationFailed
// re-pins the same fingerprint, every command spawns deno bundle.
// That is the inverse of the loop swamp-club#209 sealed. The W1a
// migration absorbed the legacy `validation_failed` boolean into
// the `state` column; this reader migrated together with the
// writer (markCatalogValidationFailed) so the W1a→W1b window
// never has a writer/reader schism on this guard.
if (
catalogEntry.bundle_path &&
catalogEntry.state !== "ValidationFailed" &&
!(await bundleExists(catalogEntry.bundle_path))
) {
logger
.warn`Rebundling ${relativePath}: cached bundle missing from disk`;
stale.push({ absolutePath, relativePath, baseDir: dir });
}
} catch {
// Defensive backstop only. computeSourceFingerprint is total
// since #208 — unreadable transitive deps produce a sentinel
// entry rather than throwing. Anything reaching this catch is
// an unforeseen failure (Deno API change, crypto.subtle panic,
// boundary-dir stat race). Force a rebundle so the error
// surfaces to the user.
stale.push({ absolutePath, relativePath, baseDir: dir });
}
}
Expand All @@ -346,6 +317,15 @@ export async function findStaleFiles(
return stale;
}

async function bundleExists(bundlePath: string): Promise<boolean> {
try {
await Deno.stat(bundlePath);
return true;
} catch {
return false;
}
}

/**
* Minimal write-side catalog view the validation-failure helper needs.
* Same one-way domain→infrastructure boundary as FreshnessCatalog.
Expand Down Expand Up @@ -397,7 +377,7 @@ export interface MarkCatalogValidationFailedParams {
* migrate together so the column is genuinely vestigial during the
* W1a → W1b release window. The column itself drops in W1b.
*
* Symmetric to the UNREADABLE_DEP_SENTINEL fix in #208: that one made
* Symmetric to the unreadable-dep fix in #208: that one made
* computeSourceFingerprint total for unreadable transitive deps; this
* one makes the catalog write total for schema-invalid sources. Both
* encode "stable broken state" into the freshness contract.
Expand Down
20 changes: 0 additions & 20 deletions src/domain/extensions/bundle_freshness_test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -310,8 +310,6 @@ Deno.test("findStaleFiles: catches mtime-preserving content change (#125)", asyn
source_fingerprint: origFp,
});

// Swap content, restore the old mtime — this is exactly what
// atomic-rename saves and rsync --times do in the wild.
await Deno.writeTextFile(file, "export const a = 2;");
await Deno.utime(file, origMtime, origMtime);

Expand Down Expand Up @@ -721,7 +719,6 @@ Deno.test("findStaleFiles: broken transitive dep — stale once, then stable (#2
);
await Deno.writeTextFile(dep, "export const x = 1;");

// Step 1: all readable. Compute fingerprint F1, store in catalog.
const f1 = await computeSourceFingerprint(entry, dir);
const catalog = new FakeCatalog();
catalog.add({
Expand All @@ -736,9 +733,6 @@ Deno.test("findStaleFiles: broken transitive dep — stale once, then stable (#2
source_fingerprint: f1,
});

// Step 2: break the transitive dep. findStaleFiles must mark the
// entry stale on this pass — the dep change is a real fingerprint
// change and the rebundle path needs to fire to refresh the row.
await Deno.remove(dep);
await Deno.symlink("/nonexistent/path/dep.ts", dep, { type: "file" });

Expand All @@ -755,8 +749,6 @@ Deno.test("findStaleFiles: broken transitive dep — stale once, then stable (#2
);
assertEquals(firstPass[0].relativePath, "entry.ts");

// Step 3: simulate the rebundle path updating the catalog row to
// the new sentinel-bearing fingerprint F2.
const f2 = await computeSourceFingerprint(entry, dir);
assertNotEquals(f1, f2);
catalog.removeBySourcePath(entry);
Expand All @@ -772,11 +764,6 @@ Deno.test("findStaleFiles: broken transitive dep — stale once, then stable (#2
source_fingerprint: f2,
});

// Step 4: subsequent passes — the regression's load-bearing claim.
// With the row reflecting the broken state, findStaleFiles must
// NOT mark the entry stale. Pre-fix, fingerprint computation threw
// and the file was marked stale on every invocation, triggering
// bundle spawns and the 8s wall time reported in #208.
const secondPass = await findStaleFiles({
modelsDir: dir,
catalog,
Expand Down Expand Up @@ -935,11 +922,6 @@ Deno.test("findStaleFiles + markCatalogValidationFailed: stable broken source co
});

Deno.test("findStaleFiles + markCatalogValidationFailed: editing a broken source produces a new fingerprint and re-stales", async () => {
// Recovery path. After the broken-state row is in place, editing
// the source to ANY different content (broken or valid) produces a
// new fingerprint that does not match the stored value, so
// findStaleFiles correctly marks the file stale and the loader's
// rebundle pass fires.
const dir = await Deno.makeTempDir({ prefix: "swamp_bf_209_recover_" });
try {
const file = join(dir, "model.ts");
Expand All @@ -956,7 +938,6 @@ Deno.test("findStaleFiles + markCatalogValidationFailed: editing a broken source
sourceFingerprint: brokenFp,
});

// Stable broken — not stale.
let stale = await findStaleFiles({
modelsDir: dir,
catalog,
Expand All @@ -965,7 +946,6 @@ Deno.test("findStaleFiles + markCatalogValidationFailed: editing a broken source
});
assertEquals(stale.length, 0);

// Edit to different content (the recovery path).
await Deno.writeTextFile(file, "export const recovered = 42;\n");
stale = await findStaleFiles({
modelsDir: dir,
Expand Down
Loading
Loading