Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 48 additions & 1 deletion docs/cli-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -343,6 +343,53 @@ Marked duplicates are hidden from all search paths by default; see
`recall search --include-duplicates`. Lineage is included in
`recall export`, so the audit trail is portable.

## Repair

Explicit data/index maintenance — deliberately separate from
`recall doctor --fix`, which only repairs install-layout symlinks and never
touches data.

```bash
recall repair # Dry-run report (default — writes nothing)
recall repair --execute # Apply the planned repairs
recall repair -t messages # Scope to one table
recall repair --no-embed # Skip the embedding pass
```

What repair covers:

- **FTS5 index rebuild.** Each search index (`messages`, `decisions`,
`learnings`, `breadcrumbs`, `loa_entries`, `telos`, `documents`) is checked
against its source table — indexed row count (via the docsize shadow
table), sync-trigger presence, and the FTS5 `integrity-check` command. A
drifted index is rebuilt from its source table; a missing index or missing
sync triggers (the classic symptom: search silently returns nothing on an
un-migrated database) are recreated from the canonical schema DDL and then
rebuilt.
- **Re-embedding.** Rows expected to carry embeddings (`loa_entries`,
`decisions`, `learnings`, assistant `messages`) that have none are
re-embedded when the Ollama embedding service is available and the row has
enough source text. If the service is unavailable, repair reports the
missing embeddings and still exits successfully — unless another requested
repair failed. Partial results are never hidden: embedded, skipped
(too short), and failed counts are all reported.
- **Orphan/invariant reporting.** Named, unambiguous integrity checks —
orphaned embeddings, dedup lineage pointing at missing rows, messages
without a session, broken LoA message ranges and parent links, pending
schema migrations — are **report-only**. Repair never attempts heuristic
data mutation; pending migrations are fixed by `recall init`.

Safety model:

- **Dry-run by default.** Mutations require `--execute`. Run
`recall export --backup` before applying repairs.
- **Repair never hard-deletes rows.**
- **Repair never changes [Record Provenance](#record-provenance).** FTS
rebuild regenerates index shadow tables; re-embedding only inserts into
the `embeddings` table. No source-table column is written.
- `recall doctor` may recommend `recall repair`, but `doctor --fix` never
runs data repair implicitly.

## Admin

```bash
Expand All @@ -359,7 +406,7 @@ recall onboard # Interactive L0 identity interview (see

`recall init` creates the database schema if it does not exist, and applies any pending migrations. It is safe to run on an existing database.

`recall doctor` checks the database connection, schema integrity, FTS5 index health, MCP server registration, Ollama availability, and the per-platform symlinks under `~/.agents/Recall/`. Run this first when troubleshooting. Pass `--fix` to repair drift: missing symlinks are re-created; user-modified files at symlink targets are backed up under `~/.agents/Recall/backups/<TIMESTAMP>/doctor-fix/` before being replaced.
`recall doctor` checks the database connection, schema integrity, FTS5 index health, MCP server registration, Ollama availability, and the per-platform symlinks under `~/.agents/Recall/`. Run this first when troubleshooting. Pass `--fix` to repair drift: missing symlinks are re-created; user-modified files at symlink targets are backed up under `~/.agents/Recall/backups/<TIMESTAMP>/doctor-fix/` before being replaced. `--fix` only ever repairs symlinks — data and index maintenance is the explicit job of [`recall repair`](#repair), which doctor recommends when an FTS index is out of sync.

`recall stats` reports row counts per table and total database size.

Expand Down
20 changes: 20 additions & 0 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,26 @@ symlinks resolve to readable files after linking — so a silent

Extraction hooks fire on the `Stop` event. If the hook isn't registered in `settings.json`, re-run `./install.sh`.

### "Search returns nothing, but the data is there"

`recall show` / `recall recent` find records that `recall search` never
returns. The usual cause is an FTS5 index out of sync with its source table —
classically a database created by an older version where the index or its
sync triggers were never created, so every write since has been silently
unindexed.

```bash
recall doctor # the FTS index check reports which indexes drifted
recall repair # dry-run: shows the planned rebuilds, writes nothing
recall export --backup # recommended before any repair
recall repair --execute # rebuild the indexes from the source tables
```

See [Repair in the CLI reference](cli-reference.md#repair) for the full
safety model. `recall doctor --fix` does **not** fix this — doctor's `--fix`
only repairs symlinks; data and index maintenance always goes through
`recall repair`.

### "Embedding service unavailable"

Embeddings are optional. Hybrid search falls back to FTS5-only automatically.
Expand Down
34 changes: 34 additions & 0 deletions src/commands/doctor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import { execSync, spawnSync } from 'child_process';
import { createHash } from 'crypto';
import { homedir } from 'os';
import { getDb, getDbPath } from '../db/connection.js';
import { checkAllFts } from '../lib/repair.js';
import { VERSION } from '../version.js';

export interface DoctorOptions {
Expand Down Expand Up @@ -138,6 +139,38 @@ function checkStructuredData(): CheckResult {
}
}

// ─────────────────────────────────────────
// Check: FTS5 index health
// ─────────────────────────────────────────
//
// Read-only by design (issue #46): doctor only RECOMMENDS `recall repair`.
// This check is exported as a plain CheckResult with no repair fn, so the
// --fix loop (symlinks only) can never run data repair implicitly.
export function checkFtsIndexes(): CheckResult {
const label = 'FTS5 search indexes in sync';

try {
const reports = checkAllFts(getDb());
const problems = reports.filter(r => r.status !== 'ok');

if (problems.length === 0) {
return { label, status: 'PASS', message: `All ${reports.length} FTS indexes in sync with source tables` };
}

const summary = problems
.map(p => `${p.ftsTable}: ${p.status} (${p.detail})`)
.join('; ');
return {
label,
status: 'WARN',
message: `${summary} — run 'recall repair' to inspect, 'recall repair --execute' to fix`,
};
} catch (err: unknown) {
const msg = err instanceof Error ? err.message : String(err);
return { label, status: 'FAIL', message: `Check failed: ${msg}` };
}
}

// ─────────────────────────────────────────
// Check 4: Extraction output files
// ─────────────────────────────────────────
Expand Down Expand Up @@ -601,6 +634,7 @@ export async function runDoctor(opts: DoctorOptions = {}): Promise<void> {
results.push(checkDatabase());
results.push(checkMessages());
results.push(checkStructuredData());
results.push(checkFtsIndexes());
results.push(checkExtractionFiles());
results.push(checkExtractLog());
results.push(checkTrackerLockouts());
Expand Down
18 changes: 5 additions & 13 deletions src/commands/embed.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import { getDb } from '../db/connection.js';
import { embed, embeddingToBlob, blobToEmbedding, cosineSimilarity, checkEmbeddingService, reciprocalRankFusion, EMBEDDING_MODEL } from '../lib/embeddings.js';
import { notMarkedDuplicateSql } from '../lib/dedup.js';
import { embeddingTextFor } from '../lib/repair.js';
import { search as ftsSearch } from '../lib/memory.js';

// Marked duplicates (recall dedup, issue #45) keep their embeddings but are
Expand All @@ -20,21 +21,12 @@ interface EmbedOptions {
}

/**
* Get content to embed for a given table
* Get content to embed for a given table. The per-table composition rules
* live in src/lib/repair.ts (shared with recall repair); this only maps the
* CLI's short 'loa' alias to the real table name.
*/
function getContentForTable(table: string, row: any): string {
switch (table) {
case 'loa':
return `${row.title}\n\n${row.fabric_extract}`;
case 'decisions':
return `${row.decision}\n\nReasoning: ${row.reasoning || 'N/A'}`;
case 'learnings':
return `${row.problem}\n\nSolution: ${row.solution || 'N/A'}`;
case 'messages':
return row.content;
default:
return row.content || row.text || '';
}
return embeddingTextFor(table === 'loa' ? 'loa_entries' : table, row);
}

/**
Expand Down
183 changes: 183 additions & 0 deletions src/commands/repair.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
// recall repair command (issue #46).
//
// Explicit data/index maintenance: rebuild FTS5 indexes from their source
// tables (recreating missing indexes from the canonical schema DDL) and
// re-embed rows missing embeddings when the Ollama service is available.
// Orphan/invariant problems that cannot be repaired safely are report-only.
//
// Dry-run by default; --execute applies. Deliberately separate from
// `recall doctor --fix`, which only repairs install-layout symlinks and
// never runs data repair. Core logic lives in src/lib/repair.ts.

import { getDb } from '../db/connection.js';
import { checkEmbeddingService, embed } from '../lib/embeddings.js';
import {
applyEmbedRepair,
applyFtsRepair,
FTS_SOURCES,
planRepair,
type EmbedFn,
type EmbedRepairResult,
type FtsRepairResult,
type RepairPlan,
} from '../lib/repair.js';

export interface RepairOptions {
execute?: boolean;
table?: string;
/** Commander maps --no-embed to embed: false. */
embed?: boolean;
}

/**
* Injectable service/embedding clients so tests run deterministically
* offline. The CLI always uses the real Ollama-backed defaults.
*/
export interface RepairDeps {
checkService: () => Promise<{ available: boolean; model: string; url: string }>;
embedFn: EmbedFn;
}

export interface RepairRunResult {
plan: RepairPlan;
fts: FtsRepairResult | null;
embeddings: EmbedRepairResult | null;
/** Why the embedding pass did not run, or null if it ran. */
embedSkipped: string | null;
}

const DEFAULT_DEPS: RepairDeps = { checkService: checkEmbeddingService, embedFn: embed };

export async function runRepair(
options: RepairOptions = {},
deps: RepairDeps = DEFAULT_DEPS
): Promise<RepairRunResult | undefined> {
const execute = options.execute ?? false;
const embedPass = options.embed ?? true;

const target = options.table ?? 'all';
if (target !== 'all' && !FTS_SOURCES.includes(target)) {
console.error(`Invalid --table "${target}". Valid tables: ${FTS_SOURCES.join(', ')}, all.`);
process.exitCode = 1;
return undefined;
}

const db = getDb();
const plan = planRepair(db, {
table: target === 'all' ? undefined : target,
embed: embedPass,
});

console.log(execute ? '[EXECUTE — applying repairs]\n' : '[DRY RUN — no changes written]\n');
if (execute) {
console.log("Recommended: run 'recall export --backup' before applying repairs.\n");
}

// ── FTS indexes ──────────────────────────────────────────────
console.log('FTS indexes:');
for (const report of plan.fts) {
const rows = report.sourceRows !== null ? `${report.sourceRows} source row(s)` : 'source missing';
const planned =
report.action === 'rebuild' ? (execute ? 'rebuilding' : 'would rebuild')
: report.action === 'create-and-rebuild' ? (execute ? 'creating + rebuilding' : 'would create + rebuild')
: report.action === 'report-only' ? 'unrepairable here'
: 'no action';
console.log(` ${report.ftsTable}: ${report.status} (${rows}) — ${report.detail} [${planned}]`);
}

let ftsResult: FtsRepairResult | null = null;
if (execute) {
ftsResult = applyFtsRepair(db, plan);
if (ftsResult.created.length > 0) {
console.log(` Created from canonical schema: ${ftsResult.created.join(', ')}`);
}
if (ftsResult.rebuilt.length > 0) {
console.log(` Rebuilt from source tables: ${ftsResult.rebuilt.join(', ')}`);
}
for (const failure of ftsResult.failed) {
console.error(` FAILED ${failure.ftsTable}: ${failure.error}`);
process.exitCode = 1;
}
}
console.log('');

// ── Embeddings ───────────────────────────────────────────────
console.log('Embeddings:');
let embedResult: EmbedRepairResult | null = null;
let embedSkipped: string | null = null;

if (!embedPass) {
embedSkipped = 'disabled (--no-embed)';
console.log(` Skipped — ${embedSkipped}`);
} else if (plan.embedGaps.length === 0) {
console.log(' No embeddable tables in scope.');
} else {
for (const gap of plan.embedGaps) {
const shortNote = gap.tooShort > 0 ? ` (${gap.tooShort} too short to embed)` : '';
console.log(` ${gap.table}: ${gap.missing} missing${shortNote}`);
}
const embeddable = plan.embedGaps.reduce((sum, g) => sum + (g.missing - g.tooShort), 0);

if (execute && embeddable > 0) {
const service = await deps.checkService();
if (!service.available) {
// Diagnostic path: report and stay successful — an unreachable
// embedding service is an environment state, not a repair failure.
embedSkipped = `embedding service unavailable at ${service.url} (model ${service.model})`;
console.log(` Skipped re-embedding — ${embedSkipped}.`);
console.log(` ${embeddable} row(s) still missing embeddings; re-run when Ollama is up.`);
} else {
embedResult = await applyEmbedRepair(db, plan, deps.embedFn);
console.log(` Embedded ${embedResult.embedded} row(s), skipped ${embedResult.skippedTooShort} too-short row(s), ${embedResult.failed.length} failure(s).`);
for (const failure of embedResult.failed.slice(0, 5)) {
console.error(` FAILED ${failure.table}#${failure.id}: ${failure.error}`);
}
if (embedResult.failed.length > 5) {
console.error(` ...and ${embedResult.failed.length - 5} more failures`);
}
}
} else if (execute && embeddable === 0) {
console.log(' Nothing to embed.');
}
}
console.log('');

// ── Orphans / invariants (report-only) ───────────────────────
console.log('Orphans / invariants (report-only, never repaired automatically):');
if (plan.orphans.length === 0) {
console.log(' None found.');
} else {
for (const orphan of plan.orphans) {
if (orphan.error) {
console.log(` ${orphan.check}: check failed — ${orphan.error}`);
} else {
const sample = orphan.sample.length > 0 ? ` — ${orphan.sample.join(', ')}` : '';
console.log(` ${orphan.check}: ${orphan.count} (${orphan.description})${sample}`);
}
}
}
console.log('');

// ── Schema state ─────────────────────────────────────────────
if (plan.migrations.pending > 0) {
console.log(
`Schema: ${plan.migrations.pending} migration(s) pending ` +
`(version ${plan.migrations.current}, target ${plan.migrations.target}) — run 'recall init' to apply.`
);
} else {
console.log(`Schema: migrations up to date (version ${plan.migrations.current}).`);
}
console.log('');

if (!execute) {
const ftsWork = plan.fts.filter(f => f.action === 'rebuild' || f.action === 'create-and-rebuild').length;
const embedWork = plan.embedGaps.reduce((sum, g) => sum + (g.missing - g.tooShort), 0);
if (ftsWork > 0 || embedWork > 0) {
console.log("Re-run with --execute to apply repairs. Recommended: 'recall export --backup' first.");
} else {
console.log('Nothing to repair.');
}
}

return { plan, fts: ftsResult, embeddings: embedResult, embedSkipped };
}
Loading
Loading