Skip to content

feat: no-index MCP chirp + ADR-044 ephemeral port + dogfood/v1.1 cleanups (1.1.0-rc3)#56

Closed
tachyon-beep wants to merge 60 commits into
mainfrom
rc3
Closed

feat: no-index MCP chirp + ADR-044 ephemeral port + dogfood/v1.1 cleanups (1.1.0-rc3)#56
tachyon-beep wants to merge 60 commits into
mainfrom
rc3

Conversation

@tachyon-beep

@tachyon-beep tachyon-beep commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator

Supersedes #55, which GitHub auto-closed when its head branch feat/serve-no-index-chirp was renamed to rc3. Same commits, plus one new feature (§4). No history was lost.

This PR bundles related loomweave serve improvements plus a batch of dogfood-friction fixes and deferred v1.1 engineering cleanups (see commit groups). Ships as 1.1.0-rc3.

1. No-index MCP chirp (cf5f397)

serve no longer exits 1 when there is no index. It serves a degraded MCP stdio session that answers initialize and chirps "run loomweave install + loomweave analyze" from every tool call, so the MCP client connects and is told how to recover instead of staring at a server that died at startup.

2. ADR-044 — read-API ephemeral port publication (a0731d4..c7f2530)

Fixes the cross-project 127.0.0.1:9111 collision: every project's loomweave serve bound the same hardcoded port, so two could not run concurrently and consumers mis-targeted each other's instances.

  • Deterministic per-project port (loomweave_port::deterministic_port, blake3 band 9400–10399, disjoint from Filigree's 8400–9399) with an OS-ephemeral :0 fallback when the deterministic port is taken (auto only — an explicit operator bind in use stays a hard error).
  • Publishes the live port to .loomweave/ephemeral.port — a normative cross-product file contract (port-only ASCII + optional \n, atomic temp+rename, loopback-only, RAII-removed on clean shutdown/error/panic). Stale-file/crash cases are tolerated; the ADR-034 instance-ID guard is the correctness backstop.
  • HttpReadConfig.bind: Option<SocketAddr>None (the new default) auto-selects; Some(addr) is honored verbatim.
  • Consume-time resolver resolve_loomweave_url (precedence: explicit > published file > config > none), the reference reader Wardline's Python twin mirrors. Reported by doctor and project_status_get.
  • Installer stops pinning 9111: the YAML stub omits bind; integration_bindings writes the per-project deterministic URL and self-heals the old bind: 127.0.0.1:9111 stamp on repair.
  • ADR-044 accepted (glossary managed-clash verdict vs Filigree's .filigree/ephemeral.port). Closes clarion-7f574bc34f.

Wardline-side consume-time resolution remains a separate follow-up (its own repo), captured in ADR-044's Related follow-up.

3. Dogfood-friction fixes + deferred v1.1 cleanups (7ff84b2..5675f4a)

Resolves the actionable loomweave-impacting tickets from the 2026-06-06 dogfood report and several deferred v1.1 gap-register items. Each is its own commit.

  • Worktree-aware staleness (ADR-045, 032425c). Snapshot/project_status_get/loomweave://context/session-start banner gain indexed_at_commit + worktree_dirty; a new Staleness::StaleWorktree verdict fires when an otherwise-fresh index has untracked source. Detection uses a hardened, hash-free git ls-files --others scoped to ingested source extensions (false-positive guard), proven filter-RCE-safe by the ls_files_others_does_not_run_clean_filter test. Closes clarion-26c7e52027 + clarion-d9cf8bcfa9.
  • *.loomweave/.gitignore instance_id + .lock (ADR-005, 7ff84b2). git add -A no longer stages the serve fingerprint or analyze lock; ADR-005 documents the live-index commit hazard → loomweave db backup. Closes clarion-7381e6382d.
  • WAL checkpoint(TRUNCATE) after each committed run (69ebadd). On-disk .db reflects committed state while serve is alive. Closes clarion-cdee445ed8.
  • Drop dead entity_fts.content_text (migration 0009, 8be269d). Never populated, never read (content search is ADR-040 embeddings); CURRENT_SCHEMA_VERSION → 9. Closes clarion-716449c371.
  • release.yml macOS aarch64 verify gate (3c8feae). Mirrors ci.yml, wired into build/publish needs. Closes clarion-47d395e03c.
  • macOS Gatekeeper doc (5da9ccd) + ADR-024 marker backfilled to v1.0.0 (b598ebf) activating the migration-retirement guard. Close clarion-03dfa1f94d + clarion-b20448b3ac.

Verification

Full CI floor green: fmt / clippy -D warnings / build / nextest 1227 passed / doc -D warnings / cargo deny; Python ruff / ruff format / mypy --strict / pytest 166 passed (85%); version-lockstep + ADR-024 migration-retirement guards pass at 1.1.0-rc3; wardline scan --fail-on ERROR clean on changed crates.

🤖 Generated with Claude Code

4. Agent-orientation block injection (0a93731)

loomweave install now pushes a managed Loomweave block into the always-loaded CLAUDE.md / AGENTS.md context (mirroring Filigree's instruction injection) so an agent learns to ask Loomweave's MCP tools before re-grepping the tree; loomweave doctor verifies it and --fix repairs it via the same idempotent installer.

  • New instructions.rs manages a <!-- loomweave:instructions --><!-- /loomweave:instructions --> span and edits only its own bytes — whole-line marker scanning, byte-offset splice with verbatim tail preservation, never truncates to EOF. A co-resident Filigree/Wardline block in the same file survives every create/append/replace/malformed operation (the headline coexistence guarantee, regression-tested incl. the two-dangling-marker data-loss case caught in review).
  • Drift signal is a body content hash, not the marker version (a version bump on identical content is not drift). Malformed-repair strips all orphan start markers → safe + converges in one pass.
  • install: --instructions flag + InstallPlan plumbing; bare install does it. doctor: Missing=warning (optional surface), Drifted/Malformed=problem; wired into text + JSON paths with a next-action.
  • Thin embedded asset (a pointer to the MCP tools + the loomweave-workflow skill).

Full CI floor re-verified green with this feature: fmt / clippy -D warnings / build / nextest 1227 passed / doc -D warnings / cargo deny.

tachyon-beep and others added 27 commits June 6, 2026 05:53
…sing DB

When `.loomweave/loomweave.db` is absent, `loomweave serve` did a hard
`ensure!(db_path.exists())` and exited 1 before the MCP protocol started.
An MCP stdio client (Claude Code) just saw the server die at startup with
the real reason buried in stderr — it read as "loomweave mcp failing"
with no actionable signal.

Now `serve` starts a degraded "no-index" stdio session instead:
- `initialize` succeeds so the client connects cleanly; the `instructions`
  field leads with the run-`install`+`analyze` chirp (mirrors the
  SessionStart hook wording).
- Every `tools/call` returns the same chirp as a tool result with
  `isError: true` — the load-bearing channel, since not every client
  surfaces `initialize.instructions`.
- `tools/list` and the static `loomweave-workflow` prompt still answer so
  the surface looks healthy.
- No HTTP read API bind, no LLM/embedding providers, no Filigree client,
  no ReaderPool — nothing to back them without a DB. One warn line to
  stderr (never stdout) at degraded startup.

loomweave-mcp gains `handle_json_rpc_no_index` + `serve_stdio_no_index`
plus chirp helpers; serve.rs swaps the exit for a `serve_no_index` branch.

Closes clarion-ac36f51c2b.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…oling

Read-API port deconfliction (clarion-7f574bc34f):
- ADR-044 (Proposed): serve publishes .loomweave/ephemeral.port with a
  per-project deterministic port + ephemeral fallback and a loomweave-side
  resolver (twin of filigree_url), so concurrent projects stop colliding on
  the hardcoded 9111; installer stops pinning the port. Indexed in the ADR
  README.
- Stopgap so this project coexists with others on 9111 until the ADR lands:
  loomweave.yaml serve.http.bind -> 127.0.0.1:9112 and wardline.yaml
  loomweave.url -> :9112.

Wardline tooling:
- .mcp.json: drop the hardcoded --loomweave-url/--filigree-url from the
  wardline MCP args (resolved from wardline.yaml instead); normalize server
  entries.
- .pre-commit-config.yaml: add a local wardline-scan hook.
- .agents/skills/wardline-gate: add the wardline-gate skill pack.

- .gitignore: ignore the raw wardline scan output (findings.jsonl).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… contract

Incorporates the Wardline (consumer-side) review. The interop surface is the
file, not loomweave's Rust resolver — Wardline implements its own Python reader
against it (SEI-style "consumers conform"). Pins, as normative:

- File contract: <project>/.loomweave/ephemeral.port, plain-ASCII port only,
  optional trailing newline, host/scheme implied, atomic temp+rename, created on
  loopback bind / removed on clean shutdown.
- Loopback-only publication: a non-loopback bind (allow_non_loopback, ADR-034)
  publishes no file, so the port-only format never under-specifies the host.
- Resolution precedence (consume-time, per read): explicit flag/env > published
  file > configured url > none. The file self-heals stale/default config but
  never overrides a deliberate explicit target.
- Fail-soft: validate 1..=65535; malformed or resolved-but-refused (stale file /
  crashed serve) degrades, never errors. Instance-ID guard (ADR-034) is the
  correctness backstop so the reader can be simple.
- Related follow-up: consume-time resolution applies to both sibling legs;
  Wardline's filigree leg (install-time today) should unify Wardline-side.

Tracking: clarion-7f574bc34f

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ic publish (ADR-044)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eans temp on rename failure

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s per-project port (ADR-044)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…; drop misleading # Panics docs (ADR-044)

Replace the two .expect() unwraps in validate_loopback_trust and
validate_auth_trust with compiler-enforced pattern matching, and delete
the # Panics doc sections (a # Panics heading documenting when a method
will *not* panic inverts the rustdoc convention). Behavior is identical.

Also add a field doc comment to HttpReadConfig.bind and two tests:
the auth-trust None path and explicit YAML-null bind parsing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…sh .loomweave/ephemeral.port (ADR-044)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t is taken (ADR-044)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…mweave URL, no fixed bind (ADR-044)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… on repair (ADR-044)

Older `install --all` runs unconditionally stamped a fixed bind. Task 5
stopped writing it but left existing stamps in place, so re-install kept
`bind: 9111`, serve honored it verbatim (no auto-port), and the collision
returned invisibly (loomweave_yaml_ok no longer inspected bind). Strip
exactly the old auto-default literal on repair and treat its presence as
not-ok so doctor/binding_state flags and fixes it. Operator-chosen binds
(any other value) are preserved.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ead-API port (ADR-044)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…weave_url (ADR-044)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…112 stopgap

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…read-API report, reconcile ADR precedence

- install.rs GITIGNORE_CONTENTS now ignores .loomweave/ephemeral.port so
  freshly-installed projects do not show the runtime port file as untracked
  while serving; install test asserts the new rule.
- project_status_get reports loomweave_read_api (resolved_url +
  resolution_source) via a query-time resolve_loomweave_url(None, project_root),
  the second in-repo consumer named by ADR-044 alongside doctor. Additive field;
  existing project_status tests unaffected. Two new tests cover published-port
  and no-file ("none") cases.
- ADR-044: clarify precedence level 1 is an operator's deliberately-supplied
  target (typed flag/env), while an installer-seeded --loomweave-url in .mcp.json
  is config-tier (precedence 3) so the published file self-heals it; added a
  Related follow-up bullet for Wardline (clarion-7f574bc34f).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Workspace + Python plugin in lockstep. Cross-ecosystem version normalization in
check-workspace-version-lockstep.py (SemVer prerelease 1.1.0-rc1 == PEP 440
1.1.0rc1). CHANGELOG: ADR-044 ephemeral-port deconfliction + no-index MCP chirp.
No package published for release candidates.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…mit hazard (ADR-005)

The shipped .loomweave/.gitignore (ADR-005) excluded WAL/shadow/logs but not
the per-project `instance_id` fingerprint or the analyze advisory lock
(`loomweave.lock`, fs2), so `git add -A` staged live runtime state into demo
repos. Add `instance_id` and `*.lock` to GITIGNORE_CONTENTS and refresh ADR-005's
verbatim block + Excluded list (also reconciling ephemeral.port/embeddings.db).
The install test now asserts both rules ship.

ADR-005 also gains a "Committing a live index" note: the on-disk loomweave.db
lags its pending WAL while serve runs, so commit a consistent copy via
`loomweave db backup` (or stop serve) rather than git-add-ing the live file.

Closes clarion-7381e6382d. Refs clarion-cdee445ed8.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ace db backup

After a successful CommitRun the writer-actor now runs
`PRAGMA wal_checkpoint(TRUNCATE)` so the on-disk loomweave.db reflects committed
state while the writer is still alive — previously the WAL only truncated on
last-connection close, leaving a multi-MB pending sidecar that made the .db an
unreliable point-in-time artifact for commit. The checkpoint is best-effort:
failure logs a warning and leaves committed frames durable.

`loomweave analyze --help` now points at `loomweave db backup` for committing the
index as a versioned artifact (the verb already exists; this is discoverability).

Closes clarion-cdee445ed8.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ee verdict (ADR-045)

project_status_get reported staleness:"fresh" while the working tree held
un-indexed source, so the session-start banner ("index is fresh, ask Loomweave")
lied about uncommitted code. Make staleness worktree-aware:

- Snapshot gains indexed_at_commit + worktree_dirty; a new Staleness::StaleWorktree
  verdict fires when an otherwise-fresh index has untracked source on disk.
- Detection uses loomweave_core::list_untracked_files — hardened, hash-free
  `git ls-files --others --exclude-standard`, scoped to ingested source extensions
  so a scratch notes.txt does not flag (false-positive guard). Fail-soft outside
  a git work tree.
- Surfaced on loomweave://context, project_status_get (worktree_dirty +
  staleness_note), and the session-start banner with a concrete re-analyze remedy;
  orientation treats StaleWorktree as stale.
- ADR-045 records the maintainer-authorized security boundary: `git status` is
  forbidden (filter.clean RCE on hashed content; clarion-4b5a8aff54), but
  ls-files --others is hash-free — proven by the new
  ls_files_others_does_not_run_clean_filter security test, not reasoning alone.

Closes clarion-26c7e52027 and clarion-d9cf8bcfa9.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-04)

Release archives are unsigned (ADR-033), so macOS Gatekeeper blocks the
downloaded loomweave binary on first launch. Add a Troubleshooting entry with
the `xattr -d com.apple.quarantine` fix and the GUI "Open Anyway" alternative.

Closes clarion-03dfa1f94d.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v1.0.0 was the first externally-published build; 0001_initial_schema.sql is
byte-identical at v1.0.0 and HEAD, and all schema changes since are additive
0002+ migrations. Backfilling the marker activates
scripts/check-migration-retirement.py's guard (previously pre-trigger despite
shipped releases): in-place edits to 0001 now fail CI, enforcing additive-only.

Closes clarion-b20448b3ac.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…47d395e03c)

release.yml's `verify` job was Linux-only, so a macOS-only clippy/--all-targets
regression — caught on PRs by ci.yml's rust-macos job but not re-verified at
release — could pass `verify` and proceed to the build/publish jobs (the
aarch64 build leg only builds --bins, not tests/all-targets). Add a `verify-macos`
job mirroring ci.yml's rust-macos (clippy + bin build on macos-14) and add it to
the needs chain of build-rust, build-wheels, and build-plugin. No new runner
dependency — build-rust already uses macos-14.

Closes clarion-47d395e03c.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…9, V11-STO-06)

content_text shipped in 0001 reserved for an on-demand source-text projection
that was never built: the entities_ai trigger always wrote '', the entities_au
trigger never touched it, and no query reads it (search MATCHes the table, not
the column). It was permanently-empty schema drift; content search is served by
the ADR-040 embeddings sidecar. FTS5 has no ALTER DROP COLUMN, so migration 0009
recreates entity_fts and its triggers without it and rebuilds the index from
entities. Behaviour-preserving — only a never-populated, never-read column goes.

Bumps CURRENT_SCHEMA_VERSION to 9; updates the schema_migrations expectation
tests and the authoritative detailed-design.md FTS block; adds a regression test
asserting content_text is gone and MATCH search still works.

Closes clarion-716449c371.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Workspace + Python plugin in lockstep. rc2 rolls up the dogfood-friction fixes
and deferred v1.1 engineering items landed on this branch: worktree-aware
staleness (ADR-045), .gitignore instance_id/*.lock (ADR-005), WAL
checkpoint(TRUNCATE), entity_fts.content_text drop (migration 0009), the macOS
aarch64 release verify gate, the Gatekeeper doc, and the ADR-024 marker backfill.
No package published for release candidates.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nstall + doctor)

`loomweave install` now pushes a managed Loomweave block into the always-loaded
CLAUDE.md / AGENTS.md context, mirroring Filigree's instruction injection, so an
agent learns to ask Loomweave's MCP tools before re-grepping the tree.
`loomweave doctor` verifies it and, with --fix, repairs it via the same
idempotent installer.

New `instructions.rs` manages a `<!-- loomweave:instructions -->`…
`<!-- /loomweave:instructions -->` span and edits ONLY its own bytes — it never
truncates to EOF, so a co-resident Filigree/Wardline block in the same file
survives every create/append/replace/malformed operation. Drift is a body
content hash (not the marker version, so a version bump on identical content is
not drift), and the malformed-repair strips all orphan start markers so it stays
safe and converges in a single pass.

- install: --instructions flag + InstallPlan plumbing; bare `install` does it.
- doctor: Missing=warning (optional surface), Drifted/Malformed=problem; wired
  into both the text and JSON report paths with a next-action remediation.
- thin embedded asset: a pointer to the MCP tools + the loomweave-workflow skill.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0a93731f72

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 269 to +270
"--loomweave-url",
LOOMWEAVE_HTTP_URL,
desired.loomweave_url,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop pinning Wardline to the deterministic port

When install_wardline_mcp emits --loomweave-url with the deterministic fallback URL, that CLI argument overrides the consumer-side config/published-port discovery path. In the scenario this commit adds support for—auto bind collides and http_read falls back to an OS-assigned port while publishing .loomweave/ephemeral.port—the installed Wardline MCP process will still call the stale deterministic port from this arg instead of the published live port, so Wardline scans cannot reach the running Loomweave API. The MCP entry should omit this override (or otherwise allow Wardline to resolve .loomweave/ephemeral.port) rather than passing the static URL as a flag.

Useful? React with 👍 / 👎.

Comment on lines +67 to +68
fn drop(&mut self) {
loomweave_federation::loomweave_port::remove_published_port(&self.project_root);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve a newer published port on shutdown

If two loomweave serve processes are started for the same project, the second one now succeeds via the auto-port fallback and overwrites .loomweave/ephemeral.port with its actual ephemeral port. When the first process later exits, this unconditional removal deletes the second process's still-live discovery file, so consumers can no longer find the running read API. The guard needs to remove the file only if it still contains the port this server published, or same-project concurrent serves need to be rejected.

Useful? React with 👍 / 👎.

Comment on lines +90 to +92
# Read-API live port discovery file (ADR-044): present only while serve runs,
# rewritten per bind, loopback-only — a runtime artifact, never committed.
ephemeral.port

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Update existing .loomweave ignores during install

These new runtime-ignore rules are only written when .loomweave/ is first created; for an existing 1.0 checkout, initialise_project returns early on an existing .loomweave/ and never refreshes .loomweave/.gitignore. After upgrading and running serve, .loomweave/ephemeral.port and instance_id can therefore show up in git add -A, exactly the runtime artifacts this change says must never be committed. The installer/doctor should merge these rules into existing .loomweave/.gitignore files as part of the upgrade path.

Useful? React with 👍 / 👎.

… retire PYRIGHT_MAX_NPROC

RLIMIT_NPROC is enforced by the kernel against every process/thread owned by the
real UID system-wide, not against a plugin's descendant tree. Any fixed ceiling
low enough to stop a fork-bomb is also low enough to fail a legitimate fork(2)
with EAGAIN once the operator's unrelated processes (an interactive session,
other Weft daemons) push the per-UID count past it — which intermittently broke
pyright-langserver on busy workstations.

`effective_max_nproc` now returns `None` (no RLIMIT_NPROC cap) for plugins
declaring the `pyright` runtime capability and `Some(DEFAULT_MAX_NPROC)`
otherwise; `apply_prlimit_nofile_nproc` threads the `Option` through and skips
the RLIMIT_NPROC setrlimit when `None`. Language-server plugins rely on
RLIMIT_AS + crash-loop supervision instead; cgroup v2 `pids.max` is the
documented path for a true per-plugin process ceiling.

- ADR-021: process-count control limitation under Alternative 4 + residual-risk note.
- ADR-035: retire PYRIGHT_MAX_NPROC from the tuning-constant inventory.
- plugin.toml: document why the pyright runtime sub-table disables the NPROC cap.
- pyright_session.py + test: language-server session adjustments.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

tachyon-beep and others added 2 commits June 8, 2026 00:16
…ft-d822a7de2d)

The index DB mutated on every analyze/scan while being git-tracked, leaving a
permanently dirty working tree that blocked legis from signing the project (C1,
dogfood-#2). loomweave.db is a regenerable orientation cache: `loomweave analyze`
rebuilds the structural graph with no LLM calls, so only the lazy summary cache
carries cost, and that is acceptably machine-local.

- install.rs GITIGNORE_CONTENTS now excludes loomweave.db (template = source of truth)
- ADR-005 reversed honestly: Status/Summary/Decision/Tracked/Excluded/Alternatives/
  Consequences/ADR-014-ref + the commit_db knob inverts to an opt-IN
- detailed-design §3 gets a C1 reversal note so it isn't read as current
- top-level .gitignore comment de-staled (it already ignored the db)

Lacuna's already-tracked db retrofit is the separate cross-repo weft-3c9bae6a40.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eft-a2f4cf95c7)

C-9's last hub-level deliverable: the cross-member weft.toml schema. Pins the
single well-known home for a shared fact ([<member>] table + cross-read allowlist,
no per-member duplication), the sibling-endpoint precedence ladder (flag > env >
weft.toml [X].url > on-disk discovery > default), the malformed=absent /
operator-sole-writer / no-duplication invariants for the shared layer, and the
reader extension to loomweave-core::store. Designed around wardline as the
multi-sibling consumer. Open questions deferred to the hub bless.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

tachyon-beep and others added 2 commits June 8, 2026 00:29
…8e3d02f409)

Weft C-2 WAL hygiene: expose an on-demand verb that issues
`PRAGMA wal_checkpoint(TRUNCATE)` on the working store, flushing outstanding WAL
frames into loomweave.db and resetting the -wal sidecar to zero so the on-disk
file is a clean point-in-time artifact for backup/demo/snapshot.

The analyze path already TRUNCATE-checkpoints at each committed run boundary
(loomweave-storage writer.rs); this is the companion for the serve summary-write
path, where the WAL grows between PASSIVE wal_autocheckpoint hits. Best-effort:
busy (a live serve reader) reports a partial outcome rather than failing —
committed frames are already durable. Two integration tests (truncates-to-0 +
data-survives; missing-db rejected).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…(L1, ADR-047)

Findings accumulated on every re-analyze of an unchanged tree (dogfood-2
255→259→263) because the finding id embedded run_id (core:finding:{run_id}:…),
so ON CONFLICT(id) only de-duped a --resume re-walk; a fresh run got a new
run_id and minted a duplicate row — also orphaning the prior row's
filigree_issue_id / suppression status every run.

Drop run_id from the id (core:finding:<discriminator>); every discriminator was
already content-derived (entity/guidance/subsystem id, or a blake3 of
entity+rule+evidence). The upsert now de-dupes across fresh runs AND preserves
lifecycle; the run_id COLUMN still tracks last-seen, so findings_for_emit
(WHERE run_id = current) is unchanged. Loomweave's finding id is never on the
wire, so filigree dedup is untouched.

- all 12 analyze.rs minting sites + secret_scan/findings.rs → content-keyed
- migration 0010 clears legacy run-scoped rows (regenerable; store is a cache
  per ADR-005/C1)
- ADR-047 documents the decision + accepted trade-off (findings are
  current-state, not a per-run append-log)
- regression test: analyze an unchanged tree 3× → finding count stable
- write_finding_row + minting-site comments de-staled; test fixtures updated

Filed as clarion-772ff358da (Part A); Part B (project-wide finding browser /
has_findings filter) remains. Gates: fmt + clippy + 1263 workspace tests +
rustdoc + migration-retirement all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

tachyon-beep and others added 4 commits June 8, 2026 08:47
…t worktree_dirty note (L1, N5)

L1 Part B (clarion-772ff358da) — close the count-without-list gap:
- New `project_finding_list` MCP tool returns EVERY finding across the
  project with NO entity id required; each row carries its anchoring
  entity { id, sei, file, line } + tool/rule_id/kind/severity/status.
  page.total is counted off the bare `findings` table (byte-identical to
  the snapshot's finding_count query), so an unfiltered list reconciles
  with project_status_get's finding count by construction; the entity
  JOIN only enriches the page rows. Honest-empty (0 findings -> []).
- `entity_wardline_list` (find_by_wardline) accepts `has_findings: true`
  to page only the taint-fact entities that ALSO carry a finding, instead
  of every blob. Declared in a new `wardline_facet_schema()` so the
  additionalProperties:false schema actually permits the param.

N5 — make `worktree_dirty` honest at the consumer surface:
- project_status_get now emits `worktree_dirty_note` on EVERY path
  (true/false/null) disclosing the field measures un-indexed UNTRACKED
  source, not the git working-tree state, so a `false` is not "git clean".
- Decision: keep the field name (no rename -> no dangling legis signing
  gate) and scope detection to untracked-only. Broadening to modified
  tracked source via `git diff`/`status --porcelain` would hash
  working-tree content, reintroducing the corpus-controlled code-exec
  vector hardened_git avoids; modified *indexed* files already surface via
  `staleness` (-> stale). Documented in the field doc + the consumer note,
  which tells a freshness/signing gate to key on `staleness == fresh`.

Tools surface 39 -> 40; tool-count/positional tests and current-surface
docs (README, web concepts/reference) updated. SEI scheme untouched;
read-only surface (no new writes).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…onger reproduces (clarion-87c1eba2bd, ADR-048)

After a clean full analyze, DELETE open, Filigree-unlinked findings whose
run_id != the current run — the prior-index-style finding diff ADR-047
deferred. Reuses ADR-047's run_id signal (a reproduced finding upserts to
the current run_id; a vanished one keeps its prior one). Closes the gap
where a finding whose code was fixed (or deleted — `entities` is cumulative,
so the findings→entities cascade never fires) lingered forever and the
whole-project finding count only ever grew.

Lifecycle preserved: findings carrying a filigree_issue_id or a non-`open`
status (acknowledged/suppressed/promoted_to_issue) are operator decisions
owned by the Filigree unseen/soft-archive path, never this local sweep —
the predicate (filigree_issue_id IS NULL) is disjoint from that set.

Gated to a clean full pass so `run_id <> current` unambiguously means
"the run walked this finding's file and stopped reproducing it":
  !resume && skipped_files == 0 && source_walk_skipped_entries == 0 && !no_sei
The walk-error clause closes a hole an adversarial review found: a single
source-walk error (IO/permission/path-jail) leaves files unread yet reaches
Completed with skipped_files == 0, which would otherwise retire a whole
unwalked subtree's still-reproducing findings.

- storage: `findings::sweep_stale_findings` + `WriterCmd::SweepStaleFindings`
  (query-time write, post-CommitRun, best-effort/enrich-only).
- cli: sweep call site last in the analyze `Completed` arm, after every
  during-run and post-commit finding pass.
- ADR-048 + README index row.
- Tests: storage unit matrix (lifecycle exemptions) + writer-actor round-trip;
  two CLI integration tests that fail if the sweep or the skipped_files gate
  clause is removed (positive retirement + incremental-skip no-op).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ix (C1, gap-analysis opp #4)

`loomweave doctor` verified the orientation surfaces but never checked whether
`.weft/loomweave/loomweave.db` was committed to git — a vacuous-green gap (same
class as wardline W2). A tracked db mutates on every analyze/scan, leaving a
permanently-dirty tree that blocks legis signing; ADR-005 was reversed in
b7a1b30 so fresh installs gitignore it, but a template change cannot untrack an
already-committed db.

- new `db_tracked_state` queries `git ls-files --error-unmatch`; non-success
  (untracked / ignored / absent / outside-repo / not-a-repo / git-missing) all
  fold to Untracked, so the check is fail-soft
- text + JSON report paths gain a `db.tracked` check: a warning with the
  `git rm --cached` remedy by default (advisory — does not fail the gate,
  matching the enrich-only severity model)
- `--fix` self-heals via `git rm --cached --ignore-unmatch` on the db + WAL/SHM
  sidecars, keeping the working-tree files
- 4 tests over a temp git repo: tracked/untracked/outside-repo detection +
  the --fix unstage-but-keep-file path

Closes the loomweave half of weft-d822a7de2d's doctor self-heal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two doctor surfaces landed together in this commit. NOTE: the index-DB-health
work was already present (uncommitted) in the working tree and is bundled here
deliberately rather than split, with its CHANGELOG entry.

1. Index-DB health check (`.weft/loomweave.schema`) — classifies four states
   instead of mere file-existence: absent (warning, install-before-analyze is a
   legitimate intermediate, gate passes); present-but-unreadable/corrupt/wrong-
   format (problem, fails the gate); opens but `PRAGMA user_version` exceeds this
   build's schema (problem, names the newer-build cause + version numbers);
   healthy (ok). JSON + text paths agree; path via `store::db_path` honours a
   `weft.toml` store_dir override; opened read-only. Tests:
   `doctor_index_health_*`.

2. Git-tracked runtime DB is now a gate-failing PROBLEM (was a warning).
   A tracked `loomweave.db` mutates on every analyze/scan, dirtying the tree and
   blocking legis signing (C1 / weft-d822a7de2d), so `doctor` exits non-zero
   instead of vacuously passing as a pre-commit gate; `--fix` self-heals via
   `git rm --cached`. Test: `doctor_flags_git_tracked_db_as_problem_*`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

#2)

`integration_bindings` hardcoded the unscoped `/api/weft/scan-results`, which
server-mode Filigree now fail-closes (N1) → every wardline scan 400s. Worse, the
staleness check tested for that exact unscoped URL, so it flagged a CORRECT
project-scoped config as "stale" and `doctor --fix` / `install` then OVERWROTE
the working URL with the broken one — a repair that converted working→broken.

- new `filigree_server_scope` reads `.weft/filigree/config.json`; when
  `mode == "server"` it returns the routing `prefix` (filigree mounts
  `/api/p/{prefix}/…` on it; `name` kept as fallback), else None (fail-soft)
- `desired_bindings` emits `/api/p/{prefix}/weft/scan-results` for server-mode
  Filigree, and keeps the unscoped `/api/…` path for single-project / no-Filigree
  layouts (which still serve the unscoped mount)
- so `binding_state` now reports a scoped config HEALTHY (no false "stale"), and
  `--fix` converges to the WORKING scoped URL instead of clobbering it
- 3 tests: server-mode→scoped, non-server→unscoped, absent-config→unscoped

Closes gap-analysis opp #2 (loomweave's half of the member path-scope action).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

tachyon-beep and others added 8 commits June 8, 2026 19:04
…-01)

Closes clarion-c9f62eec7d. The pre-merge gate (ci.yml) and the pre-release
gate (release.yml `verify`/`verify-macos`) were hand-duplicated — the exact
drift risk release.yml's own header flagged ("Duplicated rather than refactored
into a reusable workflow on the eve of 1.0; that refactor is post-release
scope"). They had already diverged (llvm-tools-preview component, job shape,
cache keys).

Factor the entire gate set into `.github/workflows/verify.yml`
(`on: workflow_call`): rust (fmt, the five lockstep/migration guards, clippy,
build bins, nextest, doc -D warnings, deny), rust-macos (aarch64 clippy+build),
python-plugin (ontology lockstep, uv sync, pip-audit, B4/B5 gate, ruff, ruff
format, mypy, pytest), and walking-skeleton (sprint_1/wp5/sprint_2/phase3).
Both entry points now `uses: ./.github/workflows/verify.yml`, so the CI floor
is defined exactly once and cannot drift.

The only release-only concern — "tag points at a commit on main" — stays in
release.yml as a dedicated `assert-on-main` job (gated on event_name==push, so
a dispatch dry-run still runs; a skipped needs does not block dependents). The
build/publish jobs now `needs: [verify, assert-on-main]`.

ci.yml 227->17, release.yml 673->532, +234 canonical. YAML validated; the gate
set is preserved by construction. actionlint could not run in this sandbox
(external download blocked) — the live proof is the next Actions run.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… serve, config passthrough (clarion-c326ee6857)

Implements the 4 still-valid deferred findings from the PR#21 review; the
other 8 were verified already-fixed/outdated against current code (rc3) and
need only thread resolution (recorded on the issue).

#4 writer.rs begin_run TOCTOU: if begin_write_tx exhausts its retries after the
runs row was auto-committed 'running', re-mark the row 'failed' under a fresh
implicit tx (mirrors the CommitRun failure-remark idiom) so it isn't stranded
phantom-running with current_run unset. Deliberately does NOT move the INSERT
inside the tx (the ticket's literal suggestion) — that would hide the running
row from cross-process analyze_status until the first batch COMMIT (the
regression review #15 warns about). Best-effort; mark_stale_running_runs_failed
remains the startup backstop.

#15 writer.rs resume_run: capture the row's prior (status, completed_at) before
flipping it to 'running' and restore them if begin_write_tx fails — a
pre-existing completed run must not be left flipped to running.

Both paths get deterministic, single-threaded coverage via two test seams
(grab a competing write lock after the insert/update; release it on the
failure path so the best-effort cleanup can re-acquire the lock) — the
contention harness the ticket said #4/#15 needed, without threads or
wall-clock races.

#8 reader.rs open_validated: reject an unmigrated DB (new
schema::reject_unmigrated_for_read → StorageError::UnmigratedIndex) so a
header-valid but empty/externally-created file is refused at serve instead of
auto-materialised and answered with zero rows. Keyed on user_version (0 =
unmigrated), NEVER on row counts — an installed-but-unanalyzed index is
user_version=CURRENT with zero entities and stays a valid serve target.

#12 serve --config passthrough: serve forwards its resolved on-disk --config to
analyze_start-spawned analyze (ServerState::with_analyze_config →
spawn_analyze --config) so the child parses the same configuration the
operator launched serve with, instead of re-discovering config and silently
diverging.

Tests: 2 writer failpoint tests, 2 reader schema-validation tests, 1
spawn_analyze --config forward/omit test. 972 storage+mcp+cli tests green;
clippy -D warnings clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-bf496d55d1, §4.2)

Operator confirmed mechanism §4.2, Loomweave half. Extends DEAD_CODE_ROOT_TAGS
with `wardline:external_boundary` and `wardline:trusted` so a developer-annotated
Wardline trust boundary is treated as a reachability root by find_dead_code
(external_boundary → entry point, trusted → exported API).

The input already arrives in-scope: the Python plugin reads the on-disk Wardline
vocabulary descriptor and emits `wardline:<canonical_name>` tags into entity_tags
(plugin_id=python) at analyze time, through the same host validation + writer
discipline as every other tag (extractor.py:1098). So this is a read-side alias —
no new ingestion, no taint-store parse, no migration, no Wardline-side work. The
opaque wardline_taint SP9 blob does NOT carry this classification (it would block
on sibling-repo emission), so the plugin-tag channel is the correct input.

Doctrine: enrich-only (no descriptor → no wardline:* tag → root set byte-identical;
the signal-unavailable empty-root guard still holds). SEI/freshness handled for
free — entity_tags rows cascade-delete with their entity (FK ON DELETE CASCADE)
and roots join only live entities, so a stale boundary fact cannot resurrect a
deleted entity as a root; the plugin re-emits per (entity_id, plugin_id) each
analyze, so the signal is as fresh as the index.

Test: find_dead_code_treats_wardline_trust_boundaries_as_roots — a wardline-tagged
unreached entity is spared, an untagged unreached entity is still flagged.
6 dead_code tests green; clippy -D warnings clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (clarion-2b8811da39)

Closes V11-ARCH-05. Carves the pure, self-free validation layer out of the
3358-line host.rs into a new same-directory submodule host_validate.rs
(mirroring the existing host_findings.rs precedent), separating stateless
*validation* from the PluginHost *transport*/orchestration that drives it.

Moved: the B.3 per-field cap consts (MAX_ENTITY_FIELD_BYTES,
MAX_ENTITY_EXTRA_BYTES, MAX_UNRESOLVED_CALLEE_EXPR_BYTES,
MAX_FINDING_SUBCODE_BYTES, MAX_FINDING_SEVERITY_BYTES,
MAX_PLUGIN_FINDINGS_PER_FILE) and the free-function validators oversize_field,
oversize_edge_field, invalid_unresolved_call_site_reason,
validate_plugin_finding (+ private stringify_finding_metadata_value).

host.rs re-exports the public caps (`pub use`) so every existing path keeps
resolving — crate::plugin::host::MAX_ENTITY_FIELD_BYTES (protocol.rs intra-doc
link), the mod.rs surface, and the host.rs test module — and brings the
validators into scope for analyze_file to call unqualified. No public API
change.

Deliberately scoped to the validator layer: analyze_file's four-stage pipeline,
the JSON-RPC transport, do_shutdown, read_response_matching, and especially the
unsafe pre_exec/setrlimit block in spawn() all stay in host.rs untouched
(unsafe_code stays denied except that one documented block). RawEntity/RawEdge/
RawSource stay in host.rs so loomweave-cli's plugin::host::RawSource path and
the mod.rs RawEntity/RawEdge re-exports are unaffected.

Added 5 direct unit tests in host_validate.rs (subcode-prefix, severity
allow-list, anchor_file_path injection, call-site range/caller rejection); the
host.rs T8/T9 pipeline integration tests stay in host.rs. 202 core tests green;
clippy -D warnings + rustdoc -D warnings clean (intra-doc links resolve).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…on-164f88c510)

Step 2 of C-9 (hub-blessed): implement the full cross-read reader against the
shared weft.toml [<member>].url schema (proposal §2.4). Step 1 (the schema
proposal) shipped in 234fe7f.

loomweave-core::store: extend WeftToml to read the allowlisted cross-read `url`
from sibling tables ([filigree]/[wardline]/[legis]) and Loomweave's own
[loomweave].url; add `pub fn sibling_url(project_root, member)` — fail-soft
(absent/malformed/wrong-type/blank → None), never writes weft.toml (Gate
weft-eb3dee402f / C-4). Refactored the parse into a shared parse_weft_toml
helper; [loomweave].store_dir reading is unchanged.

Both endpoint resolvers gain the §2.2 precedence ladder, reporting source:
  flag/env (WEFT_<X>_URL, verbatim) > weft.toml [X].url (verbatim) >
  on-disk ephemeral.port > configured/default.
The operator's durable weft.toml url deliberately outranks on-disk discovery
(§2.2: a remote sibling has no local ephemeral.port). New SOURCE_ENV /
SOURCE_WEFT_TOML labels surface where the URL came from (project_status/doctor).
The env getter is injected (closure) for testability; production passes
`|n| std::env::var(n).ok()`. loomweave.yaml stays authoritative for
member-private behavior; weft.toml is the operator overlay, never written.

Call sites updated: serve.rs + analyze.rs (×2) pass the real env getter;
status.rs + doctor.rs likewise; resolver test call sites pass `|_| None`.

Tests: store sibling_url (per-member + fail-soft + store_dir-coexistence),
filigree_url (env-wins, weft.toml-over-port, blank-env-fallthrough,
disabled-not-revived), loomweave_url (env-wins, weft.toml-over-port,
blank-env-fallthrough). 102 federation + 73 dependent resolution tests green;
clippy -D warnings clean; no migration/version-lockstep impact.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The new reject_unmigrated_for_read doc (c326, commit 9987982) linked a
non-existent `set_user_version`; the user_version writer is the private
apply_user_version, reached via the public apply_migrations. Reference
apply_migrations so `cargo doc -D warnings` passes. Doc-only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ip-list constraint (clarion-6dd4b8bb85)

164f follow-up: the loomweave_url / filigree_url module docstrings still
described the pre-164f behaviour — loomweave_url.rs flatly said the explicit
flag/env precedence was "each consumer's own job, not this library function",
which the 164f change contradicted (the function now resolves WEFT_<X>_URL +
weft.toml [X].url itself). Rewrite both module headers to state the C-9 §2.2
ladder (env > weft.toml > ephemeral.port > configured) and note it supersedes
the ADR-044 consumer-only division for these resolvers. Also fix store.rs's
module doc, which my sibling_url addition made stale ("reads only its own
[loomweave] table" — it now also reads the allowlisted sibling url). Doc-only;
rustdoc -D warnings + fmt green.

clarion-6dd4b8bb85 (resolve via the ticket's sanctioned option b — document the
constraint): the source-walk / secret-scan / pyright skip-lists exclude the
whole .weft/ dotdir, so a [loomweave].store_dir override must stay within .weft/
or else be placed entirely outside the analyzed root; an override under the
analyzed tree but outside .weft/ would get loomweave.db walked/secret-scanned as
source. Documented in store.rs (ADR-046 Consequences). Auto-excluding an
arbitrary override location (option a) was considered and rejected as not worth
the coupling — the recommended override stays within .weft/.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t (clarion-3b87a7b174)

briefing_blocked is a VIRTUAL generated column over the mutable properties
JSON (migration 0002), re-derived from the secret scanner every run; the
writer-actor upsert replaces properties wholesale (writer.rs:687) with no
read-back. So a secret-bearing entity stays withheld only because every
producer re-asserts the block each run. That invariant held on all HEAD
paths (pre_ingest is unconditional incl. --resume; secret files are carved
out of incremental skip; all three producers fail-closed via
or_else(UnscannedSource)) but was UNTESTED — fragile to a future producer
that rewrites a secret file's properties without re-stamping the block.

Add still_secret_stays_blocked_across_reanalysis: a file that STILL contains
a secret must keep briefing_blocked across (1) a full re-analysis that
rewrites properties (changed body), (2) an incremental skip, and (3)
--resume. Verified to bite by mutating a producer to drop the block (RED),
then reverting (GREEN).

Investigation found the headline "silent un-block" does not reproduce on
HEAD; the only un-blocks are operator-initiated and tested-as-correct
(baseline / --allow-unredacted-secrets). Reframed to hardening; no migration.
A separate read-side gap (MCP orientation/find/neighborhood/semantic-search
leak blocked-entity identity with no gate) was filed as clarion-307668e2be.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

@tachyon-beep tachyon-beep changed the title feat: no-index MCP chirp + ADR-044 ephemeral port + dogfood/v1.1 cleanups (1.1.0-rc2) feat: no-index MCP chirp + ADR-044 ephemeral port + dogfood/v1.1 cleanups (1.1.0-rc3) Jun 8, 2026
… grep (weft-b7ce301e92)

Dogfood-3 finding LW-1/LW-2: entity_find was effectively name-only. It ran
FTS over name/short_name/summary, and summaries are off by default (ADR-030),
so a concept word returned empty and nudged agents back to grep — the discovery
step that is supposed to replace grep didn't.

Two structural causes, confirmed against the live lacuna index:
- A concept word that lives only in docstring prose (e.g. `borrow` in
  LoanPolicy's docstring) was never indexed — FTS covers name/short_name/summary
  only.
- A concept word that is a substring of a compound identifier (e.g. `library`
  in the class `LibraryService`) cannot be reached by FTS at all: FTS matches
  whole stemmed tokens, and porter mangles the compound so neither `library`
  nor the prefix `library*` aligns with the stored token (`librar*` does, 8
  hits). Prefix-append cannot fix this; substring matching can.

Fix (no schema migration): find_entities now merges two recall paths —
  1. FTS (when the pattern is FTS-safe): stemmed, bm25-ranked.
  2. LIKE substring over id/name/short_name/summary AND a briefing_blocked-
     guarded docstring.
deduped by id (FTS rank first, then substring-only hits in id order) and paged
in Rust. This is the grep-equivalent, always-on keyword path the surface
promises, with no dependency on the opt-in embeddings sidecar (ADR-040). LW-2:
turning semantic search on by default is ruled out by ADR-040 + local-first
(needs a hosted embedding service + key); instead the `not_enabled` signal and
the entity_find description now point at entity_find as the always-on path.

Secret safety: the docstring clause is gated on `briefing_blocked IS NULL` so a
docstring withheld by the pre-ingest secret scanner (ADR-013) never becomes
matchable — searching for a leaked secret must not resurface the blocked entity.
This deliberately does NOT widen clarion-307668e2be (the separate, tracked
blocked-entity *identity* exposure on these read surfaces); id/name/short_name
matching is unchanged.

Validated on the live lacuna index (no re-analyze; query-only change): both
`entity_find 'borrow'` (0->1, LoanPolicy) and `'library'` (0->10, LibraryService
ranked first) flip empty->non-empty, end-to-end through a rebuilt serve binary.

- 3 new storage tests: docstring concept word, identifier substring FTS cannot
  reach, and the briefing_blocked content guard. Existing find/kind/pagination
  tests unchanged and green.
- entity_find tool description, web reference, and the pinned description test
  updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

…t in the agent pack (weft-b7ce301e92)

Bring the agent-facing surface in step with the shipped capabilities:

- loomweave-workflow SKILL.md (embedded via include_str! in loomweave-mcp):
  - New "How find_entity matches" section — it merges stemmed FTS ranking with
    grep-equivalent substring recall over name/short_name/summary AND docstring,
    so a concept word that is only a substring of a compound identifier
    (library -> LibraryService) or lives only in docstring prose (borrow in a
    LoanPolicy docstring) is discoverable. Names it the always-on
    keyword-discovery path (reach for it before grep); no embeddings required.
  - search_semantic paragraph reframed: not_enabled is not a dead end — points
    back at find_entity as the keyword path (the LW-2 honest-degrade pointer).
  - Added project_finding_list (cb49008) to the inspection catalogue (every
    finding project-wide, no entity id) and the has_findings filter on
    find_by_wardline.
  - find_entity table row + pagination gotcha updated for content matching.
- instructions/loomweave.md (embedded in loomweave-cli): one-line note that
  entity_find is the grep replacement (substring over name/summary/docstring,
  no embeddings); semantic ranking is the separate opt-in tool.

No code change; embedded-asset prose only. skills/install + instructions
drift-marker tests green (installed projects will read as "drifted" until the
instructions block is re-pushed, by design).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

@tachyon-beep tachyon-beep deleted the rc3 branch June 9, 2026 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants