Skip to content

feat(pack-memory): expose top_k/fusion_strategy/score_floor knobs on recall (ADR-033 §6)#406

Merged
ohdearquant merged 7 commits into
mainfrom
slice/v022/05-recall-knobs
May 25, 2026
Merged

feat(pack-memory): expose top_k/fusion_strategy/score_floor knobs on recall (ADR-033 §6)#406
ohdearquant merged 7 commits into
mainfrom
slice/v022/05-recall-knobs

Conversation

@ohdearquant
Copy link
Copy Markdown
Owner

@ohdearquant ohdearquant commented May 25, 2026

Summary

Slice 5/9 from PR #399 (v022-polish show). Stacked on slice 4 (wire-retrieval).

Adds three optional per-request fields to RecallParams per ADR-033 §6:

  • top_k: Option<usize> — overrides result limit for a single call (capped at 100)
  • fusion_strategy: Option<String> — one of {"rrf", "weighted", "union"}
  • score_floor: Option<f32> — post-filter on composite score

What's changed

  • Validate fusion_strategy with clear error listing valid values
  • Wire override into cfg.fuse_strategy before passing to fuse_candidates
  • 4 integration tests: default_identity, top_k_override, fusion_strategy_override (incl. rejection), score_floor
  • Document knobs in docs/adr/ADR-033 §6.1 with table, semantics, and example DSL
  • Style: deno fmt re-pad table in ADR (post-merge cleanup commit)

Dependencies

Depends on #405 (slice 4: wire-retrieval). Base branch is slice/v022/04-wire-retrieval — will auto-rebase onto main when slice 4 merges.

Test plan

  • cargo test --workspace passes
  • Identity test confirms no behavior change with defaults
  • CI green

ohdearquant and others added 3 commits May 25, 2026 11:21
Route pack-memory's fuse_candidates through khive_retrieval::fuse_search_results,
making khive-retrieval a real consumed facade instead of an orphan crate.

- Add khive-retrieval dep to khive-pack-memory/Cargo.toml
- Replace direct fuse_with_strategy call with retrieval adapter
  (CandidateMeta side-map, HybridConfig builder, FusionStrategy conversion)
- Fix issue #309: resolve --all-features compile failures in khive-retrieval
  (stale SqliteStore imports, missing NodeId/LinkStore imports)
- Add 5 integration tests (3 fusion_surface, 2 pack-memory recall adapter)
- RRF k=1 discriminator test proves strategy propagation (30x score gap)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…recall (ADR-033 §6)

- Add three optional per-request fields to RecallParams: top_k (usize),
  fusion_strategy (string), and score_floor (f32)
- fusion_strategy validated against {"rrf","weighted","union"}; clear error
  with valid values on invalid input
- top_k overrides the result limit for a single call (capped at 100)
- score_floor applied as a post-filter on the composite score after compute_score
- Add parse_fusion_strategy_str helper; wire override into cfg.fuse_strategy
  before passing to fuse_candidates
- Add 4 integration tests: default_identity, top_k_override,
  fusion_strategy_override (including rejection), score_floor
- Document knobs in ADR-033 §6.1 with table, semantics, and example DSL

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses codex finding on PR #405: `khive_retrieval::hnsw::HnswIndex`
doesn't resolve because `hnsw` is not a public submodule — `HnswIndex` is
re-exported at the crate root (lib.rs:145). The doctest at persist/mod.rs:29
must use the public facade import.

Closes the remaining gap on issue #309 (--all-features doctest failure).

Verified: `RUSTC_WRAPPER= cargo test --offline -p khive-retrieval --all-features --doc` passes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ohdearquant
Copy link
Copy Markdown
Owner Author

PR review authored by Codex (OpenAI agent) on behalf of the requesting operator.

Verdict: REQUEST CHANGES
Findings: 0 Blocker, 0 High, 2 Medium, 0 Low

[Medium] top_k Caps After a Lossy Cast

Evidence: docs/adr/ADR-033-recall-pipeline.md:288 says top_k is capped at 100; crates/khive-pack-memory/src/handlers.rs:74 deserializes it as Option<usize>, but crates/khive-pack-memory/src/handlers.rs:479-480 computes the effective limit with (k as u32).min(100).

Why this matters: For the requested case, top_k = 200 is silently capped to 100, which matches the ADR's "Capped at 100" wording and does not return an error. The edge case is larger usize values: on a 64-bit target, a request such as top_k = 4_294_967_297 truncates to 1 before the cap is applied, so the result limit is no longer "capped at 100"; it is wrapped by the integer cast.

Suggested fix: Cap before narrowing, e.g. let limit = k.min(100) as u32, or use u32::try_from(k).unwrap_or(100).min(100). Add a regression that covers both top_k = 200 and a value larger than u32::MAX.

[Medium] The New Knob Tests Do Not Fully Prove Default Identity or Strategy Application

Evidence: crates/khive-pack-memory/tests/integration.rs:1025-1029 captures the baseline recall output, and crates/khive-pack-memory/tests/integration.rs:1036-1043 captures the "knobless" output, but the assertions at crates/khive-pack-memory/tests/integration.rs:1045-1053 only compare lengths and then assert the baseline top hit equals the created note. They never assert that the knobless output's ordered IDs/scores equal the baseline. Similarly, the valid strategy loop at crates/khive-pack-memory/tests/integration.rs:1123-1138 only asserts that "rrf", "weighted", and "union" return arrays; it would still pass if the handler validated the string but forgot to apply the requested strategy.

Why this matters: ADR-033 §6.1 says absent or null knobs preserve current behavior, and the review scope asks for tests that would fail without each knob's behavior. The identity test can pass while the null-knob call returns a different first result with the same count. The strategy test proves validation and invalid-value rejection, but not that valid overrides affect the fusion path.

Suggested fix: In test_recall_default_identity, compare the full ordered hit IDs and scores, or the full JSON value if timestamps are stable enough, between baseline and explicit-null calls for all three knobs. For fusion_strategy, add a deterministic fixture where at least one requested strategy produces an observable difference, or expose/assert the effective strategy through a focused helper so the test fails if cfg.fuse_strategy is not updated.

Looks Right

  • fusion_strategy validation lists all valid values in the error: crates/khive-pack-memory/src/handlers.rs:47-49. The integration test also checks that the invalid-strategy error mentions rrf, weighted, and union at crates/khive-pack-memory/tests/integration.rs:1151-1155.
  • score_floor is applied after compute_score: the handler computes final_score at crates/khive-pack-memory/src/handlers.rs:531-532, applies existing min_score at crates/khive-pack-memory/src/handlers.rs:534-536, then applies score_floor at crates/khive-pack-memory/src/handlers.rs:537-540.
  • Default fusion behavior is unchanged when fusion_strategy is absent: handle_recall starts from p.effective_config(self.active_config()) at crates/khive-pack-memory/src/handlers.rs:459, and RecallConfig::default() uses FusionStrategy::default() at crates/khive-pack-memory/src/config.rs:66, which is Rrf { k: 60 } at crates/khive-runtime/src/fusion.rs:36-39.
  • The four named tests are present: test_recall_default_identity at crates/khive-pack-memory/tests/integration.rs:1007, test_recall_top_k_override at crates/khive-pack-memory/tests/integration.rs:1057, test_recall_fusion_strategy_override at crates/khive-pack-memory/tests/integration.rs:1107, and test_recall_score_floor at crates/khive-pack-memory/tests/integration.rs:1159.
  • The weighted strategy does not accept request-level weights or alpha through the new top-level knob surface: RecallParams only adds top_k, fusion_strategy, and score_floor at crates/khive-pack-memory/src/handlers.rs:67-77; when the active config is already weighted, the request's "weighted" strategy preserves the pack-configured weights at crates/khive-pack-memory/src/handlers.rs:462-474, with a unit test at crates/khive-pack-memory/src/handlers.rs:861-908.
  • ADR-033 §6.1 is aligned with the implementation on the main semantics: top_k overrides limit and is capped, fusion_strategy accepts only rrf/weighted/union, and score_floor filters composite scores after scoring.
  • Commit 1169b41 is documentation formatting only: it changes table padding/blank lines in docs/adr/ADR-033-recall-pipeline.md; deno fmt --check docs/adr/ADR-033-recall-pipeline.md passes.

Commands Run

  • git status --short --branch: on slice/v022/05-recall-knobs...origin/slice/v022/05-recall-knobs, no local dirty files before writing this review.
  • git diff --name-status origin/slice/v022/04-wire-retrieval..HEAD: changed crates/khive-pack-memory/src/handlers.rs, crates/khive-pack-memory/tests/integration.rs, and docs/adr/ADR-033-recall-pipeline.md.
  • cargo test -p khive-pack-memory: failed from the instructed cwd because this checkout's workspace manifest is crates/Cargo.toml, not ./Cargo.toml.
  • RUSTC_WRAPPER= cargo test --manifest-path crates/Cargo.toml -p khive-pack-memory: failed before testing because Cargo tried to update index.crates.io and network/DNS is unavailable.
  • CARGO_NET_OFFLINE=true RUSTC_WRAPPER= cargo test --manifest-path crates/Cargo.toml -p khive-pack-memory: passed, 61 unit tests + 27 integration tests + 0 doctests.
  • deno fmt --check docs/adr/ADR-033-recall-pipeline.md: passed, Checked 1 file.
  • git diff --check origin/slice/v022/04-wire-retrieval..HEAD: passed with no whitespace errors.

What I Did Not Check

  • I did not fetch additional GitHub metadata beyond the local branch/diff; the local branch contains the two expected commits on top of origin/slice/v022/04-wire-retrieval.
  • I did not run full workspace clippy or full workspace tests because the task scope requested cargo test -p khive-pack-memory and the deadline was short.

Re-Review Guidance

Narrow re-review is enough: check the top_k cap fix and strengthen the two tests called out above, then rerun CARGO_NET_OFFLINE=true RUSTC_WRAPPER= cargo test --manifest-path crates/Cargo.toml -p khive-pack-memory.

Domain utility: SKIPPED — no lore suggest/compose tools were available in this environment, so the review used the khive ADR/code review skill plus local ADR and source context.

@ohdearquant ohdearquant changed the base branch from slice/v022/04-wire-retrieval to main May 25, 2026 15:42
…ger tests

Two Medium findings from codex review:

1. top_k cast bug: `(k as u32).min(100)` truncates k before capping. A
   request with `top_k = 4_294_967_297` (larger than u32::MAX) truncates
   to 1 BEFORE the cap is applied, so the result limit becomes 1, not 100.
   Fixed: `u32::try_from(k.min(100)).unwrap_or(100)` clamps to usize first,
   then narrows safely.

2. Weak tests: test_recall_default_identity only checked length and the
   top hit. Strengthened to compare full ordered note_id+score lists across
   all positions, with all three knobs explicitly set to null.
   test_recall_fusion_strategy_override only validated string acceptance.
   Added a new unit test (fusion_strategy_change_produces_observable_ordering_difference)
   with a deterministic fixture where RRF and Weighted strategies MUST
   produce different orderings — proving the fusion_strategy override
   actually flows into fuse_candidates, not just validation.

Verified: cargo test -p khive-pack-memory --lib passes (62 unit tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ohdearquant added a commit that referenced this pull request May 25, 2026
…ger tests

Two Medium findings from codex review:

1. top_k cast bug: `(k as u32).min(100)` truncates k before capping. A
   request with `top_k = 4_294_967_297` (larger than u32::MAX) truncates
   to 1 BEFORE the cap is applied, so the result limit becomes 1, not 100.
   Fixed: `u32::try_from(k.min(100)).unwrap_or(100)` clamps to usize first,
   then narrows safely.

2. Weak tests: test_recall_default_identity only checked length and the
   top hit. Strengthened to compare full ordered note_id+score lists across
   all positions, with all three knobs explicitly set to null.
   test_recall_fusion_strategy_override only validated string acceptance.
   Added a new unit test (fusion_strategy_change_produces_observable_ordering_difference)
   with a deterministic fixture where RRF and Weighted strategies MUST
   produce different orderings — proving the fusion_strategy override
   actually flows into fuse_candidates, not just validation.

Verified: cargo test -p khive-pack-memory --lib passes (62 unit tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ohdearquant added a commit that referenced this pull request May 25, 2026
…ger tests

Two Medium findings from codex review:

1. top_k cast bug: `(k as u32).min(100)` truncates k before capping. A
   request with `top_k = 4_294_967_297` (larger than u32::MAX) truncates
   to 1 BEFORE the cap is applied, so the result limit becomes 1, not 100.
   Fixed: `u32::try_from(k.min(100)).unwrap_or(100)` clamps to usize first,
   then narrows safely.

2. Weak tests: test_recall_default_identity only checked length and the
   top hit. Strengthened to compare full ordered note_id+score lists across
   all positions, with all three knobs explicitly set to null.
   test_recall_fusion_strategy_override only validated string acceptance.
   Added a new unit test (fusion_strategy_change_produces_observable_ordering_difference)
   with a deterministic fixture where RRF and Weighted strategies MUST
   produce different orderings — proving the fusion_strategy override
   actually flows into fuse_candidates, not just validation.

Verified: cargo test -p khive-pack-memory --lib passes (62 unit tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ohdearquant ohdearquant force-pushed the slice/v022/05-recall-knobs branch from 61f219c to cc17441 Compare May 25, 2026 15:54
@ohdearquant
Copy link
Copy Markdown
Owner Author

Codex findings addressed

Verdict: REQUEST CHANGES → fixes pushed

Medium 1: top_k cast bug

Was (k as u32).min(100) — truncates k before clamping. A request with top_k = 4_294_967_297 (larger than u32::MAX) would truncate to 1 before the cap. Now u32::try_from(k.min(100)).unwrap_or(100) clamps to usize first, then narrows safely.

Medium 2: Weak tests

test_recall_default_identity — strengthened. Now creates 4 distinct memories, requires ≥2 baseline hits, and compares full ordered (note_id, fused_score) tuples between baseline and explicit-null-knob calls. A regression that shifts ranking with null knobs would fail.

test_recall_fusion_strategy_override — supplemented. Added new unit test fusion_strategy_change_produces_observable_ordering_difference with a deterministic in-memory fixture where RRF (rank-based) and Weighted [0.1, 0.9] (score-based) MUST produce different orderings. Calls fuse_candidates directly and asserts orderings differ. Proves the fusion_strategy override actually flows into fusion, not just validation.

Commits: 5e3abfe (fixes), cc17441 (cargo fmt).

Ready for re-review.

@ohdearquant ohdearquant merged commit f1de332 into main May 25, 2026
3 checks passed
@ohdearquant ohdearquant deleted the slice/v022/05-recall-knobs branch May 25, 2026 17:14
ohdearquant added a commit that referenced this pull request May 25, 2026
#407)

* feat(pack-memory): wire khive-retrieval as recall composer (ADR-011/021)

Route pack-memory's fuse_candidates through khive_retrieval::fuse_search_results,
making khive-retrieval a real consumed facade instead of an orphan crate.

- Add khive-retrieval dep to khive-pack-memory/Cargo.toml
- Replace direct fuse_with_strategy call with retrieval adapter
  (CandidateMeta side-map, HybridConfig builder, FusionStrategy conversion)
- Fix issue #309: resolve --all-features compile failures in khive-retrieval
  (stale SqliteStore imports, missing NodeId/LinkStore imports)
- Add 5 integration tests (3 fusion_surface, 2 pack-memory recall adapter)
- RRF k=1 discriminator test proves strategy propagation (30x score gap)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(pack-memory): expose top_k/fusion_strategy/score_floor knobs on recall (ADR-033 §6)

- Add three optional per-request fields to RecallParams: top_k (usize),
  fusion_strategy (string), and score_floor (f32)
- fusion_strategy validated against {"rrf","weighted","union"}; clear error
  with valid values on invalid input
- top_k overrides the result limit for a single call (capped at 100)
- score_floor applied as a post-filter on the composite score after compute_score
- Add parse_fusion_strategy_str helper; wire override into cfg.fuse_strategy
  before passing to fuse_candidates
- Add 4 integration tests: default_identity, top_k_override,
  fusion_strategy_override (including rejection), score_floor
- Document knobs in ADR-033 §6.1 with table, semantics, and example DSL

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style(adr-033): deno fmt re-pad recall knob table (post-merge cleanup)

* feat(embedding): dual-model registry (MiniLM + paraphrase) per ADR-043

Multi-model embedding support landed across the runtime + storage + memory
stack. Workspace dual-embedding now reachable end-to-end:

khive-runtime:
  - RuntimeConfig.additional_embedding_models: Vec<EmbeddingModel>
  - Replaces single OnceCell<embedder> with HashMap<model_name, embedder>
  - default_embedder_name() + embedder(name) public methods
  - KHIVE_ADDITIONAL_EMBEDDING_MODELS env-var parsing
  - configured_embedding_models() helper enumerates active set

khive-db:
  - V16 migration: add `embedding_model TEXT NOT NULL DEFAULT '<default>'`
    column to vectors table with backfill + composite index
  - VectorStore.insert / search scoped by embedding_model

khive-storage:
  - VectorRecord carries model tag
  - vector search params include model scope

khive-pack-memory:
  - recall + remember accept optional embedding_model arg
  - validation: must be a registered model name

kkernel:
  - engine list now returns real loaded models (no longer empty Vec)
  - engine migrate / drift-check still return not-implemented (#380/#385)

Notes:
- 16 files changed, +582/-138 lines
- Tests rebaselined for V16 (failed_migration_rolls_back tests V17 now;
  store_ddl_then_event_migration_is_idempotent expects V16 head)
- Workspace: cargo build + cargo test + clippy clean + fmt clean

Lattice gap status: N/A — lattice-embed 0.2.4 already exposes both
MiniLM + paraphrase as 384-d local models with EmbeddingRoutingConfig
primitives. khive-runtime now uses these directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(retrieval): correct doctest import — use re-export at crate root

Addresses codex finding on PR #405: `khive_retrieval::hnsw::HnswIndex`
doesn't resolve because `hnsw` is not a public submodule — `HnswIndex` is
re-exported at the crate root (lib.rs:145). The doctest at persist/mod.rs:29
must use the public facade import.

Closes the remaining gap on issue #309 (--all-features doctest failure).

Verified: `RUSTC_WRAPPER= cargo test --offline -p khive-retrieval --all-features --doc` passes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(pack-memory): address PR #406 codex findings — top_k cast + stronger tests

Two Medium findings from codex review:

1. top_k cast bug: `(k as u32).min(100)` truncates k before capping. A
   request with `top_k = 4_294_967_297` (larger than u32::MAX) truncates
   to 1 BEFORE the cap is applied, so the result limit becomes 1, not 100.
   Fixed: `u32::try_from(k.min(100)).unwrap_or(100)` clamps to usize first,
   then narrows safely.

2. Weak tests: test_recall_default_identity only checked length and the
   top hit. Strengthened to compare full ordered note_id+score lists across
   all positions, with all three knobs explicitly set to null.
   test_recall_fusion_strategy_override only validated string acceptance.
   Added a new unit test (fusion_strategy_change_produces_observable_ordering_difference)
   with a deterministic fixture where RRF and Weighted strategies MUST
   produce different orderings — proving the fusion_strategy override
   actually flows into fuse_candidates, not just validation.

Verified: cargo test -p khive-pack-memory --lib passes (62 unit tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(embedding): address PR #407 codex findings — ADR amendment + 3 data-safety fixes

Per Ocean direction (2026-05-25): amend ADR-043 to formalize the V16
string-tag schema design, claim V16 in ADR-015 ledger, and fix the
data-safety bugs codex flagged. Defer sqlite-vec data preservation to
a follow-up issue.

ADR amendments:
- ADR-043 §1.1 (vector store column addition): replaces the old FK-based
  description with the actual V16 design (TEXT embedding_model column with
  composite index). Includes rationale for TEXT vs BLOB FK (hot-path join
  cost, end-to-end shape consistency with kkernel/env-vars/registry).
  Documents sqlite-vec rebuild behavior and follow-up.
- ADR-015 schema ledger: V16 row added with cluster-22 amendment notes.

High 3 fix — atomic unknown-model validation (handlers.rs):
- handle_remember now calls resolve_embedding_model(Some(name)) BEFORE
  create_note_with_decay_for_embedding_model. resolve_embedding_model is
  synchronous and doesn't load the model — it only checks registration.
  An unknown model is rejected before any note/FTS/vector row is written.

High 2 fix — scoped delete across all model stores (operations.rs):
- delete_note now iterates over registered_embedding_model_names() (new
  public method on KhiveRuntime) and deletes the note's vector from EVERY
  registered model's vector store. Previously only the default model's
  store was touched, leaving non-default vectors orphaned.

Medium fix — KHIVE_ADDITIONAL_EMBEDDING_MODELS warning on bad names:
- parse_embedding_model_list now logs tracing::warn for non-empty raw
  names that don't parse, instead of silently filtering them out. The
  function still returns a Vec (no startup failure on partial validity),
  but operator typos now surface at startup rather than as UnknownModel
  errors at request time.

Deferred (follow-up issue, see ADR-043 §1.1 final paragraph):
- High 1 — V16 backfill hard-codes 'all-minilm-l6-v2' for all regular
  vec_* tables, and sqlite-vec virtual tables are still dropped-and-rebuilt
  on schema mismatch (data loss for non-default deployments). A copy-with-
  default rebuild path is tracked separately because it requires a careful
  multi-step migration with vec0 INSERT INTO SELECT FROM and a verification
  step. Operators are warned via ADR §1.1 to back up before upgrading.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style: cargo fmt --all

* style: deno fmt ADR tables

* fix(test): use single-token query matching all fixture memories

* fix(embedding): codex round 2 — runtime-layer atomicity + ADR internal consistency

Two findings from codex round 2 on PR #407:

1. Medium: runtime-level unknown-model atomicity. handle_remember validated
   embedding_model before calling create_note_with_decay_for_embedding_model
   (round 1 fix), but the runtime API itself was permissive — direct Rust
   callers (other packs, integration tests) would still hit the write-after-
   failure bug.

   Fix: resolve_embedding_model is now called at the start of
   create_note_inner BEFORE any note/FTS/vector write. The pack-handler check
   remains as an earlier error boundary, but atomicity is enforced at the
   lowest layer that performs the write.

2. High: ADR-043 internal inconsistency. The §1.1 amendment landed in round 1,
   but the rest of the ADR still contained normative text for the rejected
   FK-based design:
   - §8 (Backward compat) described embedding_model_id BLOB column + CHECK rebuild
   - Alternatives table rejected per-record model_id (which is what V16 ships)
   - Migration version section claimed V5 + BLOB FK

   Fix:
   - §8 rewritten to describe V14 + V16 (registry + tag column) split that
     actually shipped
   - Alternatives table row strikes-through with "Superseded by V16 (2026-05-25)"
     and points back to §1.1 for rationale
   - Migration version section split into V14 (cluster-20) and V16 (v022-polish)
     with the actual SQL each migration runs

   Codex's other round-1 findings remain addressed:
   - Critical (ADR drift) — now resolved end-to-end across §1.1, §8, alternatives, and §Migration version
   - High 2 (scoped delete) — confirmed by round-2 review at operations.rs:1395, 1408
   - Medium (env var warning) — confirmed at runtime.rs:549, 560

Verified: cargo test -p khive-runtime -p khive-pack-memory --lib passes
(262 + 62 tests), cargo fmt clean, deno fmt clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* ci: force re-trigger

* ci: force re-trigger via doc whitespace touch

* style: deno fmt

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
ohdearquant added a commit that referenced this pull request May 25, 2026
* feat(pack-memory): wire khive-retrieval as recall composer (ADR-011/021)

Route pack-memory's fuse_candidates through khive_retrieval::fuse_search_results,
making khive-retrieval a real consumed facade instead of an orphan crate.

- Add khive-retrieval dep to khive-pack-memory/Cargo.toml
- Replace direct fuse_with_strategy call with retrieval adapter
  (CandidateMeta side-map, HybridConfig builder, FusionStrategy conversion)
- Fix issue #309: resolve --all-features compile failures in khive-retrieval
  (stale SqliteStore imports, missing NodeId/LinkStore imports)
- Add 5 integration tests (3 fusion_surface, 2 pack-memory recall adapter)
- RRF k=1 discriminator test proves strategy propagation (30x score gap)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(pack-memory): expose top_k/fusion_strategy/score_floor knobs on recall (ADR-033 §6)

- Add three optional per-request fields to RecallParams: top_k (usize),
  fusion_strategy (string), and score_floor (f32)
- fusion_strategy validated against {"rrf","weighted","union"}; clear error
  with valid values on invalid input
- top_k overrides the result limit for a single call (capped at 100)
- score_floor applied as a post-filter on the composite score after compute_score
- Add parse_fusion_strategy_str helper; wire override into cfg.fuse_strategy
  before passing to fuse_candidates
- Add 4 integration tests: default_identity, top_k_override,
  fusion_strategy_override (including rejection), score_floor
- Document knobs in ADR-033 §6.1 with table, semantics, and example DSL

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style(adr-033): deno fmt re-pad recall knob table (post-merge cleanup)

* feat(embedding): dual-model registry (MiniLM + paraphrase) per ADR-043

Multi-model embedding support landed across the runtime + storage + memory
stack. Workspace dual-embedding now reachable end-to-end:

khive-runtime:
  - RuntimeConfig.additional_embedding_models: Vec<EmbeddingModel>
  - Replaces single OnceCell<embedder> with HashMap<model_name, embedder>
  - default_embedder_name() + embedder(name) public methods
  - KHIVE_ADDITIONAL_EMBEDDING_MODELS env-var parsing
  - configured_embedding_models() helper enumerates active set

khive-db:
  - V16 migration: add `embedding_model TEXT NOT NULL DEFAULT '<default>'`
    column to vectors table with backfill + composite index
  - VectorStore.insert / search scoped by embedding_model

khive-storage:
  - VectorRecord carries model tag
  - vector search params include model scope

khive-pack-memory:
  - recall + remember accept optional embedding_model arg
  - validation: must be a registered model name

kkernel:
  - engine list now returns real loaded models (no longer empty Vec)
  - engine migrate / drift-check still return not-implemented (#380/#385)

Notes:
- 16 files changed, +582/-138 lines
- Tests rebaselined for V16 (failed_migration_rolls_back tests V17 now;
  store_ddl_then_event_migration_is_idempotent expects V16 head)
- Workspace: cargo build + cargo test + clippy clean + fmt clean

Lattice gap status: N/A — lattice-embed 0.2.4 already exposes both
MiniLM + paraphrase as 384-d local models with EmbeddingRoutingConfig
primitives. khive-runtime now uses these directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* tune(recall): grid search infra + PARTIAL default changes

* fix(retrieval): correct doctest import — use re-export at crate root

Addresses codex finding on PR #405: `khive_retrieval::hnsw::HnswIndex`
doesn't resolve because `hnsw` is not a public submodule — `HnswIndex` is
re-exported at the crate root (lib.rs:145). The doctest at persist/mod.rs:29
must use the public facade import.

Closes the remaining gap on issue #309 (--all-features doctest failure).

Verified: `RUSTC_WRAPPER= cargo test --offline -p khive-retrieval --all-features --doc` passes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* revert(memory): keep RecallConfig defaults — corpus ceiling made changes unjustified

Codex review on PR #408 flagged that the three default changes from the
tune sweep (temporal_half_life_days 30 to 14, decay exp to hyp, multiplier
20 to 10) were made even though REPORT.md explicitly states the synthetic
eval set produced identical recall@10 = 0.9333 for ALL 116 configs — the
landscape was too flat to discriminate these parameters.

Reverts those three lines back to their prior values. The tuned-config.toml
artifact stays as an experimental record; the grid_search.py infra stays
runnable; only RecallConfig::default() is restored.

A discriminating eval corpus (embed-enabled, synonym queries, partial
matches) is the prerequisite before changing runtime defaults.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(pack-memory): address PR #406 codex findings — top_k cast + stronger tests

Two Medium findings from codex review:

1. top_k cast bug: `(k as u32).min(100)` truncates k before capping. A
   request with `top_k = 4_294_967_297` (larger than u32::MAX) truncates
   to 1 BEFORE the cap is applied, so the result limit becomes 1, not 100.
   Fixed: `u32::try_from(k.min(100)).unwrap_or(100)` clamps to usize first,
   then narrows safely.

2. Weak tests: test_recall_default_identity only checked length and the
   top hit. Strengthened to compare full ordered note_id+score lists across
   all positions, with all three knobs explicitly set to null.
   test_recall_fusion_strategy_override only validated string acceptance.
   Added a new unit test (fusion_strategy_change_produces_observable_ordering_difference)
   with a deterministic fixture where RRF and Weighted strategies MUST
   produce different orderings — proving the fusion_strategy override
   actually flows into fuse_candidates, not just validation.

Verified: cargo test -p khive-pack-memory --lib passes (62 unit tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(embedding): address PR #407 codex findings — ADR amendment + 3 data-safety fixes

Per Ocean direction (2026-05-25): amend ADR-043 to formalize the V16
string-tag schema design, claim V16 in ADR-015 ledger, and fix the
data-safety bugs codex flagged. Defer sqlite-vec data preservation to
a follow-up issue.

ADR amendments:
- ADR-043 §1.1 (vector store column addition): replaces the old FK-based
  description with the actual V16 design (TEXT embedding_model column with
  composite index). Includes rationale for TEXT vs BLOB FK (hot-path join
  cost, end-to-end shape consistency with kkernel/env-vars/registry).
  Documents sqlite-vec rebuild behavior and follow-up.
- ADR-015 schema ledger: V16 row added with cluster-22 amendment notes.

High 3 fix — atomic unknown-model validation (handlers.rs):
- handle_remember now calls resolve_embedding_model(Some(name)) BEFORE
  create_note_with_decay_for_embedding_model. resolve_embedding_model is
  synchronous and doesn't load the model — it only checks registration.
  An unknown model is rejected before any note/FTS/vector row is written.

High 2 fix — scoped delete across all model stores (operations.rs):
- delete_note now iterates over registered_embedding_model_names() (new
  public method on KhiveRuntime) and deletes the note's vector from EVERY
  registered model's vector store. Previously only the default model's
  store was touched, leaving non-default vectors orphaned.

Medium fix — KHIVE_ADDITIONAL_EMBEDDING_MODELS warning on bad names:
- parse_embedding_model_list now logs tracing::warn for non-empty raw
  names that don't parse, instead of silently filtering them out. The
  function still returns a Vec (no startup failure on partial validity),
  but operator typos now surface at startup rather than as UnknownModel
  errors at request time.

Deferred (follow-up issue, see ADR-043 §1.1 final paragraph):
- High 1 — V16 backfill hard-codes 'all-minilm-l6-v2' for all regular
  vec_* tables, and sqlite-vec virtual tables are still dropped-and-rebuilt
  on schema mismatch (data loss for non-default deployments). A copy-with-
  default rebuild path is tracked separately because it requires a careful
  multi-step migration with vec0 INSERT INTO SELECT FROM and a verification
  step. Operators are warned via ADR §1.1 to back up before upgrading.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(tune): document khive_contract dependency + run instructions

Codex Major 2 (PR #408): the tune CLI is not re-runnable from a fresh
clone of this branch because khive_contract is in the parent package
(slice 06, PR #403 — already merged to main). Codex tested slice 08
in isolation and hit ModuleNotFoundError.

Adds tests/khive-contract/tune/README.md explaining:
- Install khive_contract first: uv pip install -e . from
  tests/khive-contract/
- Run via uv run python -m tune --quick (or full)
- Documents the corpus-ceiling limitation surfaced in REPORT.md
- Explicit note that RecallConfig::default() was intentionally NOT
  changed (defaults reverted in d943b7f based on codex Major 1)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style: cargo fmt --all

* style: deno fmt ADR tables

* fix(test): use single-token query matching all fixture memories

* fix(test): align partial_config test with reverted default decay_model

* fix(embedding): codex round 2 — runtime-layer atomicity + ADR internal consistency

Two findings from codex round 2 on PR #407:

1. Medium: runtime-level unknown-model atomicity. handle_remember validated
   embedding_model before calling create_note_with_decay_for_embedding_model
   (round 1 fix), but the runtime API itself was permissive — direct Rust
   callers (other packs, integration tests) would still hit the write-after-
   failure bug.

   Fix: resolve_embedding_model is now called at the start of
   create_note_inner BEFORE any note/FTS/vector write. The pack-handler check
   remains as an earlier error boundary, but atomicity is enforced at the
   lowest layer that performs the write.

2. High: ADR-043 internal inconsistency. The §1.1 amendment landed in round 1,
   but the rest of the ADR still contained normative text for the rejected
   FK-based design:
   - §8 (Backward compat) described embedding_model_id BLOB column + CHECK rebuild
   - Alternatives table rejected per-record model_id (which is what V16 ships)
   - Migration version section claimed V5 + BLOB FK

   Fix:
   - §8 rewritten to describe V14 + V16 (registry + tag column) split that
     actually shipped
   - Alternatives table row strikes-through with "Superseded by V16 (2026-05-25)"
     and points back to §1.1 for rationale
   - Migration version section split into V14 (cluster-20) and V16 (v022-polish)
     with the actual SQL each migration runs

   Codex's other round-1 findings remain addressed:
   - Critical (ADR drift) — now resolved end-to-end across §1.1, §8, alternatives, and §Migration version
   - High 2 (scoped delete) — confirmed by round-2 review at operations.rs:1395, 1408
   - Medium (env var warning) — confirmed at runtime.rs:549, 560

Verified: cargo test -p khive-runtime -p khive-pack-memory --lib passes
(262 + 62 tests), cargo fmt clean, deno fmt clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* ci: force re-trigger

* ci: force re-trigger via doc whitespace touch

* style: deno fmt

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
ohdearquant added a commit that referenced this pull request May 25, 2026
…c + score normalization (#440)

Five Wave-1 fixes from /tmp/v023-usage-audit-consolidated.md UE3 + ADR-A:

1. remember(source_id=...) accepts 8-char short IDs (Critical) — same prefix
   resolution as get/link/update. Closes recon #291 and the chain
   create->remember workflow.

2. recall(help=true) + remember(help=true) HandlerDef params updated to
   include all PR #406/#421 args (top_k, score_floor, fusion_strategy,
   embedding_model + presentation for verbose breakdown).

3. decay_factor default doc corrected: 0.1 -> 0.01 (10x off).

4. compute_score normalizes RRF-fused scores by (k+1) to comparable range
   with weighted/union so score_floor is portable across fusion_strategy.

5. presentation="verbose" on recall includes per-component score breakdown
   without changing agent-mode response shape.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant