feat(salience): per-session entity-salience retrieval reflex#239
Merged
Conversation
Add an in-memory per-session ring buffer of caller queries and a substring/FTS entity-salience pass that attaches _meta.vouch_salience to read responses. Config-gated via retrieval.reflex; zero LLM calls; per-session, never persisted.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
feat(salience): per-session entity-salience reflex (#223)
What
Adds a server-side entity-salience retrieval reflex. Each session keeps an
in-memory ring buffer of the caller's recent query strings. On every read, a
zero-LLM entity pass (substring match on entity name/aliases + FTS via the
existing index) runs over the buffered queries and attaches the top-K matched
entities — with the claims that reference them — as
_meta.vouch_salienceonthe read response.
New module
src/vouch/salience.py:record_query(session_id, query, *, window=8)— append to the session'sbounded
dequering buffer (in-memory only, never written to disk).compute_salience(store, session_id, *, top_k=3)— rank entities by how manybuffered queries match (substring + FTS), returning
{"entity_id", "claim_count", "top_claim_id"}per top-K entity.attach_salience(result, store, session_id, cfg)— setsresult["_meta"]["vouch_salience"]when the reflex is enabled, asession_idis present, and the buffer is non-empty; otherwise no-op.reset_session(session_id)— clear a session's buffer.Wiring:
_h_context(JSONL) andkb_context(MCP) now acceptsession_id; whengiven, they
record_query(...)thenattach_salience(...)before returning.sessions.session_endcallsreset_session(...)so per-session state isdropped on session close.
Config (read defensively from
.vouch/config.yaml, no pydantic model):retrieval.reflex.enabled(defaultTrue),retrieval.reflex.window(default
8),retrieval.reflex.top_k(default3).Why
Agents repeatedly re-derive the same context across a session. A cheap,
substring/FTS-only reflex surfaces the claim candidates the caller is most
likely about to need — no extra LLM calls, no disk writes, scoped strictly
per session. It rides the existing read path so callers get it for free by
passing
session_id.Test plan
New
tests/test_salience.pycovers: record-then-compute highlights the entity;attach_salienceadds_meta.vouch_saliencewhen enabled; the JSONL handlerattaches salience on a subsequent context call;
retrieval.reflex.enabled: falseomits the field; a stateless call (nosession_id) omits the field;reset_sessionandsession_endclear the buffer; the window bounds thebuffer.
Constraints verified: zero LLM calls; per-session only; buffer never persisted.
Closes #223