feat(kg): on-demand topic synthesis (close the loop, #44)#557
Draft
cirwel wants to merge 2 commits into
Draft
Conversation
Adds knowledge(action="synthesize"): a periodic/on-demand pass that
compounds discrete discovery rows into rolled-up topic summaries, so a
cross-referenced, compounded narrative exists before query time instead
of only on read (search synthesize=true). This closes the loop the
knowledge-graph skill admits is open ("does not close loops
automatically"), using the GraphRAG community-summary pattern.
Deliberately NOT a per-write hook. Running an LLM synthesis pass on every
store/note across a multi-agent fleet is the auto-checkin-on-every-write
anti-pattern (latency, cost, stale-rollup noise). Synthesis runs like
lint/cleanup and reuses the existing store + LLM-delegation machinery.
Rollups are persisted as ordinary discovery rows (type=topic_rollup,
deterministic id rollup::<topic>), so they upsert in place, are queryable
through normal search, and are lifecycle-managed - with NO schema change.
Falls back to a deterministic narrative when no LLM is reachable.
- src/mcp_handlers/knowledge/synthesis.py: rollup construction + pass
- handlers.py: handle_synthesize_knowledge_graph + topic_rollup type
- consolidated.py / schemas/knowledge.py: wire the "synthesize" action
- db mixin: kg_topic_candidates read-only aggregate
- skills/knowledge-graph: document synthesize; record temporal-edges
(bi-temporal validity) as a deliberately deferred idea (YAGNI vs the
superseded+created_at 80% substitute), per #44 scope review
- tests/test_kg_synthesis.py: 14 tests (pure logic + orchestration)
https://claude.ai/code/session_01Dz2VWo4AxPyE7eoNzHRo9U
✅ Documentation Validation PassedTool Count: 7 tools tools All documentation is synchronized with the codebase. |
The consolidated knowledge router now exposes the 'synthesize' action; update the canonical action-list test to match. https://claude.ai/code/session_01Dz2VWo4AxPyE7eoNzHRo9U
✅ Documentation Validation PassedTool Count: 7 tools tools All documentation is synchronized with the codebase. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Addresses the two KG gaps surfaced by the LLM-wiki comparison (#44 / PR #45), but scoped down after review rather than building both as proposed:
knowledge(action="synthesize"), a periodic/on-demand pass that compounds discrete discovery rows into rolled-up topic summaries, so a cross-referenced, compounded narrative exists before query time — not only on read viasearch(..., synthesize=true). This closes the loop the knowledge-graph skill admits is open ("does not close loops automatically"), using the GraphRAG community-summary pattern.Why synthesis is on-demand, not on-write
The original issue framed this as a "post-write synthesis pass". For a multi-agent fleet that writes constantly, running an LLM pass on every
store/noteis the auto-checkin-on-every-trivial-write anti-pattern UNITARES lists as a non-goal — per-write latency, LLM cost, and a fresh noise source (stale auto-generated rollups racing live writes). So synthesis runs likecleanup/lint: on demand, or wired to a periodic trigger, reusing the existing store + LLM-delegation machinery. No per-write cost.Why no schema change
Rollups are persisted as ordinary discovery rows —
type="topic_rollup", deterministic idrollup::<topic>, taggedrollup. They upsert in place (compounding across runs, never duplicating) via the existingON CONFLICT (id) DO UPDATEwrite path, are queryable through normalsearch, and are lifecycle-managed like any other discovery. Zero migration. Falls back to a deterministic narrative when no LLM is reachable, so it works headless.Usage
Why Gap 2 (bi-temporal validity) was deferred
A first-class notion of when a fact became true/false (valid-from / valid-to + observation time, per the Graphiti/Zep model) is a substrate-level change: migration, AGE query rewrites, and every read path having to reason about time — for a payoff (point-in-time reconstruction, automatic invalidation) that is speculative for current usage. The existing
supersededstatus +created_atis an ~80% substitute. The signal to build it is a concrete failure (an agent acting on a stale fact in a way that actually bites). It is recorded as a deliberately deferred idea inskills/knowledge-graph/SKILL.md, not a roadmap item.Changes
src/mcp_handlers/knowledge/synthesis.py— rollup construction + the synthesis pass (pure helpers + orchestration)src/mcp_handlers/knowledge/handlers.py—handle_synthesize_knowledge_graph; registertopic_rolluptypesrc/mcp_handlers/consolidated.py,schemas/knowledge.py,__init__.py— wire thesynthesizeaction + params (topic,min_members,use_llm)src/db/mixins/knowledge_graph.py—kg_topic_candidatesread-only aggregate (densest tags)skills/knowledge-graph/SKILL.md— documentsynthesize; record the temporal-edges deferraltests/test_kg_synthesis.py— 14 tests (pure logic + orchestration, no live DB/LLM)Testing
test_knowledge_graph_*,test_migration_registry_versions,test_kg_*,test_decorators,test_tool_stability,test_handler_registry,test_pydantic_schemas,test_extracted_handlers,test_admin_handlers,test_remaining_modules,test_tool_modes— all green).https://claude.ai/code/session_01Dz2VWo4AxPyE7eoNzHRo9U
Generated by Claude Code