Skip to content

[Feature] Semantic compression — rolling summaries and project digests #9

@rahilp

Description

@rahilp

Why

Per-query synthesis already ships in recall (v1.4.0, issue #59) — that solved the "dumb retrieval" problem for single queries. What's still missing is the write-back loop: when a tag accumulates dozens of raw entries, there's no compression happening. Users returning to a heavily-used tag get 15 raw memories instead of one clean synthesis. The retrieval quality problem is solved ephemerally; the storage bloat problem is not.

What this covers

  • Rolling summaries: a nightly cron job identifies tags with >20 entries and compresses related clusters into a single synthesized entry, preserving source IDs in metadata as provenance links. Originals are not deleted.
  • /digest?tag=X endpoint: on-demand synthesis of everything stored under a given tag, returning a structured "state of the world" paragraph rather than raw entries. Useful for project handoffs, onboarding a new AI session, or sharing project context.
  • synthesized tag convention: summary entries are tagged synthesized and linked back to source IDs so callers can distinguish generated summaries from raw memories.

Out of scope (already shipped)

Per-query synthesis on recall responses (issue #59, v1.4.0). This issue is about writing synthesized entries back to storage, not enriching retrieval responses.

Implementation notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions