Skip to content

feat: RAG long-term memory with vector embeddings #71

@nickmeinhold

Description

@nickmeinhold

Long-term Memory via RAG

Inspired by Dreamfinder's memory system.

Currently Gremlin has a 10-message sliding window and a 30-minute conversation TTL. Important context just... evaporates. This adds persistent long-term memory with semantic retrieval.

How it works

  1. Embedding pipeline — As conversations happen (or during the dream cycle), important messages/decisions/context get embedded via Voyage AI and stored in the DB with their vector representations.
  2. Memory retriever — On each new message, retrieve the top-N semantically similar memories using cosine similarity and inject them into the system prompt context window.
  3. Memory consolidator — Periodically merge/deduplicate similar memories to avoid bloat.

What gets remembered

  • Decisions made in chat
  • Task assignments and outcomes
  • Team preferences and conventions
  • Recurring topics and project context
  • Anything the dream cycle explicitly files as worth remembering

DB additions needed

  • memories table: id, chat_id, content, embedding (blob/json), created_at, source (conversation/dream), similarity score metadata
  • Index on chat_id for fast retrieval

System prompt injection

Retrieved memories injected as a ## Long-term Memory section near the top of the system prompt, formatted as a concise bullet list. Top ~5 most relevant memories per message.

Dependencies

  • Voyage AI API key (VOYAGE_API_KEY env var) — or swap for any embeddings provider (OpenAI, local, etc.)
  • Vector similarity search (cosine, done in-process over SQLite blob storage — no need for a vector DB at this scale)

Why this is good

Right now every conversation starts cold. With RAG memory, Gremlin actually knows the team over time — remembers that @Thinkerer prefers tasks in a specific format, that the auth system has a known quirk, that sprint planning always happens Tuesday. It gets smarter the longer it runs.


Reference implementation: lib/src/memory/ in Dreamfinder (Dart). Uses Voyage AI embeddings + cosine similarity retrieval.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions