diff --git a/demo/gist-memory/README.md b/demo/gist-memory/README.md
new file mode 100644
index 0000000..f218426
--- /dev/null
+++ b/demo/gist-memory/README.md
@@ -0,0 +1,69 @@
+# Gist Memory Demo — Buried Preference Recall
+
+This demo shows how purpose-directed gisting enables an AI agent to recall user preferences that were briefly mentioned during an unrelated conversation — a task that generic summarization and standard RAG both fail at.
+
+## The Problem
+
+When a user buries a preference in a long, topically unrelated conversation, generic approaches fail:
+- **Topic-preserving summarization** discards the preference as low-salience noise
+- **Standard RAG** dilutes the preference signal in full-passage embeddings dominated by the conversation's main topic
+
+Purpose-directed gisting solves this by compressing conversations specifically to foreground user attributes.
+
+## Setup
+
+### Option A: Kaizen Lite (Claude Code Plugin)
+
+```bash
+# Install the plugin
+claude --plugin-dir /path/to/kaizen/platform-integrations/claude/plugins/kaizen-lite
+```
+
+### Option B: Full Kaizen (MCP Server)
+
+```bash
+# Start the MCP server
+uv run fastmcp run kaizen/frontend/mcp/mcp_server.py --transport sse --port 8201
+```
+
+## Demo Script
+
+### Session 1: Preference Embedding
+
+Have a multi-turn conversation about an unrelated technical topic. Bury a preference in one of the messages.
+
+See [session1_script.md](session1_script.md) for the full conversation script.
+
+**Key message (message 5 of 12):**
+> "That makes sense about the CNI plugin architecture. By the way, I strongly prefer Python over R for all my data analysis work — I find pandas much more intuitive than tidyverse. Anyway, back to the networking question — how does Cilium handle network policy enforcement?"
+
+The preference ("Python over R", "pandas over tidyverse") is <5% of the total conversation content.
+
+**At end of session:**
+- **Lite path:** Run `/kaizen:gist`
+- **MCP path:** Call `store_gist` with the conversation JSON
+
+**Expected gist output:**
+```
+user prefers Python over R for data analysis; finds pandas more intuitive than tidyverse; works with Kubernetes networking (Cilium, CNI plugins)
+```
+
+Note how the gist foregrounds the Python/pandas preference despite it being a tiny fraction of the conversation.
+
+### Session 2: Preference Recall
+
+Start a new session and ask:
+
+> "I need to start a new data analysis project working with network telemetry data. What language and tools would you recommend I use?"
+
+**With gist memory:** Claude recommends Python and pandas, citing your stated preference.
+
+**Without gist memory:** Claude gives a generic recommendation (likely mentioning both Python and R, or asking about your preference).
+
+See [session2_script.md](session2_script.md) for the verification prompts.
+
+## What to Look For
+
+1. **Gist content:** Does the gist capture the Python/pandas preference despite it being buried?
+2. **Recall accuracy:** In Session 2, does the agent correctly apply the preference?
+3. **A/B contrast:** Run Session 2 without gist memory to see the failure mode.
diff --git a/demo/gist-memory/session1_script.md b/demo/gist-memory/session1_script.md
new file mode 100644
index 0000000..966bb2a
--- /dev/null
+++ b/demo/gist-memory/session1_script.md
@@ -0,0 +1,62 @@
+# Session 1: Preference Embedding
+
+Use these messages in order. The buried preference is in **Message 4**.
+
+---
+
+**Message 1 (User):**
+> I'm trying to understand how Kubernetes handles pod-to-pod networking across nodes. Can you explain how the CNI plugin architecture works?
+
+**Message 2 (Assistant):**
+> [Detailed explanation of CNI plugin architecture, pod networking, veth pairs, bridge interfaces, etc.]
+
+**Message 3 (User):**
+> That's helpful. How does this differ between Calico and Cilium? I've heard Cilium uses eBPF instead of iptables.
+
+**Message 4 (Assistant):**
+> [Explanation comparing Calico's iptables-based approach vs Cilium's eBPF dataplane, performance characteristics, etc.]
+
+**Message 5 (User) — THE BURIED PREFERENCE:**
+> That makes sense about the CNI plugin architecture. By the way, I strongly prefer Python over R for all my data analysis work — I find pandas much more intuitive than tidyverse. Anyway, back to the networking question — how does Cilium handle network policy enforcement at the kernel level?
+
+**Message 6 (Assistant):**
+> [Explanation of Cilium's eBPF-based network policy enforcement, kernel-level packet filtering, etc.]
+
+**Message 7 (User):**
+> What about service mesh integration? Does Cilium replace the need for something like Istio?
+
+**Message 8 (Assistant):**
+> [Discussion of Cilium service mesh capabilities vs Istio, sidecar-free model, etc.]
+
+**Message 9 (User):**
+> I'm also curious about network observability. What tools do you recommend for monitoring pod-to-pod traffic patterns in a large cluster?
+
+**Message 10 (Assistant):**
+> [Recommendations for Hubble, Pixie, Grafana with Cilium metrics, etc.]
+
+**Message 11 (User):**
+> Great, this has been really helpful. One last question — how do I troubleshoot DNS resolution failures in pods? I've been seeing intermittent CoreDNS timeouts.
+
+**Message 12 (Assistant):**
+> [DNS troubleshooting guidance for CoreDNS, ndots settings, etc.]
+
+---
+
+## After the conversation
+
+**Kaizen Lite:** Run `/kaizen:gist`
+
+**Full Kaizen (MCP):**
+```bash
+# Store the conversation as a gist
+curl -X POST http://localhost:8201/tools/store_gist \
+  -H "Content-Type: application/json" \
+  -d '{"conversation_data": "<JSON of messages above>", "conversation_id": "demo-session-1"}'
+```
+
+## Expected Gist Output
+
+The gist should surface the buried preference:
+```
+user prefers Python over R for data analysis; finds pandas more intuitive than tidyverse; works with Kubernetes networking; troubleshooting CoreDNS; large cluster environment
+```
diff --git a/demo/gist-memory/session2_script.md b/demo/gist-memory/session2_script.md
new file mode 100644
index 0000000..89fe7e3
--- /dev/null
+++ b/demo/gist-memory/session2_script.md
@@ -0,0 +1,45 @@
+# Session 2: Preference Recall Verification
+
+Start a **new session** (no conversation history from Session 1). The gist from Session 1 should be automatically injected via the recall hook.
+
+---
+
+## Primary Verification Prompt
+
+> I need to start a new data analysis project working with network telemetry data. What language and tools would you recommend I use?
+
+### Expected Response WITH Gist Memory
+
+The agent should recommend **Python and pandas**, referencing your known preference. Example:
+
+> "Based on your preference for Python and pandas, I'd recommend using Python with pandas for the data analysis..."
+
+### Expected Response WITHOUT Gist Memory
+
+The agent gives a **generic recommendation** — likely mentioning both Python and R as options, or asking about your preference:
+
+> "For network telemetry data analysis, popular options include Python (with pandas/numpy) or R (with tidyverse). Which do you prefer?"
+
+---
+
+## Additional Verification Prompts
+
+These test whether the gist captured other signals:
+
+**Prompt 2:**
+> What's my background — do you know what kind of infrastructure I work with?
+
+Expected (with gist): Mentions Kubernetes, container networking, cluster operations.
+
+**Prompt 3:**
+> If I need to do some quick data wrangling, which library should I reach for?
+
+Expected (with gist): Recommends pandas specifically (not tidyverse or dplyr).
+
+---
+
+## Running the A/B Comparison
+
+1. **With gist memory:** Ensure the gist entity from Session 1 exists in `.kaizen/entities/gist/` (Lite) or in the MCP backend
+2. **Without gist memory:** Temporarily rename/remove the gist entity, or use a clean project directory
+3. Run each verification prompt in both conditions and compare responses
diff --git a/kaizen/config/kaizen.py b/kaizen/config/kaizen.py
index a52c513..ab14d88 100644
--- a/kaizen/config/kaizen.py
+++ b/kaizen/config/kaizen.py
@@ -8,6 +8,8 @@ class KaizenConfig(BaseSettings):
     namespace_id: str = "kaizen"
     settings: BaseSettings | None = None
     clustering_threshold: float = 0.80
+    gist_context_budget: int = 64000
+    gist_trigger_interval: int = 5
 
 
 # to reload settings call kaizen_config.__init__()
diff --git a/kaizen/config/llm.py b/kaizen/config/llm.py
index c30c3e1..0396d9c 100644
--- a/kaizen/config/llm.py
+++ b/kaizen/config/llm.py
@@ -25,6 +25,7 @@ class LLMSettings(BaseSettings):
     tips_model: str = Field(default_factory=_default_model_name)
     conflict_resolution_model: str = Field(default_factory=_default_model_name)
     fact_extraction_model: str = Field(default_factory=_default_model_name)
+    gist_model: str = Field(default_factory=_default_model_name)
     categorization_mode: Literal["predefined", "dynamic", "hybrid"] = "predefined"
     allow_dynamic_categories: bool = False
     confirm_new_categories: bool = False
diff --git a/kaizen/frontend/client/kaizen_client.py b/kaizen/frontend/client/kaizen_client.py
index 4302536..43e1704 100644
--- a/kaizen/frontend/client/kaizen_client.py
+++ b/kaizen/frontend/client/kaizen_client.py
@@ -1,9 +1,11 @@
 import logging
+import uuid
 from typing import Any
 
 from kaizen.backend.base import BaseEntityBackend
 from kaizen.config.kaizen import KaizenConfig
 from kaizen.llm.fact_extraction.fact_extraction import ExtractedFact, extract_facts_from_messages
+from kaizen.llm.gist.gist import generate_gist
 from kaizen.schema.conflict_resolution import EntityUpdate
 from kaizen.schema.core import Entity, Namespace, RecordedEntity
 from kaizen.schema.exceptions import NamespaceAlreadyExistsException, NamespaceNotFoundException
@@ -295,3 +297,135 @@ def retrieve_user_facts(
             )
 
         return categorized_preferences
+
+    # ── Gist memory ──────────────────────────────────────────────────
+
+    def store_gists(
+        self,
+        namespace_id: str,
+        messages: list[dict],
+        conversation_id: str | None = None,
+        metadata: dict[str, Any] | None = None,
+    ) -> list[EntityUpdate]:
+        """Generate purpose-directed gists from conversation messages and store them.
+
+        Implements rolling consolidation: deletes any existing gists for the same
+        conversation_id before storing new ones, so the latest gist always reflects
+        the full session.
+        """
+        if not messages:
+            return []
+
+        conversation_id = conversation_id or str(uuid.uuid4())
+        self.ensure_namespace(namespace_id)
+
+        # Delete existing gists for this conversation (rolling replacement)
+        existing = self.search_entities(
+            namespace_id=namespace_id,
+            query=None,
+            filters={"type": "gist", "metadata.conversation_id": conversation_id},
+            limit=100,
+        )
+        for entity in existing:
+            try:
+                self.delete_entity_by_id(namespace_id, entity.id)
+            except Exception:
+                logger.warning("Failed to delete old gist %s during rolling replacement", entity.id, exc_info=True)
+
+        # Generate gists
+        result = generate_gist(messages, conversation_id=conversation_id)
+
+        if not result.gists:
+            return []
+
+        # Store gist entities
+        base_metadata: dict[str, Any] = dict(metadata or {})
+        base_metadata["conversation_id"] = conversation_id
+        base_metadata["message_count"] = result.message_count
+
+        gist_entities = []
+        for i, gist_text in enumerate(result.gists):
+            gist_metadata = dict(base_metadata)
+            gist_metadata["chunk_index"] = i
+            gist_metadata["chunk_count"] = result.chunk_count
+            gist_entities.append(Entity(type="gist", content=gist_text, metadata=gist_metadata))
+
+        updates = self.update_entities(namespace_id, gist_entities, enable_conflict_resolution=False)
+
+        # Store original messages as gist_source for durable retrieval
+        source_entities = []
+        for i, msg in enumerate(messages):
+            content = msg.get("content", "")
+            if isinstance(content, list):
+                content = str(content)
+            source_entities.append(
+                Entity(
+                    type="gist_source",
+                    content=content,
+                    metadata={
+                        "conversation_id": conversation_id,
+                        "message_index": i,
+                        "role": msg.get("role", "unknown"),
+                    },
+                )
+            )
+
+        if source_entities:
+            # Delete existing sources for this conversation first
+            existing_sources = self.search_entities(
+                namespace_id=namespace_id,
+                query=None,
+                filters={"type": "gist_source", "metadata.conversation_id": conversation_id},
+                limit=1000,
+            )
+            for entity in existing_sources:
+                try:
+                    self.delete_entity_by_id(namespace_id, entity.id)
+                except Exception:
+                    logger.warning("Failed to delete old gist_source %s", entity.id, exc_info=True)
+
+            self.update_entities(namespace_id, source_entities, enable_conflict_resolution=False)
+
+        return updates
+
+    def retrieve_gists(
+        self,
+        namespace_id: str,
+        query: str,
+        limit: int = 10,
+    ) -> list[RecordedEntity]:
+        """Retrieve gists relevant to a query via semantic search."""
+        if not self.namespace_exists(namespace_id):
+            return []
+        return self.search_entities(
+            namespace_id=namespace_id,
+            query=query,
+            filters={"type": "gist"},
+            limit=limit,
+        )
+
+    def retrieve_gist_with_source(
+        self,
+        namespace_id: str,
+        query: str,
+        limit: int = 3,
+    ) -> list[dict[str, Any]]:
+        """Retrieve gists with their original source messages.
+
+        Returns a list of dicts, each with 'gist' (RecordedEntity) and
+        'source_messages' (list[RecordedEntity]) keys.
+        """
+        gists = self.retrieve_gists(namespace_id, query=query, limit=limit)
+        results = []
+        for gist in gists:
+            conversation_id = (gist.metadata or {}).get("conversation_id")
+            source_messages: list[RecordedEntity] = []
+            if conversation_id:
+                source_messages = self.search_entities(
+                    namespace_id=namespace_id,
+                    query=None,
+                    filters={"type": "gist_source", "metadata.conversation_id": conversation_id},
+                    limit=100,
+                )
+            results.append({"gist": gist, "source_messages": source_messages})
+        return results
diff --git a/kaizen/frontend/mcp/mcp_server.py b/kaizen/frontend/mcp/mcp_server.py
index fdbd65f..4c5b2ab 100644
--- a/kaizen/frontend/mcp/mcp_server.py
+++ b/kaizen/frontend/mcp/mcp_server.py
@@ -242,3 +242,71 @@ def delete_entity(entity_id: str) -> str:
     except KaizenException as e:
         logger.exception(f"Error deleting entity {entity_id}: {str(e)}")
         return json.dumps({"success": False, "error": str(e)})
+
+
+@mcp.tool()
+def store_gist(conversation_data: str, conversation_id: str | None = None) -> str:
+    """
+    Generate purpose-directed gists from a conversation and store them.
+    Gists are compressed representations optimized for answering questions about the user.
+    Uses rolling consolidation: re-calling with the same conversation_id replaces previous gists.
+
+    Args:
+        conversation_data: A JSON formatted list of conversation messages (each with 'role' and 'content').
+        conversation_id: Optional identifier for the conversation. If not provided, a UUID is generated.
+
+    Returns:
+        JSON string with stored gist details.
+    """
+    logger.info("Storing gist for conversation")
+    try:
+        messages = json.loads(conversation_data)
+        conversation_id = conversation_id or str(uuid.uuid4())
+
+        updates = get_client().store_gists(
+            namespace_id=kaizen_config.namespace_id,
+            messages=messages,
+            conversation_id=conversation_id,
+        )
+
+        return json.dumps({
+            "success": True,
+            "conversation_id": conversation_id,
+            "gists_stored": len(updates),
+            "gists": [{"id": u.id, "content": u.content} for u in updates],
+        })
+    except Exception as e:
+        logger.exception(f"Error storing gist: {e}")
+        return json.dumps({"success": False, "error": str(e)})
+
+
+@mcp.tool()
+def get_gists(query: str, limit: int = 10) -> str:
+    """
+    Retrieve stored conversation gists relevant to a query.
+    Gists are purpose-directed compressed representations of past conversations,
+    optimized for answering questions about the user.
+
+    Args:
+        query: A description or question to search for relevant gists.
+        limit: Maximum number of gists to return. Defaults to 10.
+
+    Returns:
+        Formatted string with relevant gists.
+    """
+    logger.info(f"Getting gists for query: {query}")
+    results = get_client().retrieve_gists(
+        namespace_id=kaizen_config.namespace_id,
+        query=query,
+        limit=limit,
+    )
+
+    if not results:
+        return "No relevant gists found."
+
+    response_lines = [f"# Conversation Gists for: {query}\n"]
+    for i, entity in enumerate(results, 1):
+        conversation_id = (entity.metadata or {}).get("conversation_id", "unknown")
+        response_lines.append(f"{i}. [conversation:{conversation_id}] {entity.content}")
+
+    return "\n".join(response_lines)
diff --git a/kaizen/llm/gist/__init__.py b/kaizen/llm/gist/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/kaizen/llm/gist/gist.py b/kaizen/llm/gist/gist.py
new file mode 100644
index 0000000..e1f4d7e
--- /dev/null
+++ b/kaizen/llm/gist/gist.py
@@ -0,0 +1,138 @@
+import json
+import logging
+from pathlib import Path
+
+import litellm
+from jinja2 import Template
+from litellm import completion, get_supported_openai_params, supports_response_schema
+
+from kaizen.config.kaizen import kaizen_config
+from kaizen.config.llm import llm_settings
+from kaizen.schema.gist import GistResponse, GistResult
+from kaizen.utils.utils import clean_llm_response
+
+logger = logging.getLogger(__name__)
+
+_PROMPT_TEMPLATE = Template((Path(__file__).parent / "prompts/generate_gist.jinja2").read_text())
+
+
+def _estimate_tokens(text: str) -> int:
+    """Rough token estimate: ~4 chars per token."""
+    return len(text) // 4
+
+
+def _chunk_messages(messages: list[dict], context_budget: int) -> list[list[dict]]:
+    """Split messages into chunks that fit within the context budget.
+
+    Returns a list of message chunks. Most sessions will produce a single chunk.
+    """
+    chunks: list[list[dict]] = []
+    current_chunk: list[dict] = []
+    current_tokens = 0
+
+    # Reserve tokens for prompt template + response
+    available_tokens = context_budget - 2000
+
+    for message in messages:
+        content = message.get("content", "")
+        if isinstance(content, list):
+            content = str(content)
+        msg_tokens = _estimate_tokens(str(content))
+
+        if current_chunk and (current_tokens + msg_tokens) > available_tokens:
+            chunks.append(current_chunk)
+            current_chunk = []
+            current_tokens = 0
+
+        current_chunk.append(message)
+        current_tokens += msg_tokens
+
+    if current_chunk:
+        chunks.append(current_chunk)
+
+    return chunks
+
+
+def _generate_single_gist(messages: list[dict], constrained_decoding_supported: bool) -> str | None:
+    """Generate a gist for a single chunk of messages. Returns the gist string or None on failure."""
+    prompt = _PROMPT_TEMPLATE.render(
+        messages=messages,
+        constrained_decoding_supported=constrained_decoding_supported,
+    )
+
+    last_error = None
+    for _ in range(3):
+        try:
+            if constrained_decoding_supported:
+                litellm.enable_json_schema_validation = True
+                raw = (
+                    completion(
+                        model=llm_settings.gist_model,
+                        messages=[{"role": "user", "content": prompt}],
+                        response_format=GistResponse,
+                        custom_llm_provider=llm_settings.custom_llm_provider,
+                    )
+                    .choices[0]
+                    .message.content
+                )
+            else:
+                litellm.enable_json_schema_validation = False
+                raw = (
+                    completion(
+                        model=llm_settings.gist_model,
+                        messages=[{"role": "user", "content": prompt}],
+                        custom_llm_provider=llm_settings.custom_llm_provider,
+                    )
+                    .choices[0]
+                    .message.content
+                )
+                raw = clean_llm_response(raw)
+
+            if not raw:
+                logger.warning("LLM returned empty response for gist generation.")
+                return None
+
+            parsed = GistResponse.model_validate(json.loads(raw))
+            return parsed.gist
+        except Exception as exc:
+            last_error = exc
+            continue
+
+    logger.warning(f"Failed to generate gist after 3 attempts: {last_error}")
+    return None
+
+
+def generate_gist(messages: list[dict], conversation_id: str | None = None) -> GistResult:
+    """Generate purpose-directed gists from conversation messages.
+
+    Messages are chunked based on the context budget. Each chunk produces one gist.
+    Most sessions fit in a single chunk, producing one consolidated gist.
+    """
+    if not messages:
+        return GistResult(gists=[], conversation_id=conversation_id, message_count=0, chunk_count=0)
+
+    supported_params = get_supported_openai_params(
+        model=llm_settings.gist_model,
+        custom_llm_provider=llm_settings.custom_llm_provider,
+    )
+    supports_response_format = supported_params and "response_format" in supported_params
+    response_schema_enabled = supports_response_schema(
+        model=llm_settings.gist_model,
+        custom_llm_provider=llm_settings.custom_llm_provider,
+    )
+    constrained_decoding_supported = supports_response_format and response_schema_enabled
+
+    chunks = _chunk_messages(messages, kaizen_config.gist_context_budget)
+    gists: list[str] = []
+
+    for chunk in chunks:
+        gist = _generate_single_gist(chunk, constrained_decoding_supported)
+        if gist and gist != "no user signal":
+            gists.append(gist)
+
+    return GistResult(
+        gists=gists,
+        conversation_id=conversation_id,
+        message_count=len(messages),
+        chunk_count=len(chunks),
+    )
diff --git a/kaizen/llm/gist/prompts/generate_gist.jinja2 b/kaizen/llm/gist/prompts/generate_gist.jinja2
new file mode 100644
index 0000000..fd5a708
--- /dev/null
+++ b/kaizen/llm/gist/prompts/generate_gist.jinja2
@@ -0,0 +1,18 @@
+You are given messages from a conversation between a user and an AI agent.
+Please create a gist of the conversation in a way that can be used to answer questions about the user.
+The gist will be stored in a vector database and used to answer questions about the user.
+Therefore, it can contain phrases and keywords and does not have to have complete sentences.
+The gist is not intended to be read by humans.
+You do not have to explain your reasoning. Just give me the gist.
+If there is nothing notable about the user in the conversation, return "no user signal".
+If you are not able to shorten the conversation, just give me the original messages.
+
+{% if not constrained_decoding_supported %}
+Respond with a JSON object with a single key "gist" containing the gist string.
+Do not include any other text or explanation outside the JSON.
+{% endif %}
+
+Conversation:
+{% for message in messages %}
+{{ message.role }}: {{ message.content }}
+{% endfor %}
diff --git a/kaizen/schema/gist.py b/kaizen/schema/gist.py
new file mode 100644
index 0000000..a4bd15f
--- /dev/null
+++ b/kaizen/schema/gist.py
@@ -0,0 +1,19 @@
+from dataclasses import dataclass
+
+from pydantic import BaseModel, Field
+
+
+class GistResponse(BaseModel):
+    """LLM response schema for gist generation."""
+
+    gist: str = Field(description="Purpose-directed gist of the conversation")
+
+
+@dataclass(frozen=True)
+class GistResult:
+    """Result from generate_gist(), containing one gist per chunk."""
+
+    gists: list[str]
+    conversation_id: str | None = None
+    message_count: int = 0
+    chunk_count: int = 0
diff --git a/platform-integrations/claude/plugins/kaizen-lite/skills/gist/SKILL.md b/platform-integrations/claude/plugins/kaizen-lite/skills/gist/SKILL.md
new file mode 100644
index 0000000..e28686f
--- /dev/null
+++ b/platform-integrations/claude/plugins/kaizen-lite/skills/gist/SKILL.md
@@ -0,0 +1,89 @@
+---
+name: gist
+description: Generate a purpose-directed gist of the current conversation optimized for remembering user preferences and attributes across sessions.
+context: fork
+---
+
+# Gist Memory
+
+## Overview
+
+This skill generates a **purpose-directed gist** of the current conversation — a compressed representation optimized for answering questions about the user (preferences, behaviors, habits, attributes). Unlike generic summarization, purpose-directed gisting foregrounds user-relevant signal and discards topical noise, making it dramatically more effective for personalization in future sessions.
+
+The gist will be stored as an entity and automatically injected into future sessions via the recall hook.
+
+## Workflow
+
+### Step 1: Walk Through Conversation Messages
+
+Review all messages in the current conversation from start to finish. Collect all user and assistant messages as a list.
+
+### Step 2: Generate the Gist
+
+Create a gist of the conversation following these specific instructions:
+
+**You are creating a gist that will be stored in a vector database and used to answer questions about the user.**
+
+Therefore:
+- **Focus on what the conversation reveals about the user** — their preferences, behaviors, habits, expertise, opinions, constraints, and attributes
+- **It can contain phrases and keywords** — does not need complete sentences
+- **It is not intended to be read by humans** — optimize for machine retrieval
+- **Discard topical noise** — if the user discussed Kubernetes for 20 messages but mentioned preferring Python in one sentence, the Python preference is higher signal for the gist than the Kubernetes discussion
+- If there is nothing notable about the user in the conversation, output "no user signal" and stop
+
+**Example:** A conversation about network routing where the user mentions "By the way, I strongly prefer Python over R for data analysis" should produce a gist like:
+```
+user prefers Python over R for data analysis; mentioned during networking discussion
+```
+NOT a summary of the networking discussion.
+
+### Step 3: Save the Gist
+
+Output the gist as a JSON entity and save it using the save_entities.py script:
+
+```bash
+echo '<your-json>' | python3 ${CLAUDE_PLUGIN_ROOT}/skills/learn/scripts/save_entities.py
+```
+
+The JSON format:
+```json
+{
+  "entities": [
+    {
+      "content": "<the gist text>",
+      "type": "gist",
+      "rationale": "Purpose-directed gist for personalization",
+      "trigger": "When answering questions about the user's preferences or attributes"
+    }
+  ]
+}
+```
+
+### Step 4: Confirm
+
+Tell the user what was captured in the gist. Be brief — just list the user-relevant signals that were preserved.
+
+## Examples
+
+### Good Gist (purpose-directed)
+Conversation: 20 messages about Kubernetes pod networking, one mention of preferring dark mode in IDEs
+```
+user prefers dark mode in IDEs; works with Kubernetes networking; container orchestration context
+```
+
+### Bad Gist (topic-preserving summary)
+```
+Discussion covered Kubernetes pod networking including CNI plugins, service mesh patterns, and ingress configuration. The user asked about Calico vs Cilium performance benchmarks.
+```
+This is a topic summary, not a user-attribute gist. It would fail to surface the dark mode preference in future sessions.
+
+### Good Gist (multiple signals)
+```
+user: senior backend engineer; prefers Go over Rust for systems work; uses Neovim; dislikes ORMs; team of 5; shipping deadline March 30
+```
+
+### No-Signal Case
+Conversation: User asks "What time is it?" and gets an answer.
+```
+no user signal
+```
diff --git a/platform-integrations/claude/plugins/kaizen-lite/skills/recall/scripts/retrieve_entities.py b/platform-integrations/claude/plugins/kaizen-lite/skills/recall/scripts/retrieve_entities.py
index 9e2c2f6..e63b57d 100644
--- a/platform-integrations/claude/plugins/kaizen-lite/skills/recall/scripts/retrieve_entities.py
+++ b/platform-integrations/claude/plugins/kaizen-lite/skills/recall/scripts/retrieve_entities.py
@@ -37,24 +37,45 @@ def log(message):
 
 def format_entities(entities):
     """Format all entities for Claude to review."""
-    header = """## Entities for this task
+    # Separate gists from other entities
+    gists = [e for e in entities if e.get("type") == "gist"]
+    other = [e for e in entities if e.get("type") != "gist"]
+
+    sections = []
+
+    if other:
+        header = """## Entities for this task
 
 Review these entities and apply any relevant ones:
 
 """
-    items = []
-    for e in entities:
-        content = e.get("content")
-        if not content:
-            continue
-        item = f"- **[{e.get('type', 'general')}]** {content}"
-        if e.get("rationale"):
-            item += f"\n  - _Rationale: {e['rationale']}_"
-        if e.get("trigger"):
-            item += f"\n  - _When: {e['trigger']}_"
-        items.append(item)
-
-    return header + "\n".join(items)
+        items = []
+        for e in other:
+            content = e.get("content")
+            if not content:
+                continue
+            item = f"- **[{e.get('type', 'general')}]** {content}"
+            if e.get("rationale"):
+                item += f"\n  - _Rationale: {e['rationale']}_"
+            if e.get("trigger"):
+                item += f"\n  - _When: {e['trigger']}_"
+            items.append(item)
+        sections.append(header + "\n".join(items))
+
+    if gists:
+        gist_header = """## Conversation Gists
+
+These are gists from prior conversations, optimized for recalling user preferences and attributes:
+
+"""
+        gist_items = []
+        for g in gists:
+            content = g.get("content")
+            if content:
+                gist_items.append(f"- {content}")
+        sections.append(gist_header + "\n".join(gist_items))
+
+    return "\n\n".join(sections)
 
 
 def main():
diff --git a/tests/unit/test_gist.py b/tests/unit/test_gist.py
new file mode 100644
index 0000000..7ba000a
--- /dev/null
+++ b/tests/unit/test_gist.py
@@ -0,0 +1,144 @@
+"""Tests for gist generation."""
+
+import json
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+from kaizen.llm.gist.gist import _chunk_messages, _estimate_tokens, generate_gist
+from kaizen.schema.gist import GistResult
+
+
+@pytest.mark.unit
+class TestEstimateTokens:
+    def test_empty_string(self):
+        assert _estimate_tokens("") == 0
+
+    def test_short_string(self):
+        assert _estimate_tokens("hello world") == 2  # 11 chars // 4
+
+    def test_long_string(self):
+        text = "a" * 1000
+        assert _estimate_tokens(text) == 250
+
+
+@pytest.mark.unit
+class TestChunkMessages:
+    def test_single_chunk_when_within_budget(self):
+        messages = [
+            {"role": "user", "content": "Hello"},
+            {"role": "assistant", "content": "Hi there"},
+        ]
+        chunks = _chunk_messages(messages, context_budget=64000)
+        assert len(chunks) == 1
+        assert chunks[0] == messages
+
+    def test_splits_when_exceeds_budget(self):
+        # Each message ~250 tokens (1000 chars), budget 3000 tokens
+        # Available = 3000 - 2000 (reserved) = 1000 tokens, so ~4 messages per chunk
+        messages = [{"role": "user", "content": "x" * 1000} for _ in range(10)]
+        chunks = _chunk_messages(messages, context_budget=3000)
+        assert len(chunks) > 1
+        # All messages accounted for
+        total = sum(len(chunk) for chunk in chunks)
+        assert total == 10
+
+    def test_empty_messages(self):
+        chunks = _chunk_messages([], context_budget=64000)
+        assert chunks == []
+
+    def test_single_large_message_gets_own_chunk(self):
+        # One huge message that exceeds budget on its own
+        messages = [
+            {"role": "user", "content": "x" * 300000},  # ~75k tokens
+            {"role": "user", "content": "small"},
+        ]
+        chunks = _chunk_messages(messages, context_budget=64000)
+        # The large message gets its own chunk, the small one gets another
+        assert len(chunks) == 2
+
+
+@pytest.mark.unit
+class TestGenerateGist:
+    @patch("kaizen.llm.gist.gist.get_supported_openai_params")
+    @patch("kaizen.llm.gist.gist.supports_response_schema")
+    @patch("kaizen.llm.gist.gist.completion")
+    def test_generates_gist_from_messages(self, mock_completion, mock_supports, mock_params):
+        mock_params.return_value = ["response_format"]
+        mock_supports.return_value = True
+
+        mock_response = MagicMock()
+        mock_response.choices[0].message.content = json.dumps({"gist": "user prefers Python for data analysis"})
+        mock_completion.return_value = mock_response
+
+        messages = [
+            {"role": "user", "content": "I really prefer Python over R for data work."},
+            {"role": "assistant", "content": "Got it, Python it is."},
+        ]
+        result = generate_gist(messages, conversation_id="test-123")
+
+        assert isinstance(result, GistResult)
+        assert len(result.gists) == 1
+        assert "Python" in result.gists[0]
+        assert result.conversation_id == "test-123"
+        assert result.message_count == 2
+        assert result.chunk_count == 1
+
+    @patch("kaizen.llm.gist.gist.get_supported_openai_params")
+    @patch("kaizen.llm.gist.gist.supports_response_schema")
+    @patch("kaizen.llm.gist.gist.completion")
+    def test_returns_empty_on_no_user_signal(self, mock_completion, mock_supports, mock_params):
+        mock_params.return_value = ["response_format"]
+        mock_supports.return_value = True
+
+        mock_response = MagicMock()
+        mock_response.choices[0].message.content = json.dumps({"gist": "no user signal"})
+        mock_completion.return_value = mock_response
+
+        messages = [
+            {"role": "user", "content": "What time is it?"},
+            {"role": "assistant", "content": "It's 3pm."},
+        ]
+        result = generate_gist(messages)
+
+        assert result.gists == []
+
+    def test_empty_messages_returns_empty(self):
+        result = generate_gist([])
+        assert result.gists == []
+        assert result.message_count == 0
+
+    @patch("kaizen.llm.gist.gist.get_supported_openai_params")
+    @patch("kaizen.llm.gist.gist.supports_response_schema")
+    @patch("kaizen.llm.gist.gist.completion")
+    def test_retries_on_parse_failure(self, mock_completion, mock_supports, mock_params):
+        mock_params.return_value = ["response_format"]
+        mock_supports.return_value = True
+
+        # First two calls fail, third succeeds
+        bad_response = MagicMock()
+        bad_response.choices[0].message.content = "not json"
+
+        good_response = MagicMock()
+        good_response.choices[0].message.content = json.dumps({"gist": "user likes cats"})
+
+        mock_completion.side_effect = [bad_response, bad_response, good_response]
+
+        result = generate_gist([{"role": "user", "content": "I love cats"}])
+        assert len(result.gists) == 1
+        assert "cats" in result.gists[0]
+
+    @patch("kaizen.llm.gist.gist.get_supported_openai_params")
+    @patch("kaizen.llm.gist.gist.supports_response_schema")
+    @patch("kaizen.llm.gist.gist.completion")
+    def test_fallback_without_constrained_decoding(self, mock_completion, mock_supports, mock_params):
+        mock_params.return_value = []  # No response_format support
+        mock_supports.return_value = False
+
+        mock_response = MagicMock()
+        mock_response.choices[0].message.content = json.dumps({"gist": "user is a backend engineer"})
+        mock_completion.return_value = mock_response
+
+        result = generate_gist([{"role": "user", "content": "I work on backend systems"}])
+        assert len(result.gists) == 1
+        assert "backend" in result.gists[0]