feat: add purpose-directed gist memory for user personalization by jayaramkr · Pull Request #102 · AgentToolkit/altk-evolve

jayaramkr · 2026-03-23T16:53:49Z

Implement Innovation 1 & 2 from the gist memory disclosure: storage-optimized and use-optimized gisting for conversation memory.

Core module (kaizen/llm/gist/):

generate_gist() with rolling cumulative chunking (context budget: 64k)
Purpose-directed Jinja2 prompt optimized for extracting user attributes
3-retry pattern with constrained decoding support
"no user signal" filtering for low-signal conversations

Schema & config:

GistResponse (Pydantic) + GistResult (frozen dataclass)
KAIZEN_GIST_MODEL, KAIZEN_GIST_CONTEXT_BUDGET, KAIZEN_GIST_TRIGGER_INTERVAL

KaizenClient methods:

store_gists(): rolling consolidation (delete-and-replace per conversation_id)
retrieve_gists(): semantic search over gist entities
retrieve_gist_with_source(): gists paired with original messages

MCP server tools:

store_gist: generate and store gists from conversation JSON
get_gists: retrieve relevant gists by semantic query

Claude Code plugin (Kaizen Lite):

/kaizen:gist skill for inline gist generation
Recall hook extended to inject gists in a separate section

Demo scenario (demo/gist-memory/):

Buried preference recall test (Section 9.1 probe from disclosure)
Session 1: preference embedded in unrelated K8s conversation
Session 2: preference recall verification prompts

Summary by CodeRabbit

New Features
- Gist Memory: capture and store user preferences and signals from conversations
- New MCP tools for storing and retrieving conversation gists
- Claude plugin skill for automated gist generation and extraction
Documentation
- Complete Gist Memory demo guide with setup instructions and verification workflows
- Gist skill specification for Claude plugins
Configuration
- Added configurable gist context budget and trigger interval settings
Tests
- Comprehensive unit tests for gist generation functionality

Implement Innovation 1 & 2 from the gist memory disclosure: storage-optimized and use-optimized gisting for conversation memory. Core module (kaizen/llm/gist/): - generate_gist() with rolling cumulative chunking (context budget: 64k) - Purpose-directed Jinja2 prompt optimized for extracting user attributes - 3-retry pattern with constrained decoding support - "no user signal" filtering for low-signal conversations Schema & config: - GistResponse (Pydantic) + GistResult (frozen dataclass) - KAIZEN_GIST_MODEL, KAIZEN_GIST_CONTEXT_BUDGET, KAIZEN_GIST_TRIGGER_INTERVAL KaizenClient methods: - store_gists(): rolling consolidation (delete-and-replace per conversation_id) - retrieve_gists(): semantic search over gist entities - retrieve_gist_with_source(): gists paired with original messages MCP server tools: - store_gist: generate and store gists from conversation JSON - get_gists: retrieve relevant gists by semantic query Claude Code plugin (Kaizen Lite): - /kaizen:gist skill for inline gist generation - Recall hook extended to inject gists in a separate section Demo scenario (demo/gist-memory/): - Buried preference recall test (Section 9.1 probe from disclosure) - Session 1: preference embedded in unrelated K8s conversation - Session 2: preference recall verification prompts

coderabbitai · 2026-03-23T16:55:38Z

📝 Walkthrough

Walkthrough

This pull request introduces a "gist memory" feature that extracts and stores user preferences buried in multi-turn conversations, enabling later recall via semantic search. The implementation spans configuration, client APIs, LLM-based generation, MCP tools, plugin integration, documentation, and tests.

Changes

Cohort / File(s)	Summary
Configuration `kaizen/config/kaizen.py`, `kaizen/config/llm.py`	Added `gist_context_budget` (64000 tokens) and `gist_trigger_interval` (5) settings to `KaizenConfig`, plus `gist_model` field to `LLMSettings` to configure model used for gist generation.
Core Gist Generation `kaizen/llm/gist/gist.py`, `kaizen/schema/gist.py`, `kaizen/llm/gist/prompts/generate_gist.jinja2`	Implemented `generate_gist()` function with token estimation, message chunking, and LLM-based extraction of user signals; added `GistResponse` Pydantic model and `GistResult` dataclass for structured output; created Jinja2 prompt template instructing models to produce vector-database-ready gists.
Client API Layer `kaizen/frontend/client/kaizen_client.py`	Added `store_gists()` (persists gist and source entities with rolling replacement), `retrieve_gists()` (semantic search), and `retrieve_gist_with_source()` (returns gists with associated source messages) methods to `KaizenClient`.
MCP Server Integration `kaizen/frontend/mcp/mcp_server.py`	Added `store_gist` and `get_gists` FastMCP tools wrapping the client layer, enabling conversation gist persistence and retrieval via the MCP protocol.
Plugin Integration `platform-integrations/claude/plugins/kaizen-lite/skills/gist/SKILL.md`, `platform-integrations/claude/plugins/kaizen-lite/skills/recall/scripts/retrieve_entities.py`	Introduced "gist" skill definition for Claude plugin; updated `retrieve_entities.py` to format gist entities in a separate "Conversation Gists" section during recall.
Demo & Documentation `demo/gist-memory/README.md`, `demo/gist-memory/session1_script.md`, `demo/gist-memory/session2_script.md`	Added demo scenario ("Buried Preference Recall") with setup instructions for both Lite and Full Kaizen paths, two-session workflow (preference embedding, then verification), and expected gist output examples.
Tests `tests/unit/test_gist.py`	Comprehensive unit tests validating token estimation, message chunking, gist generation with and without JSON-schema support, retry logic, and empty input handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~28 minutes

Possibly related PRs

move agent integrations into new folder #95: Modifies platform-integrations/claude/plugins/kaizen-lite/skills/recall/scripts/retrieve_entities.py (same file updated here) with related changes to format_entities behavior for entity categorization.

Suggested reviewers

visahak
illeatmyhat
vinodmut

Poem

🐰 A gist of wisdom, tucked away deep,
In conversation's weaving, secrets we keep.
Buried preferences now resurface with care,
Vector dreams help memories find their lair!
─ Hoppy Revision

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 48.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding a gist memory feature for user personalization across the codebase.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (1)

kaizen/llm/gist/prompts/generate_gist.jinja2 (1)
7-8: Consider the fallback behavior for unshortened conversations.

Line 8 instructs the LLM to return original messages if it cannot shorten the conversation. For chunked conversations near the context budget limit, this could result in gists that are nearly as large as the input, potentially defeating the purpose of gisting and causing storage bloat.

Consider whether a different fallback (e.g., a minimal metadata-only response, or explicitly filtering out such chunks) would better serve the storage-optimization goal.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kaizen/llm/gist/prompts/generate_gist.jinja2` around lines 7 - 8, The current
fallback in the generate_gist.jinja2 prompt ("If you are not able to shorten the
conversation, just give me the original messages.") can produce gists as large
as the input; change the fallback to return a minimal metadata-only response or
explicitly mark/drop such chunks to avoid storage bloat. Update the template
(generate_gist.jinja2) to replace the "just give me the original messages"
instruction with a clear alternative such as "if you cannot shorten the
conversation, return only a minimal metadata placeholder (e.g.,
'unshortened_chunk' plus participant IDs and timestamps) or mark the chunk to be
skipped" so downstream code that consumes the prompt can either store the small
metadata placeholder or drop the chunk rather than storing the full original
messages.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@demo/gist-memory/README.md`:
- Around line 47-49: The fenced code block in the README.md currently has no
language tag which triggers MD040; update the block that contains "user prefers
Python over R for data analysis; finds pandas more intuitive than tidyverse;
works with Kubernetes networking (Cilium, CNI plugins)" by adding a language
specifier (e.g., change the opening ``` to ```text or ```plaintext) so the
markdown linter treats it as plain text and the MD040 warning is resolved.

In `@demo/gist-memory/session1_script.md`:
- Around line 1-5: The header text incorrectly references "Message 4" as
containing the buried preference; update that reference to the correct message
number ("Message 5" or "Message 5 (User)") in the "Session 1: Preference
Embedding" block so the line that currently reads the buried preference is in
Message 4 matches the actual buried preference location (Message 5 (User) at
line 19).
- Around line 57-62: The fenced code block under "Expected Gist Output" lacks a
language specifier; update the block fence that wraps the expected gist (the
triple-backtick block containing "user prefers Python over R...") to include a
language tag (e.g., change ``` to ```text or ```txt) so the Markdown linter no
longer flags it.

In `@kaizen/frontend/client/kaizen_client.py`:
- Around line 303-389: The rolling replacement in store_gists is not atomic: the
search/delete/insert sequence (search_entities -> delete_entity_by_id ->
update_entities) can interleave across concurrent calls using the same
conversation_id and produce duplicate gist entities; update the store_gists
docstring (and add an inline comment above the delete+insert block) to
explicitly state this non-atomic behavior, give the concurrency
example/acceptable degradation, and note the assumption that the current
single-user MCP context accepts possible duplicate gists rather than
implementing locking or transactional semantics.

In `@kaizen/frontend/mcp/mcp_server.py`:
- Around line 272-277: The return JSON block containing conversation_id and
gists uses formatting that fails Ruff; fix by reformatting the function
containing that return (the block referencing conversation_id, updates, and the
list comprehension {"id": u.id, "content": u.content} for u in updates) to
comply with Ruff style—either run `ruff format
kaizen/frontend/mcp/mcp_server.py` or apply the equivalent formatting changes so
the return dict and list comprehension are properly spaced and wrapped.

In `@kaizen/llm/gist/gist.py`:
- Around line 114-123: The variable constrained_decoding_supported can be
non-boolean because supported_params may be a list; update the computation to
yield a strict bool before passing to _generate_single_gist: explicitly compute
supports_response_format by checking that supported_params is truthy and that
"response_format" is in it, compute response_schema_enabled via
supports_response_schema, then set constrained_decoding_supported =
bool(supports_response_format and response_schema_enabled) (or use an explicit
isinstance/boolean check) so the value is always a bool when used by
_generate_single_gist; refer to get_supported_openai_params, supported_params,
supports_response_format, supports_response_schema, response_schema_enabled, and
constrained_decoding_supported to locate the change.

---

Nitpick comments:
In `@kaizen/llm/gist/prompts/generate_gist.jinja2`:
- Around line 7-8: The current fallback in the generate_gist.jinja2 prompt ("If
you are not able to shorten the conversation, just give me the original
messages.") can produce gists as large as the input; change the fallback to
return a minimal metadata-only response or explicitly mark/drop such chunks to
avoid storage bloat. Update the template (generate_gist.jinja2) to replace the
"just give me the original messages" instruction with a clear alternative such
as "if you cannot shorten the conversation, return only a minimal metadata
placeholder (e.g., 'unshortened_chunk' plus participant IDs and timestamps) or
mark the chunk to be skipped" so downstream code that consumes the prompt can
either store the small metadata placeholder or drop the chunk rather than
storing the full original messages.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8b964908-1475-4200-ab42-57238d49f8af

📥 Commits

Reviewing files that changed from the base of the PR and between 6bfdbbb and 6cf4714.

📒 Files selected for processing (14)

demo/gist-memory/README.md
demo/gist-memory/session1_script.md
demo/gist-memory/session2_script.md
kaizen/config/kaizen.py
kaizen/config/llm.py
kaizen/frontend/client/kaizen_client.py
kaizen/frontend/mcp/mcp_server.py
kaizen/llm/gist/__init__.py
kaizen/llm/gist/gist.py
kaizen/llm/gist/prompts/generate_gist.jinja2
kaizen/schema/gist.py
platform-integrations/claude/plugins/kaizen-lite/skills/gist/SKILL.md
platform-integrations/claude/plugins/kaizen-lite/skills/recall/scripts/retrieve_entities.py
tests/unit/test_gist.py

coderabbitai · 2026-03-23T17:05:06Z

demo/gist-memory/README.md

+```
+user prefers Python over R for data analysis; finds pandas more intuitive than tidyverse; works with Kubernetes networking (Cilium, CNI plugins)
+```


⚠️ Potential issue | 🟡 Minor

Add a language specifier to the fenced code block.

The code block is missing a language identifier, which triggers a markdownlint warning (MD040). Since this shows plain text output, use text or plaintext.

📝 Suggested fix

-``` +```text user prefers Python over R for data analysis; finds pandas more intuitive than tidyverse; works with Kubernetes networking (Cilium, CNI plugins)

</details>  <details> <summary>📝 Committable suggestion</summary> > ‼️ **IMPORTANT** > Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements. ```suggestion

🧰 Tools

🪛 markdownlint-cli2 (0.21.0)

[warning] 47-47: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@demo/gist-memory/README.md` around lines 47 - 49, The fenced code block in the README.md currently has no language tag which triggers MD040; update the block that contains "user prefers Python over R for data analysis; finds pandas more intuitive than tidyverse; works with Kubernetes networking (Cilium, CNI plugins)" by adding a language specifier (e.g., change the opening ``` to ```text or ```plaintext) so the markdown linter treats it as plain text and the MD040 warning is resolved.

coderabbitai · 2026-03-23T17:05:06Z

demo/gist-memory/session1_script.md

+# Session 1: Preference Embedding
+
+Use these messages in order. The buried preference is in **Message 4**.
+
+---


⚠️ Potential issue | 🟡 Minor

Fix inconsistent message number reference.

Line 3 states the buried preference is in "Message 4", but the actual buried preference appears in "Message 5 (User)" at line 19.

📝 Proposed fix

# Session 1: Preference Embedding -Use these messages in order. The buried preference is in **Message 4**. +Use these messages in order. The buried preference is in **Message 5**. ---

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Session 1: Preference Embedding

Use these messages in order. The buried preference is in **Message 4**.

---

# Session 1: Preference Embedding

Use these messages in order. The buried preference is in **Message 5**.

---

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@demo/gist-memory/session1_script.md` around lines 1 - 5, The header text incorrectly references "Message 4" as containing the buried preference; update that reference to the correct message number ("Message 5" or "Message 5 (User)") in the "Session 1: Preference Embedding" block so the line that currently reads the buried preference is in Message 4 matches the actual buried preference location (Message 5 (User) at line 19).

coderabbitai · 2026-03-23T17:05:06Z

demo/gist-memory/session1_script.md

+## Expected Gist Output
+
+The gist should surface the buried preference:
+```
+user prefers Python over R for data analysis; finds pandas more intuitive than tidyverse; works with Kubernetes networking; troubleshooting CoreDNS; large cluster environment
+```


⚠️ Potential issue | 🟡 Minor

Add language specifier to the expected output code block.

The fenced code block is missing a language specifier, which triggers a Markdown lint warning.

📝 Proposed fix

## Expected Gist Output The gist should surface the buried preference: -``` +```text user prefers Python over R for data analysis; finds pandas more intuitive than tidyverse; works with Kubernetes networking; troubleshooting CoreDNS; large cluster environment

</details> <details> <summary>🧰 Tools</summary> <details> <summary>🪛 markdownlint-cli2 (0.21.0)</summary> [warning] 60-60: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> <details> <summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @demo/gist-memory/session1_script.md around lines 57 - 62, The fenced code
block under "Expected Gist Output" lacks a language specifier; update the block
fence that wraps the expected gist (the triple-backtick block containing "user
prefers Python over R...") to include a language tag (e.g., change totext or ```txt) so the Markdown linter no longer flags it.

</details>  

coderabbitai · 2026-03-23T17:05:06Z

kaizen/frontend/client/kaizen_client.py

+    def store_gists(
+        self,
+        namespace_id: str,
+        messages: list[dict],
+        conversation_id: str | None = None,
+        metadata: dict[str, Any] | None = None,
+    ) -> list[EntityUpdate]:
+        """Generate purpose-directed gists from conversation messages and store them.
+
+        Implements rolling consolidation: deletes any existing gists for the same
+        conversation_id before storing new ones, so the latest gist always reflects
+        the full session.
+        """
+        if not messages:
+            return []
+
+        conversation_id = conversation_id or str(uuid.uuid4())
+        self.ensure_namespace(namespace_id)
+
+        # Delete existing gists for this conversation (rolling replacement)
+        existing = self.search_entities(
+            namespace_id=namespace_id,
+            query=None,
+            filters={"type": "gist", "metadata.conversation_id": conversation_id},
+            limit=100,
+        )
+        for entity in existing:
+            try:
+                self.delete_entity_by_id(namespace_id, entity.id)
+            except Exception:
+                logger.warning("Failed to delete old gist %s during rolling replacement", entity.id, exc_info=True)
+
+        # Generate gists
+        result = generate_gist(messages, conversation_id=conversation_id)
+
+        if not result.gists:
+            return []
+
+        # Store gist entities
+        base_metadata: dict[str, Any] = dict(metadata or {})
+        base_metadata["conversation_id"] = conversation_id
+        base_metadata["message_count"] = result.message_count
+
+        gist_entities = []
+        for i, gist_text in enumerate(result.gists):
+            gist_metadata = dict(base_metadata)
+            gist_metadata["chunk_index"] = i
+            gist_metadata["chunk_count"] = result.chunk_count
+            gist_entities.append(Entity(type="gist", content=gist_text, metadata=gist_metadata))
+
+        updates = self.update_entities(namespace_id, gist_entities, enable_conflict_resolution=False)
+
+        # Store original messages as gist_source for durable retrieval
+        source_entities = []
+        for i, msg in enumerate(messages):
+            content = msg.get("content", "")
+            if isinstance(content, list):
+                content = str(content)
+            source_entities.append(
+                Entity(
+                    type="gist_source",
+                    content=content,
+                    metadata={
+                        "conversation_id": conversation_id,
+                        "message_index": i,
+                        "role": msg.get("role", "unknown"),
+                    },
+                )
+            )
+
+        if source_entities:
+            # Delete existing sources for this conversation first
+            existing_sources = self.search_entities(
+                namespace_id=namespace_id,
+                query=None,
+                filters={"type": "gist_source", "metadata.conversation_id": conversation_id},
+                limit=1000,
+            )
+            for entity in existing_sources:
+                try:
+                    self.delete_entity_by_id(namespace_id, entity.id)
+                except Exception:
+                    logger.warning("Failed to delete old gist_source %s", entity.id, exc_info=True)
+
+            self.update_entities(namespace_id, source_entities, enable_conflict_resolution=False)
+
+        return updates


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if there are any other callers of store_gists that might invoke it concurrently rg -n "store_gists" --type=py -C2

Repository: AgentToolkit/kaizen

Length of output: 802

🏁 Script executed:

cat -n kaizen/frontend/mcp/mcp_server.py | sed -n '260,280p'

Repository: AgentToolkit/kaizen

Length of output: 949

🏁 Script executed:

# Also check if there are any async/concurrent patterns in the MCP server rg -n "async|concurrent|thread|lock|pool" kaizen/frontend/mcp/mcp_server.py

Repository: AgentToolkit/kaizen

Length of output: 152

🏁 Script executed:

# Check the broader context of how the MCP server handles requests head -50 kaizen/frontend/mcp/mcp_server.py

Repository: AgentToolkit/kaizen

Length of output: 1696

🏁 Script executed:

# Check the backend implementation for any locking mechanisms rg -n "def delete_entity_by_id|def search_entities|def update_entities" kaizen/frontend/client/kaizen_client.py -A 5

Repository: AgentToolkit/kaizen

Length of output: 1085

🏁 Script executed:

# Check if there's a storage backend with its own locking find . -name "*.py" -type f | xargs grep -l "delete_entity_by_id" | head -5

Repository: AgentToolkit/kaizen

Length of output: 194

🏁 Script executed:

# Check the filesystem backend for delete and search implementations rg -n "def delete_entity_by_id|def search_entities|def update_entities|lock|Lock" kaizen/backend/filesystem.py -B 2 -A 8

Repository: AgentToolkit/kaizen

Length of output: 5080

🏁 Script executed:

# Get size of filesystem backend to decide how much to read wc -l kaizen/backend/filesystem.py

Repository: AgentToolkit/kaizen

Length of output: 95

Document the non-atomic nature of the rolling replacement logic in store_gists.

The search-delete-insert sequence is not atomic. While the backend uses per-operation locking, concurrent calls with the same conversation_id can interleave, potentially resulting in duplicate gists. For example:

Thread A searches and finds gist X

Thread B searches and finds gist X

Thread A deletes X, inserts Y

Thread B deletes nothing (X already gone), inserts Z
Result: both Y and Z coexist for the same conversation.

This is likely acceptable for the current single-user MCP tool context, but should be documented in the docstring or with a code comment explaining the assumption and acceptable degradation mode.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@kaizen/frontend/client/kaizen_client.py` around lines 303 - 389, The rolling replacement in store_gists is not atomic: the search/delete/insert sequence (search_entities -> delete_entity_by_id -> update_entities) can interleave across concurrent calls using the same conversation_id and produce duplicate gist entities; update the store_gists docstring (and add an inline comment above the delete+insert block) to explicitly state this non-atomic behavior, give the concurrency example/acceptable degradation, and note the assumption that the current single-user MCP context accepts possible duplicate gists rather than implementing locking or transactional semantics.

coderabbitai · 2026-03-23T17:05:06Z

kaizen/frontend/mcp/mcp_server.py

+        return json.dumps({
+            "success": True,
+            "conversation_id": conversation_id,
+            "gists_stored": len(updates),
+            "gists": [{"id": u.id, "content": u.content} for u in updates],
+        })


⚠️ Potential issue | 🟡 Minor

Ruff formatting check failed.

The pipeline indicates this file needs reformatting. Run ruff format kaizen/frontend/mcp/mcp_server.py to fix.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@kaizen/frontend/mcp/mcp_server.py` around lines 272 - 277, The return JSON block containing conversation_id and gists uses formatting that fails Ruff; fix by reformatting the function containing that return (the block referencing conversation_id, updates, and the list comprehension {"id": u.id, "content": u.content} for u in updates) to comply with Ruff style—either run `ruff format kaizen/frontend/mcp/mcp_server.py` or apply the equivalent formatting changes so the return dict and list comprehension are properly spaced and wrapped.

coderabbitai · 2026-03-23T17:05:06Z

kaizen/llm/gist/gist.py

+    supported_params = get_supported_openai_params(
+        model=llm_settings.gist_model,
+        custom_llm_provider=llm_settings.custom_llm_provider,
+    )
+    supports_response_format = supported_params and "response_format" in supported_params
+    response_schema_enabled = supports_response_schema(
+        model=llm_settings.gist_model,
+        custom_llm_provider=llm_settings.custom_llm_provider,
+    )
+    constrained_decoding_supported = supports_response_format and response_schema_enabled


⚠️ Potential issue | 🟠 Major

Fix type narrowing for constrained_decoding_supported.

The pipeline reports a mypy error: constrained_decoding_supported has type list[Any] | bool | None but _generate_single_gist expects bool. The and expression doesn't guarantee a boolean result.

🐛 Proposed fix

supported_params = get_supported_openai_params( model=llm_settings.gist_model, custom_llm_provider=llm_settings.custom_llm_provider, ) - supports_response_format = supported_params and "response_format" in supported_params + supports_response_format = bool(supported_params and "response_format" in supported_params) response_schema_enabled = supports_response_schema( model=llm_settings.gist_model, custom_llm_provider=llm_settings.custom_llm_provider, ) - constrained_decoding_supported = supports_response_format and response_schema_enabled + constrained_decoding_supported = bool(supports_response_format and response_schema_enabled)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

supported_params = get_supported_openai_params(

model=llm_settings.gist_model,

custom_llm_provider=llm_settings.custom_llm_provider,

)

supports_response_format = supported_params and "response_format" in supported_params

response_schema_enabled = supports_response_schema(

model=llm_settings.gist_model,

custom_llm_provider=llm_settings.custom_llm_provider,

)

constrained_decoding_supported = supports_response_format and response_schema_enabled

supported_params = get_supported_openai_params(

model=llm_settings.gist_model,

custom_llm_provider=llm_settings.custom_llm_provider,

)

supports_response_format = bool(supported_params and "response_format" in supported_params)

response_schema_enabled = supports_response_schema(

model=llm_settings.gist_model,

custom_llm_provider=llm_settings.custom_llm_provider,

)

constrained_decoding_supported = bool(supports_response_format and response_schema_enabled)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@kaizen/llm/gist/gist.py` around lines 114 - 123, The variable constrained_decoding_supported can be non-boolean because supported_params may be a list; update the computation to yield a strict bool before passing to _generate_single_gist: explicitly compute supports_response_format by checking that supported_params is truthy and that "response_format" is in it, compute response_schema_enabled via supports_response_schema, then set constrained_decoding_supported = bool(supports_response_format and response_schema_enabled) (or use an explicit isinstance/boolean check) so the value is always a bool when used by _generate_single_gist; refer to get_supported_openai_params, supported_params, supports_response_format, supports_response_schema, response_schema_enabled, and constrained_decoding_supported to locate the change.

visahak · 2026-03-23T22:12:52Z

can we hold off merging this for now?

JAYARAM RADHAKRISHNAN and others added 2 commits March 23, 2026 12:49

Merge branch 'main' into feat/gist-memory

6cf4714

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add purpose-directed gist memory for user personalization#102

feat: add purpose-directed gist memory for user personalization#102
jayaramkr wants to merge 2 commits intoAgentToolkit:mainfrom
jayaramkr:feat/gist-memory

jayaramkr commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 23, 2026

Uh oh!

coderabbitai bot Mar 23, 2026

Uh oh!

coderabbitai bot Mar 23, 2026

Uh oh!

coderabbitai bot Mar 23, 2026

Uh oh!

coderabbitai bot Mar 23, 2026

Uh oh!

coderabbitai bot Mar 23, 2026

Uh oh!

visahak commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jayaramkr commented Mar 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

visahak commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jayaramkr commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 23, 2026 •

edited

Loading