Kaizen integration by gaodan-fang · Pull Request #85 · cuga-project/cuga-agent

gaodan-fang · 2026-03-26T13:51:39Z

Summary by CodeRabbit

Release Notes

New Features
- In-process memory integration syncs user preferences automatically during conversations.
- User memories and preferences now display in UI alongside variables.
- Enhanced personalization context in agent decision-making.
Bug Fixes
- Improved memory initialization workflow.
Documentation
- Updated setup instructions to install memory extras and configure Kaizen backend.
Chores
- Minimum Python version requirement updated to 3.12+.
- Memory service no longer runs as separate process.

coderabbitai · 2026-03-26T13:52:00Z

📝 Walkthrough

Walkthrough

This PR migrates the agentic memory backend from mem0ai to Kaizen, removing the standalone memory service and integrating user memory preferences directly into the agent graph. Configuration, CLI commands, and the frontend are updated to support the new architecture.

Changes

Cohort / File(s)	Summary
Dependency Updates `pyproject.toml`, `.github/workflows/stability-tests.yml`	Updated Python version constraints to `>=3.12, <3.14`; added `litellm` and `jinja2` dependencies; replaced mem0ai optional dependency with `kaizen==1.0.2`; reduced CI test matrix from Python 3.10/3.11/3.12/3.13 to 3.12 only.
Memory Backend Removal `src/cuga/backend/memory/agentic_memory/...`	Deleted entire mem0/Milvus/HTTP backend implementations (`backend/base.py`, `backend/http_backend.py`, `backend/mem0_backend.py`, `backend/milvus.py`); removed `MemoryClient` and configuration (`client/memory_client.py`, `config.py`); deleted conflict resolution, fact extraction, and tips extraction LLM modules; removed SQLite persistence layer and all associated Jinja2 prompt templates.
Kaizen Integration `src/cuga/backend/memory/memory.py`, `src/cuga/backend/memory/cli.py`, `src/cuga/backend/memory/utils/...`	Implemented `get_kaizen_client()`, `get_kaizen_namespace_id()`, `normalize_user_id()`, and `sync_user_memory()` for Kaizen-backed memory access; added `RunRecord` pydantic model; updated CLI to use Kaizen client with dynamic dependency checks; rewrote `get_formatted_tips()` to retrieve guideline entities from Kaizen instead of V1 endpoints; added `prompts.py` for prompt defaults.
Memory/Trajectory Tracking `src/cuga/backend/activity_tracker/tracker.py`	Replaced `Memory` backing with `KaizenClient`; removed `run_id` public attribute and lifecycle management; added in-memory trajectory message tracking seeded with user intent; implemented step-wise trajectory persistence as `trajectory` entities; added async guideline generation and persistence on `FinalAnswerAgent` completion.
User Preferences Integration `src/cuga/backend/cuga_graph/state/agent_state.py`, `src/cuga/backend/cuga_graph/state/user_preferences_context.py`	Added `user_preferences` field to `AgentState`; introduced `load_user_preferences()` and `format_user_preferences_for_prompt()` helpers; created new `user_preferences_context.py` module implementing preference extraction, structuring, relevance scoring, and decision-context formatting with query-aware fact selection.
Chat & Planning Agents `src/cuga/backend/cuga_graph/nodes/chat/chat.py`, `src/cuga/backend/cuga_graph/nodes/chat/chat_agent/...`, `src/cuga/backend/cuga_graph/nodes/api/api_planner_agent/api_planner_agent.py`	Added memory synchronization on chat input via `sync_user_memory()`; extended chat agent to build and pass `memory_context` to LLM; updated `APIPlannerAgent` with user preference formatting helpers and prompt injection logic; added `user_preferences_context` to agent prompts.
Prompt Updates `src/cuga/backend/cuga_graph/nodes//prompts/.jinja2`	Updated multiple agent prompts to conditionally include `user_preferences_context` and `memory_context` sections directing agents to consult persistent memory facts when making planning/chat decisions.
Task & Code Agents Memory Tips `src/cuga/backend/cuga_graph/nodes/api/api_code_agent.py`, `src/cuga/backend/cuga_graph/nodes/api/code_agent/code_agent.py`, `src/cuga/backend/cuga_graph/nodes/task_decomposition_planning/...`	Updated all memory-tip retrieval calls to use dynamic `get_kaizen_namespace_id()` instead of hardcoded `"memory"` string; updated import paths from `agentic_memory.utils` to `memory.utils`; disabled fact-extraction blocks in code agents via `if False` guards (disabling but not removing legacy code).
CugaLite Subgraph `src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py`, `src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_node.py`	Added `user_preferences` to `CugaLiteState`; implemented memory context generation from chat history and preference formatting; injected memory decision context into dynamic prompts; forwarded preferences through subgraph invocation with debug logging.
Server & API `src/cuga/backend/server/main.py`	Added `_get_memories_for_ui()` and `_build_variables_metadata_for_ui()` helpers; extended `/api/agent/state` and streaming answer responses with `memories` and `memories_count` fields.
CLI Service Management `src/cuga/cli/main.py`	Removed "memory" service from direct-process lifecycle management (start/stop/status); memory is now initialized in-process when starting CUGA via other services (e.g., `demo`).
Configuration `src/cuga/config.py`, `src/cuga/settings.toml`, `src/cuga/configurations/...`	Added separate `kaizen_settings` loader and `KAIZEN_SETTINGS_TOML_PATH`; created `memory/kaizen.settings.toml` with Milvus/Filesystem backend config; removed legacy mem0 config files; updated all model configuration TOMLs (`settings..toml`) to use `memory.kaizen.` paths instead of `memory.mem0/milvus/tips_extractor`; enabled `enable_fact = true` by default; added `[memory]` categorization settings.
Documentation `README.md`, `docs/memory/README.md`	Updated README with memory install step (`uv sync --extra memory`) and corrected demo commands (`settings.toml` → `settings.toml`, `cuga start memory` → `cuga start demo`); rewrote memory integration docs to describe Kaizen-based workflow, in-process initialization, Kaizen configuration, and updated prompt location references.
File System Demo `docs/examples/demo_apps/file_system/build/lib/main.py`	Added new secure filesystem MCP server implementation with access control, multiple file operation tools, conditional write-access hiding, and directory tree generation.
Frontend Memories Support `src/frontend_workspaces/agentic_chat/src/*.tsx`, `src/frontend_workspaces/agentic_chat/src/VariablesSidebar.css`	Extended UI to display user memory facts alongside variables; added `globalMemories` state tracking; updated `AgentState` interface with `memories`/`memories_count`; added new "memories" sidebar tab with separate view/history; updated component props to thread memory data through chat, card manager, sidebar, and debug panel; added CSS rule for memory-type styling.
Module Exports `src/cuga/backend/memory/__init__.py`, `src/cuga/backend/memory/agentic_memory/.../__init__.py`, `src/cuga/backend/memory/utils/__init__.py`	Removed agentic_memory package exports; refactored main memory `__init__.py` to export Kaizen-focused symbols (`get_kaizen_client`, `Entity`, `RecordedEntity`, etc.) with lazy attribute resolution and runtime error handling for missing Kaizen; added new `utils/__init__.py` re-exporting `get_formatted_tips` and `prompts`.

Sequence Diagram

sequenceDiagram
    participant ChatNode
    participant sync_fn as sync_user_memory()
    participant KaizenClient
    participant ActivityTracker
    participant FinalAnswer
    participant Guidelines

    ChatNode->>sync_fn: call with user_id, query
    sync_fn->>KaizenClient: store_user_facts(query)
    sync_fn->>KaizenClient: retrieve_user_facts(user_id)
    KaizenClient-->>sync_fn: preferences dict
    sync_fn-->>ChatNode: (user_id, preferences)
    ChatNode->>ChatNode: update state.user_preferences

    Note over ActivityTracker: Per step execution
    ChatNode->>ActivityTracker: track step with data
    ActivityTracker->>ActivityTracker: build trajectory message
    ActivityTracker->>KaizenClient: update_entities(trajectory)
    KaizenClient-->>ActivityTracker: ack

    Note over FinalAnswer: On task completion
    ChatNode->>FinalAnswer: reached
    FinalAnswer->>Guidelines: _persist_guidelines_from_trajectory()
    Guidelines->>Guidelines: group trajectory by agent
    Guidelines->>KaizenClient: generate_tips per agent
    KaizenClient-->>Guidelines: tips
    Guidelines->>Guidelines: deduplicate by content
    Guidelines->>KaizenClient: update_entities(guideline)
    KaizenClient-->>Guidelines: ack
    Guidelines-->>FinalAnswer: done

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Fix CI #25: Modifies the same CI stability-tests workflow Python-version matrix constraints.
feat: add out-of-the-box agents #69: Both PRs modify CLI service management (src/cuga/cli/main.py) — this PR removes "memory" service while the related PR adjusts demo/health service wiring.

Suggested reviewers

gjt-prog
offerakrabi
nir-mashkif

🐰 A memory migration most grand,
From mem0 to Kaizen we stand,
Preferences flow through each agent's mind,
Guidelines extracted, preferences aligned! ✨
The trajectory dances, facts intertwined.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch kaizen-integration

coderabbitai

Actionable comments posted: 18

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

src/frontend_workspaces/agentic_chat/src/LeftSidebar.tsx (1)

2-2: ⚠️ Potential issue | 🔴 Critical

Import Brain from lucide-react to fix undefined reference error.

Line 245 uses <Brain size={16} />, but Brain is not imported on Line 2.

🔧 Suggested fix

-import { MessageSquare, Database, ChevronLeft, ChevronRight, Plus, Trash2, Workflow, Info, HelpCircle } from "lucide-react";
+import { MessageSquare, Database, Brain, ChevronLeft, ChevronRight, Plus, Trash2, Workflow, Info, HelpCircle } from "lucide-react";

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/frontend_workspaces/agentic_chat/src/LeftSidebar.tsx` at line 2, The file
imports several icons from lucide-react but misses the Brain icon used as <Brain
size={16} /> in LeftSidebar.tsx; update the import statement that currently
lists MessageSquare, Database, ChevronLeft, ChevronRight, Plus, Trash2,
Workflow, Info, HelpCircle to also include Brain so the Brain symbol is defined
and the component renders correctly.

src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py (1)

947-977: ⚠️ Potential issue | 🟠 Major

Keep the injected memory block out of state.chat_messages.

call_model() recomputes memory_query from state.chat_messages on every sandbox→model cycle. Once these branches write ## Memory Decision Context back into the stored human message, the next pass ranks facts against the injected memory text instead of the user's original request. Only augment messages_for_model, or keep the raw user utterance separately.

Also applies to: 995-1023
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py` around lines
947 - 977, The code is mutating state.chat_messages by writing injected blocks
(memory_decision_context, playbook_guidance, variables_addendum) back into
HumanMessage objects, which pollutes the canonical user utterance used later
(memory_query); instead, stop updating state.chat_messages in cuga_lite_graph
and apply all augmentations only to the messages_for_model list (or a separate
augmented_messages list) passed to call_model(); preserve the original content
in state.chat_messages and only construct modified HumanMessage instances for
messages_for_model using memory_section_header, memory_decision_context,
playbook_guidance, and variables_addendum (also make the same change for the
similar block around lines 995-1023).

src/cuga/backend/server/main.py (1)

1104-1110: ⚠️ Potential issue | 🟠 Major

Don't persist memories snapshots inside answer stream history.

final_answer_text now embeds the full memory snapshot, and that same string is appended to stream_events_buffer and saved to the stream-events store. That duplicates potentially personal facts into every stored answer event and keeps them around even if the underlying memory entry is later deleted. Stream the enriched payload to the live client if needed, but persist a redacted answer body instead.

🔒 Redact stored answer events

                         final_answer_text = (
                             event.answer
                             if settings.advanced_features.wxo_integration
                             else json.dumps(
                                 {
                                     "data": event.answer,
                                     "variables": variables_metadata,
                                     "memories": memories,
                                     "memories_count": memories_count,
                                     "active_policies": active_policies,
                                 }
                             )
                             if event.answer
                             else "Done."
                         )
+                        persisted_answer_text = event.answer if event.answer else "Done."
 ...
                             stream_events_buffer.append(
                                 {
                                     "event_name": "Answer",
-                                    "event_data": final_answer_text,
+                                    "event_data": persisted_answer_text,
                                     "timestamp": datetime.utcnow().isoformat(),
                                     "sequence": event_sequence,
                                 }
                             )

Also applies to: 1123-1143

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cuga/backend/server/main.py` around lines 1104 - 1110, The code is
persisting full memory snapshots into the stored answer events; update the logic
that builds and appends the stored payload (the object currently constructed
with keys like "data" / event.answer, "variables" / variables_metadata,
"memories", "memories_count", "active_policies") so that only a redacted version
is persisted to stream_events_buffer and the stream-events store—e.g., include
memories_count and safe identifiers or omit the "memories" field
entirely—instead of embedding full memories or final_answer_text that contains
them; continue to send the enriched payload (with full memories) to the live
client but ensure the persisted object (where stream_events_buffer,
final_answer_text, event.answer, variables_metadata, memories_count,
active_policies are used) does not contain personal memory content; apply the
same redaction change to the other similar block referenced around the 1123–1143
area.

🟡 Minor comments (7)

docs/examples/demo_apps/file_system/build/lib/main.py-231-240 (1)

231-240: ⚠️ Potential issue | 🟡 Minor

Validate head/tail explicitly instead of relying on truthiness.

0 currently falls through to a full-file read, and negative values produce surprising slices. Check is not None and reject non-positive values before dispatching.

🛠️ Suggested fix

-    if tail and head:
+    if tail is not None and head is not None:
         raise ValueError("Cannot specify both head and tail parameters simultaneously")
+    if tail is not None and tail <= 0:
+        raise ValueError("tail must be greater than 0")
+    if head is not None and head <= 0:
+        raise ValueError("head must be greater than 0")
@@
-    if tail:
+    if tail is not None:
         return await tail_file(valid_path, tail)
 
-    if head:
+    if head is not None:
         return await head_file(valid_path, head)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/examples/demo_apps/file_system/build/lib/main.py` around lines 231 -
240, The current dispatch uses truthiness for head/tail so values like 0 or
negatives behave incorrectly; update the logic in the block around
validate_path/head_file/tail_file to check "is not None" for both head and tail,
validate that any provided head or tail is a positive integer (raise ValueError
for non-positive or non-integer inputs), keep the mutual-exclusion check (raise
if both are not None), and then call tail_file(valid_path, tail) or
head_file(valid_path, head) only when the corresponding parameter is explicitly
provided and validated.

src/cuga/backend/memory/__init__.py-23-40 (1)

23-40: ⚠️ Potential issue | 🟡 Minor

Guard ModuleNotFoundError to only handle Kaizen-related failures.

The exception handler catches all ModuleNotFoundError cases, including missing transitive dependencies of kaizen.schema.* modules. When such dependencies (e.g., numpy, pydantic) are missing, the code incorrectly reports it as a Kaizen version incompatibility issue instead of surfacing the actual missing dependency.

Add a guard checking exc.name to distinguish between Kaizen's own modules and unrelated missing dependencies:
Suggested fix
     try:
         module = import_module(module_name)
     except ModuleNotFoundError as exc:
+        missing_name = exc.name or ""
+        if missing_name != "kaizen" and not missing_name.startswith("kaizen."):
+            raise
+
         if util.find_spec("kaizen") is None:
             raise RuntimeError(
                 "Kaizen is required for memory features but is not installed. "
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cuga/backend/memory/__init__.py` around lines 23 - 40, The
ModuleNotFoundError handler around import_module(module_name) should only treat
errors that are about Kaizen itself; inspect exc.name and if it does not refer
to a kaizen module (e.g., not starting with "kaizen" or not equal to the
top-level module_name root) re-raise the original ModuleNotFoundError so missing
transitive deps surface correctly. Only when exc.name indicates a kaizen-related
module proceed with the util.find_spec("kaizen") / version("kaizen") checks and
raise the specialized RuntimeError about Kaizen integration; otherwise
immediately reraise exc.

src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2-6-6 (1)

6-6: ⚠️ Potential issue | 🟡 Minor

Typo: "RetryClaude" should be corrected.

The text "RetryClaude can make mistakes" appears to be a typo or artifact. Consider changing to "Claude can make mistakes" or "The assistant can make mistakes."

Proposed fix

-Use these memory facts when they are relevant to the request before concluding information is unavailable. RetryClaude can make mistakes. Please double-check responses.
+Use these memory facts when they are relevant to the request before concluding information is unavailable. The assistant can make mistakes. Please double-check responses.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2` at line
6, In the pmt.jinja2 prompt template update the sentence "RetryClaude can make
mistakes." to a correct phrase such as "Claude can make mistakes" or "The
assistant can make mistakes"; locate the string inside the prompt template (the
memory-facts instruction block) and replace the token "RetryClaude" with the
chosen correct term so the prompt reads e.g. "Claude can make mistakes."

README.md-810-812 (1)

810-812: ⚠️ Potential issue | 🟡 Minor

The memory demo commands disagree.

This section first says to run cuga start demo, but the actual walkthrough below requires cuga start demo_crm --sample-memory-data. One of these is stale.

📝 Suggested fix

-3. Run `cuga start demo`
+3. Run `cuga start demo_crm --sample-memory-data`

Also applies to: 821-824

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 810 - 812, The README contains inconsistent demo
start commands: some places use "cuga start demo" while the walkthrough requires
"cuga start demo_crm --sample-memory-data"; update the README so both
occurrences (and any other mentions) consistently instruct users to run "cuga
start demo_crm --sample-memory-data" (or alternatively change the walkthrough to
use "cuga start demo" if that is the intended command), ensuring the example and
the subsequent walkthrough match and updating the step list and any references
at the other noted location to the chosen command.

src/cuga/backend/memory/utils/memory_tips_formatted.py-40-44 (1)

40-44: ⚠️ Potential issue | 🟡 Minor

Normalize search_entities output before len(...).

Line 40 assumes tips is list-like. If backend returns None/unexpected type, this path raises and interrupts the request.

🛡️ Suggested fix

-    raw_count = len(tips)
+    tips = tips or []
+    if not isinstance(tips, list):
+        tips = list(tips)
+    raw_count = len(tips)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cuga/backend/memory/utils/memory_tips_formatted.py` around lines 40 - 44,
The code assumes variable tips is a list-like before calling len(...) and list
comprehensions (raw_count, raw_types, raw_ids and subsequent filtering); guard
and normalize the backend output first by checking tips and converting it to a
safe iterable (e.g., if tips is None set tips = [], if it’s a single object wrap
it in [tips], or if it’s not list-like try to cast to list) before computing
raw_count/raw_types/raw_ids and before filtering for type == "guideline" so the
rest of the logic (raw_count, raw_types, raw_ids, and the tips = [ ... ] filter)
never crashes on unexpected types.

docs/memory/README.md-7-7 (1)

7-7: ⚠️ Potential issue | 🟡 Minor

Typo: "levearge" should be "leverage".

📝 Proposed fix

-CUGA execution can be enhanced by enabling episodic memory, which introduces the ability to levearge previously identified insights and relevant experiences when generating the final answer.
+CUGA execution can be enhanced by enabling episodic memory, which introduces the ability to leverage previously identified insights and relevant experiences when generating the final answer.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/memory/README.md` at line 7, Fix the typo in the docs line containing
"CUGA execution can be enhanced..." by replacing the misspelled word "levearge"
with "leverage" so the sentence reads "...introduces the ability to leverage
previously identified insights..."; locate the phrase "CUGA execution can be
enhanced by enabling episodic memory" in docs/memory/README.md and update the
single word.

docs/memory/README.md-61-61 (1)

61-61: ⚠️ Potential issue | 🟡 Minor

Add a language specifier to the fenced code block.

Per static analysis (MD040), fenced code blocks should have a language specified for proper syntax highlighting. Since this shows a directory structure, use text or plaintext.
📝 Proposed fix
-```
+```text
 cuga/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/memory/README.md` at line 61, Update the fenced code block that contains
the directory listing line "cuga/" to include a language specifier (e.g., change
the opening ``` to ```text or ```plaintext) so the block complies with MD040 and
enables proper syntax highlighting; locate the fenced block in the README that
currently begins with ``` and replace it with ```text.

🧹 Nitpick comments (8)

pyproject.toml (1)
32-37: Remove the duplicate litellm requirement.

litellm is now declared twice here. Keeping only litellm>=1.0.0 makes the dependency set unambiguous and avoids redundant metadata during resolution.
♻️ Suggested cleanup
-    "litellm",
     "jinja2",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 32 - 37, Remove the duplicate unversioned
dependency "litellm" in pyproject.toml and keep only the explicit versioned
entry "litellm>=1.0.0"; edit the dependency list to delete the bare "litellm"
line so the file contains a single clear declaration ("litellm>=1.0.0") and
re-run dependency resolution to verify no other duplicates remain.
src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2 (1)
3-4: Consider handling empty memory context gracefully.

When memory_context is empty or undefined, the prompt will display an awkward section header with no content. Consider wrapping this in a conditional block.
Proposed fix
-Persistent memory facts about the user:
-{{memory_context}}
+{% if memory_context %}
+Persistent memory facts about the user:
+{{memory_context}}
+{% endif %}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2` around
lines 3 - 4, The template currently always renders the "Persistent memory facts
about the user:" header even when memory_context is empty; update the pmt.jinja2
template to wrap the header and the {{memory_context}} output in a Jinja2
conditional that checks if memory_context is defined and non-empty (e.g., if
memory_context), so the whole block is omitted when there are no memory facts;
ensure you reference the existing memory_context variable and keep the rest of
the prompt intact.
src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_node.py (1)
283-285: Consider reducing debug log verbosity.

The [FACT DEBUG] logging is useful during development but may add noise in production. Consider using logger.debug consistently (already being used) or adding a feature flag for verbose memory tracing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_node.py` around lines
283 - 285, The info-level log in CugaLiteNode that mentions fact storage is too
verbose for production; update the call to use logger.debug instead of
logger.info (referencing the logger.info call in CugaLiteNode that mentions
state.user_preferences), or gate the message behind an explicit feature
flag/setting (e.g., a verbose_memory_tracing or DEBUG_MEMORY flag) so the
message only emits when that flag is enabled; ensure the check or log level
change is applied where the current logger.info is invoked so other info logs
remain unaffected.
src/cuga/backend/cuga_graph/state/agent_state.py (1)
1099-1102: Broad exception catch is acceptable here, but consider logging the exception type.

The static analyzer flags the bare Exception catch. Since preference loading is non-critical and shouldn't fail the entire operation, this is acceptable. However, logging the exception type could help with debugging.
📝 Optional improvement
     except Exception as e:
-        logger.error(f"Error loading user preferences: {e}")
+        logger.error(f"Error loading user preferences ({type(e).__name__}): {e}")
         # Don't fail the entire operation if preference loading fails
         state.user_preferences = {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cuga/backend/cuga_graph/state/agent_state.py` around lines 1099 - 1102,
The catch block currently logs only the exception message when loading
preferences ("except Exception as e: logger.error(f\"Error loading user
preferences: {e}\")") which hides the exception type; update the logging to
include the exception class/type (e.g., include type(e) or e.__class__.__name__)
so the logger.error call for the preference-loading failure (the block that sets
state.user_preferences = {}) records both the message and the exception type to
aid debugging while keeping the broad Exception catch.
src/frontend_workspaces/agentic_chat/src/CardManager.tsx (1)
369-373: Consider memoizing or caching stringified comparisons for large objects.

Using JSON.stringify for deep equality checks on variables and memories works correctly but may become expensive if these objects grow large. For now this is acceptable given the typical size of preference data, but consider a dedicated deep-equal utility if performance becomes a concern.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/frontend_workspaces/agentic_chat/src/CardManager.tsx` around lines 369 -
373, The current entryChanged calculation in CardManager (the entryChanged
constant comparing existing.title, existing.timestamp, and
JSON.stringify(existing.variables)/JSON.stringify(existing.memories) to entry.*)
uses JSON.stringify for deep equality which can be expensive for large objects;
change it to avoid repeated stringification by either (a) memoizing the
stringified representation on the objects when they are created/updated and
compare those cached strings (e.g., add
cachedVariablesString/cachedMemoriesString fields or compute once and reuse), or
(b) replace the JSON.stringify comparisons with a dedicated deep-equal utility
(e.g., lodash.isEqual or a small deepEqual function) and use
isEqual(existing.variables, entry.variables) and isEqual(existing.memories,
entry.memories) inside the entryChanged calculation. Ensure you update/maintain
the cache whenever variables/memories are modified and reference the
entryChanged constant to locate where to apply the change.
src/cuga/backend/cuga_graph/nodes/api/api_planner_agent/api_planner_agent.py (1)
220-257: Trim the duplicated memory block in _format_preferences_for_prompt().

The formatter emits a structured-facts section and then immediately re-emits the same facts again under category headings. That makes every planner turn pay double tokens for the same context. Reusing the selected structured facts, or capping the category dump, would keep this prompt cheaper and easier to tune.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cuga/backend/cuga_graph/nodes/api/api_planner_agent/api_planner_agent.py`
around lines 220 - 257, The method _format_preferences_for_prompt currently
appends structured_facts into lines and then iterates preferences to re-emit the
same facts under category headings, duplicating context; modify the logic so
that when structured_facts is non-empty you skip the categorized preferences
dump (or alternatively only include categories for facts not present in
structured_facts). Concretely, inside _format_preferences_for_prompt, after
building the "Structured Facts For Decision Making" block (using
APIPlannerAgent._format_fact_pointer, _escape_inline_code and _safe_text), add a
guard around the subsequent preferences handling (the branch that checks
first_value and iterates preferences) to return or skip that block if
structured_facts exists, or filter each category's facts by excluding items
whose pointer/id matches any pointer built from structured_facts before
appending to lines. Ensure you reference structured_facts, preferences, and the
lines list used to join the final prompt.
src/cuga/backend/activity_tracker/tracker.py (2)
652-696: Isolate guideline failures per agent.

The outer try aborts the whole loop on the first generate_tips() or update_entities() failure, so one bad agent drops the remaining guideline writes for the same trajectory. Catching/logging inside the loop would make this path much more resilient.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cuga/backend/activity_tracker/tracker.py` around lines 652 - 696, The
try/except currently wraps the whole loop so a failure in generate_tips or
self.memory.update_entities for one agent aborts processing for all agents; move
the try/except inside the for agent_id, agent_trajectory_messages in
agent_scoped_trajectories loop so each agent's work is isolated, i.e., call
generate_tips(agent_trajectory_messages) and the subsequent unique_tips ->
entities -> self.memory.update_entities within a per-agent try block, catch
exceptions there (using logger.exception) including agent_id and user_id in the
log, and continue to the next agent so one bad agent doesn't drop remaining
guideline writes; keep the outer early-return checks and use the same
functions/methods: _build_agent_scoped_trajectory_messages, generate_tips, and
self.memory.update_entities.
807-826: Consider deferring trajectory writes to avoid blocking step execution.

collect_step() persists a trajectory entity inline on every step (line 18). If KaizenClient.update_entities() performs I/O synchronously, agent step latency is directly coupled to memory write latency and availability. The code already uses asyncio.create_task() for deferred work elsewhere (lines 42–50); applying the same pattern here would decouple memory operations from the critical path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cuga/backend/activity_tracker/tracker.py` around lines 807 - 826,
collect_step currently calls self.memory.update_entities inline which can block
step execution; instead, create a deferred task to perform the memory write
asynchronously. Wrap the update_entities call (including the Entity payload and
enable_conflict_resolution=False) in an async helper or use asyncio.to_thread if
update_entities is synchronous, then schedule it with asyncio.create_task(...)
so the call returns immediately; reference the collect_step function and the
memory.update_entities invocation along with the Entity metadata (timestamp,
step_index, step_name, intent, user_id, task_id, session_id) to ensure the same
payload is passed to the background task.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/examples/demo_apps/file_system/build/lib/main.py`:
- Around line 146-164: The current diff generation in the block that compares
original_content and content (using original_lines/new_lines and a zip-based
loop) misses insertions/deletions because zip stops at the shorter list; replace
this manual loop with difflib.unified_diff to produce a correct unified diff
(import difflib, call difflib.unified_diff(original_lines, new_lines,
fromfile=f"--- {file_path}", tofile=f"+++ {file_path}", lineterm="") ), preserve
the existing status message logic around dry_run, and return the joined
unified_diff lines instead of the zip-generated diff so insertions, deletions
and EOF changes are captured correctly.
- Around line 48-63: validate_path's string-prefix check is unsafe (fails for
root allowlists and is vulnerable to TOCTOU via symlink swaps); update
validate_path to canonicalize both the candidate (from
expand_home/normalize_path) and each allowed_directories entry as
pathlib.Path.resolve(...) and then perform a true ancestor check using
Path.relative_to or os.path.commonpath to confirm the candidate is inside an
allowed_dir (handle root like "/" or "C:/" correctly by canonicalizing first),
and in addition revalidate the resolved path at the syscall boundary for any
mutating operation (open/rename/makedirs) by resolving the path again
immediately before the syscall and/or using safe syscalls (e.g., os.open with
O_NOFOLLOW or compare parent directory inodes) to prevent symlink races;
reference validate_path, expand_home, normalize_path, and allowed_directories
when making these changes.
- Around line 176-205: The non-recursive branch only matches basenames from
os.listdir(root_path) so patterns containing path segments (e.g., "src/*.py")
never match and exclude_patterns aren't applied; change the else branch in
main.py to glob against os.path.join(root_path, pattern) (using glob.glob with
recursive=False or without "**") to get full paths, then apply the same
exclusion logic used in the '**' branch (checking rel_path against
exclude_patterns with fnmatch.fnmatch and the ** variants) before appending to
results so path-segment patterns and exclusions behave consistently (update
references: pattern, root_path, results, exclude_patterns, rel_path).
- Around line 443-470: The build_tree function can recurse into symlinked
directories causing infinite loops; modify build_tree (and its
caller/initialization) to detect and avoid symlink loops by tracking visited
real paths or inodes: before descending from entry_path (where
os.path.isdir(entry_path) is checked), resolve real =
os.path.realpath(entry_path) (or use os.stat to get st_ino/st_dev) and maintain
a visited set passed through recursive calls; if real (or inode tuple) is
already in visited or if os.path.islink(entry_path) and you prefer to skip
symlinks, skip adding children and do not recurse; ensure validate_path usage
remains and update build_tree signature to accept the visited set if needed.

In `@src/cuga/backend/activity_tracker/tracker.py`:
- Around line 641-647: _snapshot self.task_id and self.session_id into local
variables at the start of _persist_guidelines_from_trajectory and use those
local variables everywhere you build guideline entity metadata or enqueue the
guideline job instead of reading self.task_id/self.session_id later; do the same
change for the other coroutines that build/enqueue guideline jobs (the blocks
referenced at lines 671-679 and 832-836) so the queued job uses the captured
task_id/session_id values rather than potentially changed singleton attributes.

In `@src/cuga/backend/cuga_graph/nodes/api/api_code_agent.py`:
- Around line 54-56: The hardcoded "if False and (" guard prevents the Kaizen
write path from ever executing; remove the "False and" so the condition uses the
real feature gates (e.g., enable_fact and enable_memory) instead. Locate the
conditional in api_code_agent.py (the block starting with "if False and (") and
change it to evaluate the actual flags (for example: "if enable_fact and
enable_memory and ( ... )") so the fact extraction/write logic inside the
existing function/method (the Kaizen write path) can run when configured.

In
`@src/cuga/backend/cuga_graph/nodes/api/api_planner_agent/prompts/user_hitl.jinja2`:
- Around line 3-8: The prompt fragment that injects remembered preferences
(user_preferences_context) should be reworded to treat those memories as
non-mandatory hints: change the line "must influence planning decisions" to a
precedence rule stating that current user instructions and session context take
priority and that any conflict between user_preferences_context and the current
request should not be auto-applied but must be flagged for confirmation;
reference the template variable user_preferences_context and ensure the prompt
explicitly instructs the planner to use memories only as suggestions and to ask
the user to confirm before applying conflicting remembered preferences.

In `@src/cuga/backend/cuga_graph/nodes/api/api_planner.py`:
- Around line 197-199: The conditional that gates planner memory recall has been
hard-disabled by "if False and settings.advanced_features.enable_fact and
settings.advanced_features.enable_memory:"; restore the real feature flag check
by removing the hardcoded False so the block is executed when both
settings.advanced_features.enable_fact and
settings.advanced_features.enable_memory are true, ensuring the Kaizen/fact
retrieval path is re-enabled for API planning (update the if condition in the
same block and keep the existing retrieval code untouched).

In `@src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt_chat.jinja2`:
- Around line 15-16: The prompt currently injects the raw memory blob
(memory_context) as authoritative; change it so memory_context is fenced with
explicit delimiters (e.g., "BEGIN MEMORY" / "END MEMORY") and prefaced as
reference-only data, and add a clear rule that "if any memory item conflicts
with the latest user message, the latest user message takes precedence"; apply
this change where memory_context is inserted (around the insert using the
memory_context template variable at the locations corresponding to lines 15-16
and 34-35) so the agent treats memories as non-authoritative reference material.

In `@src/cuga/backend/cuga_graph/nodes/chat/chat.py`:
- Around line 145-152: The code currently calls sync_user_memory whenever
settings.advanced_features.enable_fact is true; change the condition to require
persistent memory to be enabled as well by also checking
settings.advanced_features.enable_memory before importing/calling
sync_user_memory. Update the branch that contains sync_user_memory (the block
referencing memory_user_id, state.user_preferences and await
sync_user_memory(...)) to run only when both enable_fact and enable_memory are
true, so that sync_user_memory is not attempted if memory is disabled (leave the
try/except intact).
- Around line 155-163: Avoid logging raw PII: replace direct uses of
state.user_id in the logger.debug and logger.info calls with a nondeterministic
or hashed/anonymized identifier (e.g., call a helper like
anonymize_user_id(state.user_id) or mask_user_id(state.user_id) before logging).
Update the two occurrences that reference state.user_id (the
logger.debug(f"Stored categorized fact(s) for user {state.user_id}") and the
logger.info(...) that embeds state.user_id) to log the anonymized ID instead,
and add or reuse a small helper function (anonymize_user_id / mask_user_id) to
perform the masking or hashing consistently across the module.

In `@src/cuga/backend/memory/cli.py`:
- Around line 41-49: The helper _load_kaizen_exception_types currently only
catches ModuleNotFoundError when importing via
import_module("kaizen.schema.exceptions"), but getattr() calls for
KaizenException and NamespaceNotFoundException can raise AttributeError; update
the try/except to catch both ModuleNotFoundError and AttributeError around the
import+getattr sequence, and call _fail_missing_kaizen(exc) for either exception
so missing or renamed symbols produce the same friendly failure; reference
_load_kaizen_exception_types, import_module, getattr for KaizenException and
NamespaceNotFoundException, and _fail_missing_kaizen when applying the change.

In `@src/cuga/backend/memory/memory.py`:
- Around line 245-256: The Kaizen write/read calls in sync_user_memory (calls to
kaizen_client.store_user_facts and kaizen_client.retrieve_user_facts) must be
fault-tolerant: wrap the store_user_facts call and the retrieve_user_facts call
in try/except blocks, log the exception (including context like
resolved_namespace, normalized_user_id, and query) and ensure
retrieve_user_facts falls back to an empty dict {} on error so a transient
Kaizen outage doesn't raise and abort the agent turn; keep the existing behavior
when calls succeed.
- Around line 33-57: The compatibility guard in memory.py only catches
ModuleNotFoundError for the block importing FilesystemSettings, KaizenConfig,
llm_settings, MilvusDBSettings, and KaizenClient, but failed symbol renames
raise ImportError, so change the except to catch both ModuleNotFoundError and
ImportError (e.g., except (ModuleNotFoundError, ImportError) as exc) so the
curated RuntimeError path is used when any import fails; keep the existing
util.find_spec/kaizen_version logic and re-raise the same RuntimeError from exc.

In `@src/cuga/backend/memory/utils/memory_tips_formatted.py`:
- Around line 31-37: The logger.warning calls (e.g., the one invoking
logger.warning with namespace_id, normalized_agent_id, (query or "")[:120], e)
are logging raw memory content and user queries; change these log messages to
never include the raw query or memory text and instead log only safe metadata
such as namespace_id, normalized_agent_id, and counts/lengths (e.g.,
query_length or result_count) plus the error object (e). Update every occurrence
of this pattern (the logger.warning calls around the shown diff and the similar
occurrences at the other two locations) to replace raw text with sanitized
fields like query_length and result_count or a masked placeholder.

In `@src/cuga/backend/server/main.py`:
- Around line 212-216: The helper _get_memories_for_ui should normalize legacy
memory payloads in local_state.user_preferences before returning them: keep the
settings.advanced_features.enable_fact guard, but when user_preferences is a
dict and its values are not already lists (the old {fact_id: content} shape),
transform it into the categorized Record<string, any[]> shape expected by the UI
by mapping those entries into a category (e.g., "Facts") as an array of items
(include the original fact id and its content as object fields), and otherwise
return user_preferences unchanged when it already matches the categorized lists
shape.

In `@src/cuga/configurations/models/settings.openai.toml`:
- Around line 51-64: Rename the four legacy TOML keys in settings.litellm.toml
so they match the new memory.kaizen.* keys that the runtime expects: change
memory.mem0.model -> memory.kaizen.conflict_resolution.model,
memory.milvus.step_processing.model -> memory.kaizen.conflict_resolution.model
(or merge if intended), memory.milvus.fact_extraction.model ->
memory.kaizen.fact_extraction.model, and memory.tips_extractor.model ->
memory.kaizen.tips.model; ensure the platform, temperature, and max_tokens
values are preserved under the new keys so the code in memory.py that reads
memory.kaizen.conflict_resolution.model, memory.kaizen.fact_extraction.model,
and memory.kaizen.tips.model will pick them up.

In `@src/frontend_workspaces/agentic_chat/src/VariablesSidebar.tsx`:
- Around line 60-72: The current rawFacts.forEach callback ignores facts that
lack a string content, dropping valid structured facts; change the extraction
inside the loop used by VariablesSidebar (the rawFacts.forEach and facts.push
logic) to compute content as: if fact.content is a non-empty string use it,
otherwise if fact.key or fact.value exist build a display string from those
(e.g. "key: value" or the non-empty key/value alone) and treat that as content
so you don't return early; keep id/key/value assignments unchanged (id from
fact.id, key from fact.key, value from fact.value) and only skip pushing when
neither content nor key/value exist.

---

Outside diff comments:
In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py`:
- Around line 947-977: The code is mutating state.chat_messages by writing
injected blocks (memory_decision_context, playbook_guidance, variables_addendum)
back into HumanMessage objects, which pollutes the canonical user utterance used
later (memory_query); instead, stop updating state.chat_messages in
cuga_lite_graph and apply all augmentations only to the messages_for_model list
(or a separate augmented_messages list) passed to call_model(); preserve the
original content in state.chat_messages and only construct modified HumanMessage
instances for messages_for_model using memory_section_header,
memory_decision_context, playbook_guidance, and variables_addendum (also make
the same change for the similar block around lines 995-1023).

In `@src/cuga/backend/server/main.py`:
- Around line 1104-1110: The code is persisting full memory snapshots into the
stored answer events; update the logic that builds and appends the stored
payload (the object currently constructed with keys like "data" / event.answer,
"variables" / variables_metadata, "memories", "memories_count",
"active_policies") so that only a redacted version is persisted to
stream_events_buffer and the stream-events store—e.g., include memories_count
and safe identifiers or omit the "memories" field entirely—instead of embedding
full memories or final_answer_text that contains them; continue to send the
enriched payload (with full memories) to the live client but ensure the
persisted object (where stream_events_buffer, final_answer_text, event.answer,
variables_metadata, memories_count, active_policies are used) does not contain
personal memory content; apply the same redaction change to the other similar
block referenced around the 1123–1143 area.

In `@src/frontend_workspaces/agentic_chat/src/LeftSidebar.tsx`:
- Line 2: The file imports several icons from lucide-react but misses the Brain
icon used as <Brain size={16} /> in LeftSidebar.tsx; update the import statement
that currently lists MessageSquare, Database, ChevronLeft, ChevronRight, Plus,
Trash2, Workflow, Info, HelpCircle to also include Brain so the Brain symbol is
defined and the component renders correctly.

---

Minor comments:
In `@docs/examples/demo_apps/file_system/build/lib/main.py`:
- Around line 231-240: The current dispatch uses truthiness for head/tail so
values like 0 or negatives behave incorrectly; update the logic in the block
around validate_path/head_file/tail_file to check "is not None" for both head
and tail, validate that any provided head or tail is a positive integer (raise
ValueError for non-positive or non-integer inputs), keep the mutual-exclusion
check (raise if both are not None), and then call tail_file(valid_path, tail) or
head_file(valid_path, head) only when the corresponding parameter is explicitly
provided and validated.

In `@docs/memory/README.md`:
- Line 7: Fix the typo in the docs line containing "CUGA execution can be
enhanced..." by replacing the misspelled word "levearge" with "leverage" so the
sentence reads "...introduces the ability to leverage previously identified
insights..."; locate the phrase "CUGA execution can be enhanced by enabling
episodic memory" in docs/memory/README.md and update the single word.
- Line 61: Update the fenced code block that contains the directory listing line
"cuga/" to include a language specifier (e.g., change the opening ``` to ```text
or ```plaintext) so the block complies with MD040 and enables proper syntax
highlighting; locate the fenced block in the README that currently begins with
``` and replace it with ```text.

In `@README.md`:
- Around line 810-812: The README contains inconsistent demo start commands:
some places use "cuga start demo" while the walkthrough requires "cuga start
demo_crm --sample-memory-data"; update the README so both occurrences (and any
other mentions) consistently instruct users to run "cuga start demo_crm
--sample-memory-data" (or alternatively change the walkthrough to use "cuga
start demo" if that is the intended command), ensuring the example and the
subsequent walkthrough match and updating the step list and any references at
the other noted location to the chosen command.

In `@src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2`:
- Line 6: In the pmt.jinja2 prompt template update the sentence "RetryClaude can
make mistakes." to a correct phrase such as "Claude can make mistakes" or "The
assistant can make mistakes"; locate the string inside the prompt template (the
memory-facts instruction block) and replace the token "RetryClaude" with the
chosen correct term so the prompt reads e.g. "Claude can make mistakes."

In `@src/cuga/backend/memory/__init__.py`:
- Around line 23-40: The ModuleNotFoundError handler around
import_module(module_name) should only treat errors that are about Kaizen
itself; inspect exc.name and if it does not refer to a kaizen module (e.g., not
starting with "kaizen" or not equal to the top-level module_name root) re-raise
the original ModuleNotFoundError so missing transitive deps surface correctly.
Only when exc.name indicates a kaizen-related module proceed with the
util.find_spec("kaizen") / version("kaizen") checks and raise the specialized
RuntimeError about Kaizen integration; otherwise immediately reraise exc.

In `@src/cuga/backend/memory/utils/memory_tips_formatted.py`:
- Around line 40-44: The code assumes variable tips is a list-like before
calling len(...) and list comprehensions (raw_count, raw_types, raw_ids and
subsequent filtering); guard and normalize the backend output first by checking
tips and converting it to a safe iterable (e.g., if tips is None set tips = [],
if it’s a single object wrap it in [tips], or if it’s not list-like try to cast
to list) before computing raw_count/raw_types/raw_ids and before filtering for
type == "guideline" so the rest of the logic (raw_count, raw_types, raw_ids, and
the tips = [ ... ] filter) never crashes on unexpected types.

---

Nitpick comments:
In `@pyproject.toml`:
- Around line 32-37: Remove the duplicate unversioned dependency "litellm" in
pyproject.toml and keep only the explicit versioned entry "litellm>=1.0.0"; edit
the dependency list to delete the bare "litellm" line so the file contains a
single clear declaration ("litellm>=1.0.0") and re-run dependency resolution to
verify no other duplicates remain.

In `@src/cuga/backend/activity_tracker/tracker.py`:
- Around line 652-696: The try/except currently wraps the whole loop so a
failure in generate_tips or self.memory.update_entities for one agent aborts
processing for all agents; move the try/except inside the for agent_id,
agent_trajectory_messages in agent_scoped_trajectories loop so each agent's work
is isolated, i.e., call generate_tips(agent_trajectory_messages) and the
subsequent unique_tips -> entities -> self.memory.update_entities within a
per-agent try block, catch exceptions there (using logger.exception) including
agent_id and user_id in the log, and continue to the next agent so one bad agent
doesn't drop remaining guideline writes; keep the outer early-return checks and
use the same functions/methods: _build_agent_scoped_trajectory_messages,
generate_tips, and self.memory.update_entities.
- Around line 807-826: collect_step currently calls self.memory.update_entities
inline which can block step execution; instead, create a deferred task to
perform the memory write asynchronously. Wrap the update_entities call
(including the Entity payload and enable_conflict_resolution=False) in an async
helper or use asyncio.to_thread if update_entities is synchronous, then schedule
it with asyncio.create_task(...) so the call returns immediately; reference the
collect_step function and the memory.update_entities invocation along with the
Entity metadata (timestamp, step_index, step_name, intent, user_id, task_id,
session_id) to ensure the same payload is passed to the background task.

In
`@src/cuga/backend/cuga_graph/nodes/api/api_planner_agent/api_planner_agent.py`:
- Around line 220-257: The method _format_preferences_for_prompt currently
appends structured_facts into lines and then iterates preferences to re-emit the
same facts under category headings, duplicating context; modify the logic so
that when structured_facts is non-empty you skip the categorized preferences
dump (or alternatively only include categories for facts not present in
structured_facts). Concretely, inside _format_preferences_for_prompt, after
building the "Structured Facts For Decision Making" block (using
APIPlannerAgent._format_fact_pointer, _escape_inline_code and _safe_text), add a
guard around the subsequent preferences handling (the branch that checks
first_value and iterates preferences) to return or skip that block if
structured_facts exists, or filter each category's facts by excluding items
whose pointer/id matches any pointer built from structured_facts before
appending to lines. Ensure you reference structured_facts, preferences, and the
lines list used to join the final prompt.

In `@src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2`:
- Around line 3-4: The template currently always renders the "Persistent memory
facts about the user:" header even when memory_context is empty; update the
pmt.jinja2 template to wrap the header and the {{memory_context}} output in a
Jinja2 conditional that checks if memory_context is defined and non-empty (e.g.,
if memory_context), so the whole block is omitted when there are no memory
facts; ensure you reference the existing memory_context variable and keep the
rest of the prompt intact.

In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_node.py`:
- Around line 283-285: The info-level log in CugaLiteNode that mentions fact
storage is too verbose for production; update the call to use logger.debug
instead of logger.info (referencing the logger.info call in CugaLiteNode that
mentions state.user_preferences), or gate the message behind an explicit feature
flag/setting (e.g., a verbose_memory_tracing or DEBUG_MEMORY flag) so the
message only emits when that flag is enabled; ensure the check or log level
change is applied where the current logger.info is invoked so other info logs
remain unaffected.

In `@src/cuga/backend/cuga_graph/state/agent_state.py`:
- Around line 1099-1102: The catch block currently logs only the exception
message when loading preferences ("except Exception as e: logger.error(f\"Error
loading user preferences: {e}\")") which hides the exception type; update the
logging to include the exception class/type (e.g., include type(e) or
e.__class__.__name__) so the logger.error call for the preference-loading
failure (the block that sets state.user_preferences = {}) records both the
message and the exception type to aid debugging while keeping the broad
Exception catch.

In `@src/frontend_workspaces/agentic_chat/src/CardManager.tsx`:
- Around line 369-373: The current entryChanged calculation in CardManager (the
entryChanged constant comparing existing.title, existing.timestamp, and
JSON.stringify(existing.variables)/JSON.stringify(existing.memories) to entry.*)
uses JSON.stringify for deep equality which can be expensive for large objects;
change it to avoid repeated stringification by either (a) memoizing the
stringified representation on the objects when they are created/updated and
compare those cached strings (e.g., add
cachedVariablesString/cachedMemoriesString fields or compute once and reuse), or
(b) replace the JSON.stringify comparisons with a dedicated deep-equal utility
(e.g., lodash.isEqual or a small deepEqual function) and use
isEqual(existing.variables, entry.variables) and isEqual(existing.memories,
entry.memories) inside the entryChanged calculation. Ensure you update/maintain
the cache whenever variables/memories are modified and reference the
entryChanged constant to locate where to apply the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

coderabbitai · 2026-03-26T14:09:00Z

docs/examples/demo_apps/file_system/build/lib/main.py

+def validate_path(file_path: str) -> str:
+    """
+    Validate that a path is within allowed directories.
+    Returns the normalized, resolved path if valid.
+    Raises ValueError if path is outside allowed directories.
+    """
+    # Expand and resolve the path
+    expanded = expand_home(file_path)
+    resolved = normalize_path(expanded)
+
+    # Check if path is within any allowed directory
+    for allowed_dir in allowed_directories:
+        if resolved.startswith(allowed_dir + '/') or resolved == allowed_dir:
+            return resolved
+
+    raise ValueError(f"Access denied: {file_path} is outside allowed directories")


⚠️ Potential issue | 🔴 Critical

Harden validate_path before relying on it as the security boundary.

The current string-prefix check breaks for filesystem-root allowlists (/, C:/), and every mutating tool still performs the actual open/rename/makedirs on the returned pathname later. That leaves a TOCTOU window where a symlink swap can escape the allowlist. Prefer a real ancestor check (Path.relative_to / os.path.commonpath) and revalidate at the syscall boundary for write/move operations.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/examples/demo_apps/file_system/build/lib/main.py` around lines 48 - 63, validate_path's string-prefix check is unsafe (fails for root allowlists and is vulnerable to TOCTOU via symlink swaps); update validate_path to canonicalize both the candidate (from expand_home/normalize_path) and each allowed_directories entry as pathlib.Path.resolve(...) and then perform a true ancestor check using Path.relative_to or os.path.commonpath to confirm the candidate is inside an allowed_dir (handle root like "/" or "C:/" correctly by canonicalizing first), and in addition revalidate the resolved path at the syscall boundary for any mutating operation (open/rename/makedirs) by resolving the path again immediately before the syscall and/or using safe syscalls (e.g., os.open with O_NOFOLLOW or compare parent directory inodes) to prevent symlink races; reference validate_path, expand_home, normalize_path, and allowed_directories when making these changes.

coderabbitai · 2026-03-26T14:09:01Z

docs/examples/demo_apps/file_system/build/lib/main.py

+    # Generate diff
+    if original_content == content:
+        return "No changes made"
+
+    diff_lines = []
+    diff_lines.append(f"--- {file_path}")
+    diff_lines.append(f"+++ {file_path}")
+
+    original_lines = original_content.splitlines(keepends=True)
+    new_lines = content.splitlines(keepends=True)
+
+    # Simple diff generation
+    for i, (old_line, new_line) in enumerate(zip(original_lines, new_lines)):
+        if old_line != new_line:
+            diff_lines.append(f"- {old_line.rstrip()}")
+            diff_lines.append(f"+ {new_line.rstrip()}")
+
+    status = "Dry run - no changes made" if dry_run else "Changes applied successfully"
+    return f"{status}\n\n" + '\n'.join(diff_lines)


⚠️ Potential issue | 🟠 Major

The generated diff is incomplete for insertions and deletions.

This zip-based comparison only walks the shorter side, so appending/removing lines—especially at EOF—can return an empty or truncated diff even when the file changed. difflib.unified_diff is a better fit here.

🧰 Tools

🪛 Ruff (0.15.7)

[warning] 158-158: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

[warning] 158-158: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/examples/demo_apps/file_system/build/lib/main.py` around lines 146 - 164, The current diff generation in the block that compares original_content and content (using original_lines/new_lines and a zip-based loop) misses insertions/deletions because zip stops at the shorter list; replace this manual loop with difflib.unified_diff to produce a correct unified diff (import difflib, call difflib.unified_diff(original_lines, new_lines, fromfile=f"--- {file_path}", tofile=f"+++ {file_path}", lineterm="") ), preserve the existing status message logic around dry_run, and return the joined unified_diff lines instead of the zip-generated diff so insertions, deletions and EOF changes are captured correctly.

coderabbitai · 2026-03-26T14:09:01Z

docs/examples/demo_apps/file_system/build/lib/main.py

+    # Handle glob patterns
+    if '**' in pattern:
+        # Use glob for recursive patterns
+        glob_pattern = os.path.join(root_path, pattern)
+        matches = glob.glob(glob_pattern, recursive=True)
+
+        for match in matches:
+            rel_path = os.path.relpath(match, root_path)
+
+            # Check exclusions
+            excluded = False
+            for exclude in exclude_patterns:
+                if (
+                    fnmatch.fnmatch(rel_path, exclude)
+                    or fnmatch.fnmatch(rel_path, f"**/{exclude}")
+                    or fnmatch.fnmatch(rel_path, f"**/{exclude}/**")
+                ):
+                    excluded = True
+                    break
+
+            if not excluded:
+                results.append(match)
+    else:
+        # Simple pattern in current directory
+        for item in os.listdir(root_path):
+            item_path = os.path.join(root_path, item)
+            if fnmatch.fnmatch(item, pattern):
+                results.append(item_path)
+
+    return results


⚠️ Potential issue | 🟠 Major

Non-recursive glob patterns with path segments never match.

When pattern does not contain **, the code only compares it against the basenames returned by os.listdir(root_path). Patterns like src/*.py therefore always miss, and exclude_patterns are ignored in this branch.

🧭 Suggested fix

- else: - # Simple pattern in current directory - for item in os.listdir(root_path): - item_path = os.path.join(root_path, item) - if fnmatch.fnmatch(item, pattern): - results.append(item_path) + else: + matches = glob.glob(os.path.join(root_path, pattern), recursive=False) + for match in matches: + rel_path = os.path.relpath(match, root_path) + excluded = any( + fnmatch.fnmatch(rel_path, exclude) + or fnmatch.fnmatch(rel_path, f"**/{exclude}") + or fnmatch.fnmatch(rel_path, f"**/{exclude}/**") + for exclude in exclude_patterns + ) + if not excluded: + results.append(match)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/examples/demo_apps/file_system/build/lib/main.py` around lines 176 - 205, The non-recursive branch only matches basenames from os.listdir(root_path) so patterns containing path segments (e.g., "src/*.py") never match and exclude_patterns aren't applied; change the else branch in main.py to glob against os.path.join(root_path, pattern) (using glob.glob with recursive=False or without "**") to get full paths, then apply the same exclusion logic used in the '**' branch (checking rel_path against exclude_patterns with fnmatch.fnmatch and the ** variants) before appending to results so path-segment patterns and exclusions behave consistently (update references: pattern, root_path, results, exclude_patterns, rel_path).

coderabbitai · 2026-03-26T14:09:01Z

docs/examples/demo_apps/file_system/build/lib/main.py

+    def build_tree(current_path: str, root_path: str) -> List[Dict[str, Any]]:
+        valid_path = validate_path(current_path)
+        entries = os.listdir(valid_path)
+        result = []
+
+        for entry in sorted(entries):
+            entry_path = os.path.join(current_path, entry)
+            rel_path = os.path.relpath(entry_path, root_path)
+
+            # Check exclusions
+            excluded = False
+            for pattern in excludePatterns:
+                if (
+                    fnmatch.fnmatch(rel_path, pattern)
+                    or fnmatch.fnmatch(rel_path, f"**/{pattern}")
+                    or fnmatch.fnmatch(rel_path, f"**/{pattern}/**")
+                ):
+                    excluded = True
+                    break
+
+            if excluded:
+                continue
+
+            is_dir = os.path.isdir(entry_path)
+            entry_data = {'name': entry, 'type': 'directory' if is_dir else 'file'}
+
+            if is_dir:
+                entry_data['children'] = build_tree(entry_path, root_path)


⚠️ Potential issue | 🟠 Major

Guard directory_tree against symlink loops.

os.path.isdir(entry_path) follows symlinks, so a loop such as loop -> .. will recurse until RecursionError or hang the request. Either skip symlinked directories or track visited realpaths/inodes before descending.

🔁 Suggested fix

- def build_tree(current_path: str, root_path: str) -> List[Dict[str, Any]]: + def build_tree(current_path: str, root_path: str, seen: set[str]) -> List[Dict[str, Any]]: valid_path = validate_path(current_path) + real_current = os.path.realpath(valid_path) + if real_current in seen: + return [] + seen.add(real_current) entries = os.listdir(valid_path) result = [] @@ - is_dir = os.path.isdir(entry_path) + is_dir = os.path.isdir(entry_path) and not os.path.islink(entry_path) entry_data = {'name': entry, 'type': 'directory' if is_dir else 'file'} if is_dir: - entry_data['children'] = build_tree(entry_path, root_path) + entry_data['children'] = build_tree(entry_path, root_path, seen) @@ - tree_data = build_tree(root_path, root_path) + tree_data = build_tree(root_path, root_path, set())

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/examples/demo_apps/file_system/build/lib/main.py` around lines 443 - 470, The build_tree function can recurse into symlinked directories causing infinite loops; modify build_tree (and its caller/initialization) to detect and avoid symlink loops by tracking visited real paths or inodes: before descending from entry_path (where os.path.isdir(entry_path) is checked), resolve real = os.path.realpath(entry_path) (or use os.stat to get st_ino/st_dev) and maintain a visited set passed through recursive calls; if real (or inode tuple) is already in visited or if os.path.islink(entry_path) and you prefer to skip symlinks, skip adding children and do not recurse; ensure validate_path usage remains and update build_tree signature to accept the visited set if needed.

coderabbitai · 2026-03-26T14:09:01Z

src/cuga/backend/activity_tracker/tracker.py

+    async def _persist_guidelines_from_trajectory(
+        self,
+        *,
+        namespace_id: str,
+        trajectory_messages: list[dict[str, str]],
+        user_id: str,
+    ) -> None:


⚠️ Potential issue | 🟠 Major

Snapshot task_id and session_id before queuing the guideline job.

This coroutine runs after collect_step() returns, but it still reads self.task_id and self.session_id from the singleton when building entity metadata. If the tracker is reset for the next task first, guidelines from one trajectory can be written under the next task/session.

🔧 Suggested fix

async def _persist_guidelines_from_trajectory( self, *, namespace_id: str, trajectory_messages: list[dict[str, str]], user_id: str, + task_id: str, + session_id: str, ) -> None:

metadata={ "category": tip.category, "rationale": tip.rationale, "trigger": tip.trigger, "agent_id": agent_id, "user_id": user_id, - "task_id": self.task_id, - "session_id": self.session_id, + "task_id": task_id, + "session_id": session_id, "task_description": result.task_description, },

task = asyncio.create_task( self._persist_guidelines_from_trajectory( namespace_id=namespace_id, trajectory_messages=copy.deepcopy(self.trajectory_messages), user_id=normalized_user_id, + task_id=self.task_id, + session_id=self.session_id, ) )

Also applies to: 671-679, 832-836

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/activity_tracker/tracker.py` around lines 641 - 647, _snapshot self.task_id and self.session_id into local variables at the start of _persist_guidelines_from_trajectory and use those local variables everywhere you build guideline entity metadata or enqueue the guideline job instead of reading self.task_id/self.session_id later; do the same change for the other coroutines that build/enqueue guideline jobs (the blocks referenced at lines 671-679 and 832-836) so the queued job uses the captured task_id/session_id values rather than potentially changed singleton attributes.

coderabbitai · 2026-03-26T14:09:01Z

src/cuga/backend/memory/memory.py

+    if query:
+        kaizen_client.store_user_facts(
+            namespace_id=resolved_namespace,
+            message=query,
+            user_id=normalized_user_id,
        )

-        # Extract facts from the response (assuming similar structure to old implementation)
-        facts = [fact.content for fact in recorded_facts]
-
-        # Print debug info (maintaining original behavior)
-        print(query)
-        print("------ICLs--------")
-        for f in facts:
-            print(f)
-
-        return facts
-
-    def create_run(self, namespace_id: str, run_id: str | None = None) -> Run:
-        """Create a new run to track Agent steps."""
-        return self.memory_client.create_run(namespace_id, run_id)
-
-    def get_run(self, namespace_id: str, run_id: str) -> Run:
-        """Get an existing run."""
-        return self.memory_client.get_run(namespace_id, run_id)
-
-    def delete_run(self, namespace_id: str, run_id: str):
-        """Delete an existing run."""
-        return self.memory_client.delete_run(namespace_id, run_id)
-
-    def search_runs(
-        self, namespace_id: str, query: str | None = None, filters: dict[str, str] | None = None
-    ) -> Run | None:
-        """Search a namespace for a run based on it's step which best matches a query."""
-        return self.memory_client.search_runs(namespace_id, query, filters)
-
-    async def end_run(self, namespace_id: str, run_id: str):
-        """End an existing run."""
-        return await self.memory_client.end_run(namespace_id, run_id)
-
-    def add_step(self, namespace_id: str, run_id: str, step: dict, prompt: str) -> MemoryEvent:
-        """Add a new step into a run."""
-        return self.memory_client.add_step(namespace_id, run_id, step, prompt)
-
-    def list_runs(self, namespace_id: str, limit: int = 10) -> list[Run]:
-        """Retrieve the list of runs in a namespace."""
-        return self.memory_client.list_runs(namespace_id, limit)
-
-    def _get_user_id(self, state: "AgentState") -> str:
-        """Extract or generate user ID for memory scoping"""
-        # Use the pi field from AgentState
-        if hasattr(state, 'pi') and state.pi:
-            pi_dict = json.loads(state.pi)
-            state.user_id = str(f"{pi_dict["first_name"]}_{pi_dict["last_name"]}_{pi_dict["phone_number"]}")
-        else:
-            state.user_id = "default_user"
-        self.user_id = state.user_id
-        return state.user_id
+    preferences = kaizen_client.retrieve_user_facts(
+        namespace_id=resolved_namespace,
+        user_id=normalized_user_id,
+        query=query,
+    )


⚠️ Potential issue | 🟠 Major

Guard the Kaizen write/read path so memory outages don't fail the whole turn.

store_user_facts() and retrieve_user_facts() are request-path external calls, but any transient backend error currently bubbles out of sync_user_memory() and aborts the agent response. This path should log and fall back to {} so memory degradation doesn't become a hard outage.

🛟 Graceful fallback

resolved_namespace = namespace_id or get_kaizen_namespace_id() - if query: - kaizen_client.store_user_facts( - namespace_id=resolved_namespace, - message=query, - user_id=normalized_user_id, - ) - - preferences = kaizen_client.retrieve_user_facts( - namespace_id=resolved_namespace, - user_id=normalized_user_id, - query=query, - ) - return normalized_user_id, preferences + try: + if query: + kaizen_client.store_user_facts( + namespace_id=resolved_namespace, + message=query, + user_id=normalized_user_id, + ) + + preferences = kaizen_client.retrieve_user_facts( + namespace_id=resolved_namespace, + user_id=normalized_user_id, + query=query, + ) + except Exception: + return normalized_user_id, {} + + return normalized_user_id, preferences or {}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if query:

kaizen_client.store_user_facts(

namespace_id=resolved_namespace,

message=query,

user_id=normalized_user_id,

)

# Extract facts from the response (assuming similar structure to old implementation)

facts = [fact.content for fact in recorded_facts]

# Print debug info (maintaining original behavior)

print(query)

print("------ICLs--------")

for f in facts:

print(f)

return facts

def create_run(self, namespace_id: str, run_id: str | None = None) -> Run:

"""Create a new run to track Agent steps."""

return self.memory_client.create_run(namespace_id, run_id)

def get_run(self, namespace_id: str, run_id: str) -> Run:

"""Get an existing run."""

return self.memory_client.get_run(namespace_id, run_id)

def delete_run(self, namespace_id: str, run_id: str):

"""Delete an existing run."""

return self.memory_client.delete_run(namespace_id, run_id)

def search_runs(

self, namespace_id: str, query: str | None = None, filters: dict[str, str] | None = None

) -> Run | None:

"""Search a namespace for a run based on it's step which best matches a query."""

return self.memory_client.search_runs(namespace_id, query, filters)

async def end_run(self, namespace_id: str, run_id: str):

"""End an existing run."""

return await self.memory_client.end_run(namespace_id, run_id)

def add_step(self, namespace_id: str, run_id: str, step: dict, prompt: str) -> MemoryEvent:

"""Add a new step into a run."""

return self.memory_client.add_step(namespace_id, run_id, step, prompt)

def list_runs(self, namespace_id: str, limit: int = 10) -> list[Run]:

"""Retrieve the list of runs in a namespace."""

return self.memory_client.list_runs(namespace_id, limit)

def _get_user_id(self, state: "AgentState") -> str:

"""Extract or generate user ID for memory scoping"""

# Use the pi field from AgentState

if hasattr(state, 'pi') and state.pi:

pi_dict = json.loads(state.pi)

state.user_id = str(f"{pi_dict["first_name"]}_{pi_dict["last_name"]}_{pi_dict["phone_number"]}")

else:

state.user_id = "default_user"

self.user_id = state.user_id

return state.user_id

preferences = kaizen_client.retrieve_user_facts(

namespace_id=resolved_namespace,

user_id=normalized_user_id,

query=query,

)

resolved_namespace = namespace_id or get_kaizen_namespace_id()

try:

if query:

kaizen_client.store_user_facts(

namespace_id=resolved_namespace,

message=query,

user_id=normalized_user_id,

)

preferences = kaizen_client.retrieve_user_facts(

namespace_id=resolved_namespace,

user_id=normalized_user_id,

query=query,

)

except Exception:

return normalized_user_id, {}

return normalized_user_id, preferences or {}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/memory/memory.py` around lines 245 - 256, The Kaizen write/read calls in sync_user_memory (calls to kaizen_client.store_user_facts and kaizen_client.retrieve_user_facts) must be fault-tolerant: wrap the store_user_facts call and the retrieve_user_facts call in try/except blocks, log the exception (including context like resolved_namespace, normalized_user_id, and query) and ensure retrieve_user_facts falls back to an empty dict {} on error so a transient Kaizen outage doesn't raise and abort the agent turn; keep the existing behavior when calls succeed.

coderabbitai · 2026-03-26T14:09:02Z

src/cuga/backend/memory/utils/memory_tips_formatted.py

+        logger.warning(
+            "Memory guideline retrieval failed namespace_id='{}' agent_id='{}' query='{}' error={}",
+            namespace_id,
+            normalized_agent_id,
+            (query or "")[:120],
+            e,
+        )


⚠️ Potential issue | 🟠 Major

Stop logging raw memory content and user query text.

Line 31-Line 37, Line 55-Line 59, and Line 71-Line 74 can expose sensitive user facts/PII in logs. Keep only counts/namespace/agent metadata.

🔒 Suggested fix

- logger.warning( - "Memory guideline retrieval failed namespace_id='{}' agent_id='{}' query='{}' error={}", - namespace_id, - normalized_agent_id, - (query or "")[:120], - e, - ) + logger.warning( + "Memory guideline retrieval failed namespace_id='{}' agent_id='{}' error={}", + namespace_id, + normalized_agent_id, + e, + ) @@ - logger.debug( - "Memory guideline retrieval: no results namespace_id='{}' agent_id='{}' query='{}' limit={}", - namespace_id, - normalized_agent_id, - (query or "")[:120], - limit, - ) + logger.debug( + "Memory guideline retrieval: no results namespace_id='{}' agent_id='{}' limit={}", + namespace_id, + normalized_agent_id, + limit, + ) @@ - logger.debug( - "Memory guideline retrieval: {} result(s) namespace_id='{}' agent_id='{}' query='{}' limit={}", - len(tips), - namespace_id, - normalized_agent_id, - (query or "")[:120], - limit, - ) + logger.debug( + "Memory guideline retrieval: {} result(s) namespace_id='{}' agent_id='{}' limit={}", + len(tips), + namespace_id, + normalized_agent_id, + limit, + ) @@ - logger.debug( - "Memory guideline retrieval tips:\n{}", - "\n".join(lines), - ) + logger.debug("Memory guideline retrieval produced {} formatted tip(s)", len(lines))

Also applies to: 55-59, 71-74

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/memory/utils/memory_tips_formatted.py` around lines 31 - 37, The logger.warning calls (e.g., the one invoking logger.warning with namespace_id, normalized_agent_id, (query or "")[:120], e) are logging raw memory content and user queries; change these log messages to never include the raw query or memory text and instead log only safe metadata such as namespace_id, normalized_agent_id, and counts/lengths (e.g., query_length or result_count) plus the error object (e). Update every occurrence of this pattern (the logger.warning calls around the shown diff and the similar occurrences at the other two locations) to replace raw text with sanitized fields like query_length and result_count or a masked placeholder.

coderabbitai · 2026-03-26T14:09:02Z

src/cuga/backend/server/main.py

+def _get_memories_for_ui(local_state: AgentState) -> Dict[str, List[Dict[str, Any]]]:
+    """Return memories directly from state.user_preferences for UI responses."""
+    if not settings.advanced_features.enable_fact:
+        return {}
+    return local_state.user_preferences if isinstance(local_state.user_preferences, dict) else {}


⚠️ Potential issue | 🟠 Major

Normalize legacy memory payloads before returning them to the UI.

The rest of this PR still supports the old {fact_id: content} shape, but this helper now forwards state.user_preferences verbatim as Record<string, any[]>. Historical or migrated threads will therefore show an empty Memories panel and memories_count = 0 even though the agent can still use those facts. Normalize to the categorized list shape here before serializing.

🧩 Possible normalization

def _get_memories_for_ui(local_state: AgentState) -> Dict[str, List[Dict[str, Any]]]: """Return memories directly from state.user_preferences for UI responses.""" if not settings.advanced_features.enable_fact: return {} - return local_state.user_preferences if isinstance(local_state.user_preferences, dict) else {} + raw = local_state.user_preferences + if not isinstance(raw, dict): + return {} + + from collections import defaultdict + from cuga.backend.cuga_graph.state.user_preferences_context import extract_structured_facts + + grouped: dict[str, list[dict[str, Any]]] = defaultdict(list) + for fact in extract_structured_facts(raw): + grouped[fact["category"]].append(fact) + return dict(grouped)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/server/main.py` around lines 212 - 216, The helper _get_memories_for_ui should normalize legacy memory payloads in local_state.user_preferences before returning them: keep the settings.advanced_features.enable_fact guard, but when user_preferences is a dict and its values are not already lists (the old {fact_id: content} shape), transform it into the categorized Record<string, any[]> shape expected by the UI by mapping those entries into a category (e.g., "Facts") as an array of items (include the original fact id and its content as object fields), and otherwise return user_preferences unchanged when it already matches the categorized lists shape.

coderabbitai · 2026-03-26T14:09:02Z

src/cuga/configurations/models/settings.openai.toml

+[memory.kaizen.conflict_resolution.model]
 platform = "openai"
 temperature = 0.1
 max_tokens = 1000

-[memory.milvus.step_processing.model]
+[memory.kaizen.fact_extraction.model]
 platform = "openai"
 temperature = 0.1
 max_tokens = 1000

-[memory.milvus.fact_extraction.model]
-platform = "openai"
-temperature = 0.1
-max_tokens = 1000
-
-[memory.tips_extractor.model]
+[memory.kaizen.tips.model]
 platform = "openai"
 temperature = 0.1
 max_tokens = 5000


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash rg -n -C2 'memory\.(mem0\.model|milvus\.step_processing\.model|milvus\.fact_extraction\.model|tips_extractor\.model|kaizen\.(conflict_resolution|fact_extraction|tips)\.model)' || true

Repository: cuga-project/cuga-agent

Length of output: 10932

Update settings.litellm.toml to use new memory.kaizen. configuration keys.*

The runtime code (src/cuga/backend/memory/memory.py lines 135, 139, 143) reads only from memory.kaizen.conflict_resolution.model, memory.kaizen.fact_extraction.model, and memory.kaizen.tips.model. However, settings.litellm.toml still defines the retired keys: memory.mem0.model, memory.milvus.step_processing.model, memory.milvus.fact_extraction.model, and memory.tips_extractor.model. These legacy entries will be silently ignored at runtime, and the litellm provider will fall back to defaults. Rename all four keys in settings.litellm.toml to match the new naming scheme used in all other provider configurations.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cuga/configurations/models/settings.openai.toml` around lines 51 - 64, Rename the four legacy TOML keys in settings.litellm.toml so they match the new memory.kaizen.* keys that the runtime expects: change memory.mem0.model -> memory.kaizen.conflict_resolution.model, memory.milvus.step_processing.model -> memory.kaizen.conflict_resolution.model (or merge if intended), memory.milvus.fact_extraction.model -> memory.kaizen.fact_extraction.model, and memory.tips_extractor.model -> memory.kaizen.tips.model; ensure the platform, temperature, and max_tokens values are preserved under the new keys so the code in memory.py that reads memory.kaizen.conflict_resolution.model, memory.kaizen.fact_extraction.model, and memory.kaizen.tips.model will pick them up.

coderabbitai · 2026-03-26T14:09:02Z

src/frontend_workspaces/agentic_chat/src/VariablesSidebar.tsx

+      rawFacts.forEach((fact: any) => {
+        const content = typeof fact?.content === "string" ? fact.content : "";
+        if (!content) {
+          return;
+        }
+
+        facts.push({
+          id: typeof fact?.id === "string" ? fact.id : undefined,
+          category,
+          content,
+          key: typeof fact?.key === "string" ? fact.key : null,
+          value: typeof fact?.value === "string" ? fact.value : null,
+        });


⚠️ Potential issue | 🟠 Major

Don't drop structured facts that only have key/value.

The shared memory formatter treats a fact as valid when any of key, value, or content is present, but this extractor returns early unless content is non-empty. Kaizen facts that come back as structured pairs will disappear from the sidebar even though the backend still counts them. Build a display string from key/value when content is missing.

🧩 Suggested fallback

rawFacts.forEach((fact: any) => { - const content = typeof fact?.content === "string" ? fact.content : ""; + const key = typeof fact?.key === "string" ? fact.key : ""; + const value = typeof fact?.value === "string" ? fact.value : ""; + const content = + typeof fact?.content === "string" && fact.content + ? fact.content + : [key, value].filter(Boolean).join(": "); if (!content) { return; } facts.push({ id: typeof fact?.id === "string" ? fact.id : undefined, category, content, - key: typeof fact?.key === "string" ? fact.key : null, - value: typeof fact?.value === "string" ? fact.value : null, + key: key || null, + value: value || null, }); });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/frontend_workspaces/agentic_chat/src/VariablesSidebar.tsx` around lines 60 - 72, The current rawFacts.forEach callback ignores facts that lack a string content, dropping valid structured facts; change the extraction inside the loop used by VariablesSidebar (the rawFacts.forEach and facts.push logic) to compute content as: if fact.content is a non-empty string use it, otherwise if fact.key or fact.value exist build a display string from those (e.g. "key: value" or the non-empty key/value alone) and treat that as content so you don't return early; keep id/key/value assignments unchanged (id from fact.id, key from fact.key, value from fact.value) and only skip pushing when neither content nor key/value exist.

gaodan-fang added 10 commits March 25, 2026 16:47

Initial commit for personalization

3fec37b

Rebase

7379c89

Enable in cuga lite and support more implicit fact extraction

c2db562

Migrate to kaizen

65668ca

Update ReadME

10257e9

Add utils

837d495

Conslidate settings

a583c1e

Ruff format refactor and fix tests

b6d2dc5

Move to kaizen

d6853be

Move to kaizen

4f9fbf9

Update uv lock

bdda77b

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

gaodan-fang mentioned this pull request Mar 30, 2026

[Feature] Personalization Integration for CUGA #108

Open

26 tasks

Conversation

gaodan-fang commented Mar 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gaodan-fang commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 26, 2026 •

edited

Loading