Conversation
📝 WalkthroughWalkthroughThis PR migrates the agentic memory backend from Changes
Sequence DiagramsequenceDiagram
participant ChatNode
participant sync_fn as sync_user_memory()
participant KaizenClient
participant ActivityTracker
participant FinalAnswer
participant Guidelines
ChatNode->>sync_fn: call with user_id, query
sync_fn->>KaizenClient: store_user_facts(query)
sync_fn->>KaizenClient: retrieve_user_facts(user_id)
KaizenClient-->>sync_fn: preferences dict
sync_fn-->>ChatNode: (user_id, preferences)
ChatNode->>ChatNode: update state.user_preferences
Note over ActivityTracker: Per step execution
ChatNode->>ActivityTracker: track step with data
ActivityTracker->>ActivityTracker: build trajectory message
ActivityTracker->>KaizenClient: update_entities(trajectory)
KaizenClient-->>ActivityTracker: ack
Note over FinalAnswer: On task completion
ChatNode->>FinalAnswer: reached
FinalAnswer->>Guidelines: _persist_guidelines_from_trajectory()
Guidelines->>Guidelines: group trajectory by agent
Guidelines->>KaizenClient: generate_tips per agent
KaizenClient-->>Guidelines: tips
Guidelines->>Guidelines: deduplicate by content
Guidelines->>KaizenClient: update_entities(guideline)
KaizenClient-->>Guidelines: ack
Guidelines-->>FinalAnswer: done
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 18
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/frontend_workspaces/agentic_chat/src/LeftSidebar.tsx (1)
2-2:⚠️ Potential issue | 🔴 CriticalImport
Brainfromlucide-reactto fix undefined reference error.Line 245 uses
<Brain size={16} />, butBrainis not imported on Line 2.🔧 Suggested fix
-import { MessageSquare, Database, ChevronLeft, ChevronRight, Plus, Trash2, Workflow, Info, HelpCircle } from "lucide-react"; +import { MessageSquare, Database, Brain, ChevronLeft, ChevronRight, Plus, Trash2, Workflow, Info, HelpCircle } from "lucide-react";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/frontend_workspaces/agentic_chat/src/LeftSidebar.tsx` at line 2, The file imports several icons from lucide-react but misses the Brain icon used as <Brain size={16} /> in LeftSidebar.tsx; update the import statement that currently lists MessageSquare, Database, ChevronLeft, ChevronRight, Plus, Trash2, Workflow, Info, HelpCircle to also include Brain so the Brain symbol is defined and the component renders correctly.src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py (1)
947-977:⚠️ Potential issue | 🟠 MajorKeep the injected memory block out of
state.chat_messages.
call_model()recomputesmemory_queryfromstate.chat_messageson every sandbox→model cycle. Once these branches write## Memory Decision Contextback into the stored human message, the next pass ranks facts against the injected memory text instead of the user's original request. Only augmentmessages_for_model, or keep the raw user utterance separately.Also applies to: 995-1023
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py` around lines 947 - 977, The code is mutating state.chat_messages by writing injected blocks (memory_decision_context, playbook_guidance, variables_addendum) back into HumanMessage objects, which pollutes the canonical user utterance used later (memory_query); instead, stop updating state.chat_messages in cuga_lite_graph and apply all augmentations only to the messages_for_model list (or a separate augmented_messages list) passed to call_model(); preserve the original content in state.chat_messages and only construct modified HumanMessage instances for messages_for_model using memory_section_header, memory_decision_context, playbook_guidance, and variables_addendum (also make the same change for the similar block around lines 995-1023).src/cuga/backend/server/main.py (1)
1104-1110:⚠️ Potential issue | 🟠 MajorDon't persist
memoriessnapshots inside answer stream history.
final_answer_textnow embeds the full memory snapshot, and that same string is appended tostream_events_bufferand saved to the stream-events store. That duplicates potentially personal facts into every stored answer event and keeps them around even if the underlying memory entry is later deleted. Stream the enriched payload to the live client if needed, but persist a redacted answer body instead.🔒 Redact stored answer events
final_answer_text = ( event.answer if settings.advanced_features.wxo_integration else json.dumps( { "data": event.answer, "variables": variables_metadata, "memories": memories, "memories_count": memories_count, "active_policies": active_policies, } ) if event.answer else "Done." ) + persisted_answer_text = event.answer if event.answer else "Done." ... stream_events_buffer.append( { "event_name": "Answer", - "event_data": final_answer_text, + "event_data": persisted_answer_text, "timestamp": datetime.utcnow().isoformat(), "sequence": event_sequence, } )Also applies to: 1123-1143
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/server/main.py` around lines 1104 - 1110, The code is persisting full memory snapshots into the stored answer events; update the logic that builds and appends the stored payload (the object currently constructed with keys like "data" / event.answer, "variables" / variables_metadata, "memories", "memories_count", "active_policies") so that only a redacted version is persisted to stream_events_buffer and the stream-events store—e.g., include memories_count and safe identifiers or omit the "memories" field entirely—instead of embedding full memories or final_answer_text that contains them; continue to send the enriched payload (with full memories) to the live client but ensure the persisted object (where stream_events_buffer, final_answer_text, event.answer, variables_metadata, memories_count, active_policies are used) does not contain personal memory content; apply the same redaction change to the other similar block referenced around the 1123–1143 area.
🟡 Minor comments (7)
docs/examples/demo_apps/file_system/build/lib/main.py-231-240 (1)
231-240:⚠️ Potential issue | 🟡 MinorValidate
head/tailexplicitly instead of relying on truthiness.
0currently falls through to a full-file read, and negative values produce surprising slices. Checkis not Noneand reject non-positive values before dispatching.🛠️ Suggested fix
- if tail and head: + if tail is not None and head is not None: raise ValueError("Cannot specify both head and tail parameters simultaneously") + if tail is not None and tail <= 0: + raise ValueError("tail must be greater than 0") + if head is not None and head <= 0: + raise ValueError("head must be greater than 0") @@ - if tail: + if tail is not None: return await tail_file(valid_path, tail) - if head: + if head is not None: return await head_file(valid_path, head)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/examples/demo_apps/file_system/build/lib/main.py` around lines 231 - 240, The current dispatch uses truthiness for head/tail so values like 0 or negatives behave incorrectly; update the logic in the block around validate_path/head_file/tail_file to check "is not None" for both head and tail, validate that any provided head or tail is a positive integer (raise ValueError for non-positive or non-integer inputs), keep the mutual-exclusion check (raise if both are not None), and then call tail_file(valid_path, tail) or head_file(valid_path, head) only when the corresponding parameter is explicitly provided and validated.src/cuga/backend/memory/__init__.py-23-40 (1)
23-40:⚠️ Potential issue | 🟡 MinorGuard ModuleNotFoundError to only handle Kaizen-related failures.
The exception handler catches all
ModuleNotFoundErrorcases, including missing transitive dependencies ofkaizen.schema.*modules. When such dependencies (e.g.,numpy,pydantic) are missing, the code incorrectly reports it as a Kaizen version incompatibility issue instead of surfacing the actual missing dependency.Add a guard checking
exc.nameto distinguish between Kaizen's own modules and unrelated missing dependencies:Suggested fix
try: module = import_module(module_name) except ModuleNotFoundError as exc: + missing_name = exc.name or "" + if missing_name != "kaizen" and not missing_name.startswith("kaizen."): + raise + if util.find_spec("kaizen") is None: raise RuntimeError( "Kaizen is required for memory features but is not installed. "🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/memory/__init__.py` around lines 23 - 40, The ModuleNotFoundError handler around import_module(module_name) should only treat errors that are about Kaizen itself; inspect exc.name and if it does not refer to a kaizen module (e.g., not starting with "kaizen" or not equal to the top-level module_name root) re-raise the original ModuleNotFoundError so missing transitive deps surface correctly. Only when exc.name indicates a kaizen-related module proceed with the util.find_spec("kaizen") / version("kaizen") checks and raise the specialized RuntimeError about Kaizen integration; otherwise immediately reraise exc.src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2-6-6 (1)
6-6:⚠️ Potential issue | 🟡 MinorTypo: "RetryClaude" should be corrected.
The text "RetryClaude can make mistakes" appears to be a typo or artifact. Consider changing to "Claude can make mistakes" or "The assistant can make mistakes."
Proposed fix
-Use these memory facts when they are relevant to the request before concluding information is unavailable. RetryClaude can make mistakes. Please double-check responses. +Use these memory facts when they are relevant to the request before concluding information is unavailable. The assistant can make mistakes. Please double-check responses.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2` at line 6, In the pmt.jinja2 prompt template update the sentence "RetryClaude can make mistakes." to a correct phrase such as "Claude can make mistakes" or "The assistant can make mistakes"; locate the string inside the prompt template (the memory-facts instruction block) and replace the token "RetryClaude" with the chosen correct term so the prompt reads e.g. "Claude can make mistakes."README.md-810-812 (1)
810-812:⚠️ Potential issue | 🟡 MinorThe memory demo commands disagree.
This section first says to run
cuga start demo, but the actual walkthrough below requirescuga start demo_crm --sample-memory-data. One of these is stale.📝 Suggested fix
-3. Run `cuga start demo` +3. Run `cuga start demo_crm --sample-memory-data`Also applies to: 821-824
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 810 - 812, The README contains inconsistent demo start commands: some places use "cuga start demo" while the walkthrough requires "cuga start demo_crm --sample-memory-data"; update the README so both occurrences (and any other mentions) consistently instruct users to run "cuga start demo_crm --sample-memory-data" (or alternatively change the walkthrough to use "cuga start demo" if that is the intended command), ensuring the example and the subsequent walkthrough match and updating the step list and any references at the other noted location to the chosen command.src/cuga/backend/memory/utils/memory_tips_formatted.py-40-44 (1)
40-44:⚠️ Potential issue | 🟡 MinorNormalize
search_entitiesoutput beforelen(...).Line 40 assumes
tipsis list-like. If backend returnsNone/unexpected type, this path raises and interrupts the request.🛡️ Suggested fix
- raw_count = len(tips) + tips = tips or [] + if not isinstance(tips, list): + tips = list(tips) + raw_count = len(tips)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/memory/utils/memory_tips_formatted.py` around lines 40 - 44, The code assumes variable tips is a list-like before calling len(...) and list comprehensions (raw_count, raw_types, raw_ids and subsequent filtering); guard and normalize the backend output first by checking tips and converting it to a safe iterable (e.g., if tips is None set tips = [], if it’s a single object wrap it in [tips], or if it’s not list-like try to cast to list) before computing raw_count/raw_types/raw_ids and before filtering for type == "guideline" so the rest of the logic (raw_count, raw_types, raw_ids, and the tips = [ ... ] filter) never crashes on unexpected types.docs/memory/README.md-7-7 (1)
7-7:⚠️ Potential issue | 🟡 MinorTypo: "levearge" should be "leverage".
📝 Proposed fix
-CUGA execution can be enhanced by enabling episodic memory, which introduces the ability to levearge previously identified insights and relevant experiences when generating the final answer. +CUGA execution can be enhanced by enabling episodic memory, which introduces the ability to leverage previously identified insights and relevant experiences when generating the final answer.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/memory/README.md` at line 7, Fix the typo in the docs line containing "CUGA execution can be enhanced..." by replacing the misspelled word "levearge" with "leverage" so the sentence reads "...introduces the ability to leverage previously identified insights..."; locate the phrase "CUGA execution can be enhanced by enabling episodic memory" in docs/memory/README.md and update the single word.docs/memory/README.md-61-61 (1)
61-61:⚠️ Potential issue | 🟡 MinorAdd a language specifier to the fenced code block.
Per static analysis (MD040), fenced code blocks should have a language specified for proper syntax highlighting. Since this shows a directory structure, use
textorplaintext.📝 Proposed fix
-``` +```text cuga/🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/memory/README.md` at line 61, Update the fenced code block that contains the directory listing line "cuga/" to include a language specifier (e.g., change the opening ``` to ```text or ```plaintext) so the block complies with MD040 and enables proper syntax highlighting; locate the fenced block in the README that currently begins with ``` and replace it with ```text.
🧹 Nitpick comments (8)
pyproject.toml (1)
32-37: Remove the duplicatelitellmrequirement.
litellmis now declared twice here. Keeping onlylitellm>=1.0.0makes the dependency set unambiguous and avoids redundant metadata during resolution.♻️ Suggested cleanup
- "litellm", "jinja2",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pyproject.toml` around lines 32 - 37, Remove the duplicate unversioned dependency "litellm" in pyproject.toml and keep only the explicit versioned entry "litellm>=1.0.0"; edit the dependency list to delete the bare "litellm" line so the file contains a single clear declaration ("litellm>=1.0.0") and re-run dependency resolution to verify no other duplicates remain.src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2 (1)
3-4: Consider handling empty memory context gracefully.When
memory_contextis empty or undefined, the prompt will display an awkward section header with no content. Consider wrapping this in a conditional block.Proposed fix
-Persistent memory facts about the user: -{{memory_context}} +{% if memory_context %} +Persistent memory facts about the user: +{{memory_context}} +{% endif %}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2` around lines 3 - 4, The template currently always renders the "Persistent memory facts about the user:" header even when memory_context is empty; update the pmt.jinja2 template to wrap the header and the {{memory_context}} output in a Jinja2 conditional that checks if memory_context is defined and non-empty (e.g., if memory_context), so the whole block is omitted when there are no memory facts; ensure you reference the existing memory_context variable and keep the rest of the prompt intact.src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_node.py (1)
283-285: Consider reducing debug log verbosity.The
[FACT DEBUG]logging is useful during development but may add noise in production. Consider usinglogger.debugconsistently (already being used) or adding a feature flag for verbose memory tracing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_node.py` around lines 283 - 285, The info-level log in CugaLiteNode that mentions fact storage is too verbose for production; update the call to use logger.debug instead of logger.info (referencing the logger.info call in CugaLiteNode that mentions state.user_preferences), or gate the message behind an explicit feature flag/setting (e.g., a verbose_memory_tracing or DEBUG_MEMORY flag) so the message only emits when that flag is enabled; ensure the check or log level change is applied where the current logger.info is invoked so other info logs remain unaffected.src/cuga/backend/cuga_graph/state/agent_state.py (1)
1099-1102: Broad exception catch is acceptable here, but consider logging the exception type.The static analyzer flags the bare
Exceptioncatch. Since preference loading is non-critical and shouldn't fail the entire operation, this is acceptable. However, logging the exception type could help with debugging.📝 Optional improvement
except Exception as e: - logger.error(f"Error loading user preferences: {e}") + logger.error(f"Error loading user preferences ({type(e).__name__}): {e}") # Don't fail the entire operation if preference loading fails state.user_preferences = {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/cuga_graph/state/agent_state.py` around lines 1099 - 1102, The catch block currently logs only the exception message when loading preferences ("except Exception as e: logger.error(f\"Error loading user preferences: {e}\")") which hides the exception type; update the logging to include the exception class/type (e.g., include type(e) or e.__class__.__name__) so the logger.error call for the preference-loading failure (the block that sets state.user_preferences = {}) records both the message and the exception type to aid debugging while keeping the broad Exception catch.src/frontend_workspaces/agentic_chat/src/CardManager.tsx (1)
369-373: Consider memoizing or caching stringified comparisons for large objects.Using
JSON.stringifyfor deep equality checks onvariablesandmemoriesworks correctly but may become expensive if these objects grow large. For now this is acceptable given the typical size of preference data, but consider a dedicated deep-equal utility if performance becomes a concern.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/frontend_workspaces/agentic_chat/src/CardManager.tsx` around lines 369 - 373, The current entryChanged calculation in CardManager (the entryChanged constant comparing existing.title, existing.timestamp, and JSON.stringify(existing.variables)/JSON.stringify(existing.memories) to entry.*) uses JSON.stringify for deep equality which can be expensive for large objects; change it to avoid repeated stringification by either (a) memoizing the stringified representation on the objects when they are created/updated and compare those cached strings (e.g., add cachedVariablesString/cachedMemoriesString fields or compute once and reuse), or (b) replace the JSON.stringify comparisons with a dedicated deep-equal utility (e.g., lodash.isEqual or a small deepEqual function) and use isEqual(existing.variables, entry.variables) and isEqual(existing.memories, entry.memories) inside the entryChanged calculation. Ensure you update/maintain the cache whenever variables/memories are modified and reference the entryChanged constant to locate where to apply the change.src/cuga/backend/cuga_graph/nodes/api/api_planner_agent/api_planner_agent.py (1)
220-257: Trim the duplicated memory block in_format_preferences_for_prompt().The formatter emits a structured-facts section and then immediately re-emits the same facts again under category headings. That makes every planner turn pay double tokens for the same context. Reusing the selected structured facts, or capping the category dump, would keep this prompt cheaper and easier to tune.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/cuga_graph/nodes/api/api_planner_agent/api_planner_agent.py` around lines 220 - 257, The method _format_preferences_for_prompt currently appends structured_facts into lines and then iterates preferences to re-emit the same facts under category headings, duplicating context; modify the logic so that when structured_facts is non-empty you skip the categorized preferences dump (or alternatively only include categories for facts not present in structured_facts). Concretely, inside _format_preferences_for_prompt, after building the "Structured Facts For Decision Making" block (using APIPlannerAgent._format_fact_pointer, _escape_inline_code and _safe_text), add a guard around the subsequent preferences handling (the branch that checks first_value and iterates preferences) to return or skip that block if structured_facts exists, or filter each category's facts by excluding items whose pointer/id matches any pointer built from structured_facts before appending to lines. Ensure you reference structured_facts, preferences, and the lines list used to join the final prompt.src/cuga/backend/activity_tracker/tracker.py (2)
652-696: Isolate guideline failures per agent.The outer
tryaborts the whole loop on the firstgenerate_tips()orupdate_entities()failure, so one bad agent drops the remaining guideline writes for the same trajectory. Catching/logging inside the loop would make this path much more resilient.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/activity_tracker/tracker.py` around lines 652 - 696, The try/except currently wraps the whole loop so a failure in generate_tips or self.memory.update_entities for one agent aborts processing for all agents; move the try/except inside the for agent_id, agent_trajectory_messages in agent_scoped_trajectories loop so each agent's work is isolated, i.e., call generate_tips(agent_trajectory_messages) and the subsequent unique_tips -> entities -> self.memory.update_entities within a per-agent try block, catch exceptions there (using logger.exception) including agent_id and user_id in the log, and continue to the next agent so one bad agent doesn't drop remaining guideline writes; keep the outer early-return checks and use the same functions/methods: _build_agent_scoped_trajectory_messages, generate_tips, and self.memory.update_entities.
807-826: Consider deferring trajectory writes to avoid blocking step execution.
collect_step()persists a trajectory entity inline on every step (line 18). IfKaizenClient.update_entities()performs I/O synchronously, agent step latency is directly coupled to memory write latency and availability. The code already usesasyncio.create_task()for deferred work elsewhere (lines 42–50); applying the same pattern here would decouple memory operations from the critical path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuga/backend/activity_tracker/tracker.py` around lines 807 - 826, collect_step currently calls self.memory.update_entities inline which can block step execution; instead, create a deferred task to perform the memory write asynchronously. Wrap the update_entities call (including the Entity payload and enable_conflict_resolution=False) in an async helper or use asyncio.to_thread if update_entities is synchronous, then schedule it with asyncio.create_task(...) so the call returns immediately; reference the collect_step function and the memory.update_entities invocation along with the Entity metadata (timestamp, step_index, step_name, intent, user_id, task_id, session_id) to ensure the same payload is passed to the background task.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/examples/demo_apps/file_system/build/lib/main.py`:
- Around line 146-164: The current diff generation in the block that compares
original_content and content (using original_lines/new_lines and a zip-based
loop) misses insertions/deletions because zip stops at the shorter list; replace
this manual loop with difflib.unified_diff to produce a correct unified diff
(import difflib, call difflib.unified_diff(original_lines, new_lines,
fromfile=f"--- {file_path}", tofile=f"+++ {file_path}", lineterm="") ), preserve
the existing status message logic around dry_run, and return the joined
unified_diff lines instead of the zip-generated diff so insertions, deletions
and EOF changes are captured correctly.
- Around line 48-63: validate_path's string-prefix check is unsafe (fails for
root allowlists and is vulnerable to TOCTOU via symlink swaps); update
validate_path to canonicalize both the candidate (from
expand_home/normalize_path) and each allowed_directories entry as
pathlib.Path.resolve(...) and then perform a true ancestor check using
Path.relative_to or os.path.commonpath to confirm the candidate is inside an
allowed_dir (handle root like "/" or "C:/" correctly by canonicalizing first),
and in addition revalidate the resolved path at the syscall boundary for any
mutating operation (open/rename/makedirs) by resolving the path again
immediately before the syscall and/or using safe syscalls (e.g., os.open with
O_NOFOLLOW or compare parent directory inodes) to prevent symlink races;
reference validate_path, expand_home, normalize_path, and allowed_directories
when making these changes.
- Around line 176-205: The non-recursive branch only matches basenames from
os.listdir(root_path) so patterns containing path segments (e.g., "src/*.py")
never match and exclude_patterns aren't applied; change the else branch in
main.py to glob against os.path.join(root_path, pattern) (using glob.glob with
recursive=False or without "**") to get full paths, then apply the same
exclusion logic used in the '**' branch (checking rel_path against
exclude_patterns with fnmatch.fnmatch and the ** variants) before appending to
results so path-segment patterns and exclusions behave consistently (update
references: pattern, root_path, results, exclude_patterns, rel_path).
- Around line 443-470: The build_tree function can recurse into symlinked
directories causing infinite loops; modify build_tree (and its
caller/initialization) to detect and avoid symlink loops by tracking visited
real paths or inodes: before descending from entry_path (where
os.path.isdir(entry_path) is checked), resolve real =
os.path.realpath(entry_path) (or use os.stat to get st_ino/st_dev) and maintain
a visited set passed through recursive calls; if real (or inode tuple) is
already in visited or if os.path.islink(entry_path) and you prefer to skip
symlinks, skip adding children and do not recurse; ensure validate_path usage
remains and update build_tree signature to accept the visited set if needed.
In `@src/cuga/backend/activity_tracker/tracker.py`:
- Around line 641-647: _snapshot self.task_id and self.session_id into local
variables at the start of _persist_guidelines_from_trajectory and use those
local variables everywhere you build guideline entity metadata or enqueue the
guideline job instead of reading self.task_id/self.session_id later; do the same
change for the other coroutines that build/enqueue guideline jobs (the blocks
referenced at lines 671-679 and 832-836) so the queued job uses the captured
task_id/session_id values rather than potentially changed singleton attributes.
In `@src/cuga/backend/cuga_graph/nodes/api/api_code_agent.py`:
- Around line 54-56: The hardcoded "if False and (" guard prevents the Kaizen
write path from ever executing; remove the "False and" so the condition uses the
real feature gates (e.g., enable_fact and enable_memory) instead. Locate the
conditional in api_code_agent.py (the block starting with "if False and (") and
change it to evaluate the actual flags (for example: "if enable_fact and
enable_memory and ( ... )") so the fact extraction/write logic inside the
existing function/method (the Kaizen write path) can run when configured.
In
`@src/cuga/backend/cuga_graph/nodes/api/api_planner_agent/prompts/user_hitl.jinja2`:
- Around line 3-8: The prompt fragment that injects remembered preferences
(user_preferences_context) should be reworded to treat those memories as
non-mandatory hints: change the line "must influence planning decisions" to a
precedence rule stating that current user instructions and session context take
priority and that any conflict between user_preferences_context and the current
request should not be auto-applied but must be flagged for confirmation;
reference the template variable user_preferences_context and ensure the prompt
explicitly instructs the planner to use memories only as suggestions and to ask
the user to confirm before applying conflicting remembered preferences.
In `@src/cuga/backend/cuga_graph/nodes/api/api_planner.py`:
- Around line 197-199: The conditional that gates planner memory recall has been
hard-disabled by "if False and settings.advanced_features.enable_fact and
settings.advanced_features.enable_memory:"; restore the real feature flag check
by removing the hardcoded False so the block is executed when both
settings.advanced_features.enable_fact and
settings.advanced_features.enable_memory are true, ensuring the Kaizen/fact
retrieval path is re-enabled for API planning (update the if condition in the
same block and keep the existing retrieval code untouched).
In `@src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt_chat.jinja2`:
- Around line 15-16: The prompt currently injects the raw memory blob
(memory_context) as authoritative; change it so memory_context is fenced with
explicit delimiters (e.g., "BEGIN MEMORY" / "END MEMORY") and prefaced as
reference-only data, and add a clear rule that "if any memory item conflicts
with the latest user message, the latest user message takes precedence"; apply
this change where memory_context is inserted (around the insert using the
memory_context template variable at the locations corresponding to lines 15-16
and 34-35) so the agent treats memories as non-authoritative reference material.
In `@src/cuga/backend/cuga_graph/nodes/chat/chat.py`:
- Around line 145-152: The code currently calls sync_user_memory whenever
settings.advanced_features.enable_fact is true; change the condition to require
persistent memory to be enabled as well by also checking
settings.advanced_features.enable_memory before importing/calling
sync_user_memory. Update the branch that contains sync_user_memory (the block
referencing memory_user_id, state.user_preferences and await
sync_user_memory(...)) to run only when both enable_fact and enable_memory are
true, so that sync_user_memory is not attempted if memory is disabled (leave the
try/except intact).
- Around line 155-163: Avoid logging raw PII: replace direct uses of
state.user_id in the logger.debug and logger.info calls with a nondeterministic
or hashed/anonymized identifier (e.g., call a helper like
anonymize_user_id(state.user_id) or mask_user_id(state.user_id) before logging).
Update the two occurrences that reference state.user_id (the
logger.debug(f"Stored categorized fact(s) for user {state.user_id}") and the
logger.info(...) that embeds state.user_id) to log the anonymized ID instead,
and add or reuse a small helper function (anonymize_user_id / mask_user_id) to
perform the masking or hashing consistently across the module.
In `@src/cuga/backend/memory/cli.py`:
- Around line 41-49: The helper _load_kaizen_exception_types currently only
catches ModuleNotFoundError when importing via
import_module("kaizen.schema.exceptions"), but getattr() calls for
KaizenException and NamespaceNotFoundException can raise AttributeError; update
the try/except to catch both ModuleNotFoundError and AttributeError around the
import+getattr sequence, and call _fail_missing_kaizen(exc) for either exception
so missing or renamed symbols produce the same friendly failure; reference
_load_kaizen_exception_types, import_module, getattr for KaizenException and
NamespaceNotFoundException, and _fail_missing_kaizen when applying the change.
In `@src/cuga/backend/memory/memory.py`:
- Around line 245-256: The Kaizen write/read calls in sync_user_memory (calls to
kaizen_client.store_user_facts and kaizen_client.retrieve_user_facts) must be
fault-tolerant: wrap the store_user_facts call and the retrieve_user_facts call
in try/except blocks, log the exception (including context like
resolved_namespace, normalized_user_id, and query) and ensure
retrieve_user_facts falls back to an empty dict {} on error so a transient
Kaizen outage doesn't raise and abort the agent turn; keep the existing behavior
when calls succeed.
- Around line 33-57: The compatibility guard in memory.py only catches
ModuleNotFoundError for the block importing FilesystemSettings, KaizenConfig,
llm_settings, MilvusDBSettings, and KaizenClient, but failed symbol renames
raise ImportError, so change the except to catch both ModuleNotFoundError and
ImportError (e.g., except (ModuleNotFoundError, ImportError) as exc) so the
curated RuntimeError path is used when any import fails; keep the existing
util.find_spec/kaizen_version logic and re-raise the same RuntimeError from exc.
In `@src/cuga/backend/memory/utils/memory_tips_formatted.py`:
- Around line 31-37: The logger.warning calls (e.g., the one invoking
logger.warning with namespace_id, normalized_agent_id, (query or "")[:120], e)
are logging raw memory content and user queries; change these log messages to
never include the raw query or memory text and instead log only safe metadata
such as namespace_id, normalized_agent_id, and counts/lengths (e.g.,
query_length or result_count) plus the error object (e). Update every occurrence
of this pattern (the logger.warning calls around the shown diff and the similar
occurrences at the other two locations) to replace raw text with sanitized
fields like query_length and result_count or a masked placeholder.
In `@src/cuga/backend/server/main.py`:
- Around line 212-216: The helper _get_memories_for_ui should normalize legacy
memory payloads in local_state.user_preferences before returning them: keep the
settings.advanced_features.enable_fact guard, but when user_preferences is a
dict and its values are not already lists (the old {fact_id: content} shape),
transform it into the categorized Record<string, any[]> shape expected by the UI
by mapping those entries into a category (e.g., "Facts") as an array of items
(include the original fact id and its content as object fields), and otherwise
return user_preferences unchanged when it already matches the categorized lists
shape.
In `@src/cuga/configurations/models/settings.openai.toml`:
- Around line 51-64: Rename the four legacy TOML keys in settings.litellm.toml
so they match the new memory.kaizen.* keys that the runtime expects: change
memory.mem0.model -> memory.kaizen.conflict_resolution.model,
memory.milvus.step_processing.model -> memory.kaizen.conflict_resolution.model
(or merge if intended), memory.milvus.fact_extraction.model ->
memory.kaizen.fact_extraction.model, and memory.tips_extractor.model ->
memory.kaizen.tips.model; ensure the platform, temperature, and max_tokens
values are preserved under the new keys so the code in memory.py that reads
memory.kaizen.conflict_resolution.model, memory.kaizen.fact_extraction.model,
and memory.kaizen.tips.model will pick them up.
In `@src/frontend_workspaces/agentic_chat/src/VariablesSidebar.tsx`:
- Around line 60-72: The current rawFacts.forEach callback ignores facts that
lack a string content, dropping valid structured facts; change the extraction
inside the loop used by VariablesSidebar (the rawFacts.forEach and facts.push
logic) to compute content as: if fact.content is a non-empty string use it,
otherwise if fact.key or fact.value exist build a display string from those
(e.g. "key: value" or the non-empty key/value alone) and treat that as content
so you don't return early; keep id/key/value assignments unchanged (id from
fact.id, key from fact.key, value from fact.value) and only skip pushing when
neither content nor key/value exist.
---
Outside diff comments:
In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py`:
- Around line 947-977: The code is mutating state.chat_messages by writing
injected blocks (memory_decision_context, playbook_guidance, variables_addendum)
back into HumanMessage objects, which pollutes the canonical user utterance used
later (memory_query); instead, stop updating state.chat_messages in
cuga_lite_graph and apply all augmentations only to the messages_for_model list
(or a separate augmented_messages list) passed to call_model(); preserve the
original content in state.chat_messages and only construct modified HumanMessage
instances for messages_for_model using memory_section_header,
memory_decision_context, playbook_guidance, and variables_addendum (also make
the same change for the similar block around lines 995-1023).
In `@src/cuga/backend/server/main.py`:
- Around line 1104-1110: The code is persisting full memory snapshots into the
stored answer events; update the logic that builds and appends the stored
payload (the object currently constructed with keys like "data" / event.answer,
"variables" / variables_metadata, "memories", "memories_count",
"active_policies") so that only a redacted version is persisted to
stream_events_buffer and the stream-events store—e.g., include memories_count
and safe identifiers or omit the "memories" field entirely—instead of embedding
full memories or final_answer_text that contains them; continue to send the
enriched payload (with full memories) to the live client but ensure the
persisted object (where stream_events_buffer, final_answer_text, event.answer,
variables_metadata, memories_count, active_policies are used) does not contain
personal memory content; apply the same redaction change to the other similar
block referenced around the 1123–1143 area.
In `@src/frontend_workspaces/agentic_chat/src/LeftSidebar.tsx`:
- Line 2: The file imports several icons from lucide-react but misses the Brain
icon used as <Brain size={16} /> in LeftSidebar.tsx; update the import statement
that currently lists MessageSquare, Database, ChevronLeft, ChevronRight, Plus,
Trash2, Workflow, Info, HelpCircle to also include Brain so the Brain symbol is
defined and the component renders correctly.
---
Minor comments:
In `@docs/examples/demo_apps/file_system/build/lib/main.py`:
- Around line 231-240: The current dispatch uses truthiness for head/tail so
values like 0 or negatives behave incorrectly; update the logic in the block
around validate_path/head_file/tail_file to check "is not None" for both head
and tail, validate that any provided head or tail is a positive integer (raise
ValueError for non-positive or non-integer inputs), keep the mutual-exclusion
check (raise if both are not None), and then call tail_file(valid_path, tail) or
head_file(valid_path, head) only when the corresponding parameter is explicitly
provided and validated.
In `@docs/memory/README.md`:
- Line 7: Fix the typo in the docs line containing "CUGA execution can be
enhanced..." by replacing the misspelled word "levearge" with "leverage" so the
sentence reads "...introduces the ability to leverage previously identified
insights..."; locate the phrase "CUGA execution can be enhanced by enabling
episodic memory" in docs/memory/README.md and update the single word.
- Line 61: Update the fenced code block that contains the directory listing line
"cuga/" to include a language specifier (e.g., change the opening ``` to ```text
or ```plaintext) so the block complies with MD040 and enables proper syntax
highlighting; locate the fenced block in the README that currently begins with
``` and replace it with ```text.
In `@README.md`:
- Around line 810-812: The README contains inconsistent demo start commands:
some places use "cuga start demo" while the walkthrough requires "cuga start
demo_crm --sample-memory-data"; update the README so both occurrences (and any
other mentions) consistently instruct users to run "cuga start demo_crm
--sample-memory-data" (or alternatively change the walkthrough to use "cuga
start demo" if that is the intended command), ensuring the example and the
subsequent walkthrough match and updating the step list and any references at
the other noted location to the chosen command.
In `@src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2`:
- Line 6: In the pmt.jinja2 prompt template update the sentence "RetryClaude can
make mistakes." to a correct phrase such as "Claude can make mistakes" or "The
assistant can make mistakes"; locate the string inside the prompt template (the
memory-facts instruction block) and replace the token "RetryClaude" with the
chosen correct term so the prompt reads e.g. "Claude can make mistakes."
In `@src/cuga/backend/memory/__init__.py`:
- Around line 23-40: The ModuleNotFoundError handler around
import_module(module_name) should only treat errors that are about Kaizen
itself; inspect exc.name and if it does not refer to a kaizen module (e.g., not
starting with "kaizen" or not equal to the top-level module_name root) re-raise
the original ModuleNotFoundError so missing transitive deps surface correctly.
Only when exc.name indicates a kaizen-related module proceed with the
util.find_spec("kaizen") / version("kaizen") checks and raise the specialized
RuntimeError about Kaizen integration; otherwise immediately reraise exc.
In `@src/cuga/backend/memory/utils/memory_tips_formatted.py`:
- Around line 40-44: The code assumes variable tips is a list-like before
calling len(...) and list comprehensions (raw_count, raw_types, raw_ids and
subsequent filtering); guard and normalize the backend output first by checking
tips and converting it to a safe iterable (e.g., if tips is None set tips = [],
if it’s a single object wrap it in [tips], or if it’s not list-like try to cast
to list) before computing raw_count/raw_types/raw_ids and before filtering for
type == "guideline" so the rest of the logic (raw_count, raw_types, raw_ids, and
the tips = [ ... ] filter) never crashes on unexpected types.
---
Nitpick comments:
In `@pyproject.toml`:
- Around line 32-37: Remove the duplicate unversioned dependency "litellm" in
pyproject.toml and keep only the explicit versioned entry "litellm>=1.0.0"; edit
the dependency list to delete the bare "litellm" line so the file contains a
single clear declaration ("litellm>=1.0.0") and re-run dependency resolution to
verify no other duplicates remain.
In `@src/cuga/backend/activity_tracker/tracker.py`:
- Around line 652-696: The try/except currently wraps the whole loop so a
failure in generate_tips or self.memory.update_entities for one agent aborts
processing for all agents; move the try/except inside the for agent_id,
agent_trajectory_messages in agent_scoped_trajectories loop so each agent's work
is isolated, i.e., call generate_tips(agent_trajectory_messages) and the
subsequent unique_tips -> entities -> self.memory.update_entities within a
per-agent try block, catch exceptions there (using logger.exception) including
agent_id and user_id in the log, and continue to the next agent so one bad agent
doesn't drop remaining guideline writes; keep the outer early-return checks and
use the same functions/methods: _build_agent_scoped_trajectory_messages,
generate_tips, and self.memory.update_entities.
- Around line 807-826: collect_step currently calls self.memory.update_entities
inline which can block step execution; instead, create a deferred task to
perform the memory write asynchronously. Wrap the update_entities call
(including the Entity payload and enable_conflict_resolution=False) in an async
helper or use asyncio.to_thread if update_entities is synchronous, then schedule
it with asyncio.create_task(...) so the call returns immediately; reference the
collect_step function and the memory.update_entities invocation along with the
Entity metadata (timestamp, step_index, step_name, intent, user_id, task_id,
session_id) to ensure the same payload is passed to the background task.
In
`@src/cuga/backend/cuga_graph/nodes/api/api_planner_agent/api_planner_agent.py`:
- Around line 220-257: The method _format_preferences_for_prompt currently
appends structured_facts into lines and then iterates preferences to re-emit the
same facts under category headings, duplicating context; modify the logic so
that when structured_facts is non-empty you skip the categorized preferences
dump (or alternatively only include categories for facts not present in
structured_facts). Concretely, inside _format_preferences_for_prompt, after
building the "Structured Facts For Decision Making" block (using
APIPlannerAgent._format_fact_pointer, _escape_inline_code and _safe_text), add a
guard around the subsequent preferences handling (the branch that checks
first_value and iterates preferences) to return or skip that block if
structured_facts exists, or filter each category's facts by excluding items
whose pointer/id matches any pointer built from structured_facts before
appending to lines. Ensure you reference structured_facts, preferences, and the
lines list used to join the final prompt.
In `@src/cuga/backend/cuga_graph/nodes/chat/chat_agent/prompts/pmt.jinja2`:
- Around line 3-4: The template currently always renders the "Persistent memory
facts about the user:" header even when memory_context is empty; update the
pmt.jinja2 template to wrap the header and the {{memory_context}} output in a
Jinja2 conditional that checks if memory_context is defined and non-empty (e.g.,
if memory_context), so the whole block is omitted when there are no memory
facts; ensure you reference the existing memory_context variable and keep the
rest of the prompt intact.
In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_node.py`:
- Around line 283-285: The info-level log in CugaLiteNode that mentions fact
storage is too verbose for production; update the call to use logger.debug
instead of logger.info (referencing the logger.info call in CugaLiteNode that
mentions state.user_preferences), or gate the message behind an explicit feature
flag/setting (e.g., a verbose_memory_tracing or DEBUG_MEMORY flag) so the
message only emits when that flag is enabled; ensure the check or log level
change is applied where the current logger.info is invoked so other info logs
remain unaffected.
In `@src/cuga/backend/cuga_graph/state/agent_state.py`:
- Around line 1099-1102: The catch block currently logs only the exception
message when loading preferences ("except Exception as e: logger.error(f\"Error
loading user preferences: {e}\")") which hides the exception type; update the
logging to include the exception class/type (e.g., include type(e) or
e.__class__.__name__) so the logger.error call for the preference-loading
failure (the block that sets state.user_preferences = {}) records both the
message and the exception type to aid debugging while keeping the broad
Exception catch.
In `@src/frontend_workspaces/agentic_chat/src/CardManager.tsx`:
- Around line 369-373: The current entryChanged calculation in CardManager (the
entryChanged constant comparing existing.title, existing.timestamp, and
JSON.stringify(existing.variables)/JSON.stringify(existing.memories) to entry.*)
uses JSON.stringify for deep equality which can be expensive for large objects;
change it to avoid repeated stringification by either (a) memoizing the
stringified representation on the objects when they are created/updated and
compare those cached strings (e.g., add
cachedVariablesString/cachedMemoriesString fields or compute once and reuse), or
(b) replace the JSON.stringify comparisons with a dedicated deep-equal utility
(e.g., lodash.isEqual or a small deepEqual function) and use
isEqual(existing.variables, entry.variables) and isEqual(existing.memories,
entry.memories) inside the entryChanged calculation. Ensure you update/maintain
the cache whenever variables/memories are modified and reference the
entryChanged constant to locate where to apply the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| def validate_path(file_path: str) -> str: | ||
| """ | ||
| Validate that a path is within allowed directories. | ||
| Returns the normalized, resolved path if valid. | ||
| Raises ValueError if path is outside allowed directories. | ||
| """ | ||
| # Expand and resolve the path | ||
| expanded = expand_home(file_path) | ||
| resolved = normalize_path(expanded) | ||
|
|
||
| # Check if path is within any allowed directory | ||
| for allowed_dir in allowed_directories: | ||
| if resolved.startswith(allowed_dir + '/') or resolved == allowed_dir: | ||
| return resolved | ||
|
|
||
| raise ValueError(f"Access denied: {file_path} is outside allowed directories") |
There was a problem hiding this comment.
Harden validate_path before relying on it as the security boundary.
The current string-prefix check breaks for filesystem-root allowlists (/, C:/), and every mutating tool still performs the actual open/rename/makedirs on the returned pathname later. That leaves a TOCTOU window where a symlink swap can escape the allowlist. Prefer a real ancestor check (Path.relative_to / os.path.commonpath) and revalidate at the syscall boundary for write/move operations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/examples/demo_apps/file_system/build/lib/main.py` around lines 48 - 63,
validate_path's string-prefix check is unsafe (fails for root allowlists and is
vulnerable to TOCTOU via symlink swaps); update validate_path to canonicalize
both the candidate (from expand_home/normalize_path) and each
allowed_directories entry as pathlib.Path.resolve(...) and then perform a true
ancestor check using Path.relative_to or os.path.commonpath to confirm the
candidate is inside an allowed_dir (handle root like "/" or "C:/" correctly by
canonicalizing first), and in addition revalidate the resolved path at the
syscall boundary for any mutating operation (open/rename/makedirs) by resolving
the path again immediately before the syscall and/or using safe syscalls (e.g.,
os.open with O_NOFOLLOW or compare parent directory inodes) to prevent symlink
races; reference validate_path, expand_home, normalize_path, and
allowed_directories when making these changes.
| # Generate diff | ||
| if original_content == content: | ||
| return "No changes made" | ||
|
|
||
| diff_lines = [] | ||
| diff_lines.append(f"--- {file_path}") | ||
| diff_lines.append(f"+++ {file_path}") | ||
|
|
||
| original_lines = original_content.splitlines(keepends=True) | ||
| new_lines = content.splitlines(keepends=True) | ||
|
|
||
| # Simple diff generation | ||
| for i, (old_line, new_line) in enumerate(zip(original_lines, new_lines)): | ||
| if old_line != new_line: | ||
| diff_lines.append(f"- {old_line.rstrip()}") | ||
| diff_lines.append(f"+ {new_line.rstrip()}") | ||
|
|
||
| status = "Dry run - no changes made" if dry_run else "Changes applied successfully" | ||
| return f"{status}\n\n" + '\n'.join(diff_lines) |
There was a problem hiding this comment.
The generated diff is incomplete for insertions and deletions.
This zip-based comparison only walks the shorter side, so appending/removing lines—especially at EOF—can return an empty or truncated diff even when the file changed. difflib.unified_diff is a better fit here.
🧰 Tools
🪛 Ruff (0.15.7)
[warning] 158-158: Loop control variable i not used within loop body
Rename unused i to _i
(B007)
[warning] 158-158: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/examples/demo_apps/file_system/build/lib/main.py` around lines 146 -
164, The current diff generation in the block that compares original_content and
content (using original_lines/new_lines and a zip-based loop) misses
insertions/deletions because zip stops at the shorter list; replace this manual
loop with difflib.unified_diff to produce a correct unified diff (import
difflib, call difflib.unified_diff(original_lines, new_lines, fromfile=f"---
{file_path}", tofile=f"+++ {file_path}", lineterm="") ), preserve the existing
status message logic around dry_run, and return the joined unified_diff lines
instead of the zip-generated diff so insertions, deletions and EOF changes are
captured correctly.
| # Handle glob patterns | ||
| if '**' in pattern: | ||
| # Use glob for recursive patterns | ||
| glob_pattern = os.path.join(root_path, pattern) | ||
| matches = glob.glob(glob_pattern, recursive=True) | ||
|
|
||
| for match in matches: | ||
| rel_path = os.path.relpath(match, root_path) | ||
|
|
||
| # Check exclusions | ||
| excluded = False | ||
| for exclude in exclude_patterns: | ||
| if ( | ||
| fnmatch.fnmatch(rel_path, exclude) | ||
| or fnmatch.fnmatch(rel_path, f"**/{exclude}") | ||
| or fnmatch.fnmatch(rel_path, f"**/{exclude}/**") | ||
| ): | ||
| excluded = True | ||
| break | ||
|
|
||
| if not excluded: | ||
| results.append(match) | ||
| else: | ||
| # Simple pattern in current directory | ||
| for item in os.listdir(root_path): | ||
| item_path = os.path.join(root_path, item) | ||
| if fnmatch.fnmatch(item, pattern): | ||
| results.append(item_path) | ||
|
|
||
| return results |
There was a problem hiding this comment.
Non-recursive glob patterns with path segments never match.
When pattern does not contain **, the code only compares it against the basenames returned by os.listdir(root_path). Patterns like src/*.py therefore always miss, and exclude_patterns are ignored in this branch.
🧭 Suggested fix
- else:
- # Simple pattern in current directory
- for item in os.listdir(root_path):
- item_path = os.path.join(root_path, item)
- if fnmatch.fnmatch(item, pattern):
- results.append(item_path)
+ else:
+ matches = glob.glob(os.path.join(root_path, pattern), recursive=False)
+ for match in matches:
+ rel_path = os.path.relpath(match, root_path)
+ excluded = any(
+ fnmatch.fnmatch(rel_path, exclude)
+ or fnmatch.fnmatch(rel_path, f"**/{exclude}")
+ or fnmatch.fnmatch(rel_path, f"**/{exclude}/**")
+ for exclude in exclude_patterns
+ )
+ if not excluded:
+ results.append(match)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/examples/demo_apps/file_system/build/lib/main.py` around lines 176 -
205, The non-recursive branch only matches basenames from os.listdir(root_path)
so patterns containing path segments (e.g., "src/*.py") never match and
exclude_patterns aren't applied; change the else branch in main.py to glob
against os.path.join(root_path, pattern) (using glob.glob with recursive=False
or without "**") to get full paths, then apply the same exclusion logic used in
the '**' branch (checking rel_path against exclude_patterns with fnmatch.fnmatch
and the ** variants) before appending to results so path-segment patterns and
exclusions behave consistently (update references: pattern, root_path, results,
exclude_patterns, rel_path).
| def build_tree(current_path: str, root_path: str) -> List[Dict[str, Any]]: | ||
| valid_path = validate_path(current_path) | ||
| entries = os.listdir(valid_path) | ||
| result = [] | ||
|
|
||
| for entry in sorted(entries): | ||
| entry_path = os.path.join(current_path, entry) | ||
| rel_path = os.path.relpath(entry_path, root_path) | ||
|
|
||
| # Check exclusions | ||
| excluded = False | ||
| for pattern in excludePatterns: | ||
| if ( | ||
| fnmatch.fnmatch(rel_path, pattern) | ||
| or fnmatch.fnmatch(rel_path, f"**/{pattern}") | ||
| or fnmatch.fnmatch(rel_path, f"**/{pattern}/**") | ||
| ): | ||
| excluded = True | ||
| break | ||
|
|
||
| if excluded: | ||
| continue | ||
|
|
||
| is_dir = os.path.isdir(entry_path) | ||
| entry_data = {'name': entry, 'type': 'directory' if is_dir else 'file'} | ||
|
|
||
| if is_dir: | ||
| entry_data['children'] = build_tree(entry_path, root_path) |
There was a problem hiding this comment.
Guard directory_tree against symlink loops.
os.path.isdir(entry_path) follows symlinks, so a loop such as loop -> .. will recurse until RecursionError or hang the request. Either skip symlinked directories or track visited realpaths/inodes before descending.
🔁 Suggested fix
- def build_tree(current_path: str, root_path: str) -> List[Dict[str, Any]]:
+ def build_tree(current_path: str, root_path: str, seen: set[str]) -> List[Dict[str, Any]]:
valid_path = validate_path(current_path)
+ real_current = os.path.realpath(valid_path)
+ if real_current in seen:
+ return []
+ seen.add(real_current)
entries = os.listdir(valid_path)
result = []
@@
- is_dir = os.path.isdir(entry_path)
+ is_dir = os.path.isdir(entry_path) and not os.path.islink(entry_path)
entry_data = {'name': entry, 'type': 'directory' if is_dir else 'file'}
if is_dir:
- entry_data['children'] = build_tree(entry_path, root_path)
+ entry_data['children'] = build_tree(entry_path, root_path, seen)
@@
- tree_data = build_tree(root_path, root_path)
+ tree_data = build_tree(root_path, root_path, set())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/examples/demo_apps/file_system/build/lib/main.py` around lines 443 -
470, The build_tree function can recurse into symlinked directories causing
infinite loops; modify build_tree (and its caller/initialization) to detect and
avoid symlink loops by tracking visited real paths or inodes: before descending
from entry_path (where os.path.isdir(entry_path) is checked), resolve real =
os.path.realpath(entry_path) (or use os.stat to get st_ino/st_dev) and maintain
a visited set passed through recursive calls; if real (or inode tuple) is
already in visited or if os.path.islink(entry_path) and you prefer to skip
symlinks, skip adding children and do not recurse; ensure validate_path usage
remains and update build_tree signature to accept the visited set if needed.
| async def _persist_guidelines_from_trajectory( | ||
| self, | ||
| *, | ||
| namespace_id: str, | ||
| trajectory_messages: list[dict[str, str]], | ||
| user_id: str, | ||
| ) -> None: |
There was a problem hiding this comment.
Snapshot task_id and session_id before queuing the guideline job.
This coroutine runs after collect_step() returns, but it still reads self.task_id and self.session_id from the singleton when building entity metadata. If the tracker is reset for the next task first, guidelines from one trajectory can be written under the next task/session.
🔧 Suggested fix
async def _persist_guidelines_from_trajectory(
self,
*,
namespace_id: str,
trajectory_messages: list[dict[str, str]],
user_id: str,
+ task_id: str,
+ session_id: str,
) -> None: metadata={
"category": tip.category,
"rationale": tip.rationale,
"trigger": tip.trigger,
"agent_id": agent_id,
"user_id": user_id,
- "task_id": self.task_id,
- "session_id": self.session_id,
+ "task_id": task_id,
+ "session_id": session_id,
"task_description": result.task_description,
}, task = asyncio.create_task(
self._persist_guidelines_from_trajectory(
namespace_id=namespace_id,
trajectory_messages=copy.deepcopy(self.trajectory_messages),
user_id=normalized_user_id,
+ task_id=self.task_id,
+ session_id=self.session_id,
)
)Also applies to: 671-679, 832-836
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cuga/backend/activity_tracker/tracker.py` around lines 641 - 647,
_snapshot self.task_id and self.session_id into local variables at the start of
_persist_guidelines_from_trajectory and use those local variables everywhere you
build guideline entity metadata or enqueue the guideline job instead of reading
self.task_id/self.session_id later; do the same change for the other coroutines
that build/enqueue guideline jobs (the blocks referenced at lines 671-679 and
832-836) so the queued job uses the captured task_id/session_id values rather
than potentially changed singleton attributes.
| if query: | ||
| kaizen_client.store_user_facts( | ||
| namespace_id=resolved_namespace, | ||
| message=query, | ||
| user_id=normalized_user_id, | ||
| ) | ||
|
|
||
| # Extract facts from the response (assuming similar structure to old implementation) | ||
| facts = [fact.content for fact in recorded_facts] | ||
|
|
||
| # Print debug info (maintaining original behavior) | ||
| print(query) | ||
| print("------ICLs--------") | ||
| for f in facts: | ||
| print(f) | ||
|
|
||
| return facts | ||
|
|
||
| def create_run(self, namespace_id: str, run_id: str | None = None) -> Run: | ||
| """Create a new run to track Agent steps.""" | ||
| return self.memory_client.create_run(namespace_id, run_id) | ||
|
|
||
| def get_run(self, namespace_id: str, run_id: str) -> Run: | ||
| """Get an existing run.""" | ||
| return self.memory_client.get_run(namespace_id, run_id) | ||
|
|
||
| def delete_run(self, namespace_id: str, run_id: str): | ||
| """Delete an existing run.""" | ||
| return self.memory_client.delete_run(namespace_id, run_id) | ||
|
|
||
| def search_runs( | ||
| self, namespace_id: str, query: str | None = None, filters: dict[str, str] | None = None | ||
| ) -> Run | None: | ||
| """Search a namespace for a run based on it's step which best matches a query.""" | ||
| return self.memory_client.search_runs(namespace_id, query, filters) | ||
|
|
||
| async def end_run(self, namespace_id: str, run_id: str): | ||
| """End an existing run.""" | ||
| return await self.memory_client.end_run(namespace_id, run_id) | ||
|
|
||
| def add_step(self, namespace_id: str, run_id: str, step: dict, prompt: str) -> MemoryEvent: | ||
| """Add a new step into a run.""" | ||
| return self.memory_client.add_step(namespace_id, run_id, step, prompt) | ||
|
|
||
| def list_runs(self, namespace_id: str, limit: int = 10) -> list[Run]: | ||
| """Retrieve the list of runs in a namespace.""" | ||
| return self.memory_client.list_runs(namespace_id, limit) | ||
|
|
||
| def _get_user_id(self, state: "AgentState") -> str: | ||
| """Extract or generate user ID for memory scoping""" | ||
| # Use the pi field from AgentState | ||
| if hasattr(state, 'pi') and state.pi: | ||
| pi_dict = json.loads(state.pi) | ||
| state.user_id = str(f"{pi_dict["first_name"]}_{pi_dict["last_name"]}_{pi_dict["phone_number"]}") | ||
| else: | ||
| state.user_id = "default_user" | ||
| self.user_id = state.user_id | ||
| return state.user_id | ||
| preferences = kaizen_client.retrieve_user_facts( | ||
| namespace_id=resolved_namespace, | ||
| user_id=normalized_user_id, | ||
| query=query, | ||
| ) |
There was a problem hiding this comment.
Guard the Kaizen write/read path so memory outages don't fail the whole turn.
store_user_facts() and retrieve_user_facts() are request-path external calls, but any transient backend error currently bubbles out of sync_user_memory() and aborts the agent response. This path should log and fall back to {} so memory degradation doesn't become a hard outage.
🛟 Graceful fallback
resolved_namespace = namespace_id or get_kaizen_namespace_id()
- if query:
- kaizen_client.store_user_facts(
- namespace_id=resolved_namespace,
- message=query,
- user_id=normalized_user_id,
- )
-
- preferences = kaizen_client.retrieve_user_facts(
- namespace_id=resolved_namespace,
- user_id=normalized_user_id,
- query=query,
- )
- return normalized_user_id, preferences
+ try:
+ if query:
+ kaizen_client.store_user_facts(
+ namespace_id=resolved_namespace,
+ message=query,
+ user_id=normalized_user_id,
+ )
+
+ preferences = kaizen_client.retrieve_user_facts(
+ namespace_id=resolved_namespace,
+ user_id=normalized_user_id,
+ query=query,
+ )
+ except Exception:
+ return normalized_user_id, {}
+
+ return normalized_user_id, preferences or {}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if query: | |
| kaizen_client.store_user_facts( | |
| namespace_id=resolved_namespace, | |
| message=query, | |
| user_id=normalized_user_id, | |
| ) | |
| # Extract facts from the response (assuming similar structure to old implementation) | |
| facts = [fact.content for fact in recorded_facts] | |
| # Print debug info (maintaining original behavior) | |
| print(query) | |
| print("------ICLs--------") | |
| for f in facts: | |
| print(f) | |
| return facts | |
| def create_run(self, namespace_id: str, run_id: str | None = None) -> Run: | |
| """Create a new run to track Agent steps.""" | |
| return self.memory_client.create_run(namespace_id, run_id) | |
| def get_run(self, namespace_id: str, run_id: str) -> Run: | |
| """Get an existing run.""" | |
| return self.memory_client.get_run(namespace_id, run_id) | |
| def delete_run(self, namespace_id: str, run_id: str): | |
| """Delete an existing run.""" | |
| return self.memory_client.delete_run(namespace_id, run_id) | |
| def search_runs( | |
| self, namespace_id: str, query: str | None = None, filters: dict[str, str] | None = None | |
| ) -> Run | None: | |
| """Search a namespace for a run based on it's step which best matches a query.""" | |
| return self.memory_client.search_runs(namespace_id, query, filters) | |
| async def end_run(self, namespace_id: str, run_id: str): | |
| """End an existing run.""" | |
| return await self.memory_client.end_run(namespace_id, run_id) | |
| def add_step(self, namespace_id: str, run_id: str, step: dict, prompt: str) -> MemoryEvent: | |
| """Add a new step into a run.""" | |
| return self.memory_client.add_step(namespace_id, run_id, step, prompt) | |
| def list_runs(self, namespace_id: str, limit: int = 10) -> list[Run]: | |
| """Retrieve the list of runs in a namespace.""" | |
| return self.memory_client.list_runs(namespace_id, limit) | |
| def _get_user_id(self, state: "AgentState") -> str: | |
| """Extract or generate user ID for memory scoping""" | |
| # Use the pi field from AgentState | |
| if hasattr(state, 'pi') and state.pi: | |
| pi_dict = json.loads(state.pi) | |
| state.user_id = str(f"{pi_dict["first_name"]}_{pi_dict["last_name"]}_{pi_dict["phone_number"]}") | |
| else: | |
| state.user_id = "default_user" | |
| self.user_id = state.user_id | |
| return state.user_id | |
| preferences = kaizen_client.retrieve_user_facts( | |
| namespace_id=resolved_namespace, | |
| user_id=normalized_user_id, | |
| query=query, | |
| ) | |
| resolved_namespace = namespace_id or get_kaizen_namespace_id() | |
| try: | |
| if query: | |
| kaizen_client.store_user_facts( | |
| namespace_id=resolved_namespace, | |
| message=query, | |
| user_id=normalized_user_id, | |
| ) | |
| preferences = kaizen_client.retrieve_user_facts( | |
| namespace_id=resolved_namespace, | |
| user_id=normalized_user_id, | |
| query=query, | |
| ) | |
| except Exception: | |
| return normalized_user_id, {} | |
| return normalized_user_id, preferences or {} |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cuga/backend/memory/memory.py` around lines 245 - 256, The Kaizen
write/read calls in sync_user_memory (calls to kaizen_client.store_user_facts
and kaizen_client.retrieve_user_facts) must be fault-tolerant: wrap the
store_user_facts call and the retrieve_user_facts call in try/except blocks, log
the exception (including context like resolved_namespace, normalized_user_id,
and query) and ensure retrieve_user_facts falls back to an empty dict {} on
error so a transient Kaizen outage doesn't raise and abort the agent turn; keep
the existing behavior when calls succeed.
| logger.warning( | ||
| "Memory guideline retrieval failed namespace_id='{}' agent_id='{}' query='{}' error={}", | ||
| namespace_id, | ||
| normalized_agent_id, | ||
| (query or "")[:120], | ||
| e, | ||
| ) |
There was a problem hiding this comment.
Stop logging raw memory content and user query text.
Line 31-Line 37, Line 55-Line 59, and Line 71-Line 74 can expose sensitive user facts/PII in logs. Keep only counts/namespace/agent metadata.
🔒 Suggested fix
- logger.warning(
- "Memory guideline retrieval failed namespace_id='{}' agent_id='{}' query='{}' error={}",
- namespace_id,
- normalized_agent_id,
- (query or "")[:120],
- e,
- )
+ logger.warning(
+ "Memory guideline retrieval failed namespace_id='{}' agent_id='{}' error={}",
+ namespace_id,
+ normalized_agent_id,
+ e,
+ )
@@
- logger.debug(
- "Memory guideline retrieval: no results namespace_id='{}' agent_id='{}' query='{}' limit={}",
- namespace_id,
- normalized_agent_id,
- (query or "")[:120],
- limit,
- )
+ logger.debug(
+ "Memory guideline retrieval: no results namespace_id='{}' agent_id='{}' limit={}",
+ namespace_id,
+ normalized_agent_id,
+ limit,
+ )
@@
- logger.debug(
- "Memory guideline retrieval: {} result(s) namespace_id='{}' agent_id='{}' query='{}' limit={}",
- len(tips),
- namespace_id,
- normalized_agent_id,
- (query or "")[:120],
- limit,
- )
+ logger.debug(
+ "Memory guideline retrieval: {} result(s) namespace_id='{}' agent_id='{}' limit={}",
+ len(tips),
+ namespace_id,
+ normalized_agent_id,
+ limit,
+ )
@@
- logger.debug(
- "Memory guideline retrieval tips:\n{}",
- "\n".join(lines),
- )
+ logger.debug("Memory guideline retrieval produced {} formatted tip(s)", len(lines))Also applies to: 55-59, 71-74
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cuga/backend/memory/utils/memory_tips_formatted.py` around lines 31 - 37,
The logger.warning calls (e.g., the one invoking logger.warning with
namespace_id, normalized_agent_id, (query or "")[:120], e) are logging raw
memory content and user queries; change these log messages to never include the
raw query or memory text and instead log only safe metadata such as
namespace_id, normalized_agent_id, and counts/lengths (e.g., query_length or
result_count) plus the error object (e). Update every occurrence of this pattern
(the logger.warning calls around the shown diff and the similar occurrences at
the other two locations) to replace raw text with sanitized fields like
query_length and result_count or a masked placeholder.
| def _get_memories_for_ui(local_state: AgentState) -> Dict[str, List[Dict[str, Any]]]: | ||
| """Return memories directly from state.user_preferences for UI responses.""" | ||
| if not settings.advanced_features.enable_fact: | ||
| return {} | ||
| return local_state.user_preferences if isinstance(local_state.user_preferences, dict) else {} |
There was a problem hiding this comment.
Normalize legacy memory payloads before returning them to the UI.
The rest of this PR still supports the old {fact_id: content} shape, but this helper now forwards state.user_preferences verbatim as Record<string, any[]>. Historical or migrated threads will therefore show an empty Memories panel and memories_count = 0 even though the agent can still use those facts. Normalize to the categorized list shape here before serializing.
🧩 Possible normalization
def _get_memories_for_ui(local_state: AgentState) -> Dict[str, List[Dict[str, Any]]]:
"""Return memories directly from state.user_preferences for UI responses."""
if not settings.advanced_features.enable_fact:
return {}
- return local_state.user_preferences if isinstance(local_state.user_preferences, dict) else {}
+ raw = local_state.user_preferences
+ if not isinstance(raw, dict):
+ return {}
+
+ from collections import defaultdict
+ from cuga.backend.cuga_graph.state.user_preferences_context import extract_structured_facts
+
+ grouped: dict[str, list[dict[str, Any]]] = defaultdict(list)
+ for fact in extract_structured_facts(raw):
+ grouped[fact["category"]].append(fact)
+ return dict(grouped)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cuga/backend/server/main.py` around lines 212 - 216, The helper
_get_memories_for_ui should normalize legacy memory payloads in
local_state.user_preferences before returning them: keep the
settings.advanced_features.enable_fact guard, but when user_preferences is a
dict and its values are not already lists (the old {fact_id: content} shape),
transform it into the categorized Record<string, any[]> shape expected by the UI
by mapping those entries into a category (e.g., "Facts") as an array of items
(include the original fact id and its content as object fields), and otherwise
return user_preferences unchanged when it already matches the categorized lists
shape.
| [memory.kaizen.conflict_resolution.model] | ||
| platform = "openai" | ||
| temperature = 0.1 | ||
| max_tokens = 1000 | ||
|
|
||
| [memory.milvus.step_processing.model] | ||
| [memory.kaizen.fact_extraction.model] | ||
| platform = "openai" | ||
| temperature = 0.1 | ||
| max_tokens = 1000 | ||
|
|
||
| [memory.milvus.fact_extraction.model] | ||
| platform = "openai" | ||
| temperature = 0.1 | ||
| max_tokens = 1000 | ||
|
|
||
| [memory.tips_extractor.model] | ||
| [memory.kaizen.tips.model] | ||
| platform = "openai" | ||
| temperature = 0.1 | ||
| max_tokens = 5000 No newline at end of file |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -n -C2 'memory\.(mem0\.model|milvus\.step_processing\.model|milvus\.fact_extraction\.model|tips_extractor\.model|kaizen\.(conflict_resolution|fact_extraction|tips)\.model)' || trueRepository: cuga-project/cuga-agent
Length of output: 10932
Update settings.litellm.toml to use new memory.kaizen. configuration keys.*
The runtime code (src/cuga/backend/memory/memory.py lines 135, 139, 143) reads only from memory.kaizen.conflict_resolution.model, memory.kaizen.fact_extraction.model, and memory.kaizen.tips.model. However, settings.litellm.toml still defines the retired keys: memory.mem0.model, memory.milvus.step_processing.model, memory.milvus.fact_extraction.model, and memory.tips_extractor.model. These legacy entries will be silently ignored at runtime, and the litellm provider will fall back to defaults. Rename all four keys in settings.litellm.toml to match the new naming scheme used in all other provider configurations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cuga/configurations/models/settings.openai.toml` around lines 51 - 64,
Rename the four legacy TOML keys in settings.litellm.toml so they match the new
memory.kaizen.* keys that the runtime expects: change memory.mem0.model ->
memory.kaizen.conflict_resolution.model, memory.milvus.step_processing.model ->
memory.kaizen.conflict_resolution.model (or merge if intended),
memory.milvus.fact_extraction.model -> memory.kaizen.fact_extraction.model, and
memory.tips_extractor.model -> memory.kaizen.tips.model; ensure the platform,
temperature, and max_tokens values are preserved under the new keys so the code
in memory.py that reads memory.kaizen.conflict_resolution.model,
memory.kaizen.fact_extraction.model, and memory.kaizen.tips.model will pick them
up.
| rawFacts.forEach((fact: any) => { | ||
| const content = typeof fact?.content === "string" ? fact.content : ""; | ||
| if (!content) { | ||
| return; | ||
| } | ||
|
|
||
| facts.push({ | ||
| id: typeof fact?.id === "string" ? fact.id : undefined, | ||
| category, | ||
| content, | ||
| key: typeof fact?.key === "string" ? fact.key : null, | ||
| value: typeof fact?.value === "string" ? fact.value : null, | ||
| }); |
There was a problem hiding this comment.
Don't drop structured facts that only have key/value.
The shared memory formatter treats a fact as valid when any of key, value, or content is present, but this extractor returns early unless content is non-empty. Kaizen facts that come back as structured pairs will disappear from the sidebar even though the backend still counts them. Build a display string from key/value when content is missing.
🧩 Suggested fallback
rawFacts.forEach((fact: any) => {
- const content = typeof fact?.content === "string" ? fact.content : "";
+ const key = typeof fact?.key === "string" ? fact.key : "";
+ const value = typeof fact?.value === "string" ? fact.value : "";
+ const content =
+ typeof fact?.content === "string" && fact.content
+ ? fact.content
+ : [key, value].filter(Boolean).join(": ");
if (!content) {
return;
}
facts.push({
id: typeof fact?.id === "string" ? fact.id : undefined,
category,
content,
- key: typeof fact?.key === "string" ? fact.key : null,
- value: typeof fact?.value === "string" ? fact.value : null,
+ key: key || null,
+ value: value || null,
});
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/frontend_workspaces/agentic_chat/src/VariablesSidebar.tsx` around lines
60 - 72, The current rawFacts.forEach callback ignores facts that lack a string
content, dropping valid structured facts; change the extraction inside the loop
used by VariablesSidebar (the rawFacts.forEach and facts.push logic) to compute
content as: if fact.content is a non-empty string use it, otherwise if fact.key
or fact.value exist build a display string from those (e.g. "key: value" or the
non-empty key/value alone) and treat that as content so you don't return early;
keep id/key/value assignments unchanged (id from fact.id, key from fact.key,
value from fact.value) and only skip pushing when neither content nor key/value
exist.
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Documentation
Chores