Skip to content

refactor: enhance encryption handling and error reporting in agent re…#143

Merged
pikann merged 1 commit into
masterfrom
fix/fix-production-agent-error
Jun 5, 2026
Merged

refactor: enhance encryption handling and error reporting in agent re…#143
pikann merged 1 commit into
masterfrom
fix/fix-production-agent-error

Conversation

@pikann
Copy link
Copy Markdown
Contributor

@pikann pikann commented Jun 5, 2026

Summary

This PR fixes several production-environment issues in the AI agent service: encryption key misconfiguration, repo tools unavailability in production containers, and silent swallowing of agent errors.

Changes

Fix encryption key propagation (deploy/docker-compose.prod.yml, scripts/install.sh)

  • Added ENCRYPTION_KEY to the ai-agent service in the production docker-compose — without it, the service couldn't decrypt LLM API keys stored encrypted by the API service.
  • Fixed the install script to generate a proper 64-character hex key (32 bytes for AES-256), up from 32 characters (16 bytes).

Fix repo tools injection in production (docker_workspace.py)

  • In development, /app is bind-mounted from the host into the agent container, so sibling sandbox containers could share it via a volume mount. In production, /app is baked into the image and the host path detection fails.
  • Added _copy_repo_tools_to_container() which uses Docker's put_archive API to inject repo_tools.py and the required __init__.py files directly into the running sandbox container at /tmp/paca_tools when no bind-mount is detected.
  • OH_EXTRA_PYTHON_PATH is now always set — to /app (dev) or /tmp/paca_tools (prod).
  • Removed the MCP build directory (/mcp) volume sharing, which is no longer needed.

Improve error reporting (executor.py)

  • _wait_for_done_or_stop now returns (stopped, errored) instead of a single bool.
  • Conversations ending with ERROR or STUCK status now correctly set the conversation to "failed" and emit agent.conversation.failed, instead of being reported as finished.
  • Added _get_conversation_error_detail() to extract and log the ConversationErrorEvent detail for observability.
  • Fixed visualizer=_QuietVisualizervisualizer=_QuietVisualizer() (was passing the class, not an instance).

Improve decryption failure handling (agent_repository.py)

  • On decryption failure, return "" instead of the raw ciphertext. Forwarding ciphertext to the LLM provider produced misleading "token expired / incorrect key" errors; an empty key surfaces a clear "missing API key" error instead.
  • Added an explicit warning when ENCRYPTION_KEY is unset, pointing to the fix.

@pikann pikann merged commit bc36ef6 into master Jun 5, 2026
3 checks passed
@pikann pikann deleted the fix/fix-production-agent-error branch June 5, 2026 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant