Port all Bash to Python — complete migration#77
Conversation
_merge_media_group() called _enqueue_single() without the required platform parameter at 3 call sites, causing a TypeError whenever Telegram media groups (multiple photos) were merged or fell back. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port Bash webhook handlers (telegram.sh, whatsapp.sh, claude.sh, tts.sh, stt.sh) to Python modules, eliminating 400-800ms subprocess overhead per message on Raspberry Pi. The new in-process handlers use stdlib only (urllib, json, sqlite3) matching the codebase's minimal-dependency philosophy. New modules: - lib/util.py: shared utilities (sanitization, validation, multipart encoder) - lib/config.py: BotConfig class with env file loading - lib/telegram_api.py: TelegramClient (send, download, typing, reactions) - lib/whatsapp_api.py: WhatsAppClient (send, download, mark_read) - lib/elevenlabs.py: TTS/STT integration - lib/claude_runner.py: Claude CLI invocation with JSON parsing - lib/handlers.py: webhook orchestrator with unified pipeline Feature flag: set CLAUDIO_PYTHON_HANDLERS=1 to opt in. Default remains Bash handlers for safe rollout. 341 tests across 7 new test files, all passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello @claudio-pi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant architectural shift by porting the application's webhook processing from legacy Bash scripts to a new set of Python modules. This transition is primarily aimed at enhancing performance by removing the overhead associated with spawning subprocesses for each message. The new Python implementation is designed to be lightweight, relying solely on standard library features to minimize external dependencies. To facilitate a controlled deployment, a feature flag has been implemented, allowing users to opt-in to the new Python handlers and ensure stability during the rollout. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request is a major and well-executed refactoring, porting several Bash-based webhook handlers to new Python modules. The new Python code is well-structured, follows good practices by using only the standard library, and includes a comprehensive test suite which is excellent. The introduction of a feature flag for a safe rollout is also a great strategy.
I've identified a few areas for improvement:
- Configuration File Handling: The
save_modelfunction inlib/config.pydestructively rewrites thebot.envfile, which could lead to loss of comments or other configuration variables. I've suggested a safer read-modify-write pattern. - JSON Parsing: In
lib/claude_runner.py, a manual method is used to strip quotes from JSON strings. This is fragile and can fail with escaped characters. I've recommended usingjson.loads()for robust parsing. - File Permissions: There's an inconsistency in file permissions. Downloaded files in
lib/telegram_api.pyare created with restrictive permissions (0o600), but inlib/whatsapp_api.pythey are not. I've suggested aligning the WhatsApp handler for better security.
Overall, this is a high-quality submission. Addressing these points will enhance the robustness and security of the new Python handlers.
There was a problem hiding this comment.
5 issues found across 15 files
Confidence score: 2/5
- The path construction in
lib/config.pydoesn’t validatebot_id, which can enable path traversal and arbitrary file read/write—this is a high-impact security risk. lib/server.pyis likely to fail importinglib.handlerswhen run aspython3 lib/server.py, so the Python handlers flag could break at runtime.- Overall risk is elevated due to multiple high-severity issues (security + runtime failure), making this a cautious merge.
- Pay close attention to
lib/config.py,lib/server.py,lib/whatsapp_api.py,lib/claude_runner.py,lib/telegram_api.py- security validation, import pathing, and file I/O correctness.
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="lib/whatsapp_api.py">
<violation number="1" location="lib/whatsapp_api.py:291">
P2: The `resp.read()` call has no size cap, allowing unbounded memory consumption. Additionally, the file is written to disk *before* the size validation. Read with a size limit and validate before writing.
Consider reading at most `_MAX_MEDIA_SIZE + 1` bytes, then checking the length before writing to disk. This prevents both memory exhaustion and unnecessary disk writes for oversized files.</violation>
</file>
<file name="lib/config.py">
<violation number="1" location="lib/config.py:144">
P1: Validate bot_id (e.g., `^[a-zA-Z0-9_-]+$`) before using it to build filesystem paths; otherwise a crafted bot_id can escape `~/.claudio/bots/` and read/write arbitrary files.
(Based on your team's feedback about validating bot_id values before constructing filesystem paths.) [FEEDBACK_USED]</violation>
</file>
<file name="lib/claude_runner.py">
<violation number="1" location="lib/claude_runner.py:360">
P2: Manual quote stripping doesn't handle JSON escape sequences. If a notification message contains characters that are escaped in JSON (`\"`, `\\`, `\n`, etc.), they will appear as literal escape sequences instead of being decoded. Use `json.loads(line)` for correct parsing.</violation>
</file>
<file name="lib/server.py">
<violation number="1" location="lib/server.py:304">
P1: When the Python handlers feature flag is enabled, `from lib.handlers import process_webhook` will fail because server.py is executed as `python3 lib/server.py`, so `lib` isn’t on sys.path as a package. This makes the new handler path crash at runtime. Add the repo root to sys.path before importing or adjust imports/module execution so `lib.*` resolves.</violation>
</file>
<file name="lib/telegram_api.py">
<violation number="1" location="lib/telegram_api.py:276">
P2: Use a loop (or a file object) to ensure all bytes are written; os.write may write only part of the buffer, which can truncate downloads.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Security fixes: - Add db_file path validation (must be named history.db, no traversal) - Sanitize MultipartEncoder header values against CRLF injection - Validate env file keys against [A-Za-z_][A-Za-z0-9_]* pattern - Case-insensitive HTTPS scheme check for WhatsApp media URLs Bug fixes: - WhatsApp document history caption: use msg.caption only (not msg.text) - WhatsApp document enrichment condition: check msg.caption only - Add traceback logging on unhandled exceptions in _process_message - Reduce typing thread join timeout from 5s to 1s - Add explanatory comment for missing WhatsApp typing indicator Documentation: - CLAUDE.md: document Python handler modules and feature flag - CONTRIBUTING.md: add Python modules, pytest test suites Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Telegram/WhatsApp API error logs now extract error descriptions instead of dumping full JSON responses (prevents leaking tokens or internal IDs to log files) - README.md: add pytest instructions, CLAUDIO_PYTHON_HANDLERS env var documentation, and Python handlers roadmap entry Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
1 issue found across 3 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="README.md">
<violation number="1" location="README.md:395">
P3: Remove the duplicate “Health Check” heading added after the Webhook Handlers section to avoid redundant documentation headers.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Security: - Sanitize ElevenLabs error logs (extract detail.message only, truncate to 100 chars — matches Telegram/WhatsApp treatment) - Add length limits to voice_id and model regex (max 64 chars) - Add defense-in-depth model validation in run_claude() Docs: - Remove broken link to deleted DUAL_PLATFORM_SETUP.md in README - Replace internal phase references in server.py comment Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- claude_runner.py: Use json.loads() for notifier log parsing with manual fallback, fixing fragile quote-stripping - config.py: Add bot_id regex validation in from_env_files() to prevent path traversal; refactor save_model() to do targeted MODEL= line update preserving comments and extra variables - server.py: Add repo root to sys.path before importing lib.handlers so the import resolves when server.py is launched as a standalone script - telegram_api.py: Use os.fdopen() instead of raw os.write() to prevent partial writes on large files - whatsapp_api.py: Cap resp.read() at 16MB+1 to prevent OOM; use os.open() with 0o600 permissions matching telegram_api.py - README.md: Remove duplicate "Health Check" heading Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This is a substantial and well-executed refactoring that ports the core webhook handling logic from Bash to Python, which should yield significant performance improvements. The new Python modules are well-structured, with good separation of concerns (API clients, configuration, orchestration, etc.). The code demonstrates strong attention to security, with input validation, restrictive file permissions, and measures against path traversal and header injection. The retry logic in the API clients and the graceful shutdown handling in the server are also robust.
I've included a few suggestions for improvement, mainly around logging, history recording, and code structure. The most important one is a bug in how conversation history is recorded for messages with multiple media types. Several suggestions align with established repository rules for robustness and consistency.
Overall, this is a high-quality pull request that significantly improves the project's architecture and performance.
- Remove _USE_PYTHON_HANDLERS flag and CLAUDIO_BIN constant - Remove _process_webhook_bash() — Bash subprocess handler path - Inline _process_webhook_python() as _process_webhook() - Simplify notifier log fallback: use raw line on json.loads failure - Move memory daemon import to module level in handlers.py - Update CLAUDE.md, README.md, CONTRIBUTING.md to reflect Python handlers as the sole webhook processing path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
save_model() wrote to bot.env but the server's in-memory bot registry still had the old model value. Send SIGHUP after saving to trigger load_bots() and refresh the cache. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="lib/handlers.py">
<violation number="1" location="lib/handlers.py:244">
P2: Sending `SIGHUP` unconditionally can terminate the process when no SIGHUP handler is installed (e.g., tests or non-server usage). Guard this so the signal is only sent when a handler is registered, or use a reload callback instead.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Python handlers are now the sole webhook path. Delete claude.sh, tts.sh, stt.sh, and their tests. Strip telegram.sh and whatsapp.sh down to setup wizard + send helpers. Remove the _webhook CLI subcommand and its sourced dependencies (history.sh, memory.sh). Fix 6 stale test_server.py tests by updating enqueue_webhook signatures and patching _process_webhook. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request is a major and impressive refactoring that ports the Bash webhook handlers to new, in-process Python modules. This should provide a significant performance improvement by eliminating subprocess overhead. The new Python modules for the Telegram, WhatsApp, ElevenLabs, and Claude CLI integrations are well-structured and robust, with good error handling and security considerations like path validation. The addition of 341 new tests across 7 new test files is excellent and provides great confidence in the new implementation.
I have found two issues:
- A critical discrepancy where the described feature flag for a safe rollout (
CLAUDIO_PYTHON_HANDLERS) appears to be missing from the implementation. The old Bash handler path has been completely removed, preventing a fallback. This issue is highlighted by rules concerning configuration consistency and the removal of obsolete code. - A medium-severity bug in
lib/handlers.pywhere the text accompanying a document is not correctly saved to the conversation history.
Once these issues are addressed, this will be a fantastic improvement to the project.
Replace the entire Bash layer (11 .sh files, 4562 lines) with Python equivalents (5 new modules, ~2400 lines). The codebase is now pure Python (stdlib only, no external deps except optional fastembed). New modules: - lib/cli.py — CLI dispatch with lazy imports for fast startup - lib/setup.py — Telegram/WhatsApp interactive setup wizards - lib/service.py — systemd/launchd management, cloudflared, webhooks - lib/backup.py — rsync-based rotating backups with cron scheduling - lib/health_check.py — standalone cron health checker with alerts Also: - Expanded lib/config.py with ClaudioConfig class, save_bot_env() - Added CLI output helpers to lib/util.py - Rewrote claudio entry point from 207-line Bash to 7-line Python - Deleted all 8 .sh library files and 8 .bats test files - Updated CLAUDE.md and CONTRIBUTING.md for pure-Python architecture 640 Python tests passing (223 new + 417 existing). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use `msg.caption or msg.text` for document history entries, matching the existing image handling logic for consistency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The tool_summary (from post-tool-use hook) and notifier_messages were being prepended to the assistant's conversation history entries, producing lines like `Bash "curl -s ..."`. When this history was injected into Claude's prompt on subsequent requests, Claude mimicked the pattern and output tool descriptions as text instead of actually invoking tools. Fix: store only the clean response text in conversation history. Tool summaries are still logged separately but no longer pollute the prompt context. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The H:/A: prefix format in conversation history was being mimicked by Claude, causing it to generate fake multi-turn conversations with tool descriptions in its responses instead of actually executing tools. Switch to <user>/<assistant> XML tags which Claude is trained to respect as structural markup rather than content to reproduce. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Helps diagnose tool-use issues by logging num_turns and tools_used alongside response_len. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace ShellCheck + BATS with ruff + pytest in CI workflow. Fix 47 ruff errors: unused imports, ambiguous variable names, unused assignments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace os.getlogin() fallback with env var chain that works in headless environments like GitHub Actions runners. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
fastembed)Migration phases
handlers.py,telegram_api.py,whatsapp_api.py,elevenlabs.py,claude_runner.py,config.py,util.pyconfig.py—ClaudioConfig, multi-bot migration, env file I/Osetup.py— interactive Telegram + WhatsApp bot configurationservice.py— systemd/launchd, cloudflared, cron, hooksbackup.py— rsync-based rotating backups with cronhealth_check.py— cron health monitoring, restart, alertscli.py— Python CLI dispatch (was 207-line Bash script).shfiles and 5.batstest filesBug fixes (post-migration)
Tool: Bash "curl ...") instead of executing tools — removed tool_summary from stored historyH:/A:→<user>/<assistant>XML tags) to prevent Claude from mimicking the formatmsg.caption or msg.textfallbackArchitecture
Claude CLI remains a subprocess (cannot be imported as a library).
Test plan
python3 -m pytest tests/ -v)ruff check lib/ tests/)/opus,/sonnet,/haikumodel switching🤖 Generated with Claude Code