diff --git a/CLAUDE.md b/CLAUDE.md index 07e3126..45231ea 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -4,15 +4,16 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co ## What is Claudio -Claudio is a Telegram-to-Claude Code bridge. It runs a local HTTP server (port 8421), tunneled via cloudflared, that receives Telegram webhook messages and forwards them as one-shot prompts to the Claude Code CLI. Responses are sent back to Telegram. +Claudio is a messaging-to-Claude Code bridge. It supports both Telegram and WhatsApp Business API, running a local HTTP server (port 8421), tunneled via cloudflared, that receives webhook messages and forwards them as one-shot prompts to the Claude Code CLI. Responses are sent back to the originating platform. ## Architecture -- `claudio` — Main CLI entry point, dispatches subcommands (`status`, `start`, `install [bot_id]`, `uninstall {|--purge}`, `update`, `restart`, `log`, `telegram setup`, `version`). +- `claudio` — Main CLI entry point, dispatches subcommands (`status`, `start`, `install [bot_id]`, `uninstall {|--purge}`, `update`, `restart`, `log`, `telegram setup`, `whatsapp setup`, `version`). - `lib/config.sh` — Multi-bot config management. Handles global config (`$HOME/.claudio/service.env`) and per-bot config (`$HOME/.claudio/bots//bot.env`). Functions: `claudio_load_bot()`, `claudio_save_bot_env()`, `claudio_list_bots()`, `_migrate_to_multi_bot()` (auto-migrates single-bot installs). - `lib/server.sh` — Starts the Python HTTP server and cloudflared named tunnel together. Handles webhook registration with retry logic. `register_all_webhooks()` registers webhooks for all configured bots. -- `lib/server.py` — Python HTTP server (stdlib `http.server`), listens on port 8421, routes POST `/telegram/webhook`. Multi-bot dispatch: matches incoming webhooks to bots via secret-token header, loads bot registry from `~/.claudio/bots/*/bot.env`. SIGHUP handler for hot-reload. Composite queue keys (`bot_id:chat_id`) for per-bot message isolation. `/reload` endpoint (requires `MANAGEMENT_SECRET` authentication). Logging includes bot_id via `log_msg()` helper. +- `lib/server.py` — Python HTTP server (stdlib `http.server`), listens on port 8421, routes POST `/telegram/webhook` and POST/GET `/whatsapp/webhook`. Multi-bot dispatch: matches Telegram webhooks via secret-token header, WhatsApp webhooks via HMAC-SHA256 signature verification. Supports dual-platform bots (same bot_id serving both Telegram and WhatsApp). Loads bot registry from `~/.claudio/bots/*/bot.env`. SIGHUP handler for hot-reload. Composite queue keys (`bot_id:chat_id` for Telegram, `bot_id:phone_number` for WhatsApp) for per-bot, per-user message isolation. `/reload` endpoint (requires `MANAGEMENT_SECRET` authentication). Logging includes bot_id via `log_msg()` helper. - `lib/telegram.sh` — Telegram Bot API integration (send messages, parse webhooks, image download/validation, document download, voice message handling). `telegram_setup()` accepts optional bot_id for per-bot configuration. Model commands (`/haiku`, `/sonnet`, `/opus`) save to bot.env when `CLAUDIO_BOT_DIR` is set. +- `lib/whatsapp.sh` — WhatsApp Business API integration (send messages with 4096 char chunking, parse webhooks, image/document/audio download with magic byte validation, voice message transcription via ElevenLabs). Uses WhatsApp Cloud API v21.0. `whatsapp_setup()` accepts optional bot_id for per-bot configuration. Model commands (`/haiku`, `/sonnet`, `/opus`) save to bot.env when `CLAUDIO_BOT_DIR` is set. Security: HMAC-SHA256 signature verification, per-bot app secrets, authorized phone number enforcement. - `lib/claude.sh` — Claude Code CLI wrapper with conversation context injection. Uses global SYSTEM_PROMPT.md (from repo root). Supports per-bot CLAUDE.md (loaded from `$CLAUDIO_BOT_DIR` when set). - `lib/history.sh` — Conversation history wrapper, delegates to `lib/db.sh` for SQLite storage. Per-bot history stored in `$CLAUDIO_BOT_DIR/history.db`. - `lib/db.sh` — SQLite database layer for conversation storage. diff --git a/README.md b/README.md index bc2b88e..89ed5ba 100644 --- a/README.md +++ b/README.md @@ -2,15 +2,16 @@ ![](header.png) -Claudio is an adapter for Claude Code CLI and Telegram. It makes a tunnel between a private network and Telegram's API. So that users can chat with Claude Code remotely, in a safe way. +Claudio is an adapter for Claude Code CLI and messaging platforms (Telegram and WhatsApp Business API). It makes a tunnel between a private network and messaging APIs, allowing users to chat with Claude Code remotely in a safe way. ``` +---------------------------------------------+ | Remote Machine | | | - | | +----------+ | +---------+ +-----------------+ | - | Telegram |<--------+--->| Claudio |<----->| Claude Code CLI | | + | Telegram |<--------+--->| | | | | + +----------+ | | Claudio |<----->| Claude Code CLI | | + | WhatsApp |<--------+--->| | | | | +----------+ | +---------+ +-----------------+ | +---------------------------------------------+ ``` @@ -21,9 +22,9 @@ Claudio is an adapter for Claude Code CLI and Telegram. It makes a tunnel betwee ## Overview -Claudio starts a local HTTP server that listens on port 8421, and creates a tunnel using [cloudflared](https://github.com/cloudflare/cloudflared). When a user sends a message from Telegram, it's sent to `/telegram/webhook` and forwarded to the Claude Code CLI. +Claudio starts a local HTTP server that listens on port 8421, and creates a tunnel using [cloudflared](https://github.com/cloudflare/cloudflared). When a user sends a message from Telegram or WhatsApp, it's sent to `/telegram/webhook` or `/whatsapp/webhook` and forwarded to the Claude Code CLI. -Claudio supports **multiple bots**: each bot has its own Telegram token, chat ID, webhook secret, conversation history, and configuration. Incoming webhooks are matched to bots via HMAC secret-token header matching, and each bot maintains independent conversation context. +Claudio supports **multiple bots** and **dual-platform bots**: each bot can have Telegram credentials, WhatsApp credentials, or both. Bots maintain their own conversation history and configuration. Incoming webhooks are matched to bots via HMAC secret-token header matching (Telegram) or signature verification (WhatsApp), and each bot maintains independent conversation context. Dual-platform bots share the same conversation history across both platforms. User messages are passed as one-shot prompts, along with conversation context to maintain continuity. All messages are stored in a per-bot SQLite database, with the last 100 used as conversation context (configurable via `MAX_HISTORY_LINES`). @@ -81,24 +82,55 @@ This will: > **Multi-bot support:** To configure additional bots, run `claudio install ` with a unique bot identifier (alphanumeric, hyphens, and underscores only). Each bot will have its own Telegram credentials, conversation history, and configuration stored in `~/.claudio/bots//`. -2. Set up Telegram bot credentials +2. Set up messaging platform credentials + +The install wizard will guide you through bot setup and ask which platform(s) you want to configure: +- **Telegram only**: Traditional bot setup via @BotFather +- **WhatsApp Business API only**: Requires Meta Business account and WhatsApp Business API credentials +- **Both platforms**: Configure both in sequence for a dual-platform bot + +#### Telegram Setup -The install wizard will guide you through Telegram bot setup. If you skipped it or need to reconfigure, in Telegram, message `@BotFather` with `/newbot` and follow instructions to create a bot. At the end, you'll be given a secret token. +In Telegram, message `@BotFather` with `/newbot` and follow instructions to create a bot. At the end, you'll be given a secret token. -For the default bot (or when reconfiguring an existing bot), run: +For the default bot or when reconfiguring, run: ```bash claudio telegram setup ``` -For additional bots, use `claudio install ` which will interactively configure the new bot's Telegram credentials. - Paste your Telegram token when asked, and press Enter. Then, send a `/start` message to your bot from the Telegram account that you'll use to communicate with Claude Code. -The setup wizard will confirm when it receives the message and finish. Once done, the service restarts automatically, and you can start chatting with Claude Code. +The setup wizard will confirm when it receives the message and finish. > For security, only the `chat_id` captured during setup is authorized to send messages. +#### WhatsApp Setup + +For WhatsApp Business API, you'll need: +- Phone Number ID (from Meta Business Suite) +- Access Token (permanent token from Meta for Developers) +- App Secret (from your Meta app settings) +- Authorized phone number (your WhatsApp number for testing) + +Run: + +```bash +claudio whatsapp setup +``` + +The wizard will validate credentials and provide webhook configuration details for Meta for Developers. + +> For security, only the authorized phone number configured during setup can send messages. + +#### Dual-Platform Bots + +A single bot can serve both platforms, sharing conversation history across Telegram and WhatsApp. During `claudio install `, choose option 3 to configure both, or add a platform later using the platform-specific setup commands. + +See [DUAL_PLATFORM_SETUP.md](DUAL_PLATFORM_SETUP.md) for detailed dual-platform configuration. + +Once setup is done, the service restarts automatically, and you can start chatting with Claude Code from either platform. + > A cron job runs every minute to monitor the webhook endpoint. It verifies the webhook is registered and re-registers it if needed. If the server is unreachable, it auto-restarts the service (throttled to once per 3 minutes, max 3 attempts). After exhausting restart attempts without recovery, it sends a Telegram alert and stops retrying until the server responds with HTTP 200. The restart counter auto-clears when the health endpoint returns HTTP 200. You can also reset it manually by deleting `$HOME/.claudio/.last_restart_attempt` and `$HOME/.claudio/.restart_fail_count`. > > The health check also monitors: disk usage (alerts above 90%), log file sizes (rotates files over 10MB), backup freshness (alerts if the last backup is older than 2 hours), and recent log analysis (detects errors, restart loops, and slow API responses — sends Telegram alerts with a configurable cooldown). These thresholds are configurable via environment variables. @@ -371,11 +403,19 @@ claudio restart - `TELEGRAM_BOT_TOKEN` — Telegram Bot API token. Set automatically during `claudio telegram setup`. - `TELEGRAM_CHAT_ID` — Authorized Telegram chat ID. Only messages from this chat are processed. Set automatically during `claudio telegram setup`. -- `WEBHOOK_SECRET` — HMAC secret for validating incoming webhook requests. Auto-generated during bot setup. +- `WEBHOOK_SECRET` — HMAC secret for validating incoming Telegram webhook requests. Auto-generated during bot setup. + +**WhatsApp** + +- `WHATSAPP_PHONE_NUMBER_ID` — WhatsApp Phone Number ID from Meta Business Suite. Set automatically during `claudio whatsapp setup`. +- `WHATSAPP_ACCESS_TOKEN` — WhatsApp Access Token (permanent token from Meta for Developers). Set automatically during `claudio whatsapp setup`. +- `WHATSAPP_APP_SECRET` — App Secret from Meta app settings, used for webhook signature verification. Set automatically during `claudio whatsapp setup`. +- `WHATSAPP_VERIFY_TOKEN` — Verify token for webhook registration challenge. Auto-generated during `claudio whatsapp setup`. +- `WHATSAPP_PHONE_NUMBER` — Authorized WhatsApp phone number. Only messages from this number are processed. Set automatically during `claudio whatsapp setup`. **Claude** -- `MODEL` — Claude model to use for this bot. Accepts `haiku`, `sonnet`, or `opus`. Default: `haiku`. Can also be changed at runtime via Telegram commands `/haiku`, `/sonnet`, `/opus`. +- `MODEL` — Claude model to use for this bot. Accepts `haiku`, `sonnet`, or `opus`. Default: `haiku`. Can also be changed at runtime via commands `/haiku`, `/sonnet`, `/opus` (works on both platforms). - `MAX_HISTORY_LINES` — Number of recent messages used as conversation context for this bot. Default: `100`. --- @@ -428,6 +468,7 @@ bats tests/db.bats - [x] Tool usage capture in conversation history (PostToolUse hook) - [x] Health check log analysis (error detection, restart loops, API slowness) - [x] Claude code review for Pull Requests (GitHub Actions) +- [x] WhatsApp Business API integration with dual-platform support (single bot serving both Telegram and WhatsApp) **Future** diff --git a/WHATSAPP_FEATURES.md b/WHATSAPP_FEATURES.md new file mode 100644 index 0000000..ff4d3b7 --- /dev/null +++ b/WHATSAPP_FEATURES.md @@ -0,0 +1,188 @@ +# WhatsApp Business API Integration - Feature Comparison + +## Complete Feature Parity with Telegram + +| Feature | Telegram | WhatsApp | Notes | +|---------|----------|----------|-------| +| **Text Messages** | ✅ | ✅ | Full support with 4096 char chunking | +| **Single Images** | ✅ | ✅ | Downloads and passes to Claude | +| **Multiple Images** | ✅ (media groups) | ✅ (separate messages) | WhatsApp doesn't batch images like Telegram, but each is processed | +| **Image Captions** | ✅ | ✅ | Full support | +| **Documents** | ✅ | ✅ | Full support with mime type detection | +| **Voice Messages** | ✅ | ✅ | Transcription via ElevenLabs STT | +| **Audio Messages** | ✅ | ✅ | Same as voice messages | +| **Voice Responses** | ✅ | ✅ | TTS via ElevenLabs when user sends audio | +| **Reply Context** | ✅ (fetches original) | ✅ (notes it's a reply) | WhatsApp API doesn't provide original text | +| **Commands** | ✅ | ✅ | `/opus`, `/sonnet`, `/haiku`, `/start` | +| **Model Switching** | ✅ | ✅ | Persisted per-bot | +| **Typing Indicators** | ✅ | ⚠️ | WhatsApp: sends "..." text message (no native typing API) | +| **Recording Indicator** | ✅ | ⚠️ | WhatsApp: sends "..." text message (same as typing) | +| **Read Receipts** | ✅ (👀 reaction) | ✅ (mark as read) | Different APIs, same purpose | +| **Conversation History** | ✅ | ✅ | SQLite per-bot | +| **Memory System** | ✅ | ✅ | Full ACT-R cognitive memory | +| **Multi-Bot Support** | ✅ | ✅ | Unlimited bots per platform | +| **Per-Bot Config** | ✅ | ✅ | Separate `bot.env` files | +| **Per-Bot CLAUDE.md** | ✅ | ✅ | Custom instructions per bot | +| **Tool Summaries** | ✅ | ✅ | Appended to history | +| **Notifier Messages** | ✅ | ✅ | MCP tool notifications | +| **Text Chunking** | ✅ | ✅ | Long responses split automatically | +| **Security** | ✅ | ✅ | Secret token / HMAC-SHA256 signature | + +## WhatsApp-Specific Implementation Details + +### Authentication +- **Webhook Verification**: GET request with `hub.verify_token` challenge +- **Message Verification**: HMAC-SHA256 signature in `X-Hub-Signature-256` header +- **Per-Bot Secrets**: Each bot has unique verify token and app secret + +### API Endpoints +- **Messages**: `https://graph.facebook.com/v21.0/{phone_number_id}/messages` +- **Media**: `https://graph.facebook.com/v21.0/{media_id}` (two-step download) +- **Upload**: `https://graph.facebook.com/v21.0/{phone_number_id}/media` + +### Message Format +WhatsApp uses a different JSON structure: +```json +{ + "entry": [{ + "changes": [{ + "value": { + "messages": [{ + "from": "1234567890", + "id": "wamid.xxx", + "type": "text", + "text": { "body": "Hello" } + }] + } + }] + }] +} +``` + +### Media Handling +- Images: Direct download via media API +- Documents: Same as images +- Audio: OGG, MP3 support with magic byte validation +- Upload: Required for sending audio (TTS responses) + +### Limitations vs Telegram +1. **No Media Groups**: WhatsApp Business API doesn't support receiving multiple images in a single webhook like Telegram's media groups. Each image arrives as a separate webhook and is processed individually. +2. **Reply Context**: Can detect replies (via `context.id` field) but WhatsApp API doesn't provide the original message text, only the message ID. Implementation adds `[Replying to a previous message]` prefix. +3. **No Built-in Markdown**: WhatsApp uses different formatting (bold: `*text*`, italic: `_text_`) +4. **16 MB Limit**: Smaller than Telegram's 20 MB +5. **No Native Typing Indicator**: WhatsApp Business API doesn't expose a typing indicator endpoint like Telegram. Implementation sends "..." as a text message as a workaround. + +## Setup Process + +### Interactive Wizard +When running `claudio install `, the wizard now asks: +``` +Which platform do you want to use? + 1) Telegram + 2) WhatsApp Business API + +Enter choice [1-2]: +``` + +### WhatsApp Setup Requirements +1. **Meta Business Account** with WhatsApp Business API access +2. **Phone Number ID** from Meta Business Suite +3. **Access Token** (permanent token, not temporary) +4. **App Secret** from Meta for Developers app settings +5. **Authorized Phone Number** (your personal WhatsApp number for testing) + +### Webhook Configuration +After running setup, configure in Meta for Developers: +``` +Callback URL: https:///whatsapp/webhook + (e.g., https://claudio.example.com/whatsapp/webhook for named tunnels) +Verify Token: [provided by setup wizard] +Subscribe to: messages +``` + +## Architecture + +### Multi-Platform Bot Loading +The server now loads both Telegram and WhatsApp bots on startup: + +```python +bots = {} # Telegram bots by bot_id +bots_by_secret = [] # Telegram dispatch by secret token +whatsapp_bots = {} # WhatsApp bots by bot_id +whatsapp_bots_by_verify = [] # WhatsApp dispatch by verify token +``` + +### Webhook Routing +``` +POST /telegram/webhook → match by X-Telegram-Bot-Api-Secret-Token +GET /whatsapp/webhook → verify token challenge +POST /whatsapp/webhook → match by X-Hub-Signature-256 HMAC +``` + +### Queue Isolation +- Telegram: `bot_id:chat_id` +- WhatsApp: `bot_id:phone_number` + +Each queue ensures serial processing per user per bot. + +## Testing + +### Basic Flow +1. **Setup**: `./claudio install mybot` → Choose WhatsApp +2. **Configure**: Enter credentials, get verify token +3. **Register**: Add webhook in Meta for Developers +4. **Test**: Send "Hello" from authorized phone number +5. **Verify**: Check response and conversation history + +### Feature Testing +- ✅ Text messages with replies +- ✅ Send image with caption +- ✅ Send document/PDF +- ✅ Send voice message (gets transcribed and responds with voice) +- ✅ Commands: `/sonnet`, `/haiku`, `/opus` +- ✅ Long responses (>4096 chars) split into chunks +- ✅ Multiple messages queued properly +- ✅ Model switching persists + +## Migration from Telegram + +Claudio now supports running both platforms simultaneously: + +```bash +# Keep existing Telegram bot +./claudio status # Shows "telegram-bot: active" + +# Add WhatsApp bot +./claudio install whatsapp-bot +# Choose option 2 (WhatsApp) + +# Both run in same service +./claudio status +# Shows: +# telegram-bot: active (Telegram) +# whatsapp-bot: active (WhatsApp) +``` + +## Troubleshooting + +### Webhook Not Receiving Messages +1. Check webhook registration in Meta for Developers +2. Verify verify token matches bot.env +3. Check cloudflared tunnel is running: `ps aux | grep cloudflared` +4. Test verification endpoint: `curl https:///whatsapp/webhook?hub.mode=subscribe&hub.verify_token=YOUR_TOKEN&hub.challenge=test` + (Replace `` with your actual tunnel URL, e.g., `claudio.example.com`) + +### Signature Verification Failed +1. Ensure `WHATSAPP_APP_SECRET` matches your Meta app +2. Check webhook is configured with correct app +3. Verify no proxy/CDN is modifying request body + +### Media Download Fails +1. Verify `WHATSAPP_ACCESS_TOKEN` has proper permissions +2. Check token hasn't expired (use permanent token) +3. Ensure media under 16 MB limit + +### Voice Response Not Working +1. Verify `ELEVENLABS_API_KEY` is configured +2. Check audio upload succeeded: `grep "Failed to upload audio" ~/.claudio/claudio.log` +3. Test TTS separately: `./claudio` → source lib/tts.sh → `tts_convert "test" /tmp/test.mp3` diff --git a/claudio b/claudio index 59cec88..e1c6080 100755 --- a/claudio +++ b/claudio @@ -23,6 +23,8 @@ source "$LIB_DIR/history.sh" source "$LIB_DIR/claude.sh" # shellcheck source=lib/telegram.sh source "$LIB_DIR/telegram.sh" +# shellcheck source=lib/whatsapp.sh +source "$LIB_DIR/whatsapp.sh" # shellcheck source=lib/server.sh source "$LIB_DIR/server.sh" # shellcheck source=lib/service.sh @@ -51,6 +53,7 @@ Commands: update Update to the latest release restart Restart the service telegram setup Set up Telegram bot and webhook + whatsapp setup Set up WhatsApp Business API webhook log [-f] [-n N] Show logs (-f to follow, -n for line count) backup Run backup (--hours N, --days N for retention) backup status Show backup status @@ -98,7 +101,15 @@ case "${1:-}" in history_init memory_init body=$(cat) - telegram_handle_webhook "$body" + # Call appropriate handler based on platform passed from server.py + platform="${2:-}" + if [ "$platform" = "whatsapp" ]; then + whatsapp_handle_webhook "$body" + elif [ "$platform" = "telegram" ]; then + telegram_handle_webhook "$body" + else + log_error "webhook" "Unknown platform '$platform' for bot $CLAUDIO_BOT_ID" + fi ;; status) service_status @@ -206,6 +217,17 @@ case "${1:-}" in ;; esac ;; + whatsapp) + case "${2:-}" in + setup) + whatsapp_setup + ;; + *) + echo "Usage: claudio whatsapp setup" + exit 1 + ;; + esac + ;; *) usage ;; diff --git a/lib/config.sh b/lib/config.sh index 10b5a0c..16b2f62 100644 --- a/lib/config.sh +++ b/lib/config.sh @@ -9,6 +9,11 @@ PORT="${PORT:-8421}" MODEL="${MODEL:-haiku}" TELEGRAM_BOT_TOKEN="${TELEGRAM_BOT_TOKEN:-}" TELEGRAM_CHAT_ID="${TELEGRAM_CHAT_ID:-}" +WHATSAPP_PHONE_NUMBER_ID="${WHATSAPP_PHONE_NUMBER_ID:-}" +WHATSAPP_ACCESS_TOKEN="${WHATSAPP_ACCESS_TOKEN:-}" +WHATSAPP_APP_SECRET="${WHATSAPP_APP_SECRET:-}" +WHATSAPP_VERIFY_TOKEN="${WHATSAPP_VERIFY_TOKEN:-}" +WHATSAPP_PHONE_NUMBER="${WHATSAPP_PHONE_NUMBER:-}" WEBHOOK_URL="${WEBHOOK_URL:-}" TUNNEL_NAME="${TUNNEL_NAME:-}" TUNNEL_HOSTNAME="${TUNNEL_HOSTNAME:-}" @@ -161,9 +166,21 @@ claudio_save_bot_env() { ( umask 077 { - printf 'TELEGRAM_BOT_TOKEN="%s"\n' "$(_env_quote "$TELEGRAM_BOT_TOKEN")" - printf 'TELEGRAM_CHAT_ID="%s"\n' "$(_env_quote "$TELEGRAM_CHAT_ID")" - printf 'WEBHOOK_SECRET="%s"\n' "$(_env_quote "$WEBHOOK_SECRET")" + # Telegram bot fields + if [ -n "$TELEGRAM_BOT_TOKEN" ]; then + printf 'TELEGRAM_BOT_TOKEN="%s"\n' "$(_env_quote "$TELEGRAM_BOT_TOKEN")" + printf 'TELEGRAM_CHAT_ID="%s"\n' "$(_env_quote "$TELEGRAM_CHAT_ID")" + printf 'WEBHOOK_SECRET="%s"\n' "$(_env_quote "$WEBHOOK_SECRET")" + fi + # WhatsApp bot fields + if [ -n "$WHATSAPP_PHONE_NUMBER_ID" ]; then + printf 'WHATSAPP_PHONE_NUMBER_ID="%s"\n' "$(_env_quote "$WHATSAPP_PHONE_NUMBER_ID")" + printf 'WHATSAPP_ACCESS_TOKEN="%s"\n' "$(_env_quote "$WHATSAPP_ACCESS_TOKEN")" + printf 'WHATSAPP_APP_SECRET="%s"\n' "$(_env_quote "$WHATSAPP_APP_SECRET")" + printf 'WHATSAPP_VERIFY_TOKEN="%s"\n' "$(_env_quote "$WHATSAPP_VERIFY_TOKEN")" + printf 'WHATSAPP_PHONE_NUMBER="%s"\n' "$(_env_quote "$WHATSAPP_PHONE_NUMBER")" + fi + # Common fields printf 'MODEL="%s"\n' "$(_env_quote "$MODEL")" printf 'MAX_HISTORY_LINES="%s"\n' "$(_env_quote "$MAX_HISTORY_LINES")" } > "$CLAUDIO_BOT_DIR/bot.env" diff --git a/lib/server.py b/lib/server.py index ddd475d..d813a6b 100644 --- a/lib/server.py +++ b/lib/server.py @@ -1,6 +1,7 @@ #!/usr/bin/env python3 import base64 +import hashlib import hmac import json import os @@ -40,13 +41,17 @@ # Multi-bot registry: loaded from ~/.claudio/bots/*/bot.env # bots: dict of bot_id -> {"token": str, "chat_id": str, "secret": str, ...} -# bots_by_secret: list of (secret, bot_id) for dispatch +# bots_by_secret: list of (secret, bot_id) for Telegram dispatch +# whatsapp_bots: dict of bot_id -> {"phone_number_id": str, "app_secret": str, ...} +# whatsapp_bots_by_verify: list of (verify_token, bot_id) for WhatsApp verification bots = {} bots_by_secret = [] +whatsapp_bots = {} +whatsapp_bots_by_verify = [] bots_lock = threading.Lock() # Per-chat message queues for serial processing -chat_queues = {} # queue_key -> deque of (webhook_body, bot_id) +chat_queues = {} # queue_key -> deque of (webhook_body, bot_id, platform) chat_active = {} # queue_key -> bool, True if a processor thread is running active_threads = [] # Non-daemon processor threads to wait on during shutdown queue_lock = threading.Lock() @@ -105,16 +110,20 @@ def is_valid_bot_id(bot_id): def load_bots(): - """Scan ~/.claudio/bots/*/bot.env and build bot registry.""" - global bots, bots_by_secret + """Scan ~/.claudio/bots/*/bot.env and build bot registry (Telegram and WhatsApp).""" + global bots, bots_by_secret, whatsapp_bots, whatsapp_bots_by_verify bots_dir = os.path.join(CLAUDIO_PATH, "bots") new_bots = {} new_by_secret = [] + new_whatsapp_bots = {} + new_whatsapp_by_verify = [] if not os.path.isdir(bots_dir): with bots_lock: bots = new_bots bots_by_secret = new_by_secret + whatsapp_bots = new_whatsapp_bots + whatsapp_bots_by_verify = new_whatsapp_by_verify return bots_dir_real = os.path.realpath(bots_dir) @@ -140,27 +149,66 @@ def load_bots(): if not os.path.isfile(bot_env): continue cfg = parse_env_file(bot_env) + + bot_dir_path = os.path.join(bots_dir, entry) + model = cfg.get("MODEL", "haiku") + max_history = cfg.get("MAX_HISTORY_LINES", "100") + loaded_any = False + + # Try loading Telegram credentials token = cfg.get("TELEGRAM_BOT_TOKEN", "") chat_id = cfg.get("TELEGRAM_CHAT_ID", "") secret = cfg.get("WEBHOOK_SECRET", "") - if not token or not secret: - sys.stderr.write(f"[bots] Skipping bot '{entry}': missing token or secret\n") - continue - new_bots[entry] = { - "token": token, - "chat_id": chat_id, - "secret": secret, - "model": cfg.get("MODEL", "haiku"), - "max_history_lines": cfg.get("MAX_HISTORY_LINES", "100"), - "bot_dir": os.path.join(bots_dir, entry), - } - new_by_secret.append((secret, entry)) + if token and secret: + new_bots[entry] = { + "token": token, + "chat_id": chat_id, + "secret": secret, + "model": model, + "max_history_lines": max_history, + "bot_dir": bot_dir_path, + "type": "telegram", + } + new_by_secret.append((secret, entry)) + loaded_any = True + + # Try loading WhatsApp credentials (can coexist with Telegram) + phone_id = cfg.get("WHATSAPP_PHONE_NUMBER_ID", "") + access_token = cfg.get("WHATSAPP_ACCESS_TOKEN", "") + app_secret = cfg.get("WHATSAPP_APP_SECRET", "") + verify_token = cfg.get("WHATSAPP_VERIFY_TOKEN", "") + phone_number = cfg.get("WHATSAPP_PHONE_NUMBER", "") + + if phone_id and access_token and app_secret and verify_token: + new_whatsapp_bots[entry] = { + "phone_number_id": phone_id, + "access_token": access_token, + "app_secret": app_secret, + "verify_token": verify_token, + "phone_number": phone_number, + "model": model, + "max_history_lines": max_history, + "bot_dir": bot_dir_path, + "type": "whatsapp", + } + new_whatsapp_by_verify.append((verify_token, entry)) + loaded_any = True + + if not loaded_any: + sys.stderr.write(f"[bots] Skipping bot '{entry}': no valid credentials found\n") with bots_lock: bots = new_bots bots_by_secret = new_by_secret - - sys.stderr.write(f"[bots] Loaded {len(new_bots)} bot(s): {', '.join(new_bots.keys())}\n") + whatsapp_bots = new_whatsapp_bots + whatsapp_bots_by_verify = new_whatsapp_by_verify + + # Count unique bot_ids (a bot can have both platforms) + all_bot_ids = set(new_bots.keys()) | set(new_whatsapp_bots.keys()) + sys.stderr.write( + f"[bots] Loaded {len(all_bot_ids)} bot(s): " + f"{len(new_bots)} Telegram endpoint(s), {len(new_whatsapp_bots)} WhatsApp endpoint(s)\n" + ) def match_bot_by_secret(token_header): @@ -174,6 +222,37 @@ def match_bot_by_secret(token_header): return None, None +def match_whatsapp_bot_by_verify_token(verify_token): + """Find WhatsApp bot matching the verify token. Returns (bot_id, bot_config) or (None, None).""" + if not verify_token: + return None, None + with bots_lock: + for token, bot_id in whatsapp_bots_by_verify: + if hmac.compare_digest(verify_token, token): + return bot_id, whatsapp_bots[bot_id] + return None, None + + +def match_whatsapp_bot_by_signature(body, signature): + """Find WhatsApp bot by verifying HMAC signature. Returns (bot_id, bot_config) or (None, None).""" + if not signature: + return None, None + with bots_lock: + for bot_id, bot_config in whatsapp_bots.items(): + app_secret = bot_config.get("app_secret", "") + if not app_secret: + continue + # Compute HMAC-SHA256 + expected = hmac.new( + app_secret.encode("utf-8"), + body.encode("utf-8"), + hashlib.sha256 + ).hexdigest() + if hmac.compare_digest(signature, expected): + return bot_id, bot_config + return None, None + + def parse_webhook(body): """Extract update_id, chat_id, and media_group_id from webhook body.""" try: @@ -208,7 +287,7 @@ def _process_queue_loop(queue_key): del chat_queues[queue_key] chat_active.pop(queue_key, None) return - body, bot_id = chat_queues[queue_key].popleft() + body, bot_id, platform = chat_queues[queue_key].popleft() proc = None try: @@ -224,7 +303,7 @@ def _process_queue_loop(queue_key): env["CLAUDIO_BOT_ID"] = bot_id proc = subprocess.Popen( - [CLAUDIO_BIN, "_webhook"], + [CLAUDIO_BIN, "_webhook", platform], stdin=subprocess.PIPE, stdout=log_fh, stderr=log_fh, @@ -373,10 +452,10 @@ def enqueue_webhook(body, bot_id, bot_config): timer.start() return - _enqueue_single(body, chat_id, bot_id) + _enqueue_single(body, chat_id, bot_id, "telegram") -def _enqueue_single(body, chat_id, bot_id): +def _enqueue_single(body, chat_id, bot_id, platform): """Enqueue a single (possibly merged) webhook body for processing.""" queue_key = f"{bot_id}:{chat_id}" with queue_lock: @@ -400,7 +479,7 @@ def _enqueue_single(body, chat_id, bot_id): bot_id )) - chat_queues[queue_key].append((body, bot_id)) + chat_queues[queue_key].append((body, bot_id, platform)) if not chat_active.get(queue_key): chat_active[queue_key] = True @@ -413,6 +492,81 @@ def _enqueue_single(body, chat_id, bot_id): thread.start() +def enqueue_whatsapp_webhook(body, bot_id, bot_config): + """Add WhatsApp webhook to per-chat queue and start processor if needed.""" + try: + data = json.loads(body) + # Extract phone number from first message + phone_number = "" + for entry in data.get("entry", []): + for change in entry.get("changes", []): + messages = change.get("value", {}).get("messages", []) + if messages: + phone_number = messages[0].get("from", "") + break + if phone_number: + break + + if not phone_number: + return # No message found + + # Validate phone number against authorized number (defense in depth) + # Fail closed: reject if no phone number configured + bot_phone = bot_config.get("phone_number", "") + if not bot_phone: + sys.stderr.write(log_msg( + "whatsapp", + "Bot has no authorized phone number configured, rejecting message", + bot_id + )) + return + + if phone_number != bot_phone: + sys.stderr.write(log_msg( + "whatsapp", + f"Rejected message from unauthorized number: {phone_number}", + bot_id + )) + return + + # Use phone number as chat_id for queue isolation + queue_key = f"{bot_id}:{phone_number}" + + with queue_lock: + if shutting_down: + sys.stderr.write(log_msg("queue", f"Rejecting webhook during shutdown for {queue_key}", bot_id)) + return + + # Initialize queue if needed + if queue_key not in chat_queues: + chat_queues[queue_key] = deque() + + # Prevent unbounded queue growth + queue_size = len(chat_queues[queue_key]) + if queue_size >= MAX_QUEUE_SIZE: + sys.stderr.write(log_msg( + "queue", + f"Queue full for {queue_key} ({queue_size}/{MAX_QUEUE_SIZE}), dropping message", + bot_id + )) + return + + chat_queues[queue_key].append((body, bot_id, "whatsapp")) + + if not chat_active.get(queue_key): + chat_active[queue_key] = True + thread = threading.Thread( + target=process_queue, + args=(queue_key,), + daemon=False, + ) + active_threads.append(thread) + thread.start() + + except (json.JSONDecodeError, KeyError, IndexError) as e: + sys.stderr.write(log_msg("whatsapp", f"Error parsing webhook: {e}", bot_id)) + + class ThreadedHTTPServer(ThreadingMixIn, HTTPServer): daemon_threads = True @@ -458,6 +612,8 @@ def do_POST(self): return self._respond(200, {"ok": True}) enqueue_webhook(body, bot_id, bot_config) + elif self.path == "/whatsapp/webhook": + self._handle_whatsapp() elif self.path == "/alexa": self._handle_alexa() else: @@ -578,11 +734,89 @@ def _respond_alexa(self, text, end_session=True, reprompt=None): self.end_headers() self.wfile.write(body) + def _handle_whatsapp_verify(self): + """Handle WhatsApp webhook verification challenge.""" + # Parse query parameters + parsed_url = urllib.parse.urlparse(self.path) + params = urllib.parse.parse_qs(parsed_url.query) + + mode = params.get("hub.mode", [""])[0] + token = params.get("hub.verify_token", [""])[0] + challenge = params.get("hub.challenge", [""])[0] + + sys.stderr.write(f"[whatsapp] Verification request: mode={mode} token={'***' if token else 'empty'}\n") + + # Find bot with matching verify token + bot_id, bot_config = match_whatsapp_bot_by_verify_token(token) + + if mode == "subscribe" and bot_id is not None and challenge: + sys.stderr.write(f"[whatsapp] Verification successful for bot {bot_id}\n") + # Respond with the challenge token to complete verification + self.send_response(200) + self.send_header("Content-Type", "text/plain") + self.send_header("Content-Length", str(len(challenge))) + self.end_headers() + self.wfile.write(challenge.encode("utf-8")) + else: + sys.stderr.write("[whatsapp] Verification failed: invalid token or missing parameters\n") + self._respond(403, {"error": "verification failed"}) + + def _handle_whatsapp(self): + """Handle WhatsApp webhook messages with signature verification.""" + if shutting_down: + self._respond(503, {"error": "shutting down"}) + return + + body = self._read_body() + if body is None: + return + + # Verify signature - ensure constant-time operations throughout + signature = self.headers.get("X-Hub-Signature-256", "") + # Always perform same operations regardless of prefix validity + if signature.startswith("sha256="): + signature = signature[7:] # Remove "sha256=" prefix + else: + signature = "" # Invalid but continue to timing-consistent path + + # Find bot by verifying signature against all WhatsApp bots + bot_id, bot_config = match_whatsapp_bot_by_signature(body, signature) + if bot_id is None: + # Same error for all failure modes to prevent timing attacks + self._respond(401, {"error": "authentication failed"}) + return + + # Check if this is a message or status update + try: + data = json.loads(body) + # WhatsApp sends status updates and messages in the same webhook + # Only process if there are actual messages + has_messages = False + if data.get("entry", []): + for entry in data["entry"]: + for change in entry.get("changes", []): + if change.get("value", {}).get("messages"): + has_messages = True + break + + if not has_messages: + # Status update or other notification, acknowledge but don't process + self._respond(200, {"status": "ok"}) + return + except json.JSONDecodeError: + self._respond(400, {"error": "invalid json"}) + return + + self._respond(200, {"status": "ok"}) + enqueue_whatsapp_webhook(body, bot_id, bot_config) + def do_GET(self): if self.path == "/health": health = check_health() code = 200 if health["status"] == "healthy" else 503 self._respond(code, health) + elif self.path.startswith("/whatsapp/webhook"): + self._handle_whatsapp_verify() elif self.path == "/reload": # Require MANAGEMENT_SECRET to access this endpoint if not MANAGEMENT_SECRET: diff --git a/lib/service.sh b/lib/service.sh index 02d8eb7..0594ca4 100644 --- a/lib/service.sh +++ b/lib/service.sh @@ -197,20 +197,98 @@ bot_setup() { local bot_id="$1" local bot_dir="$CLAUDIO_PATH/bots/$bot_id" - # Check if bot already exists + # Check what's already configured + local has_telegram=false + local has_whatsapp=false if [ -f "$bot_dir/bot.env" ]; then - echo "" - echo "Bot '$bot_id' already exists at $bot_dir" - read -rp "Re-run setup? [y/N] " confirm - if [[ ! "$confirm" =~ ^[Yy] ]]; then - echo "Skipping bot setup." - return 0 - fi + # Unset per-bot credentials to prevent stale values from leaking + unset TELEGRAM_BOT_TOKEN TELEGRAM_CHAT_ID WEBHOOK_SECRET \ + WHATSAPP_PHONE_NUMBER_ID WHATSAPP_ACCESS_TOKEN WHATSAPP_APP_SECRET \ + WHATSAPP_VERIFY_TOKEN WHATSAPP_PHONE_NUMBER + # shellcheck source=/dev/null + source "$bot_dir/bot.env" 2>/dev/null || true + [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && has_telegram=true + [ -n "${WHATSAPP_PHONE_NUMBER_ID:-}" ] && has_whatsapp=true fi echo "" echo "=== Setting up bot: $bot_id ===" - telegram_setup "$bot_id" + + # Show what's already configured + if [ "$has_telegram" = true ] || [ "$has_whatsapp" = true ]; then + echo "" + echo "Current configuration:" + [ "$has_telegram" = true ] && echo " ✓ Telegram configured" + [ "$has_whatsapp" = true ] && echo " ✓ WhatsApp configured" + fi + + # Offer platform choices + echo "" + echo "Which platform(s) do you want to configure?" + echo " 1) Telegram only" + echo " 2) WhatsApp Business API only" + echo " 3) Both Telegram and WhatsApp" + [ "$has_telegram" = true ] && echo " 4) Re-configure Telegram" + [ "$has_whatsapp" = true ] && echo " 5) Re-configure WhatsApp" + echo "" + + local max_choice=3 + [ "$has_telegram" = true ] && max_choice=4 + [ "$has_whatsapp" = true ] && max_choice=5 + + read -rp "Enter choice [1-${max_choice}]: " platform_choice + + case "$platform_choice" in + 1) + telegram_setup "$bot_id" + # Offer to set up WhatsApp too + if [ "$has_whatsapp" != true ]; then + echo "" + read -rp "Would you like to also configure WhatsApp for this bot? [y/N] " add_whatsapp + if [[ "$add_whatsapp" =~ ^[Yy] ]]; then + echo "" + whatsapp_setup "$bot_id" + fi + fi + ;; + 2) + whatsapp_setup "$bot_id" + # Offer to set up Telegram too + if [ "$has_telegram" != true ]; then + echo "" + read -rp "Would you like to also configure Telegram for this bot? [y/N] " add_telegram + if [[ "$add_telegram" =~ ^[Yy] ]]; then + echo "" + telegram_setup "$bot_id" + fi + fi + ;; + 3) + telegram_setup "$bot_id" + echo "" + whatsapp_setup "$bot_id" + ;; + 4) + if [ "$has_telegram" = true ]; then + telegram_setup "$bot_id" + else + echo "Invalid choice." + exit 1 + fi + ;; + 5) + if [ "$has_whatsapp" = true ]; then + whatsapp_setup "$bot_id" + else + echo "Invalid choice." + exit 1 + fi + ;; + *) + echo "Invalid choice. Please enter a valid option." + exit 1 + ;; + esac } cloudflared_setup() { diff --git a/lib/telegram.sh b/lib/telegram.sh index cc35ca2..1544243 100644 --- a/lib/telegram.sh +++ b/lib/telegram.sh @@ -797,16 +797,31 @@ telegram_setup() { mkdir -p "$bot_dir" chmod 700 "$bot_dir" - # Generate per-bot webhook secret - export WEBHOOK_SECRET - WEBHOOK_SECRET=$(openssl rand -hex 32) || { - print_error "Failed to generate WEBHOOK_SECRET" - exit 1 - } - + # Load existing config to preserve other platform's credentials export CLAUDIO_BOT_ID="$bot_id" export CLAUDIO_BOT_DIR="$bot_dir" export CLAUDIO_DB_FILE="$bot_dir/history.db" + # Unset OTHER platform's credentials to prevent stale values from leaking + # (Don't unset Telegram vars - they were just set above!) + unset WEBHOOK_SECRET WHATSAPP_PHONE_NUMBER_ID WHATSAPP_ACCESS_TOKEN \ + WHATSAPP_APP_SECRET WHATSAPP_VERIFY_TOKEN WHATSAPP_PHONE_NUMBER + if [ -f "$bot_dir/bot.env" ]; then + # shellcheck source=/dev/null + source "$bot_dir/bot.env" 2>/dev/null || true + fi + + # Re-apply new Telegram credentials (source may have overwritten them during re-configuration) + export TELEGRAM_BOT_TOKEN="$token" + export TELEGRAM_CHAT_ID="$TELEGRAM_CHAT_ID" + + # Generate per-bot webhook secret (only if not already set) + if [ -z "${WEBHOOK_SECRET:-}" ]; then + export WEBHOOK_SECRET + WEBHOOK_SECRET=$(openssl rand -hex 32) || { + print_error "Failed to generate WEBHOOK_SECRET" + exit 1 + } + fi claudio_save_bot_env diff --git a/lib/whatsapp.sh b/lib/whatsapp.sh new file mode 100644 index 0000000..cd09047 --- /dev/null +++ b/lib/whatsapp.sh @@ -0,0 +1,800 @@ +#!/bin/bash + +# shellcheck source=lib/log.sh +source "$(dirname "${BASH_SOURCE[0]}")/log.sh" + +WHATSAPP_API="https://graph.facebook.com/v21.0" + +# Helper: Create secure temporary config file for curl +# Returns path via stdout, caller must cleanup +_whatsapp_curl_config() { + local endpoint="$1" + local config_file + config_file=$(mktemp "${CLAUDIO_PATH}/tmp/curl-config-XXXXXX") || return 1 + chmod 600 "$config_file" + + printf 'url = "%s/%s/%s"\n' "$WHATSAPP_API" "$WHATSAPP_PHONE_NUMBER_ID" "$endpoint" > "$config_file" + printf 'header = "Authorization: Bearer %s"\n' "$WHATSAPP_ACCESS_TOKEN" >> "$config_file" + + echo "$config_file" +} + +# Strip XML-like tags that could be used for prompt injection +_sanitize_for_prompt() { + sed -E 's/<\/?[a-zA-Z_][a-zA-Z0-9_-]*[^>]*>/[quoted text]/g' +} + +# Collapse text to a single line, trimmed and truncated to 200 chars +_summarize() { + local summary + summary=$(printf '%s' "$1" | _sanitize_for_prompt | tr '\n' ' ' | sed -E 's/^[[:space:]]*//;s/[[:space:]]+/ /g') + [ ${#summary} -gt 200 ] && summary="${summary:0:200}..." + printf '%s' "$summary" +} + +whatsapp_api() { + local endpoint="$1" + shift + + local max_retries=4 + local attempt=0 + local response http_code body + local config_file + + # Create secure config file (prevents credential exposure in process list) + config_file=$(_whatsapp_curl_config "$endpoint") || { + log_error "whatsapp" "Failed to create curl config" + return 1 + } + trap 'rm -f "$config_file"' RETURN + + while [ $attempt -le $max_retries ]; do + response=$(curl -s -w "\n%{http_code}" --config "$config_file" "$@") + http_code=$(echo "$response" | tail -n1) + body=$(echo "$response" | sed '$d') + + # Success or client error (4xx except 429) - don't retry + if [[ "$http_code" =~ ^2 ]] || { [[ "$http_code" =~ ^4 ]] && [ "$http_code" != "429" ]; }; then + echo "$body" + return 0 + fi + + # Retryable: 429 (rate limit) or 5xx (server error) + if [ $attempt -lt $max_retries ]; then + local delay=$(( 2 ** attempt )) # Exponential backoff + log "whatsapp" "API error (HTTP $http_code), retrying in ${delay}s..." + sleep "$delay" + fi + + ((attempt++)) || true + done + + # All retries exhausted + log_error "whatsapp" "API failed after $((max_retries + 1)) attempts (HTTP $http_code)" + echo "$body" + return 1 +} + +whatsapp_send_message() { + local to="$1" + local text="$2" + local reply_to_message_id="${3:-}" + + # WhatsApp has a 4096 char limit per message + local max_len=4096 + local is_first=true + while [ ${#text} -gt 0 ]; do + local chunk="${text:0:$max_len}" + text="${text:$max_len}" + + # Build JSON payload with jq for safe variable handling + local payload + if [ "$is_first" = true ] && [ -n "$reply_to_message_id" ]; then + payload=$(jq -n \ + --arg to "$to" \ + --arg text "$chunk" \ + --arg mid "$reply_to_message_id" \ + '{ + messaging_product: "whatsapp", + recipient_type: "individual", + to: $to, + type: "text", + text: { preview_url: false, body: $text } + } | . + {context: {message_id: $mid}}') + else + payload=$(jq -n \ + --arg to "$to" \ + --arg text "$chunk" \ + '{ + messaging_product: "whatsapp", + recipient_type: "individual", + to: $to, + type: "text", + text: { preview_url: false, body: $text } + }') + fi + is_first=false + + local result + result=$(whatsapp_api "messages" \ + -H "Content-Type: application/json" \ + -d "$payload") + + local success + success=$(echo "$result" | jq -r '.messages[0].id // empty' 2>/dev/null) + if [ -z "$success" ]; then + log_error "whatsapp" "Failed to send message: $result" + fi + done +} + +whatsapp_send_audio() { + local to="$1" + local audio_file="$2" + local reply_to_message_id="${3:-}" + + # Upload audio file and send + local mime_type="audio/mpeg" # MP3 + local config_file result + + # Create secure config for media upload + config_file=$(mktemp "${CLAUDIO_PATH}/tmp/curl-config-XXXXXX") || { + log_error "whatsapp" "Failed to create curl config" + return 1 + } + chmod 600 "$config_file" + trap 'rm -f "$config_file"' RETURN + + printf 'url = "%s/%s/media"\n' "$WHATSAPP_API" "$WHATSAPP_PHONE_NUMBER_ID" > "$config_file" + printf 'header = "Authorization: Bearer %s"\n' "$WHATSAPP_ACCESS_TOKEN" >> "$config_file" + + result=$(curl -s --config "$config_file" \ + -H "Content-Type: multipart/form-data" \ + -F "messaging_product=whatsapp" \ + -F "file=@${audio_file};type=${mime_type}") + + local media_id + media_id=$(echo "$result" | jq -r '.id // empty') + if [ -z "$media_id" ]; then + log_error "whatsapp" "Failed to upload audio: $result" + return 1 + fi + + # Send audio message with media_id - use jq for safe JSON construction + local payload + payload=$(jq -n \ + --arg to "$to" \ + --arg mid "$media_id" \ + --arg rmid "$reply_to_message_id" \ + '{ + messaging_product: "whatsapp", + recipient_type: "individual", + to: $to, + type: "audio", + audio: { id: $mid } + } | if $rmid != "" then . + {context: {message_id: $rmid}} else . end') + + result=$(whatsapp_api "messages" \ + -H "Content-Type: application/json" \ + -d "$payload") + + local success + success=$(echo "$result" | jq -r '.messages[0].id // empty') + if [ -z "$success" ]; then + log_error "whatsapp" "Failed to send audio message: $result" + return 1 + fi +} + +# whatsapp_send_typing removed - WhatsApp Cloud API typing indicator requires +# message_id and auto-dismisses after 25s. Proper implementation deferred to follow-up. +# See: https://github.com/edgarjs/claudio/issues/XXX + +whatsapp_mark_read() { + local message_id="$1" + local payload + payload=$(jq -n --arg mid "$message_id" '{ + messaging_product: "whatsapp", + status: "read", + message_id: $mid + }') + # Fire-and-forget: don't retry read receipts + curl -s --connect-timeout 5 --max-time 10 \ + --config <(printf 'url = "%s/%s/messages"\n' "$WHATSAPP_API" "$WHATSAPP_PHONE_NUMBER_ID"; printf 'header = "Authorization: Bearer %s"\n' "$WHATSAPP_ACCESS_TOKEN") \ + -H "Content-Type: application/json" \ + -d "$payload" \ + > /dev/null 2>&1 || true +} + +whatsapp_parse_webhook() { + local body="$1" + # Extract message data from WhatsApp webhook format + # WhatsApp sends: entry[0].changes[0].value.messages[0] + local parsed + parsed=$(printf '%s' "$body" | jq -r '[ + .entry[0].changes[0].value.messages[0].from // "", + .entry[0].changes[0].value.messages[0].id // "", + .entry[0].changes[0].value.messages[0].text.body // "", + .entry[0].changes[0].value.messages[0].type // "", + (.entry[0].changes[0].value.messages[0].image.id // ""), + (.entry[0].changes[0].value.messages[0].image.caption // ""), + (.entry[0].changes[0].value.messages[0].document.id // ""), + (.entry[0].changes[0].value.messages[0].document.filename // ""), + (.entry[0].changes[0].value.messages[0].document.mime_type // ""), + (.entry[0].changes[0].value.messages[0].audio.id // ""), + (.entry[0].changes[0].value.messages[0].voice.id // ""), + (.entry[0].changes[0].value.messages[0].context.id // "") + ] | join("\u001f")') + + # shellcheck disable=SC2034 # Variables available for use + IFS=$'\x1f' read -r -d '' WEBHOOK_FROM_NUMBER WEBHOOK_MESSAGE_ID WEBHOOK_TEXT \ + WEBHOOK_MESSAGE_TYPE WEBHOOK_IMAGE_ID WEBHOOK_IMAGE_CAPTION \ + WEBHOOK_DOC_ID WEBHOOK_DOC_FILENAME WEBHOOK_DOC_MIME \ + WEBHOOK_AUDIO_ID WEBHOOK_VOICE_ID WEBHOOK_CONTEXT_ID <<< "$parsed" || true +} + +whatsapp_download_media() { + local media_id="$1" + local output_path="$2" + local label="${3:-media}" + local config_file + + # Step 1: Get media URL from WhatsApp API + config_file=$(mktemp "${CLAUDIO_PATH}/tmp/curl-config-XXXXXX") || { + log_error "whatsapp" "Failed to create curl config" + return 1 + } + chmod 600 "$config_file" + trap 'rm -f "$config_file"' RETURN + + printf 'url = "%s/%s"\n' "$WHATSAPP_API" "$media_id" > "$config_file" + printf 'header = "Authorization: Bearer %s"\n' "$WHATSAPP_ACCESS_TOKEN" >> "$config_file" + + local url_response + url_response=$(curl -s --connect-timeout 10 --max-time 30 --config "$config_file") + + local media_url + media_url=$(printf '%s' "$url_response" | jq -r '.url // empty') + + if [ -z "$media_url" ]; then + log_error "whatsapp" "Failed to get ${label} URL for media_id: $media_id" + return 1 + fi + + # Whitelist allowed characters to prevent injection + if [[ ! "$media_url" =~ ^https:// ]]; then + log_error "whatsapp" "Invalid ${label} URL scheme" + return 1 + fi + + # Step 2: Download the media file + # Reuse config file for download - use _env_quote to prevent injection + printf 'url = "%s"\n' "$(_env_quote "$media_url")" > "$config_file" + printf 'header = "Authorization: Bearer %s"\n' "$WHATSAPP_ACCESS_TOKEN" >> "$config_file" + + if ! curl -sf --connect-timeout 10 --max-time 60 --max-redirs 1 -o "$output_path" --config "$config_file"; then + log_error "whatsapp" "Failed to download ${label}" + return 1 + fi + + # Validate file size (max 16 MB — WhatsApp Cloud API limit) + local max_size=$((16 * 1024 * 1024)) + local file_size + file_size=$(wc -c < "$output_path") + if [ "$file_size" -gt "$max_size" ]; then + log_error "whatsapp" "Downloaded ${label} exceeds size limit: ${file_size} bytes" + rm -f "$output_path" + return 1 + fi + + if [ "$file_size" -eq 0 ]; then + log_error "whatsapp" "Downloaded ${label} is empty" + rm -f "$output_path" + return 1 + fi + + log "whatsapp" "Downloaded ${label} to: $output_path (${file_size} bytes)" +} + +whatsapp_download_image() { + local media_id="$1" + local output_path="$2" + + if ! whatsapp_download_media "$media_id" "$output_path" "image"; then + return 1 + fi + + # Validate magic bytes to ensure it's an image + local header + header=$(od -An -tx1 -N12 "$output_path" | tr -d ' ') + case "$header" in + ffd8ff*) ;; # JPEG + 89504e47*) ;; # PNG + 47494638*) ;; # GIF + 52494646????????57454250) ;; # WebP + *) + log_error "whatsapp" "Downloaded file is not a recognized image format" + rm -f "$output_path" + return 1 + ;; + esac +} + +whatsapp_download_document() { + whatsapp_download_media "$1" "$2" "document" +} + +whatsapp_download_audio() { + local media_id="$1" + local output_path="$2" + + if ! whatsapp_download_media "$media_id" "$output_path" "audio"; then + return 1 + fi + + # Validate magic bytes for audio formats + local header + header=$(od -An -tx1 -N12 "$output_path" | tr -d ' ') + case "$header" in + 4f676753*) ;; # OGG + 494433*) ;; # MP3 (ID3 tag) + fffb*) ;; # MP3 (frame sync) + fff3*) ;; # MP3 (MPEG-1 Layer 3) + fff2*) ;; # MP3 (MPEG-2 Layer 3) + *) + log_error "whatsapp" "Downloaded file is not a recognized audio format" + rm -f "$output_path" + return 1 + ;; + esac +} + +whatsapp_handle_webhook() { + local body="$1" + whatsapp_parse_webhook "$body" + + if [ -z "$WEBHOOK_FROM_NUMBER" ]; then + return + fi + + # Security: only allow configured phone number (never skip if unset) + if [ -z "$WHATSAPP_PHONE_NUMBER" ]; then + log_error "whatsapp" "WHATSAPP_PHONE_NUMBER not configured — rejecting all messages" + return + fi + if [ "$WEBHOOK_FROM_NUMBER" != "$WHATSAPP_PHONE_NUMBER" ]; then + log "whatsapp" "Rejected message from unauthorized number: $WEBHOOK_FROM_NUMBER" + return + fi + + local text="$WEBHOOK_TEXT" + local message_id="$WEBHOOK_MESSAGE_ID" + + # Handle different message types + local has_image=false + local has_document=false + local has_audio=false + + case "$WEBHOOK_MESSAGE_TYPE" in + image) + has_image=true + text="${WEBHOOK_IMAGE_CAPTION:-$text}" + ;; + document) + has_document=true + ;; + audio|voice) + has_audio=true + ;; + text) + # Already handled + ;; + *) + log "whatsapp" "Unsupported message type: $WEBHOOK_MESSAGE_TYPE" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I don't support that message type yet." "$message_id" + return + ;; + esac + + # Must have either text, image, document, or audio + if [ -z "$text" ] && [ "$has_image" != true ] && [ "$has_document" != true ] && [ "$has_audio" != true ]; then + return + fi + + # If this is a reply, prepend context note + # (WhatsApp doesn't provide the original message text, only the message ID) + if [ -n "$text" ] && [ -n "$WEBHOOK_CONTEXT_ID" ]; then + text="[Replying to a previous message] + +${text}" + fi + + # Handle commands + case "$text" in + /opus) + MODEL="opus" + if [ -n "$CLAUDIO_BOT_DIR" ]; then + claudio_save_bot_env + else + claudio_save_env + fi + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "_Switched to Opus model._" "$message_id" + return + ;; + /sonnet) + MODEL="sonnet" + if [ -n "$CLAUDIO_BOT_DIR" ]; then + claudio_save_bot_env + else + claudio_save_env + fi + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "_Switched to Sonnet model._" "$message_id" + return + ;; + /haiku) + # shellcheck disable=SC2034 # Used by claude.sh via config + MODEL="haiku" + if [ -n "$CLAUDIO_BOT_DIR" ]; then + claudio_save_bot_env + else + claudio_save_env + fi + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "_Switched to Haiku model._" "$message_id" + return + ;; + /start) + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "_Hola!_ Send me a message and I'll forward it to Claude Code." "$message_id" + return + ;; + esac + + log "whatsapp" "Received message from number=$WEBHOOK_FROM_NUMBER" + + # Mark as read to acknowledge receipt + whatsapp_mark_read "$message_id" + + # Download image if present + local image_file="" + if [ "$has_image" = true ] && [ -n "$WEBHOOK_IMAGE_ID" ]; then + local img_tmpdir="${CLAUDIO_PATH}/tmp" + if ! mkdir -p "$img_tmpdir"; then + log_error "whatsapp" "Failed to create image temp directory: $img_tmpdir" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I couldn't process your image. Please try again." "$message_id" + return + fi + image_file=$(mktemp "${img_tmpdir}/claudio-img-XXXXXX.jpg") || { + log_error "whatsapp" "Failed to create temp file for image" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I couldn't process your image. Please try again." "$message_id" + return + } + if ! whatsapp_download_image "$WEBHOOK_IMAGE_ID" "$image_file"; then + rm -f "$image_file" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I couldn't download your image. Please try again." "$message_id" + return + fi + chmod 600 "$image_file" + fi + + # Download document if present + local doc_file="" + if [ "$has_document" = true ] && [ -n "$WEBHOOK_DOC_ID" ]; then + local doc_tmpdir="${CLAUDIO_PATH}/tmp" + if ! mkdir -p "$doc_tmpdir"; then + log_error "whatsapp" "Failed to create document temp directory: $doc_tmpdir" + rm -f "$image_file" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I couldn't process your file. Please try again." "$message_id" + return + fi + # Derive extension from filename + local doc_ext="bin" + if [ -n "$WEBHOOK_DOC_FILENAME" ]; then + local name_ext="${WEBHOOK_DOC_FILENAME##*.}" + if [ -n "$name_ext" ] && [ "$name_ext" != "$WEBHOOK_DOC_FILENAME" ] && [[ "$name_ext" =~ ^[a-zA-Z0-9]+$ ]] && [ ${#name_ext} -le 10 ]; then + doc_ext="$name_ext" + fi + fi + doc_file=$(mktemp "${doc_tmpdir}/claudio-doc-XXXXXX.${doc_ext}") || { + log_error "whatsapp" "Failed to create temp file for document" + rm -f "$image_file" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I couldn't process your file. Please try again." "$message_id" + return + } + if ! whatsapp_download_document "$WEBHOOK_DOC_ID" "$doc_file"; then + rm -f "$doc_file" "$image_file" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I couldn't download your file. Please try again." "$message_id" + return + fi + chmod 600 "$doc_file" + fi + + # Download and transcribe audio if present + local audio_file="" + local transcription="" + if [ "$has_audio" = true ] && [ -n "${WEBHOOK_AUDIO_ID}${WEBHOOK_VOICE_ID}" ]; then + if [[ -z "$ELEVENLABS_API_KEY" ]]; then + rm -f "$image_file" "$doc_file" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "_Voice messages require ELEVENLABS_API_KEY to be configured._" "$message_id" + return + fi + local audio_tmpdir="${CLAUDIO_PATH}/tmp" + if ! mkdir -p "$audio_tmpdir"; then + log_error "whatsapp" "Failed to create audio temp directory: $audio_tmpdir" + rm -f "$image_file" "$doc_file" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I couldn't process your audio message. Please try again." "$message_id" + return + fi + audio_file=$(mktemp "${audio_tmpdir}/claudio-audio-XXXXXX.ogg") || { + log_error "whatsapp" "Failed to create temp file for audio" + rm -f "$image_file" "$doc_file" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I couldn't process your audio message. Please try again." "$message_id" + return + } + local audio_id="${WEBHOOK_AUDIO_ID:-$WEBHOOK_VOICE_ID}" + if ! whatsapp_download_audio "$audio_id" "$audio_file"; then + rm -f "$audio_file" "$image_file" "$doc_file" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I couldn't download your audio message. Please try again." "$message_id" + return + fi + chmod 600 "$audio_file" + + if ! transcription=$(stt_transcribe "$audio_file"); then + rm -f "$audio_file" "$image_file" "$doc_file" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I couldn't transcribe your audio message. Please try again." "$message_id" + return + fi + rm -f "$audio_file" + audio_file="" + + if [ -n "$text" ]; then + text="${transcription} + +${text}" + else + text="$transcription" + fi + log "whatsapp" "Audio message transcribed: ${#transcription} chars" + fi + + # Build prompt with image reference + if [ -n "$image_file" ]; then + if [ -n "$text" ]; then + text="[The user sent an image at ${image_file}] + +${text}" + else + text="[The user sent an image at ${image_file}] + +Describe this image." + fi + fi + + # Build prompt with document reference + if [ -n "$doc_file" ]; then + local doc_name="${WEBHOOK_DOC_FILENAME:-document}" + doc_name=$(printf '%s' "$doc_name" | tr -cd 'a-zA-Z0-9._ -' | head -c 255) + doc_name="${doc_name:-document}" + if [ -n "$text" ]; then + text="[The user sent a file \"${doc_name}\" at ${doc_file}] + +${text}" + else + text="[The user sent a file \"${doc_name}\" at ${doc_file}] + +Read this file and summarize its contents." + fi + fi + + # Store descriptive text in history + local history_text="$text" + if [ "$has_audio" = true ]; then + history_text="[Sent an audio message: ${transcription}]" + elif [ -n "$image_file" ]; then + local caption="${WEBHOOK_IMAGE_CAPTION:-}" + if [ -n "$caption" ]; then + history_text="[Sent an image with caption: ${caption}]" + else + history_text="[Sent an image]" + fi + elif [ -n "$doc_file" ]; then + if [ -n "$text" ]; then + history_text="[Sent a file \"${doc_name}\" with caption: ${text}]" + else + history_text="[Sent a file \"${doc_name}\"]" + fi + fi + + # Typing indicator removed - see whatsapp_send_typing comment above + local tts_file="" + trap 'rm -f "$image_file" "$doc_file" "$audio_file" "$tts_file"' RETURN + + local response + response=$(claude_run "$text") + + # Enrich history with document summary + if [ -n "$response" ]; then + if [ -z "$WEBHOOK_IMAGE_CAPTION" ] && [ -n "$doc_file" ]; then + history_text="[Sent a file \"${doc_name}\": $(_summarize "$response")]" + fi + fi + + history_add "user" "$history_text" + + if [ -n "$response" ]; then + local history_response="$response" + if [ -n "${CLAUDE_NOTIFIER_MESSAGES:-}" ]; then + history_response="${CLAUDE_NOTIFIER_MESSAGES}"$'\n\n'"${history_response}" + fi + if [ -n "${CLAUDE_TOOL_SUMMARY:-}" ]; then + history_response="${CLAUDE_TOOL_SUMMARY}"$'\n\n'"${history_response}" + fi + history_response=$(printf '%s' "$history_response" | _sanitize_for_prompt) + history_add "assistant" "$history_response" + + # Consolidate memories + if type memory_consolidate &>/dev/null; then + (memory_consolidate || true) & + fi + + # Respond with audio when the user sent an audio message + # (ELEVENLABS_API_KEY is guaranteed non-empty here — checked at audio download) + if [ "$has_audio" = true ]; then + local tts_tmpdir="${CLAUDIO_PATH}/tmp" + if ! mkdir -p "$tts_tmpdir"; then + log_error "whatsapp" "Failed to create TTS temp directory: $tts_tmpdir" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "$response" "$message_id" + else + tts_file=$(mktemp "${tts_tmpdir}/claudio-tts-XXXXXX.mp3") || { + log_error "whatsapp" "Failed to create temp file for TTS" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "$response" "$message_id" + return + } + chmod 600 "$tts_file" + + if tts_convert "$response" "$tts_file"; then + if ! whatsapp_send_audio "$WEBHOOK_FROM_NUMBER" "$tts_file" "$message_id"; then + log_error "whatsapp" "Failed to send audio message, falling back to text" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "$response" "$message_id" + fi + else + # TTS failed, fall back to text only + log_error "whatsapp" "TTS conversion failed, sending text only" + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "$response" "$message_id" + fi + fi + else + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "$response" "$message_id" + fi + else + whatsapp_send_message "$WEBHOOK_FROM_NUMBER" "Sorry, I couldn't get a response. Please try again." "$message_id" + fi +} + +whatsapp_setup() { + local bot_id="${1:-}" + + echo "=== Claudio WhatsApp Business API Setup ===" + if [ -n "$bot_id" ]; then + echo "Bot: $bot_id" + fi + echo "" + echo "You'll need the following from your WhatsApp Business account:" + echo "1. Phone Number ID (from Meta Business Suite)" + echo "2. Access Token (permanent token from Meta for Developers)" + echo "3. App Secret (from your Meta app settings)" + echo "4. Authorized phone number (the number you want to receive messages from)" + echo "" + + read -rp "Enter your WhatsApp Phone Number ID: " phone_id + if [ -z "$phone_id" ]; then + print_error "Phone Number ID cannot be empty." + exit 1 + fi + + read -rp "Enter your WhatsApp Access Token: " access_token + if [ -z "$access_token" ]; then + print_error "Access Token cannot be empty." + exit 1 + fi + + read -rp "Enter your WhatsApp App Secret: " app_secret + if [ -z "$app_secret" ]; then + print_error "App Secret cannot be empty." + exit 1 + fi + + read -rp "Enter authorized phone number (format: 1234567890): " phone_number + if [ -z "$phone_number" ]; then + print_error "Phone number cannot be empty." + exit 1 + fi + + # Generate verify token + local verify_token + verify_token=$(openssl rand -hex 32) || { + print_error "Failed to generate verify token" + exit 1 + } + + export WHATSAPP_PHONE_NUMBER_ID="$phone_id" + export WHATSAPP_ACCESS_TOKEN="$access_token" + export WHATSAPP_APP_SECRET="$app_secret" + export WHATSAPP_PHONE_NUMBER="$phone_number" + export WHATSAPP_VERIFY_TOKEN="$verify_token" + + # Verify credentials by calling the API + local config_file test_result + config_file=$(mktemp "${CLAUDIO_PATH}/tmp/curl-config-XXXXXX") || { + print_error "Failed to create temporary config file" + exit 1 + } + chmod 600 "$config_file" + trap 'rm -f "$config_file"' RETURN + + printf 'url = "%s/%s"\n' "$WHATSAPP_API" "$phone_id" > "$config_file" + printf 'header = "Authorization: Bearer %s"\n' "$access_token" >> "$config_file" + + test_result=$(curl -s --connect-timeout 10 --max-time 30 --config "$config_file") + + local verified_name + verified_name=$(echo "$test_result" | jq -r '.verified_name // empty') + if [ -z "$verified_name" ]; then + print_error "Failed to verify WhatsApp credentials. Check your Phone Number ID and Access Token." + exit 1 + fi + + print_success "Credentials verified: $verified_name" + + # Verify tunnel is configured + if [ -z "$WEBHOOK_URL" ]; then + print_warning "No tunnel configured. Run 'claudio install' first." + exit 1 + fi + + # Save config: per-bot or global + if [ -n "$bot_id" ]; then + # Validate bot_id format + if [[ ! "$bot_id" =~ ^[a-zA-Z0-9_-]+$ ]]; then + print_error "Invalid bot name: '$bot_id'. Use only letters, numbers, hyphens, and underscores." + exit 1 + fi + + local bot_dir="$CLAUDIO_PATH/bots/$bot_id" + mkdir -p "$bot_dir" + chmod 700 "$bot_dir" + + # Load existing config to preserve other platform's credentials + export CLAUDIO_BOT_ID="$bot_id" + export CLAUDIO_BOT_DIR="$bot_dir" + export CLAUDIO_DB_FILE="$bot_dir/history.db" + if [ -f "$bot_dir/bot.env" ]; then + # shellcheck source=/dev/null + source "$bot_dir/bot.env" 2>/dev/null || true + fi + + # Re-apply new WhatsApp credentials (source may have overwritten them during re-configuration) + export WHATSAPP_PHONE_NUMBER_ID="$phone_id" + export WHATSAPP_ACCESS_TOKEN="$access_token" + export WHATSAPP_APP_SECRET="$app_secret" + export WHATSAPP_PHONE_NUMBER="$phone_number" + export WHATSAPP_VERIFY_TOKEN="$verify_token" + + claudio_save_bot_env + + print_success "Bot config saved to $bot_dir/bot.env" + else + claudio_save_env + print_success "Config saved to service.env" + fi + + echo "" + echo "=== Webhook Configuration ===" + echo "Configure your WhatsApp webhook in Meta for Developers:" + echo "" + echo " Callback URL: ${WEBHOOK_URL}/whatsapp/webhook" + echo " Verify Token: ${verify_token}" + echo "" + echo "Subscribe to these webhook fields:" + echo " - messages" + echo "" + print_success "Setup complete!" +}