A 150-line Python solution to extend Claude Code usage through automatic API fallbacks with commit tracking for code review.
-
LiteLLM Proxy (config file)
- Handles API routing and automatic fallbacks
- Switches: Claude → Gemini → DeepSeek
- Built-in retry logic and rate limit handling
-
Monitor (50 lines)
- Flask webhook listener
- Creates git commits/tags on model switches
- Sends Discord notifications
-
Review Script (100 lines)
- Analyzes commits by model
- Shows diffs for each model's work
- Enables accept/rollback of model work
Before: 2,500+ lines of custom Python orchestrator with:
- Custom provider adapters
- ClaudeMonitor subprocess wrapper
- Complex session management
- HTTP server with SSE
- Nested git branching logic
After: Use battle-tested LiteLLM proxy
- Already handles 100+ LLM providers
- Built-in fallback logic
- Well-documented and maintained
- 95% less code to maintain
The Problem: Claude Code is a web app making HTTP requests, not a CLI tool you can wrap as a subprocess.
The Solution: Intercept at the HTTP level using a proxy, then use webhooks to trigger git operations.
Alternative Considered: mitmproxy with custom addon
- More complex setup
- Requires HTTPS certificate management
- Harder to debug Why We Chose LiteLLM: Official support for proxying, easier configuration, better documentation.
Before: Nested branches like dev → dev-openrouter → dev-openrouter-gemini
- Hard to visualize
- Complex merge/rollback logic
- Over-engineered for the use case
After: Linear tags on commits
- Clear which model did what
- Simple git revert for rollback
- Easy to analyze with
git log --oneline --decorate
Example:
abc123f (tag: model-gemini-20260216-1430) Auto-commit before switch to gemini
def456g (tag: model-deepseek-20260216-1500) Auto-commit before switch to deepseek
┌─────────────┐
│ Claude Code │
│ (Web App) │
└──────┬──────┘
│ HTTP POST
│ to api.anthropic.com
↓
┌─────────────────┐
│ LiteLLM Proxy │ ← listens on localhost:8000
│ (Port 8000) │
└────────┬────────┘
│
├─→ Try T0: Claude API
│ ↓ (429 rate limit)
├─→ Try T1: Gemini API
│ ↓ (429 rate limit)
└─→ Try T2: DeepSeek API
│
│ Webhook on fallback
↓
┌──────────────────┐
│ Monitor (Flask) │ ← listens on localhost:5000
│ (Port 5000) │
└────────┬─────────┘
│
├─→ git commit -am "..."
├─→ git tag model-X-timestamp
└─→ Discord notification
┌──────────────┐
│ Claude Usage │
│ Returns │
└──────┬───────┘
│
↓
┌─────────────────────────┐
│ python review.py -a │ Show what each model did
└──────┬──────────────────┘
│
├─→ Good work? → python review.py --accept gemini
│ ↓
│ Tag renamed: model-* → accepted-*
│
└─→ Bad work? → python review.py --rollback deepseek
↓
git revert <commits>
Edit litellm_config.yaml:
model_list:
- model_name: my-custom-model
litellm_params:
model: provider/model-name
api_key: os.environ/MY_API_KEY
router_settings:
fallbacks: [
{"claude-sonnet-4": ["gemini-flash"]},
{"gemini-flash": ["my-custom-model"]}, # Add here
]export GIT_AUTO_COMMIT=false
python monitor.pyMonitor will still receive webhooks but won't create commits.
Edit monitor.py webhook handler to add custom logic:
@app.route("/webhook", methods=["POST"])
def webhook():
data = request.json
model = data.get("model")
# Your custom logic here
if model == "gemini-flash":
# Run tests before accepting fallback
subprocess.run(["pytest"])
# ... existing commit logic# Start monitor
python monitor.py
# In another terminal, send test webhook
curl -X POST http://localhost:5000/webhook \
-H "Content-Type: application/json" \
-d '{"model":"test-model","event_type":"fallback"}'
# Check git tags
git tag -l "model-*"- Set very low rate limit on Claude API (via Anthropic console)
- Start proxy and monitor
- Use Claude Code intensively
- Verify automatic fallback to Gemini
- Check commits were created:
git log --oneline - Review work:
python review.py -a
- Check
review.py --analyzefor model activity - Review and accept/rollback model work
- Clean up old tags:
git tag -l "accepted-*" | xargs git tag -d - Check LiteLLM proxy logs for errors
- Update dependencies:
pip install -U litellm flask gitpython requests - Review fallback model list, adjust as providers change offerings
Cause: Every webhook triggers commit, even non-fallback events
Fix: Edit monitor.py to filter events more strictly:
if event_type == "fallback_success" and "rate_limit" in str(data):
git_commit_and_tag(model, event_type)Cause: No files changed when model switches
Fix: This is expected if Claude was just thinking/planning. The commit preserves the moment of switch even if no code changed.
Cause: Monitor not creating tags, or tags were deleted
Fix:
- Check monitor is running:
curl http://localhost:5000/health - Check git tag creation:
git tag -l "model-*" - Manually test webhook: see "Testing" section above
If this proves useful, consider:
-
Automated Testing on Fallback
- Run test suite before accepting fallback
- Auto-rollback if tests fail
-
Cost Tracking
- Log token usage by model
- Generate cost reports
-
Context Preservation
- Save conversation state before switch
- Replay context to new model
-
Smart Routing
- Route by task type (coding vs docs vs planning)
- Use cheaper models for simple tasks
-
Web Dashboard
- Real-time monitoring UI
- Visual git history by model
- One-click accept/rollback
But: Keep it simple. The current solution solves the core problem without over-engineering.
- Existing Tools: LiteLLM handles the hard parts (API compatibility, rate limit detection)
- Simple State: Git commits are the only state, no databases
- Manual Review: Human judgment for code quality, not automated merges
- Minimal Code: Easy to understand, debug, and modify
Total maintenance burden: ~2 hours/month for a solo developer.
| Aspect | Original | New |
|---|---|---|
| Lines of Code | ~2,500 | ~150 |
| External Dependencies | Custom orchestrator | LiteLLM (maintained) |
| Complexity | High (nested branches, sessions) | Low (tags, commits) |
| API Interception | ❌ Subprocess wrapper | ✅ HTTP proxy |
| Setup Time | Hours | Minutes |
| Maintenance | High | Minimal |
| Works with Claude Code | ❌ No | ✅ Yes |
MIT - Do whatever you want with this code.