Problem
The circuit breaker, rate limiter, and retry logic are unit tested, but never validated under realistic failure scenarios:
- What happens when a backend dies mid-request?
- Does the circuit breaker actually trip under sustained failures?
- Do retries work correctly with real network errors?
Current State
- ✅ Unit tests for circuit breaker state machine
- ✅ Unit tests for rate limiter token bucket
- ❌ No integration tests with actual failing backends
- ❌ No chaos engineering (pod kills, network partitions)
Suggested Fix
Integration Tests
- Deploy backend that returns 500s on demand
- Verify circuit breaker opens after threshold
- Verify traffic fails fast when circuit is open
- Verify half-open state allows probe requests
Chaos Testing (optional)
- Use Chaos Mesh or LitmusChaos in CI
- Kill backend pods during load test
- Inject network latency
- Verify graceful degradation
Files
control/src/proxy/circuit_breaker.rs
control/src/proxy/rate_limiter.rs
tests/integration/scenarios/ (add new scenarios)
Problem
The circuit breaker, rate limiter, and retry logic are unit tested, but never validated under realistic failure scenarios:
Current State
Suggested Fix
Integration Tests
Chaos Testing (optional)
Files
control/src/proxy/circuit_breaker.rscontrol/src/proxy/rate_limiter.rstests/integration/scenarios/(add new scenarios)