feat(net): CircuitBreaker resilience layer (Slice 1, PR 4)#85
Conversation
Design for the net-http CircuitBreaker resilience layer (ADR-0031 §5): a pure, clock-injected Breaker state machine (Closed/Open/Half-Open) behind a thin Arc<Mutex<Breaker>> + Timer Service shell; trips on consecutive Connection/Timeout/5xx failures (or immediately on Throttled/429 with the long throttle_cooldown), fast-rejects with a new non-retryable HttpError::CircuitOpen / ErrorKind::CircuitOpen, lazy Half-Open probing with now()-only timing (no sleep, no futures-util, no new dependency). Adds a 4-class outcome partition (Failure / TripNow / Ignored / Success) so 4xx/Auth and unclassified errors neither trip nor mask a building outage; Unknown -> Ignored for v1 with the resilience4j fail-safe recorded as a future improvement. To be recorded as ADR-0034 amendment #9. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds BreakerState/Admit/Breaker (pub(crate), not re-exported) to circuit_breaker.rs per ADR-0031 §5: new/admit/record, clock-injected via now: Instant so the whole transition table is testable with zero async. Half-Open treats Ignored and Success identically (a reached-host probe resolves it); Open ignores stale record() calls (single global v1 breaker). Task 4's Service will consume these types, removing the transitional #[allow(dead_code)] added here to keep just lint --all-targets green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add CircuitBreakerLayer<T> (Layer<S> factory) and CircuitBreaker<S, T> (the wrapping Service) over the pure Breaker state machine from Tasks 2-3: locks briefly to admit (via timer.now()), releases the lock, runs inner.call().await (or fast-rejects CircuitOpen), then locks briefly to record the classified outcome. The Mutex guard never crosses the await. Removes the transitional #[allow(dead_code)] attributes from Tasks 2-3 now that Class/classify/Breaker/Admit are wired into non-test code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
If the CircuitBreaker call future is dropped before recording the outcome (caller cancellation, or an inner panic) while a Half-Open probe is in flight, the breaker was stranded at probes_left:0 forever. Add a state-aware Breaker::on_abandoned_probe (reopen from Half-Open, no-op in Closed/Open) and an RAII ProbeGuard that fires it on drop, so a cancelled probe self-heals after a fresh cooldown instead of wedging. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR adds a CircuitBreaker resilience layer to net-http-api: a new ChangesCircuitBreaker resilience layer
Estimated code review effort: 4 (Complex) | ~60 minutes Possibly related PRs
Suggested labels: 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/adr/0034-http-construction-surface-auth-guarded-boot-coverage.md`:
- Around line 234-237: Tighten the circuit-breaker state description so
`Class::Ignored` is only described as not affecting the Closed-state failure
streak, since `circuit_breaker.rs::classify` and the Half-Open probe handling
currently close probes on `Ignored`. Update the wording around the `partition`
rules to make clear that `4xx`/`Auth`/`Unknown` never trip and do not reset the
Closed-state counter, but they do resolve Half-Open probes rather than “never
resets” in every state.
In `@docs/superpowers/specs/2026-07-04-net-http-circuitbreaker-layer-design.md`:
- Around line 415-417: Remove the stale `half_open_probes > 1` question from the
testing notes in the circuit breaker design spec, since the multi-probe table
test is already committed elsewhere. Update that item in the testing section to
either a confirmed decision or delete it entirely so it no longer reads like
open planning; use the `half_open_probes`, `probes_left`, and `successes_needed`
terminology to keep the wording aligned with the rest of the spec.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 92cdf6ef-92da-4214-aa66-37e0f748ecbd
📒 Files selected for processing (8)
CHANGELOG.mdcrates/adapter/net/api/src/error_kind.rscrates/adapter/net/http/api/src/circuit_breaker.rscrates/adapter/net/http/api/src/error.rscrates/adapter/net/http/api/src/lib.rsdocs/adr/0034-http-construction-surface-auth-guarded-boot-coverage.mddocs/superpowers/plans/2026-07-04-net-http-circuitbreaker-layer.mddocs/superpowers/specs/2026-07-04-net-http-circuitbreaker-layer-design.md
Address CodeRabbit review on #85: - ADR-0034 #9: "never resets" was unqualified but only holds for the Closed-state failure streak; in Half-Open a reached-host Ignored resolves the probe like a Success (circuit_breaker.rs record()). Scope the wording. - CB design spec open-question #4: convert the stale "table-test >1 now?" question to Resolved — the Testing section commits to the two-probe case and multi_probe_half_open_requires_all_to_close ships it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Closes #84.
Slice 1, PR 4 of the net-http resilience layers (after RateLimit #76, Timeout #78, Retry #82; before Tracing #83). Spec:
docs/superpowers/specs/2026-07-04-net-http-circuitbreaker-layer-design.md; plan:docs/superpowers/plans/2026-07-04-net-http-circuitbreaker-layer.md; ADR-0031 §5, ADR-0034 #9.What
CircuitOpenerror surface — newErrorKind::CircuitOpen+HttpError::CircuitOpen, non-retryable fast-reject.CircuitBreakerConfig+ outcome classifier — 4-class partition (Failure / TripNow / Ignored / Success): Connection/Timeout/5xx count as failures; Throttled/429 trips immediately with the longthrottle_cooldown; 4xx/Auth/unclassified neither trip nor mask a building outage; Unknown → Ignored for v1.Breakerstate machine (Closed/Open/Half-Open) — table-tested with zero async; lazy Half-Open probing vianow()only (no sleep, no new dependency).CircuitBreaker<S, T>+CircuitBreakerLayer<T>— thinArc<Mutex<Breaker>>asyncServiceshell over the breaker, single per-host breaker; body-transparent; sits outside Retry so it counts logical post-retry outcomes.Verification
just cigreen locally (fmt, lint, test + doctests, doc, deny, machete, gitleaks, actionlint, shellcheck).CHANGELOG.md[Unreleased]updated.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
HttpError::CircuitOpen/ErrorKind::CircuitOpen).Bug Fixes
Documentation
Tests