Fix: Pin IBAC user intent against FIFO eviction; add listener.skip_hosts#500
Conversation
…ffic A chatty outbound destination (e.g., an OpenTelemetry collector sidecar that exports dozens of times per agent turn) fills the per-session FIFO event buffer and evicts the inbound A2A user intent that IBAC needs to align tool calls. Once evicted, IBAC's LastIntent() returns nil and the plugin falls through to the no_user_context skip path with sub_reason no_intent -- every tool call after the first is allowed without LLM- judged alignment. Add a listener-level skip list. Hosts matching listener.skip_hosts are forwarded as a transparent proxy: no pipeline run, no session event recorded. The matcher uses the same dot-delimited gobwas/glob semantics as authproxy-routes; ports are stripped before matching. The skip applies symmetrically to both proxy-sidecar (forward + reverse proxy) and envoy-sidecar (ext_proc) deployments, and to both HTTP and CONNECT- tunnel paths in proxy-sidecar mode. Wired through the runtime config; required a switch from struct == to reflect.DeepEqual in the reloader's listener-comparison guard since ListenerConfig now contains a slice. Includes per-listener tests covering: skipped host bypasses pipeline plus recording while still forwarding upstream, non-matching host runs the pipeline normally, and a nil matcher preserves today's behavior. Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Kelly Abuelsaad <kaymar@gmail.com>
The session store evicts the oldest events when a bucket exceeds maxEvents (default 100). This is the right policy in general but catastrophic for one specific event: the inbound A2A request that carries the user's intent, which IBAC's LastIntent() relies on to align outbound tool calls. A chatty agent can fill the buffer with inference + MCP + observability events in seconds, evicting the intent and breaking IBAC for the rest of the turn. listener.skip_hosts is the right primary fix when the offending traffic is identifiable infrastructure. This commit adds a defense-in-depth backstop in the store itself: the eviction policy now pins the most-recent inbound A2A request event so it survives FIFO trimming. Other events evict in chronological order; the protected intent stays at its original timestamp, leaving a visible time gap in the timeline (consumers like abctl see events in the same monotonic order they were appended, just with a gap). Only the most-recent intent is pinned. Older intents from earlier turns evict normally so a multi-turn conversation with huge fan-out cannot starve the buffer with stale intents. The intent predicate matches SessionView.LastIntent exactly, so the eviction policy and the consumer view cannot drift apart. Tests cover: GSM8K reproducer (intent survives 20 chatty events), no-intent regression (FIFO unchanged for non-intent buckets), multi-turn with only most-recent pinned, fast-path when intent is already in the keep window, chronological-order preservation, and the all-intents pathological bucket (maxEvents still respected). Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Kelly Abuelsaad <kaymar@gmail.com>
|
Warning Review limit reached
More reviews will be available in 3 minutes and 6 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThe PR adds configurable outbound host-skipping via ChangesSkip-Host Bypass and Intent Pinning
🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
authbridge/authlib/listener/skiphost/skiphost.go (1)
36-56: Rejectskip_hostspatterns containing:portinskiphost.New
- The matcher strips ports from incoming hosts before matching, but
New()currently accepts patterns with colons; such misconfigs can compile yet never match.- Repo docs/tests/examples for
skip_hostsuse hostname patterns without:port, so adding a construction-time check (and importingstrings) would align enforcement with the existing contract and fail fast at startup.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/authlib/listener/skiphost/skiphost.go` around lines 36 - 56, New currently allows skip_hosts patterns that include a port (a colon), but incoming hosts have ports stripped before matching so such patterns will never match; update New (in skiphost.New) to reject any pattern containing ':' by returning an error (e.g., "skiphost: pattern contains port: %q") when iterating the patterns slice, and add the strings import to check strings.Contains(p, ":"). Ensure the check runs before glob.Compile so invalid patterns fail fast and the returned Matcher/patterns and compiled entries remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@authbridge/authlib/listener/forwardproxy/skiphost_test.go`:
- Line 200: Update the misleading comment in skiphost_test.go so it correctly
describes why the no-op exists: the line "var _ = strings.TrimSpace" is present
to silence an unused-import warning for the strings package, not to suppress
mustParseURL; either replace the comment with one that explicitly references
strings.TrimSpace and the strings import, or remove the no-op and import if
strings is no longer needed; locate the symbol mustParseURL (used from
server_test.go) to verify it's not the cause before editing.
In `@authbridge/CLAUDE.md`:
- Around line 593-594: The paragraph overstates LastIntent()'s guarantee; update
the wording to say the pin only protects the most-recent inbound A2A event from
FIFO eviction while the session exists, and does not prevent LastIntent() from
becoming nil due to session expiration, explicit deletion, or non-FIFO removals.
Mention that only the latest inbound intent is pinned (older intents are not),
recommend preferring skip_hosts when the source is identifiable, and change the
sentence asserting "will return non-nil as long as ANY inbound A2A request
landed in the bucket since session start" to a narrower form like "will return
non-nil while the session remains active and the pinned event has not been
removed; it is not a lifetime guarantee."
---
Nitpick comments:
In `@authbridge/authlib/listener/skiphost/skiphost.go`:
- Around line 36-56: New currently allows skip_hosts patterns that include a
port (a colon), but incoming hosts have ports stripped before matching so such
patterns will never match; update New (in skiphost.New) to reject any pattern
containing ':' by returning an error (e.g., "skiphost: pattern contains port:
%q") when iterating the patterns slice, and add the strings import to check
strings.Contains(p, ":"). Ensure the check runs before glob.Compile so invalid
patterns fail fast and the returned Matcher/patterns and compiled entries remain
unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 09c6b6c8-b7b2-4dca-920f-a56e18b588e9
📒 Files selected for processing (14)
authbridge/CLAUDE.mdauthbridge/authlib/config/config.goauthbridge/authlib/listener/extproc/server.goauthbridge/authlib/listener/extproc/skiphost_test.goauthbridge/authlib/listener/forwardproxy/server.goauthbridge/authlib/listener/forwardproxy/skiphost_test.goauthbridge/authlib/listener/skiphost/skiphost.goauthbridge/authlib/listener/skiphost/skiphost_test.goauthbridge/authlib/reloader/reloader.goauthbridge/authlib/session/intent_pin_test.goauthbridge/authlib/session/store.goauthbridge/cmd/authbridge-envoy/main.goauthbridge/cmd/authbridge-lite/main.goauthbridge/cmd/authbridge-proxy/main.go
huang195
left a comment
There was a problem hiding this comment.
Review Summary
Strong PR — the IBAC intent-pin (change B) is correct and thoroughly tested: it pins only the most-recent A2A intent (older intents FIFO-evict), preserves chronological order, respects maxEvents even in a pathological all-intents bucket, runs under the same lock as reads, and the isIntentEvent predicate is character-for-character identical to SessionView.LastIntent (co-located with a comment) so the eviction policy and consumer view can't drift. The reflect.DeepEqual reloader change is correct (slices aren't ==-comparable) and no laxer than before. Nil/empty skip_hosts preserves today's behavior, inbound is never skipped (matcher wired only to the forward proxy, not the reverse proxy), and skip is a true transparent passthrough.
The blocker is the new bypass's safety envelope: skip_hosts has no match-all guard (see inline), and in the ext_proc / HTTP-forward paths it keys on an agent-controllable hostname (:authority / Host) with no audit trail when a host is skipped. The design intent — "trusted infrastructure surface, not a per-route knob" (CLAUDE.md) — is reasonable, but without the match-all guard a single skip_hosts: ["*"] silently disables the whole outbound enforcement pipeline + recording for matched traffic, and #496 already established that bypass matchers must reject match-all.
One doc nit that isn't inline-able (transparent.go isn't in this diff): the hard enforce-redirect listener deliberately ignores skip_hosts (good — it shouldn't be self-exemptable). Worth a one-line comment there stating the omission is intentional, so a future maintainer doesn't "fix the inconsistency" by adding a skip keyed on the agent-controlled SNI/Host and open a real bypass of the hard guard.
Areas reviewed: Go (skiphost, extproc, forwardproxy, session store, reloader, 3 mains) + tests, config, CLAUDE.md · Commits: 2, signed-off ✅ · CI: all green
Add the match-all guard and this is good to go.
Assisted-By: Claude Code
| // error identifying the first invalid pattern so misconfigurations | ||
| // surface at startup rather than at first request. An empty input is | ||
| // valid and yields a Matcher that matches nothing. | ||
| func New(patterns []string) (*Matcher, error) { |
There was a problem hiding this comment.
must-fix — New() rejects only empty / syntactically-invalid patterns; a bare * (or *.*, *.svc.cluster.local) compiles and is accepted (the test even constructs New([]string{"*"})). Because skip_hosts bypasses the entire pipeline (OBO token injection, policy, IBAC) and session recording, a skip_hosts: ["*"] silently turns the outbound egress guard into a no-op for matched traffic — and the matched value is agent-influenceable in the ext_proc / HTTP-forward paths (see the comment on extproc/server.go).
#496 added exactly this guard to the bypass matcher (reject * / /* / empty). Mirror it here: reject match-all / unbounded globs at construction so an operator cannot disable enforcement with a single wildcard. This is the one knob that turns skip_hosts from an ops convenience into an enforcement kill-switch.
There was a problem hiding this comment.
Done in e8e632d. skiphost.New now rejects *, **, and whitespace-only at construction with an error pointing operators at "remove the plugin from the pipeline if you mean to disable enforcement." Mirrors the ibac bypass guard from #496.
One scoping note: *.* and *.svc.cluster.local are NOT match-all under .-delimited gobwas/glob — empirically * matches one label only, so *.* only matches 2-label hosts and *.svc.cluster.local only matches the exact 4-label suffix. They're accepted; only * and ** are rejected. Added TestNew_AcceptsLeadingStar to lock that down so a future "stricter" rewrite doesn't break operator-typical FQDN patterns.
| // pctx=nil signals the response handlers (handleResponseHeaders, | ||
| // handleResponseBody) and the deferred RunFinish to no-op as well — | ||
| // all four phases are skipped consistently. See ListenerConfig.SkipHosts. | ||
| if s.SkipHosts.Match(pctx.Host) { |
There was a problem hiding this comment.
suggestion (security) — The skip match keys on pctx.Host, which on this path is the agent-supplied :authority (fallback host header) — i.e. the agent proposes the value that decides whether it's exempted from OBO/policy/IBAC. The CONNECT path (forwardproxy/server.go:710) is safe-by-construction (it dials the same r.Host it matched), but ext_proc trusts Envoy's authority handling.
Two asks: (a) document in config.go that ext_proc-mode skip trusts :authority so operators know the match value is only as trustworthy as Envoy's authority rewriting; and (b) emit a counter / structured log on each skip — a skipped host currently leaves no trace in /v1/sessions or abctl, so a successful self-exemption is invisible.
There was a problem hiding this comment.
Done in e8e632d.
(a) ListenerConfig.SkipHosts godoc now spells out the trust model per deployment shape — ext_proc keys on Envoy's :authority (agent-controlled), HTTP-forward keys on r.Host (agent-controlled), CONNECT keys on r.Host which IS the dial target so it's safe-by-construction. Explicit "do NOT list a destination here you'd want IBAC / token-exchange to deny on."
(b) Each skip now emits a slog.Info line carrying host, matched pattern, and method/path on every skip path (extproc handleOutbound + handleOutboundBody, forwardproxy handleRequest + handleConnect). Skipped operator config: MatchPattern(host) (string, bool) returns the raw operator-supplied pattern that matched, so the log entry attributes back to the specific skip_hosts line. No counter yet — wanted to keep the change scoped to "leave a trace" rather than build out a stats surface; happy to follow up if you want the counter too.
| // response-phase work. RunFinish is also skipped (no defer | ||
| // registered) because the pipeline never ran and has nothing to | ||
| // finalize. See ListenerConfig.SkipHosts for motivation. | ||
| skipped := s.SkipHosts.Match(pctx.Host) |
There was a problem hiding this comment.
suggestion — HTTP-forward skip matches r.Host, but the request is then forwarded to r.URL (s.Client.Do(r)), so a forged Host that diverges from the real URL host would skip-match yet dial elsewhere. Worth a test that feeds a spoofed Host / :authority diverging from the dial target (the exact self-exemption scenario) to lock the contract — currently only the matching-host happy path is covered, and neither the CONNECT skip (:710) nor the transparent path's non-skip is exercised.
There was a problem hiding this comment.
Done in e8e632d. New TestForwardProxy_SkipHosts_MatchesAgainstHostHeaderNotURL feeds a req.Host = 'infrastructure-only.example' while req.URL points at the real backend, asserts the skip fires (pipeline doesn't run, no session event recorded), and locks the contract that the match keys on the agent-supplied Host header.
Test scope is deliberately narrow: it asserts the skip-decision behavior, not what Go's http.Transport does to a request whose Host diverges from URL when going through a proxy (which turns out to be implementation-detail and not part of the trust contract).
Also added TestForwardProxy_SkipHosts_CONNECT_BypassesPipeline — drives a CONNECT manually, observes the 200 + tunnel echo round-trip, asserts no pipeline run and no session event. Closes the previously-uncovered CONNECT skip path.
For the transparent-path comment you flagged out-of-diff: added a paragraph to HandleTransparentConn's godoc in transparent.go (in-PR now) stating the SkipHosts omission is deliberate and explicitly warning future maintainers not to 'fix the inconsistency' by adding a skip keyed on agent-controlled SNI/Host bytes.
Adds construction-time guards, audit logging, and tests asked for in the PR kagenti#500 review. Skiphost matcher (authlib/listener/skiphost): - Reject match-all patterns ("*", "**", whitespace-only) at New() so a single misconfigured entry cannot silently disable the entire outbound enforcement pipeline. Mirrors the bypass-pattern guard added to ibac in kagenti#496. Empirically verified: under .-delimited glob "*" matches every single-label hostname (which is how every short Kubernetes service name reaches the listener), and "**" is the unambiguous match-all. Patterns like "*.svc.cluster.local" or "*.*" are NOT match-all and remain accepted. - Reject patterns containing ":" so an operator typo like "otel-collector:8335" fails fast at startup rather than silently never matching (Match strips the port from the incoming host before comparing, so colon-bearing patterns would never fire). - Add MatchPattern(host) (string, bool) so listeners can attribute audit logs / counters to the specific raw operator-supplied pattern that fired. Match() is now a thin wrapper around it for callers that only want the boolean. Listener skip-path audit logging: - Each successful skip in extproc (handleOutbound, handleOutboundBody) and forwardproxy (handleRequest, handleConnect) now emits a structured slog INFO line carrying host, matched pattern, and method/path. Without this trail, a successful self-exemption left no trace anywhere -- no session event, no plugin invocation -- because that is exactly what skip means. Trust-model documentation (authlib/config/config.go): - ListenerConfig.SkipHosts godoc now spells out that the matched value is agent-influenceable on the ext_proc and HTTP-forward paths (Envoy :authority / r.Host header) and only safe-by- construction on the CONNECT-tunnel path. Operators should not list a host they would want IBAC / token-exchange to deny on. Hard-guard reminder (authlib/listener/forwardproxy/transparent.go): - HandleTransparentConn (proxy-sidecar enforce-redirect mode) intentionally does NOT consult SkipHosts. Add a paragraph to its godoc stating the omission is deliberate so a future maintainer does not "fix the inconsistency" by adding a skip keyed on the agent-controlled SNI/Host bytes and open a real bypass of the hard egress guard. Tests: - New TestForwardProxy_SkipHosts_MatchesAgainstHostHeaderNotURL locks the trust-model contract for the HTTP forward path: the skip keys on the agent-supplied Host header, not on r.URL.Host. - New TestForwardProxy_SkipHosts_CONNECT_BypassesPipeline exercises the previously-uncovered CONNECT skip path end-to-end. - New TestNew_RejectsMatchAllStar / RejectsMatchAllDoubleStar / RejectsWhitespaceOnly / RejectsPortInPattern lock the construction- time guards, plus TestNew_AcceptsLeadingStar guards against a future rewrite that over-rejects and breaks operator-typical FQDN patterns. CLAUDE.md gotcha kagenti#11: - Narrow the LastIntent() wording: the pin protects against FIFO eviction only; expired or deleted sessions can still return nil. Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Kelly Abuelsaad <kaymar@gmail.com>
huang195
left a comment
There was a problem hiding this comment.
Re-Review Summary
All prior review items are addressed, and well:
- ✅ Must-fix — match-all guard added.
skiphost.Newnow rejects*,**, empty/whitespace, and port-bearing (:) patterns at construction with clear startup errors and a rationale comment mirroring #496. It correctly does not over-reject —*.*,*.svc.cluster.local,service-*still compile (not match-all under.-delimited glob). A single wildcard can no longer silently disable the outbound egress guard. - ✅ Audit trail on skip (prior suggestion). Both skip sites now emit
slog.Info(forward-proxy "bypassing pipeline + session recording"; CONNECT "opening tunnel without pipeline + recording"), andMatchPatternsurfaces which pattern matched — a self-exemption is no longer invisible. - ✅ Trust model documented.
config.gonow states the matched value is agent-influenceable on the cooperative paths, and that the transparent guard must not consultSkipHosts. - ✅ Transparent hard-guard protected.
transparent.gogained a comment that documents the deliberate omission and explicitly warns the next maintainer not to add aSkipHostscheck there — and no skip was actually added to the hard path (verified). - ✅ Substantial new tests on
skiphostand the forward-proxy skip paths.
The IBAC intent-pin (change B) was already verified correct in the first pass (latest-intent-only, maxEvents-bounded, predicate identical to LastIntent). Thorough turnaround on a security-sensitive bypass feature.
Author: kellyaa (MEMBER — maintainer) · Agent/IDE config (.claude/.vscode): none · Commits: 3, signed-off ✅ · CI: green (Go authlib still running at review time)
No blockers — approving.
Assisted-By: Claude Code
Summary
Fixes IBAC intent eviction in chatty agents. When an agent generates many outbound events per turn (e.g., dozens of OpenTelemetry exports), the per-session FIFO event buffer overflows and silently evicts the inbound A2A user intent that IBAC needs to align tool calls. After eviction,
LastIntent()returns nil, IBAC falls through to theno_user_contextskip path withsub_reason: no_intent, and every tool call after the first is allowed without LLM-judged alignment.Two layered defenses, one per commit:
Add listener.skip_hosts to bypass infrastructure traffic— primary fix. Newlistener.skip_hostsconfig field accepts dot-delimited globs (same matcher asauthproxy-routes); matched outbound destinations are forwarded as a transparent proxy with no pipeline run and no session recording. Wired into both proxy-sidecar (HTTP forward + reverse proxy and CONNECT-tunnel paths) and envoy-sidecar (ext_proc) deployments. Required switching the reloader'slistener.*change-detection from struct==toreflect.DeepEqualsinceListenerConfignow contains a slice.Pin most-recent A2A intent against FIFO eviction— defense-in-depth backstop. The session store eviction policy now pins the most-recent inbound A2A request event so it survives FIFO trimming even withskip_hostsempty. Only the latest intent is pinned (older intents from prior turns evict normally), so a multi-turn conversation with huge fan-out cannot starve the buffer with stale intents. The intent predicate matchesSessionView.LastIntentexactly so the eviction policy and the consumer view cannot drift apart.Reproducer (GSM8K agent in team1): pre-fix,
/v1/sessions/{id}showed 100 events with 0 inbound A2A request events — 85 of the 100 were OTel exports that had pushed the user intent out of the FIFO. Post-fix (verified locally on the kind cluster), the user intent persists across the full session.Test plan
go test ./...passes acrossauthlib/(extproc, forwardproxy, session, reloader, skiphost — no failures)authbridge-proxy,authbridge-envoy,authbridge-lite) build cleanauthbridge:latest, retagged to the operator's expected ref (ghcr.io/kagenti/kagenti-extensions/authbridge:v0.6.0-alpha.7), loaded into kind, and verified bothteam1/exgentic-a2a-tool-calling-{gsm8k,tau2}pods picked up the new image and started cleanlymaxEventsLastIntent()resolves to the user message on every tool call (no moreno_intentskip rows in/v1/sessions)Assisted-By: Claude (Anthropic AI) noreply@anthropic.com
Summary by CodeRabbit
Release Notes
New Features
listener.skip_hostsconfiguration to bypass the pipeline and session recording for specified outbound hosts using glob pattern matching.Tests