feat: Add CPEX plugin#493
Conversation
Integrate the CPEX policy framework as an authbridge plugin (authlib/plugins/cpex). The plugin is a generic bridge: it maps AuthBridge requests to CMF and routes inbound/outbound hooks through CPEX policy chains expressed in the APL DSL plus named CPEX plugins, over the CPEX FFI. Adds the authbridge-cpex proxy-sidecar build that links the pinned CPEX release. Includes an end-to-end hr-cpex demo (HR MCP server, Keycloak realm, agent, and allow/deny/redact scenarios) whose policy exercises the named CPEX plugins: Cedar authorization, PII scanning/redaction, JWT identity, and OAuth token exchange (RFC 8693). Documents the AuthBridge hook system and the CPEX integration. Co-Authored-By: Teryl Taylor <terylt@ibm.com> Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com>
Addresses two findings from the security review of the CPEX plugin (commit a150c84). Finding 1 (host-bypass): the cpex plugin skipped the entire policy chain when the HTTP Host matched bypass_hosts. On the inbound reverse-proxy phase the Host is attacker-controlled and identity is resolved inside the very chain being bypassed, so a spoofed Host defeated all enforcement. Honor bypass_hosts on the outbound phase only; ignore it inbound (path bypass still applies) and warn at Configure time since direction is unknown there. Finding 2 (spoofable session taint): session-scoped taint labels were keyed off the client-supplied X-Session-Id / A2A contextId, letting one principal read or pollute another principal's session state. Fixed in the CPEX framework (session_resolver subject-binds Tier 0/1 as sha256(subject_id : value)); bump the FFI to v0.2.0-ffi.test.8 and the cpex Go bindings to match. Add demos/hr-cpex scenario 09 proving cross-principal isolation (eve taints a session; bob reusing her session id is not blocked) and extend the scenarios glob to include it. Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com>
…e agent; addded readme to cpex plugin Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds the CPEX plugin surface, manager implementations (cpex and stub), CMF body mapping and request/response rewrite helpers with tests, identity/delegation enrichments, CI/Docker build wiring for an authbridge-cpex image, and a full HR demo (agent, MCP server, Kubernetes manifests, scenarios, and docs). ChangesCPEX Plugin and Demo
Sequence Diagram(s)sequenceDiagram
participant Client
participant AuthBridge
participant CPEXManager
participant rcpexPluginManager
Client->>AuthBridge: Send request (MCP / Inference / A2A)
AuthBridge->>CPEXManager: build CMF payload + Invoke(hook, pctx)
CPEXManager->>rcpexPluginManager: InvokeResolved or InvokeByName
rcpexPluginManager-->>CPEXManager: PipelineResult (+mods)
CPEXManager->>AuthBridge: mapped Result + extension/body mods
AuthBridge->>Client: Continue / Deny / Mutate response
Estimated code review effort: 🎯 4 (Complex) | ⏱️ ~45 minutes
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 13
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
authbridge/demos/hr-cpex/agent/Dockerfile (1)
11-27:⚠️ Potential issue | 🟠 Major | ⚡ Quick winContainer runs as root and does not enforce UID/GID 1000.
This image currently defaults to root. Set explicit UID/GID 1000 and switch runtime user.
Suggested hardening patch
FROM python:3.12-slim WORKDIR /app @@ COPY requirements.txt /app/requirements.txt RUN pip install --no-cache-dir -r requirements.txt COPY agent.py /app/agent.py +RUN groupadd -g 1000 app && useradd -u 1000 -g 1000 -m app && chown -R 1000:1000 /app +USER 1000:1000 @@ CMD ["python", "agent.py"]As per coding guidelines, "Python Dockerfile must use UID/GID 1000 and this MUST match the runAsUser/runAsGroup values set by the operator's webhook."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/demos/hr-cpex/agent/Dockerfile` around lines 11 - 27, The Dockerfile currently runs as root; add steps to create a user and group with UID/GID 1000 (e.g., addgroup --gid 1000 and adduser --uid 1000 --gid 1000 or equivalent), chown /app to that user, and switch to it using the USER instruction before CMD so the container runs as UID/GID 1000; keep existing WORKDIR, COPY, EXPOSE and CMD intact and ensure file ownership is updated after COPY (refer to the Dockerfile's RUN, COPY, and CMD instructions to locate where to insert these changes).Source: Coding guidelines
authbridge/demos/hr-cpex/k8s/20-hr-mcp.yaml (1)
44-63:⚠️ Potential issue | 🟠 Major | ⚡ Quick winContainer securityContext is missing baseline hardening.
The pod currently relies on permissive defaults. Add container-level hardening at least for privilege escalation/capabilities/seccomp.
Suggested hardening patch
containers: - name: hr-mcp image: hr-mcp:dev imagePullPolicy: Never + securityContext: + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + seccompProfile: + type: RuntimeDefault ports: - name: http containerPort: 9100🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/demos/hr-cpex/k8s/20-hr-mcp.yaml` around lines 44 - 63, Add a container-level securityContext for the hr-mcp container to enforce baseline hardening: set allowPrivilegeEscalation: false, privileged: false, runAsNonRoot: true (and optionally runAsUser: non-root UID), readOnlyRootFilesystem: true, drop all capabilities via capabilities.drop: ["ALL"], and set seccompProfile.type: "RuntimeDefault" (or "Localhost" with a provided profile) under the container named "hr-mcp" so the pod no longer relies on permissive defaults.Source: Linters/SAST tools
authbridge/demos/hr-cpex/hr-mcp-server/Dockerfile (1)
1-13:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd USER directive to comply with UID/GID 1000 requirement.
The Dockerfile runs as root (default UID 0), violating the coding guideline that "Python Dockerfile must use UID/GID 1000 and this MUST match the runAsUser/runAsGroup values set by the operator's webhook." Add a USER directive before CMD.
🔒 Proposed fix
COPY server.py /app/server.py +USER 1000:1000 + EXPOSE 9100 CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "9100"]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/demos/hr-cpex/hr-mcp-server/Dockerfile` around lines 1 - 13, Add a non-root USER with UID and GID 1000 to the Dockerfile so the container runs as the required UID/GID and matches the operator webhook's runAsUser/runAsGroup; update the Dockerfile after the COPY steps and before CMD (the block containing CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "9100"]) to create or switch to a user/group with numeric ID 1000 (ensure ownership of /app as needed) and add a USER directive referencing that UID:GID.Source: Coding guidelines
🟡 Minor comments (8)
authbridge/cmd/authbridge-cpex/Dockerfile-107-107 (1)
107-107:⚠️ Potential issue | 🟡 MinorClarify UID expectations for authbridge-cpex vs existing combined images
authbridge-cmd/authbridge-proxyandauthbridge-cmd/authbridge-liteboth useUSER 1001, while onlyauthbridge-cmd/authbridge-envoyusesUSER 1337. Given that, the claim that the “combined authbridge container and Envoy proxy” should use UID1337doesn’t match the current Dockerfile pattern. Keepauthbridge/cmd/authbridge-cpex/DockerfileonUSER 1001(like the proxy/lite combined images) and update the documentation/learned note to specify that1337applies toauthbridge-envoyonly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/cmd/authbridge-cpex/Dockerfile` at line 107, Update the documentation/note about UID usage to reflect the actual Dockerfile pattern: keep authbridge/cmd/authbridge-cpex/Dockerfile using USER 1001 (matching authbridge-cmd/authbridge-proxy and authbridge-cmd/authbridge-lite) and change the learned-note/README text to state that UID 1337 is specific to authbridge-cmd/authbridge-envoy only; adjust any wording that currently claims the "combined authbridge container and Envoy proxy" should use 1337 so it accurately distinguishes authbridge-envoy (1337) from the other combined images (1001).Source: Learnings
authbridge/cmd/authbridge-cpex/entrypoint.sh-2-2 (1)
2-2:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd
pipefailto shell strict mode.The coding guideline requires shell scripts to use
set -euo pipefailfor strict error handling. This script usesset -euwhich is missing thepipefailflag. Withoutpipefail, errors in the middle of a pipeline (e.g.,cmd1 | cmd2) won't cause the script to exit.🛡️ Proposed fix
-set -eu +set -euo pipefail🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/cmd/authbridge-cpex/entrypoint.sh` at line 2, The script's strict mode uses "set -eu" which omits pipefail; update the entrypoint shell strict mode by changing the options to include "pipefail" so the invocation uses "set -euo pipefail" (i.e., modify the existing set invocation in entrypoint.sh that currently reads set -eu to include pipefail) to ensure failures in pipelines cause the script to exit.Source: Coding guidelines
authbridge/authlib/plugins/jwtvalidation/identity.go-68-72 (1)
68-72:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winHandle nil claims consistently in
AuthMethod().
AuthMethod()reports"jwt"even when the adapter has no claims, while the other accessors treat nil as absent identity. That can incorrectly signal authenticated JWT context to downstream consumers.Proposed fix
func (i claimsIdentity) AuthMethod() string { - return "jwt" + if i.c == nil { + return "" + } + return "jwt" }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/authlib/plugins/jwtvalidation/identity.go` around lines 68 - 72, claimsIdentity.AuthMethod() always returns "jwt" even when the adapter has no claims; change it to mirror the other accessors by returning an empty string when i.claims is nil and only return "jwt" when i.claims != nil. Locate the claimsIdentity type and the AuthMethod() method and add a nil-check on i.claims so downstream callers don't receive a false-authenticated signal.authbridge/authlib/plugins/cpex/headers.go-35-42 (1)
35-42:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix contradictory
Authorizationheader documentation.Line 80 says
Authorizationis dropped, but Line 35-42 (and the code path) preserve it. This contradiction in a security-boundary comment can cause incorrect future edits.Proposed comment-only fix
-// Sensitive headers (Authorization, Cookie, X-Api-Key, …) are +// Sensitive headers (Cookie, X-Api-Key, …) are // dropped via secretHeaderPrefixes / secretHeaderExact, NOT silently // truncated.Also applies to: 80-82
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/authlib/plugins/cpex/headers.go` around lines 35 - 42, The file has contradictory comments about the Authorization header: one comment block (the "NOTE on `Authorization`" near the top of headers.go) explains the header is deliberately preserved, but another comment later says Authorization is dropped; update the later comment to match the actual behavior (preserve the Authorization header) so documentation aligns with the code path that retains the bearer token used by CPEX identity/jwt plugins; search for the string "Authorization" in headers.go and replace the inaccurate "dropped" wording with a clear statement that the header is preserved (or vice-versa only if you also change the code path) and ensure both comment blocks consistently describe the same security behavior.authbridge/demos/hr-cpex/README.md-103-103 (1)
103-103:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd language tags to fenced code blocks.
These fences are missing language identifiers; adding
text/bash/jsonwhere appropriate will satisfy markdown lint and keep rendering consistent.Also applies to: 225-225, 233-233, 266-266, 330-330
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/demos/hr-cpex/README.md` at line 103, Multiple fenced code blocks in the README are missing language identifiers; update each triple-backtick fence (the ones currently just ``` ) to include the correct language tag (e.g., ```text, ```bash, or ```json) so Markdown linting and rendering are correct—find the plain ``` blocks in the HR CPEX README and replace them with the appropriate tagged fences, ensuring consistency for examples, CLI commands, and JSON snippets.Source: Linters/SAST tools
authbridge/demos/hr-cpex/scenarios/04-alice-internal-allow.sh-29-41 (1)
29-41:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAssert the expected 200 instead of swallowing failures.
curl -s ... | jq ... || trueexits 0 for proxy/transport failures and unexpected responses, so this "allow" scenario can look green even when the request never succeeds. Capture the status code and fail unless it is 200 before pretty-printing the body.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/demos/hr-cpex/scenarios/04-alice-internal-allow.sh` around lines 29 - 41, Update the curl invocation in the 04-alice-internal-allow.sh scenario so it captures and checks the HTTP status code instead of swallowing failures; call curl (the existing line that uses PROXY, MCP_TARGET, CLIENT, ALICE and pipes to jq) with options to write the response body to a temporary file and output the status code (e.g., -s -o <tempfile> -w "%{http_code}"), then test that the captured code equals 200 and exit non‑zero if not, only then pretty‑print the saved body with jq; ensure the script fails on non‑200 responses rather than using the current `| jq . 2>/dev/null || true` pattern.authbridge/demos/hr-cpex/verify-token-exchange.sh-96-103 (1)
96-103:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winCheck whether
audcontainsworkday-api, not whether it is first.JWT
audcan be an array. The current jq reads index0, so a valid exchanged token will be reported as broken wheneverworkday-apiis present but not first.Suggested fix
-decoded=$(printf "%s" "$payload" | tr '_-' '/+' | base64 -d 2>/dev/null || true) -aud=$(echo "$decoded" | jq -r 'if .aud | type == "array" then .aud[0] else .aud end' 2>/dev/null || echo "?") +decoded=$(printf "%s" "$payload" | tr '_-' '/+' | base64 -d 2>/dev/null || true) +aud=$(echo "$decoded" | jq -cr '.aud // "?"' 2>/dev/null || echo "?") sub=$(echo "$decoded" | jq -r '.sub // "?"' 2>/dev/null || echo "?") -if [ "$aud" = "$AUDIENCE" ]; then +if echo "$decoded" | jq -e --arg aud "$AUDIENCE" ' + (.aud | if type == "array" then . else [.] end) | index($aud) != null +' >/dev/null 2>&1; then ok "cpex-gateway → workday-api" "OK (aud=$aud)" else fail "cpex-gateway → workday-api" "aud mismatch: expected '$AUDIENCE' got '$aud'" exit 1 fi🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/demos/hr-cpex/verify-token-exchange.sh` around lines 96 - 103, The check currently reads aud as the first array element (variable aud) which fails when the JWT aud is an array containing "workday-api" not at index 0; update the extraction/validation to detect membership instead of only index 0 by changing the jq expression that sets aud/does the comparison: parse "decoded" with jq to test whether .aud equals the string or contains the expected value (AUDIENCE) when .aud is an array, then set a boolean or normalized value and compare that to AUDIENCE (referencing the variables aud, AUDIENCE and the decoded JWT handling in verify-token-exchange.sh) so the script passes when "workday-api" appears anywhere in the aud array.authbridge/demos/hr-cpex/scenarios/06-bob-apl-deny.sh-29-41 (1)
29-41:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAvoid
curl | headunderpipefailin scenario output.This can intermittently fail the script when
headexits early andcurlreceives SIGPIPE, even if the request itself succeeded.Suggested fix
-curl -s -x "$PROXY" -X POST "$MCP_TARGET" \ +resp="$(curl -s -x "$PROXY" -X POST "$MCP_TARGET" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $CLIENT" \ -H "X-User-Token: $BOB" \ -d '{ "jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": { "name": "search_repos", "arguments": { "visibility": "internal" } } - }' -i 2>&1 | head -20 + }' -i 2>&1)" +printf '%s\n' "$resp" | sed -n '1,20p'As per coding guidelines,
**/*.shscripts run withset -euo pipefail, so SIGPIPE-sensitive truncation pipelines are brittle.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/demos/hr-cpex/scenarios/06-bob-apl-deny.sh` around lines 29 - 41, The pipeline in 06-bob-apl-deny.sh currently ends with "| head -20" which can send SIGPIPE to curl under set -euo pipefail; change the pattern so curl writes its response to a file (use mktemp or curl --output) and then run head -20 on that file (or cat the file and remove it) instead of piping directly to head; update the curl invocation in the script to save to a temporary file and then print the first 20 lines from that file to avoid terminating curl early.Source: Coding guidelines
🧹 Nitpick comments (8)
authbridge/authlib/plugins/cpex/README.md (2)
33-42: 💤 Low valueAdd language identifier to fenced code block.
The fenced code block should specify a language identifier for proper syntax highlighting and rendering.
📝 Suggested fix
-``` +```text request ─► token-exchange ─► parsers ─► cpex ─────────► upstream ─► response (MCP, │ inference, │ context🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/authlib/plugins/cpex/README.md` around lines 33 - 42, The fenced diagram in authbridge/authlib/plugins/cpex/README.md is missing a language identifier which prevents proper syntax highlighting; update the opening and closing fences for that code block from triple backticks to include a language tag (for example use "text" like ```text) so the diagram renders correctly and ensure the closing fence remains ```; locate the block by the diagram content starting with "request ─► token-exchange ─► parsers ─► cpex" to make the change.Source: Linters/SAST tools
188-206: 💤 Low valueAdd language identifier to fenced code block.
The fenced code block should specify a language identifier for proper syntax highlighting.
📝 Suggested fix
-``` +```text outbound: mcp-parser ──► cpex ──► (token exchange happens as part of the policy via delegate)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/authlib/plugins/cpex/README.md` around lines 188 - 206, The README's fenced code blocks lack language identifiers which prevents proper syntax highlighting; update each triple-backtick block (the flow line and the YAML block showing pipeline configuration) to include appropriate language tags (e.g., ```text for the flow diagram and ```yaml for the pipeline block) so the outbound pipeline snippet (showing mcp-parser and cpex with config_file and hooks like on_request/on_response and fail_open) renders with correct highlighting.Source: Linters/SAST tools
docs/proposals/authbridge-hooks.md (1)
8-12: 💤 Low valueFix heading level increment.
Heading levels should increment by one level at a time. Line 8 jumps from h1 (line 1) to h3, skipping h2.
📝 Suggested fix
-### How it works +## How it works🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/proposals/authbridge-hooks.md` around lines 8 - 12, Update the heading "How it works" to the correct level so heading levels increment by one (change the "### How it works" header to "## How it works"); locate the header text "How it works" in the document and adjust its Markdown prefix to a single additional pound sign so the document goes from h1 to h2 sequentially.Source: Linters/SAST tools
authbridge/docs/cpex-plugin.md (1)
14-43: 💤 Low valueAdd language identifier to fenced code block.
The ASCII art diagram should have a language identifier for proper rendering.
📝 Suggested fix
-``` +```text ┌──────────────────────────────────────────────────────────────────┐ │ authbridge-cpex (binary built with -tags cpex, links libcpex_ffi)│🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/docs/cpex-plugin.md` around lines 14 - 43, In the fenced code block that contains the ASCII art diagram in cpex-plugin.md (the block starting with ``` and showing the authbridge-cpex pipeline and CPEX runtime diagram), add a language identifier (e.g., "text") immediately after the opening backticks so the block becomes ```text; this will ensure proper rendering of the ASCII art.Source: Linters/SAST tools
authbridge/authlib/plugins/cpex/manager_cpex.go (1)
71-82: Initialize cancellation behavior is intentional but may leak resources.The
Initializemethod returnsctx.Err()on cancellation without callingShutdown, which is documented as intentional to avoid deadlock. However, this means thePluginManagerresources remain allocated until a laterShutdowncall or GC finalization.The TODO comment (line 68-70) acknowledges this should be improved when the rcpex bindings gain context-aware initialization. This is an acceptable tradeoff for now, but operators should be aware that cancelled initialization may leave resources allocated temporarily.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/authlib/plugins/cpex/manager_cpex.go` around lines 71 - 82, The Initialize method can block trying to send the result to done after ctx cancellation; change the goroutine that calls c.mgr.Initialize() so it performs a non-blocking send into the done channel (use select { case done <- err: default: }) instead of an unconditional send, so the goroutine cannot block if the caller already returned on ctx.Done(); reference the cpexManager.Initialize function, the done channel and the c.mgr.Initialize() call and keep the channel buffered as-is.authbridge/authlib/plugins/cpex/cmf_inference_test.go (1)
127-156: ⚡ Quick winAdd a multi-choice response rewrite test case.
Please add a case with two
choices[].message.contententries to lock in safe behavior (fail-closed or positional mapping) and prevent silent choice-collapsing regressions.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/authlib/plugins/cpex/cmf_inference_test.go` around lines 127 - 156, Add a new unit test (e.g., TestInferenceResponseBodyMod_RewritesMultipleChoices) that constructs a pipeline.Context with ResponseBody containing two choices entries (choices[0].message.content and choices[1].message.content) with distinct values, call applyInferenceResponseBodyMod(pctx, "redacted"), assert the call returns mutated==true and err==nil, json.Unmarshal pctx.ResponseBody into a map and assert the decoded "choices" slice length remains 2 and that both choices' message.content values were replaced with "redacted" (preserving order), to guard against silent collapsing or positional mapping regressions; reference applyInferenceResponseBodyMod, pipeline.Context and ResponseBody when locating where to add the test.authbridge/demos/hr-cpex/k8s/10-keycloak.yaml (1)
54-110: ⚖️ Poor tradeoffConsider adding securityContext for production-like hardening (optional for demo).
The Keycloak container lacks a
securityContextwithrunAsNonRoot,allowPrivilegeEscalation: false, and explicit UID/GID. For a demo environment this is acceptable, but if the demo is used as a reference for production deployments, adding these fields would demonstrate security best practices.🛡️ Example securityContext (if desired)
spec: containers: - name: keycloak image: quay.io/keycloak/keycloak:26.6.3 + securityContext: + runAsNonRoot: true + runAsUser: 1000 + runAsGroup: 1000 + allowPrivilegeEscalation: false + capabilities: + drop: + - ALL args:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/demos/hr-cpex/k8s/10-keycloak.yaml` around lines 54 - 110, Add a container-level securityContext to the Keycloak container (container name "keycloak") to harden the demo: set runAsNonRoot: true, specify runAsUser and runAsGroup to a non-root UID/GID, set allowPrivilegeEscalation: false, and enable readOnlyRootFilesystem where compatible; place this securityContext alongside the existing ports/volumeMounts/readinessProbe in the pod spec so the container runs unprivileged while preserving the existing KC_* env vars and volume mount for /opt/keycloak/data/import.authbridge/demos/hr-cpex/scenarios/_lib.sh (1)
70-113: ⚡ Quick winBuild request JSON with
jq -ninstead of raw heredoc interpolation.Interpolating unescaped shell strings into JSON will break requests when values contain quotes/newlines (e.g., subject/body overrides).
Suggested refactor (example for
call_send_email)- local body - body=$(cat <<EOF -{ - "jsonrpc": "2.0", - "id": 1, - "method": "tools/call", - "params": { - "name": "send_email", - "arguments": { - "to": "$to", - "subject": "$subject", - "body": "$body_text" - } - } -} -EOF - ) + local body + body="$(jq -cn \ + --arg to "$to" \ + --arg subject "$subject" \ + --arg body "$body_text" \ + '{ + jsonrpc: "2.0", + id: 1, + method: "tools/call", + params: { + name: "send_email", + arguments: {to: $to, subject: $subject, body: $body} + } + }')"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/demos/hr-cpex/scenarios/_lib.sh` around lines 70 - 113, Replace the raw heredoc JSON interpolation with constructing the request body via jq -n to safely escape values; for call_send_email (and the preceding get_compensation caller) use jq -n with --arg for strings (to, subject, body, ssn, employee_id) and --argjson for booleans/numbers (include_ssn) to build the top-level JSON object, assign that output to the local body variable, then pass it to _post_tool and _print_response as before; keep using _post_tool "$user_token" "$client_token" "$body" to send the request.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@authbridge/authlib/plugins/cpex/cmf_a2a.go`:
- Around line 97-114: The write-back selection currently includes all parts with
po["kind"] == "text" which can include empty text parts, causing count-drift and
wrong-part rewrites; update the selection logic in cmf_a2a.go (the block
building targets and the similar block at the second occurrence) to only append
parts where kind == "text" AND the text payload is non-empty (e.g., check
po["text"] as a string and ensure it is not empty) so it matches the
a2aToCMFParts emit contract; ensure both places use the same non-empty-text
predicate before appending to targets and before comparing len(targets) to
len(newTexts).
In `@authbridge/authlib/plugins/cpex/cmf_inference.go`:
- Around line 25-27: The write path currently overwrites multiple choices into a
single newCompletion, collapsing distinct completions; change the code that
writes cmf.llm_output/newCompletion to detect when the read path produced
multiple choices (n > 1) and fail-closed (return an error) or else preserve all
completions by accepting and storing a slice of completions rather than a single
newCompletion; update any code paths that construct or persist newCompletion
(and the analogous logic around the region that mirrors 162-190) to either
return an error for multi-choice responses or to accept []newCompletion and
persist each entry without collapsing them.
In `@authbridge/authlib/plugins/cpex/plugin.go`:
- Around line 333-349: The handler currently forwards internal error text to
clients by passing err.Error() into denyWithStatus from CPEX.handleInvokeError
(and the other deny path around the similarly named call), which can leak
internals; instead, log the full error locally (use slog.Error or pctx.Observe
with the detailed err) and pass a generic client-facing reason (e.g. "internal
cpex error" or "cpex error") into denyWithStatus while leaving the detailed
message out of the returned violation; update both uses (the call in
handleInvokeError and the other denyWithStatus call in the later deny path) to
follow this pattern.
In `@authbridge/cmd/authbridge-cpex/Dockerfile`:
- Line 98: Replace the unpinned runtime base image reference FROM
debian:trixie-slim with a digest-pinned form (e.g., FROM debian@sha256:...) so
the runtime stage is reproducible and secure; obtain the current digest by
running docker pull debian:trixie-slim and docker inspect ... | grep -A1
RepoDigests, then update the Dockerfile's FROM debian:trixie-slim line to use
the reported sha256 digest.
In `@authbridge/cmd/authbridge-cpex/main.go`:
- Around line 300-306: The Shutdown calls on httpServers, statSrv, and
sessionAPISrv currently ignore returned errors; update the shutdown sequence to
capture each error from srv.Shutdown(shutdownCtx), statSrv.Shutdown(shutdownCtx)
and sessionAPISrv.Shutdown(shutdownCtx), then handle or report them (e.g., log
via your logger or aggregate into a combined error and return/non-zero exit) so
failures during graceful termination aren't discarded; ensure you still attempt
to shut down remaining servers even if one Shutdown returns an error and include
shutdownCtx in the log for context.
- Line 104: The code in authbridge-cpex's main (replace usages in
authbridge/cmd/authbridge-cpex/main.go) currently calls log.Fatal/log.Fatalf and
ignores errors from srv.Shutdown, statSrv.Shutdown, and sessionAPISrv.Shutdown;
change all log.Fatal* sites to use slog (structured logging) with descriptive
fields and non-abrupt exit flow, remove direct use/import of log, and implement
a coordinated shutdown sequence in main that cancels the shutdown context, calls
srv.Shutdown(shutdownCtx), statSrv.Shutdown(shutdownCtx), and
sessionAPISrv.Shutdown(shutdownCtx), captures and logs each returned error via
slog (e.g., slog.Error/Info with error field), and ensure the program exits only
after shutdown completion (using os.Exit from main if necessary) rather than
from goroutines. Ensure all failing initialization paths (previously using
log.Fatal) return an error up to main or call slog and trigger the same
coordinated shutdown/exit path.
In `@authbridge/demos/hr-cpex/agent/agent.py`:
- Around line 316-320: The agent keeps mutable shared state (HRAgent._histories
and HRAgent._request_id) without synchronization while dispatching turns
concurrently (the dispatch turns code around lines 442-448), causing races; make
these accesses thread-safe by adding a lock (e.g., self._state_lock =
threading.Lock() in HRAgent.__init__) and surround every read/modify/write of
_histories and _request_id with the lock (including where request IDs are
incremented and where histories are appended/modified in the dispatch/concurrent
worker code). Ensure all methods that touch these symbols acquire/release the
lock (or use context manager) so increments and history updates are atomic.
- Around line 299-305: The MCP HTTP call can raise httpx transport exceptions
which currently escape; wrap the call to client.post(...) inside a try/except
that catches httpx.RequestError (and optionally
httpx.TimeoutException/Exception) and return a structured tool error instead of
letting the exception bubble: on exception, set resp-like values such as
status_code=502 (or another appropriate code) and data={"error": str(e)} (or
{"text": resp.text} when successful), ensuring you only call resp.json() if resp
was obtained; update the block around MCP_PROXY/MCP_URL/payload/headers to
return (status_code, data) for both success and transport failures.
In `@authbridge/demos/hr-cpex/agent/requirements.txt`:
- Line 22: The requirements pin litellm==1.83.0 which is in known vulnerable
ranges; update the dependency to at least 1.83.7 (preferably a newer stable
release) in requirements.txt and regenerate your lock/install to ensure the
patched package is used. Verify code paths using litellm (notably calls to
litellm.completion(...) in agent.py) continue to work after the upgrade and
confirm any proxied calls (httpx.Client(proxy=MCP_PROXY) and MCP tool usage)
remain functional with the newer litellm version.
In `@authbridge/demos/hr-cpex/hr-mcp-server/requirements.txt`:
- Around line 1-2: The requirements pin currently uses fastapi==0.115.4 which
pulls a vulnerable starlette version; update the FastAPI pin in requirements.txt
to a newer release (for example fastapi==0.136.3 or later) so that dependency
resolution pulls a non-vulnerable starlette (e.g., 1.x), leaving or keeping
uvicorn[standard]==0.32.0 unchanged; modify the fastapi version token in the
file (the line containing "fastapi==0.115.4") to the chosen secure version and
run dependency resolution (pip-compile/poetry lock or pip install) to verify
starlette is upgraded.
In `@authbridge/demos/hr-cpex/k8s/30-agent.yaml`:
- Around line 100-218: Add a pod- and container-level security context to harden
the Deployment: under spec.template.spec set a podSecurityContext (e.g.,
runAsNonRoot: true, runAsUser: 1000, fsGroup: 1000) and for each container
(names "agent" and "authbridge-cpex") add securityContext entries that disable
privilege escalation (allowPrivilegeEscalation: false), drop capabilities
(capabilities: drop: ["ALL"]), enforce non-root (runAsNonRoot: true) and make
the root filesystem read-only where possible (readOnlyRootFilesystem: true);
ensure these keys are added alongside existing env/ports/volumeMounts so the
containers run with minimal privileges.
In `@authbridge/demos/hr-cpex/k8s/realm-export.json`:
- Around line 35-37: The realm export currently contains hardcoded reusable
secrets (user password entries in "credentials" arrays with "type":"password"
and "value":..., and client secret values), which must not be committed; remove
the plaintext "value" fields for demo users and client secrets (or replace them
with non-sensitive placeholders like "<REPLACE_WITH_SECRET>" or
environment-derived tokens) and, if needed, set user credentials as temporary or
omit the credentials block so installers generate credentials on import. Update
all occurrences referenced (user credential objects and confidential-client
secret entries) instead of committing real passwords/secrets.
In `@authbridge/demos/hr-cpex/Makefile`:
- Line 22: The Makefile exposes NAMESPACE but several kubectl apply/delete
commands still reference the hardcoded "cpex-demo", which breaks overrides;
update the Makefile so all manifest application and deletion commands use the
NAMESPACE variable (replace literal "cpex-demo" occurrences in the targets that
run kubectl apply/delete and kubectl -n ... with $(NAMESPACE)), or alternatively
remove the variable override and hard-lock NAMESPACE to cpex-demo consistently;
ensure every reference in the apply/delete targets, including the sections
currently using cpex-demo (around the manifest application and deletion blocks),
is changed so the namespace is consistent.
---
Outside diff comments:
In `@authbridge/demos/hr-cpex/agent/Dockerfile`:
- Around line 11-27: The Dockerfile currently runs as root; add steps to create
a user and group with UID/GID 1000 (e.g., addgroup --gid 1000 and adduser --uid
1000 --gid 1000 or equivalent), chown /app to that user, and switch to it using
the USER instruction before CMD so the container runs as UID/GID 1000; keep
existing WORKDIR, COPY, EXPOSE and CMD intact and ensure file ownership is
updated after COPY (refer to the Dockerfile's RUN, COPY, and CMD instructions to
locate where to insert these changes).
In `@authbridge/demos/hr-cpex/hr-mcp-server/Dockerfile`:
- Around line 1-13: Add a non-root USER with UID and GID 1000 to the Dockerfile
so the container runs as the required UID/GID and matches the operator webhook's
runAsUser/runAsGroup; update the Dockerfile after the COPY steps and before CMD
(the block containing CMD ["uvicorn", "server:app", "--host", "0.0.0.0",
"--port", "9100"]) to create or switch to a user/group with numeric ID 1000
(ensure ownership of /app as needed) and add a USER directive referencing that
UID:GID.
In `@authbridge/demos/hr-cpex/k8s/20-hr-mcp.yaml`:
- Around line 44-63: Add a container-level securityContext for the hr-mcp
container to enforce baseline hardening: set allowPrivilegeEscalation: false,
privileged: false, runAsNonRoot: true (and optionally runAsUser: non-root UID),
readOnlyRootFilesystem: true, drop all capabilities via capabilities.drop:
["ALL"], and set seccompProfile.type: "RuntimeDefault" (or "Localhost" with a
provided profile) under the container named "hr-mcp" so the pod no longer relies
on permissive defaults.
---
Minor comments:
In `@authbridge/authlib/plugins/cpex/headers.go`:
- Around line 35-42: The file has contradictory comments about the Authorization
header: one comment block (the "NOTE on `Authorization`" near the top of
headers.go) explains the header is deliberately preserved, but another comment
later says Authorization is dropped; update the later comment to match the
actual behavior (preserve the Authorization header) so documentation aligns with
the code path that retains the bearer token used by CPEX identity/jwt plugins;
search for the string "Authorization" in headers.go and replace the inaccurate
"dropped" wording with a clear statement that the header is preserved (or
vice-versa only if you also change the code path) and ensure both comment blocks
consistently describe the same security behavior.
In `@authbridge/authlib/plugins/jwtvalidation/identity.go`:
- Around line 68-72: claimsIdentity.AuthMethod() always returns "jwt" even when
the adapter has no claims; change it to mirror the other accessors by returning
an empty string when i.claims is nil and only return "jwt" when i.claims != nil.
Locate the claimsIdentity type and the AuthMethod() method and add a nil-check
on i.claims so downstream callers don't receive a false-authenticated signal.
In `@authbridge/cmd/authbridge-cpex/Dockerfile`:
- Line 107: Update the documentation/note about UID usage to reflect the actual
Dockerfile pattern: keep authbridge/cmd/authbridge-cpex/Dockerfile using USER
1001 (matching authbridge-cmd/authbridge-proxy and
authbridge-cmd/authbridge-lite) and change the learned-note/README text to state
that UID 1337 is specific to authbridge-cmd/authbridge-envoy only; adjust any
wording that currently claims the "combined authbridge container and Envoy
proxy" should use 1337 so it accurately distinguishes authbridge-envoy (1337)
from the other combined images (1001).
In `@authbridge/cmd/authbridge-cpex/entrypoint.sh`:
- Line 2: The script's strict mode uses "set -eu" which omits pipefail; update
the entrypoint shell strict mode by changing the options to include "pipefail"
so the invocation uses "set -euo pipefail" (i.e., modify the existing set
invocation in entrypoint.sh that currently reads set -eu to include pipefail) to
ensure failures in pipelines cause the script to exit.
In `@authbridge/demos/hr-cpex/README.md`:
- Line 103: Multiple fenced code blocks in the README are missing language
identifiers; update each triple-backtick fence (the ones currently just ``` ) to
include the correct language tag (e.g., ```text, ```bash, or ```json) so
Markdown linting and rendering are correct—find the plain ``` blocks in the HR
CPEX README and replace them with the appropriate tagged fences, ensuring
consistency for examples, CLI commands, and JSON snippets.
In `@authbridge/demos/hr-cpex/scenarios/04-alice-internal-allow.sh`:
- Around line 29-41: Update the curl invocation in the
04-alice-internal-allow.sh scenario so it captures and checks the HTTP status
code instead of swallowing failures; call curl (the existing line that uses
PROXY, MCP_TARGET, CLIENT, ALICE and pipes to jq) with options to write the
response body to a temporary file and output the status code (e.g., -s -o
<tempfile> -w "%{http_code}"), then test that the captured code equals 200 and
exit non‑zero if not, only then pretty‑print the saved body with jq; ensure the
script fails on non‑200 responses rather than using the current `| jq .
2>/dev/null || true` pattern.
In `@authbridge/demos/hr-cpex/scenarios/06-bob-apl-deny.sh`:
- Around line 29-41: The pipeline in 06-bob-apl-deny.sh currently ends with "|
head -20" which can send SIGPIPE to curl under set -euo pipefail; change the
pattern so curl writes its response to a file (use mktemp or curl --output) and
then run head -20 on that file (or cat the file and remove it) instead of piping
directly to head; update the curl invocation in the script to save to a
temporary file and then print the first 20 lines from that file to avoid
terminating curl early.
In `@authbridge/demos/hr-cpex/verify-token-exchange.sh`:
- Around line 96-103: The check currently reads aud as the first array element
(variable aud) which fails when the JWT aud is an array containing "workday-api"
not at index 0; update the extraction/validation to detect membership instead of
only index 0 by changing the jq expression that sets aud/does the comparison:
parse "decoded" with jq to test whether .aud equals the string or contains the
expected value (AUDIENCE) when .aud is an array, then set a boolean or
normalized value and compare that to AUDIENCE (referencing the variables aud,
AUDIENCE and the decoded JWT handling in verify-token-exchange.sh) so the script
passes when "workday-api" appears anywhere in the aud array.
---
Nitpick comments:
In `@authbridge/authlib/plugins/cpex/cmf_inference_test.go`:
- Around line 127-156: Add a new unit test (e.g.,
TestInferenceResponseBodyMod_RewritesMultipleChoices) that constructs a
pipeline.Context with ResponseBody containing two choices entries
(choices[0].message.content and choices[1].message.content) with distinct
values, call applyInferenceResponseBodyMod(pctx, "redacted"), assert the call
returns mutated==true and err==nil, json.Unmarshal pctx.ResponseBody into a map
and assert the decoded "choices" slice length remains 2 and that both choices'
message.content values were replaced with "redacted" (preserving order), to
guard against silent collapsing or positional mapping regressions; reference
applyInferenceResponseBodyMod, pipeline.Context and ResponseBody when locating
where to add the test.
In `@authbridge/authlib/plugins/cpex/manager_cpex.go`:
- Around line 71-82: The Initialize method can block trying to send the result
to done after ctx cancellation; change the goroutine that calls
c.mgr.Initialize() so it performs a non-blocking send into the done channel (use
select { case done <- err: default: }) instead of an unconditional send, so the
goroutine cannot block if the caller already returned on ctx.Done(); reference
the cpexManager.Initialize function, the done channel and the c.mgr.Initialize()
call and keep the channel buffered as-is.
In `@authbridge/authlib/plugins/cpex/README.md`:
- Around line 33-42: The fenced diagram in
authbridge/authlib/plugins/cpex/README.md is missing a language identifier which
prevents proper syntax highlighting; update the opening and closing fences for
that code block from triple backticks to include a language tag (for example use
"text" like ```text) so the diagram renders correctly and ensure the closing
fence remains ```; locate the block by the diagram content starting with
"request ─► token-exchange ─► parsers ─► cpex" to make the change.
- Around line 188-206: The README's fenced code blocks lack language identifiers
which prevents proper syntax highlighting; update each triple-backtick block
(the flow line and the YAML block showing pipeline configuration) to include
appropriate language tags (e.g., ```text for the flow diagram and ```yaml for
the pipeline block) so the outbound pipeline snippet (showing mcp-parser and
cpex with config_file and hooks like on_request/on_response and fail_open)
renders with correct highlighting.
In `@authbridge/demos/hr-cpex/k8s/10-keycloak.yaml`:
- Around line 54-110: Add a container-level securityContext to the Keycloak
container (container name "keycloak") to harden the demo: set runAsNonRoot:
true, specify runAsUser and runAsGroup to a non-root UID/GID, set
allowPrivilegeEscalation: false, and enable readOnlyRootFilesystem where
compatible; place this securityContext alongside the existing
ports/volumeMounts/readinessProbe in the pod spec so the container runs
unprivileged while preserving the existing KC_* env vars and volume mount for
/opt/keycloak/data/import.
In `@authbridge/demos/hr-cpex/scenarios/_lib.sh`:
- Around line 70-113: Replace the raw heredoc JSON interpolation with
constructing the request body via jq -n to safely escape values; for
call_send_email (and the preceding get_compensation caller) use jq -n with --arg
for strings (to, subject, body, ssn, employee_id) and --argjson for
booleans/numbers (include_ssn) to build the top-level JSON object, assign that
output to the local body variable, then pass it to _post_tool and
_print_response as before; keep using _post_tool "$user_token" "$client_token"
"$body" to send the request.
In `@authbridge/docs/cpex-plugin.md`:
- Around line 14-43: In the fenced code block that contains the ASCII art
diagram in cpex-plugin.md (the block starting with ``` and showing the
authbridge-cpex pipeline and CPEX runtime diagram), add a language identifier
(e.g., "text") immediately after the opening backticks so the block becomes
```text; this will ensure proper rendering of the ASCII art.
In `@docs/proposals/authbridge-hooks.md`:
- Around line 8-12: Update the heading "How it works" to the correct level so
heading levels increment by one (change the "### How it works" header to "## How
it works"); locate the header text "How it works" in the document and adjust its
Markdown prefix to a single additional pound sign so the document goes from h1
to h2 sequentially.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: f624c440-2597-465a-afb4-0cfa35d0d739
⛔ Files ignored due to path filters (3)
authbridge/authlib/go.sumis excluded by!**/*.sumauthbridge/cmd/authbridge-cpex/go.sumis excluded by!**/*.sumauthbridge/go.workis excluded by!**/*.work
📒 Files selected for processing (66)
.github/workflows/build.yaml.gitignoreCLAUDE.mdauthbridge/CLAUDE.mdauthbridge/authlib/contracts/identity.goauthbridge/authlib/go.modauthbridge/authlib/pipeline/context.goauthbridge/authlib/pipeline/extensions.goauthbridge/authlib/plugins/cpex/README.mdauthbridge/authlib/plugins/cpex/cmf_a2a.goauthbridge/authlib/plugins/cpex/cmf_a2a_test.goauthbridge/authlib/plugins/cpex/cmf_body.goauthbridge/authlib/plugins/cpex/cmf_body_test.goauthbridge/authlib/plugins/cpex/cmf_inference.goauthbridge/authlib/plugins/cpex/cmf_inference_test.goauthbridge/authlib/plugins/cpex/config.goauthbridge/authlib/plugins/cpex/headers.goauthbridge/authlib/plugins/cpex/headers_test.goauthbridge/authlib/plugins/cpex/hooks.goauthbridge/authlib/plugins/cpex/manager.goauthbridge/authlib/plugins/cpex/manager_cpex.goauthbridge/authlib/plugins/cpex/manager_stub.goauthbridge/authlib/plugins/cpex/manager_test.goauthbridge/authlib/plugins/cpex/plugin.goauthbridge/authlib/plugins/cpex/plugin_test.goauthbridge/authlib/plugins/jwtvalidation/identity.goauthbridge/authlib/plugins/jwtvalidation/identity_test.goauthbridge/authlib/plugins/tokenexchange/delegation_test.goauthbridge/authlib/plugins/tokenexchange/plugin.goauthbridge/cmd/authbridge-cpex/CPEX_FFI_VERSIONauthbridge/cmd/authbridge-cpex/Dockerfileauthbridge/cmd/authbridge-cpex/entrypoint.shauthbridge/cmd/authbridge-cpex/go.modauthbridge/cmd/authbridge-cpex/main.goauthbridge/demos/README.mdauthbridge/demos/hr-cpex/Makefileauthbridge/demos/hr-cpex/README.mdauthbridge/demos/hr-cpex/agent/CHAT-WALKTHROUGH.mdauthbridge/demos/hr-cpex/agent/Dockerfileauthbridge/demos/hr-cpex/agent/agent.pyauthbridge/demos/hr-cpex/agent/chat.pyauthbridge/demos/hr-cpex/agent/requirements-client.txtauthbridge/demos/hr-cpex/agent/requirements.txtauthbridge/demos/hr-cpex/hr-mcp-server/Dockerfileauthbridge/demos/hr-cpex/hr-mcp-server/requirements.txtauthbridge/demos/hr-cpex/hr-mcp-server/server.pyauthbridge/demos/hr-cpex/k8s/00-namespace.yamlauthbridge/demos/hr-cpex/k8s/10-keycloak.yamlauthbridge/demos/hr-cpex/k8s/20-hr-mcp.yamlauthbridge/demos/hr-cpex/k8s/30-agent.yamlauthbridge/demos/hr-cpex/k8s/cpex-policy.yamlauthbridge/demos/hr-cpex/k8s/realm-export.jsonauthbridge/demos/hr-cpex/mint-token.shauthbridge/demos/hr-cpex/scenarios/01-bob-allow.shauthbridge/demos/hr-cpex/scenarios/02-alice-deny.shauthbridge/demos/hr-cpex/scenarios/03-eve-redact.shauthbridge/demos/hr-cpex/scenarios/04-alice-internal-allow.shauthbridge/demos/hr-cpex/scenarios/05-alice-external-cedar-deny.shauthbridge/demos/hr-cpex/scenarios/06-bob-apl-deny.shauthbridge/demos/hr-cpex/scenarios/07-bob-pii-deny.shauthbridge/demos/hr-cpex/scenarios/08-bob-taint-deny.shauthbridge/demos/hr-cpex/scenarios/09-cross-principal-taint-isolation.shauthbridge/demos/hr-cpex/scenarios/_lib.shauthbridge/demos/hr-cpex/verify-token-exchange.shauthbridge/docs/cpex-plugin.mddocs/proposals/authbridge-hooks.md
CI: - Bump proxy/envoy/lite go.mod to go 1.25.4 to match authlib (which the CPEX dependency raised), so `setup-go` installs a toolchain that can build the shared library under GOTOOLCHAIN=local. - Add `# shellcheck shell=bash` to hr-cpex scenarios/_lib.sh (SC2148). - Bump litellm 1.83.0 -> 1.83.10 in the hr-cpex agent (SQLi/SSTI/RCE/ sandbox-escape cluster; GHSA-r75f-5x8p-qvmc, -xqmj-j6mv-4862, -v4p8-mg3p-g94g, -wxxx-gvqv-xp7p). CPEX plugin (correctness/security): - cmf_a2a: align write-back target selection with the read-side non-empty-text contract so empty text parts don't trip false count-drift or rewrite the wrong part. - cmf_inference: fail closed on multi-choice responses instead of stamping one redacted completion over distinct choices. - plugin: stop leaking raw internal CPEX/FFI error text to clients on deny paths; log detail, return a generic message. - Add regression tests for both fail-closed/alignment behaviors. hr-cpex demo: - Pin the cpex runtime base image (debian:trixie-slim) by digest. - Bump fastapi -> 0.136.3 to pull a patched starlette (transitive multipart/Range/Host-header advisories). - Sanitize user-controlled values in hr-mcp-server logs (CWE-117) and return a generic tool-error message instead of exception detail. - Wrap the agent's MCP POST so transport errors return a structured tool error rather than aborting the turn. - Add baseline container securityContext (privesc off, drop ALL caps, RuntimeDefault seccomp) to the demo pods; runAsNonRoot for the cpex sidecar. - Make the demo Makefile NAMESPACE fixed (manifests hardcode cpex-demo). Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com>
Update the pinned CPEX FFI ABI and Go bindings from v0.2.0-ffi.test.8 to v0.2.0-alpha.1: - cmd/authbridge-cpex/CPEX_FFI_VERSION (drives the FFI archive download) - authlib + cmd/authbridge-cpex go.mod/go.sum (cpex Go bindings) The published FFI archive is sha256-verified (FFI_ABI=2); the -tags cpex cgo build links cleanly against the new bindings and 'make scenarios' passes all 9 hr-cpex demo cases on a kind cluster. Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
authbridge/authlib/go.mod (1)
6-6: Consider production readiness of alpha version.The cpex dependency is pinned to
v0.2.0-alpha.1, which is an alpha/pre-release version. While this may be intentional for early integration testing, consider the following for production deployments:
- Alpha versions typically don't carry the same stability or support guarantees as stable releases
- API changes may occur between alpha versions without deprecation notices
- Consider pinning to a stable release (e.g., v0.2.0 when available) before production deployment
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@authbridge/authlib/go.mod` at line 6, The go.mod currently pins the cpex module to a pre-release version "github.com/contextforge-org/cpex/go/cpex v0.2.0-alpha.1"; update this dependency to a stable release before production by replacing the alpha tag with a non-pre-release version (e.g., v0.2.0 or the latest semantically stable tag) or add a comment and CI gating if you must keep the alpha for now; locate the module entry "github.com/contextforge-org/cpex/go/cpex v0.2.0-alpha.1" and change the version string to the chosen stable tag (or document and pin a specific commit if stability is required).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@authbridge/authlib/go.mod`:
- Line 6: The go.mod currently pins the cpex module to a pre-release version
"github.com/contextforge-org/cpex/go/cpex v0.2.0-alpha.1"; update this
dependency to a stable release before production by replacing the alpha tag with
a non-pre-release version (e.g., v0.2.0 or the latest semantically stable tag)
or add a comment and CI gating if you must keep the alpha for now; locate the
module entry "github.com/contextforge-org/cpex/go/cpex v0.2.0-alpha.1" and
change the version string to the chosen stable tag (or document and pin a
specific commit if stability is required).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 689f4775-def5-431c-8ca2-6fb6a9e87a22
⛔ Files ignored due to path filters (2)
authbridge/authlib/go.sumis excluded by!**/*.sumauthbridge/cmd/authbridge-cpex/go.sumis excluded by!**/*.sum
📒 Files selected for processing (3)
authbridge/authlib/go.modauthbridge/cmd/authbridge-cpex/CPEX_FFI_VERSIONauthbridge/cmd/authbridge-cpex/go.mod
✅ Files skipped from review due to trivial changes (1)
- authbridge/cmd/authbridge-cpex/CPEX_FFI_VERSION
🚧 Files skipped from review as they are similar to previous changes (1)
- authbridge/cmd/authbridge-cpex/go.mod
Resolve the still-open review comments from PR kagenti#493 (the rest were handled in cd2e38b): - agent.py: guard HRAgent shared state (_histories, _request_id) with a threading.Lock — run_turn() runs on asyncio.to_thread workers and the A2A SDK can dispatch turns concurrently, which could corrupt the history map or lose a request-id increment. - verify-token-exchange.sh: check aud membership instead of the first array element, so a valid multi-audience token isn't a false failure. - scenarios/04, scenarios/06: capture-then-render so a transport failure surfaces (no '|| true' mask) and 'curl -i | head' can't SIGPIPE under 'set -o pipefail'. - cmd/authbridge-cpex/entrypoint.sh: set -eu -> set -euo pipefail. - k8s/10-keycloak.yaml: baseline container securityContext (allowPrivilegeEscalation:false, drop ALL, seccomp RuntimeDefault) to match 20-hr-mcp.yaml / 30-agent.yaml. - docs: add language tags to fenced code blocks and fix a heading-level increment in the cpex READMEs, cpex-plugin.md, and authbridge-hooks.md. Verified on a kind cluster: make scenarios 9/9, verify-token-exchange passes, A2A chat (bob allow / eve redact / alice deny) works. Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com>
make deploy rebuilds + kind-loads the local :dev images, but a Deployment whose pod template is unchanged won't roll on its own (imagePullPolicy: Never, no spec diff), so a code-only rebuild kept running the previous image until a manual 'kubectl rollout restart'. deploy now: waits for Keycloak to settle first, then 'rollout restart's the two locally-built workloads (hr-mcp; hr-cpex-agent, which carries the agent + authbridge-cpex sidecar) so a rebuild always lands. Keycloak is left alone (upstream pinned image; restarting it regenerates realm keys and re-triggers the stale-JWKS / cpex.auth_unknown_kid race the settle-wait guards against). Verified: 'make deploy' then 'make scenarios' (no manual rollout) -> 9/9, 0 auth_unknown_kid. Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com>
Resolved all items. Most related to moke code used in the demo. |
| logger.info(" %-25s = %s", name, _redact_token(v)) | ||
| try: | ||
| rpc = json.loads(body_bytes) | ||
| logger.info(" body.method = %s", _log_safe(rpc.get("method"))) |
There was a problem hiding this comment.
This is mock server for a demo, but it uses _log_safe function.
| rpc = json.loads(body_bytes) | ||
| logger.info(" body.method = %s", _log_safe(rpc.get("method"))) | ||
| params = rpc.get("params", {}) | ||
| logger.info(" body.params.name = %s", _log_safe(params.get("name"))) |
There was a problem hiding this comment.
This is mock server for a demo, but it uses _log_safe function.
| logger.info(" body.params.name = %s", _log_safe(params.get("name"))) | ||
| logger.info( | ||
| " body.params.arguments = %s", | ||
| _log_safe(json.dumps(params.get("arguments", {}))), |
There was a problem hiding this comment.
This is mock server for a demo, but it uses _log_safe function.
| except Exception: | ||
| # Full detail (including the stack trace) goes to the server log; | ||
| # the client gets a generic message so internal state isn't exposed. | ||
| logger.exception("tool '%s' failed", _log_safe(tool_name)) |
There was a problem hiding this comment.
This is mock server for a demo, but it uses _log_safe function.
| ) | ||
|
|
||
| logger.info("OUTBOUND RESPONSE") | ||
| logger.info(" tool = %s", _log_safe(tool_name)) |
There was a problem hiding this comment.
This is mock server for a demo, but it uses _log_safe function.
…ls, ClientFactory
Iterate the A2A chat demo into a reliable, observable walkthrough:
agent.py:
- temperature=0 + hardened 'relay tool data verbatim' prompt so the model
stops fabricating [REDACTED] for an allowed SSN (bob).
- Drop the optional 'ssn' echo-back tool arg: models populated it
inconsistently ('' / null / omitted) and a null tripped cpex's APL
redact(args.ssn) type check -> 403. The SSN demo is response-side
(include_ssn=true -> redact(result.ssn)); request-side args.ssn redaction
stays covered by scenarios/01-bob-allow.sh.
- include_ssn contract (tool-arg description + prompt): set true ONLY when the
user explicitly asks for the SSN, so a plain 'look up compensation' no
longer leaks it.
- Attach a per-turn tool trace (name/args/status/result) to the A2A response
message metadata for optional client display.
chat.py:
- Migrate off the deprecated A2AClient to ClientFactory; per-call identity
headers are injected via a ClientCallInterceptor + ClientCallContext (the
new client has no http_kwargs). Removes the deprecation warning.
- Add --show-tools (default off): render the agent's cpex-governed tool calls
and results, indented, tool name in color, result green (allow) / red (deny).
Default model -> ollama/llama3:latest (tool-calls on plain ollama/, renders
faithfully at temp=0, no 'thinking' latency). Makefile documents ollama/qwen3
(richer prose, slower) and llama3.2:3b (fastest, occasionally noisy) as
alternatives. README + CHAT-WALKTHROUGH updated.
Verified on a kind cluster: bob real SSN, eve [REDACTED], alice cpex 403;
include_ssn only set when explicitly asked; no client deprecation warning.
Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@authbridge/demos/hr-cpex/agent/chat.py`:
- Around line 197-213: Add full type annotations to the
_HeaderInterceptor.intercept signature to match the a2a-sdk
ClientCallInterceptor: annotate method_name as str, request_payload and
http_kwargs as dict[str, Any], agent_card as AgentCard | None and context as
ClientCallContext | None, and the return type as tuple[dict[str, Any], dict[str,
Any]]; also ensure Any is imported from typing and import (or forward-reference)
AgentCard and ClientCallContext from the a2a-sdk module so the annotated
signature for _HeaderInterceptor.intercept matches the expected interface.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: c504b31c-8d06-4c02-9d81-fe39a51228bf
📒 Files selected for processing (6)
authbridge/demos/hr-cpex/Makefileauthbridge/demos/hr-cpex/README.mdauthbridge/demos/hr-cpex/agent/CHAT-WALKTHROUGH.mdauthbridge/demos/hr-cpex/agent/agent.pyauthbridge/demos/hr-cpex/agent/chat.pyauthbridge/demos/hr-cpex/k8s/30-agent.yaml
✅ Files skipped from review due to trivial changes (2)
- authbridge/demos/hr-cpex/agent/CHAT-WALKTHROUGH.md
- authbridge/demos/hr-cpex/README.md
🚧 Files skipped from review as they are similar to previous changes (3)
- authbridge/demos/hr-cpex/agent/agent.py
- authbridge/demos/hr-cpex/Makefile
- authbridge/demos/hr-cpex/k8s/30-agent.yaml
| class _HeaderInterceptor(ClientCallInterceptor): | ||
| """Inject the per-call identity headers into the outgoing HTTP request. | ||
|
|
||
| The modern a2a-sdk client (ClientFactory) has no per-call `http_kwargs`, | ||
| so identity that changes per turn (persona token, client token, session | ||
| id) is passed via the ClientCallContext `state` and merged into the HTTP | ||
| headers here. The agent reads them off the inbound request and re-attaches | ||
| them onto its cpex-governed tool calls; contextId carries the same session | ||
| id inside the A2A envelope as a belt-and-suspenders for taint scoping.""" | ||
|
|
||
| async def intercept(self, method_name, request_payload, http_kwargs, agent_card, context): | ||
| headers = context.state.get("headers") if context else None | ||
| if headers: | ||
| merged = dict(http_kwargs.get("headers") or {}) | ||
| merged.update(headers) | ||
| http_kwargs = {**http_kwargs, "headers": merged} | ||
| return request_payload, http_kwargs |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
a2a-sdk python ClientCallInterceptor intercept method signature
💡 Result:
In the a2a-sdk for Python, the ClientCallInterceptor abstract base class requires the implementation of an async method named intercept [1][2][3]. The signature of the intercept method is as follows [1]: async def intercept( self, method_name: str, request_payload: dict[str, Any], http_kwargs: dict[str, Any], agent_card: AgentCard | None, context: ClientCallContext | None,) -> tuple[dict[str, Any], dict[str, Any]]: Parameters: - method_name: The name of the RPC method being invoked (e.g., 'message/send') [1][2][3]. - request_payload: The JSON-RPC request payload as a dictionary [1][2][3]. - http_kwargs: Keyword arguments for the HTTP client (e.g., headers, timeout) [1][2][3]. - agent_card: The AgentCard associated with the client, if available [1][2][3]. - context: The ClientCallContext, which carries call-specific configuration and state [1][2][3]. The method must return a tuple containing the (potentially modified) request_payload and http_kwargs [1][2][3]. This structure allows interceptors to inspect and modify outgoing requests before they are transmitted [1][4][5].
Citations:
- 1: https://github.com/google/a2a-python/blob/fa14dbf4/src/a2a/client/middleware.py
- 2: https://deepwiki.com/google/a2a-python/6.5-client-middleware-and-interceptors
- 3: https://deepwiki.com/a2aproject/a2a-python/5.6-client-interceptors-and-middleware
- 4: https://a2a-protocol.org/dev/sdk/python/api/a2a.client.html
- 5: https://a2a-protocol.org/latest/sdk/python/api/a2a.client.html
Add full type annotations to _HeaderInterceptor.intercept parameters/return (per a2a-sdk signature)
In authbridge/demos/hr-cpex/agent/chat.py (lines ~207-213), _HeaderInterceptor.intercept is missing type hints. The a2a-python ClientCallInterceptor interface expects:
method_name: strrequest_payload: dict[str, Any]http_kwargs: dict[str, Any]agent_card: AgentCard | Nonecontext: ClientCallContext | None- return
tuple[dict[str, Any], dict[str, Any]]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@authbridge/demos/hr-cpex/agent/chat.py` around lines 197 - 213, Add full type
annotations to the _HeaderInterceptor.intercept signature to match the a2a-sdk
ClientCallInterceptor: annotate method_name as str, request_payload and
http_kwargs as dict[str, Any], agent_card as AgentCard | None and context as
ClientCallContext | None, and the return type as tuple[dict[str, Any], dict[str,
Any]]; also ensure Any is imported from typing and import (or forward-reference)
AgentCard and ClientCallContext from the a2a-sdk module so the annotated
signature for _HeaderInterceptor.intercept matches the expected interface.
Source: Coding guidelines
huang195
left a comment
There was a problem hiding this comment.
Review Summary
Thorough pass across the plugin, the cross-cutting/pipeline changes, build isolation, the CGO/FFI surface, the demo, and prior bot reviews. This is strong, well-tested work on the chassis — but the response-body redaction path has a data-leak bug that blocks merge.
Verified sound (not re-raising)
- Build isolation is airtight. Every file in
plugins/cpex/was tag-audited: onlymanager_cpex.goimports the rcpex/FFI dep (//go:build cpex); the only importer of the package iscmd/authbridge-cpex/main.go(alsocpex-tagged); the three pure-Go mains don't import it and theirgo.modchange is only thego 1.25.0→1.25.4bump.CGO_ENABLED=0builds compile neither the package nor rcpex. Themanager_stub.gopath fails loud ifcpexis configured without the tag. - Decision→action mapping has no authz bypass. Deny short-circuits before
fail_openis consulted; thedefault/unknown-decision arm fails closed;fail_opendefaults false and governs only CPEX-internal errors; first-deny-wins across multiple hooks; inbound host-bypass is correctly disabled (spoofed inbound Host still hits the deny). - Prior coverage: CodeRabbit's items were resolved by the author; CodeQL's 11 findings are all in the demo MCP server (log-injection / exception-exposure), none in the plugin or pipeline.
🔴 Must-fix
Multi-part MCP/A2A response redaction silently leaks and still allows. The CMF read side emits every text part, but the MCP/A2A response re-serializers rewrite only the first and return mutated=true, err=nil — parts[1..] forward the original secret and the request is allowed. The inference response path already has the correct guard (len(targets) > 1 → fail-closed, cmf_inference.go:204); MCP/A2A lack it, and firstText() (manager_cpex.go:375) collapses the redacted list to texts[0] before write-back. Uncovered by tests (all response tests use a single part). See inline on cmf_a2a.go and cmf_body.go. This is the worst failure mode for a redaction feature, hence blocking even for an alpha plugin.
Suggestions (security / correctness)
- Streamed responses bypass redaction entirely — the SSE read side yields zero parts, so the policy never sees the content and the unredacted stream is allowed. Needs an explicit allow/deny decision for streamed bodies rather than "no parts → allow."
- Content-Length isn't updated after body rewrite in this layer (
SetResponseBody); confirm the ext_proc/wire adapter recomputes it fromlen(Body)whenBodyMutated()— otherwise it's a proxy-desync risk. Decisionzero-value isAllow(DecisionAllow = iota) — not currently exploitable, but flip the zero value to a fail-closed sentinel. AndFakeManagerdefaults to Allow, weakening the suite's fidelity to production fail-closed — default it to deny/error and add an explicit unknown-decision dispatch test.- Supply chain / FFI (inline): cosign-verify the
.a(the.sig/.crtexist) and add a runtime FFI ABI assertion before non-demo use. - Architecture / footprint: the FFI imports CPEX's full PluginManager runtime + a Tokio threadpool, not just CMF/APL. CMF is separable, but APL evaluation is bound to
PluginManagerin the bindings — there is no thin-eval surface. Adopting this embeds CPEX's full runtime (a second in-process pipeline) in the sidecar; if only authz eval is wanted, that's worth a narrow-evaluator request upstream. Relatedly, the image is ~146 MB (vs ~30–45 MB for the pure-Go siblings) — ~110 MB of that is thedebian:trixie-slimbase, not CPEX (the FFI adds only ~12–15 MB; the 100 MB.ais mostly stripped by the linker). It's now in the CI build matrix, so every build pulls the 23 MB FFI asset — consider adistroless/ccbase. - Pipeline coupling:
Context.CurrentPhase()is avoidable convenience (the phase is known atOnRequest/OnResponse; pass it down rather than widening sharedContextAPI). The newDelegationHopfields are shaped to mirror CPEX's model; the one default-pipeline behavior change (tokenexchange.recordDelegationHop) is additive and tested.
Nits
- Stale doc comments:
manager_cpex.gosays "inference/A2A body rewriting not yet implemented" (it is); an import-path comment referencesIBM/...vs the actualcontextforge-org/.... - Demo:
kind: literalsecrets want a "demo-only" banner;:latesttags + missingsecurityContexton the demo deployments; CodeQL log-injection/exception-exposure inhr-cpex/hr-mcp-server/server.py;# Made with Bobtrailers in ~12 files. - Docs: clarify "authbridge-cpex" is a build variant of the AuthBridge sidecar (deployed in place of authbridge-proxy), not a second sidecar; and that OPA now appears in two places (native
opaplugin vs CPEX's externalopa(...)step) — when to use which. Note CPEX-OPA/AuthZEN are doc-level BYOP examples, not exercised here (only embedded Cedar is).
Areas reviewed: Go (plugin, cross-cutting pipeline, jwtvalidation, tokenexchange, contracts), CGO/FFI build + image size (built it), k8s, shell, Python demo, CI, prior bot reviews · Commits: 8, all signed-off ✅ · CI: green
The must-fix is the only blocker; everything else is a suggestion/nit. Fix the response-side redaction guard and this is in good shape.
Assisted-By: Claude Code
| return false, fmt.Errorf("re-serialize A2A response body: %w", err) | ||
| } | ||
| pctx.SetResponseBody(newBody) | ||
| return true, nil |
There was a problem hiding this comment.
must-fix (data leak) — This returns after rewriting only the first text part, but a2aResponseParts (cmf_a2a.go:61) emits one part per artifact text part, and the caller passes firstText(texts) (manager_cpex.go:341), which already discarded texts[1:]. So an A2A response with ≥2 artifact text parts (routine) gets parts[1..] forwarded with the original unredacted secret, while this returns mutated=true, err=nil → the request is ALLOWED.
The inference response path already does this correctly — it fail-closes on len(targets) > 1 (cmf_inference.go:204). Port that guard here (or rewrite all parts with the same count-drift check the request side enforces). No test covers a multi-part A2A response, and TestA2AResponseBodyMod_RewritesArtifact only asserts the redacted string is present, not that the secret is absent — worth a multi-part test that asserts the original value is gone.
| if t, ok := blockObj["type"].(string); ok && t == "text" { | ||
| blockObj["text"] = stringifyForTextBlock(newContent) | ||
| replaced = true | ||
| break |
There was a problem hiding this comment.
must-fix (data leak) — Same class as the A2A response bug: this breaks after rewriting the first text block, but the MCP read side emits every text block. A tools/call result with multiple text blocks gets block[0] redacted and the rest forwarded verbatim, with mutated=true → ALLOWED. Unlike the request side (which enforces len(targets) != len(newContents)), the MCP response side has no count-drift guard. Mirror cmf_inference.go:204 (fail-closed on >1 redactable block) or rewrite all blocks.
|
|
||
| switch method { | ||
| case "tools/call", "prompts/get": | ||
| if len(mod.NewArguments) == 0 { |
There was a problem hiding this comment.
suggestion — len(mod.NewArguments) == 0 is treated as a no-op that forwards the original body. A policy whose intended redaction is "strip all arguments" (→ empty args) is then a silent no-redaction-but-allow: the original arguments (with the secret) go through unchanged. Consider an explicit "args were modified" signal from the CMF layer rather than inferring intent from emptiness.
| // ProvenanceExtension.MessageID | ||
| // delegation → DelegationExtension (chain/origin/actor/depth) | ||
| // request → RequestExtension (request id + trace/span ids) | ||
| // headers → HttpExtension.RequestHeaders (Authorization and Cookie |
There was a problem hiding this comment.
suggestion — This buildCMF comment says "Authorization and Cookie stripped," but per headers.go only Cookie (and the other secret headers) are stripped — Authorization is deliberately kept so CPEX's identity plugins can read the bearer. The live token therefore reaches HttpExtension.RequestHeaders and the unauthenticated session API. Defensible for identity resolution, but fix the misleading doc and make the audit-log redaction guidance non-optional rather than operator-discretion.
| # ./LICENSE | ||
| # | ||
| # This stage fetches the tarball + its .sha256 sidecar, verifies the | ||
| # checksum, extracts the .a. Cosign verification is a follow-up (we |
There was a problem hiding this comment.
suggestion (supply chain) — The FFI .a is verified by a .sha256 fetched from the same release URL as the tarball — transport integrity, not authenticity. Since this links into a policy-enforcement sidecar, wire up cosign verify-blob (the .sig/.crt already exist on the release) before this image is used beyond demos. Also nothing asserts the linked lib's FFI ABI matches the Go binding's expectation at runtime — a future drift would mis-marshal structs across CGO; consider a boot-time ABI assertion (the CPEX_FFI_VERSION is pinned but only checked by a TODO pre-commit hook).
…+ hardening Resolve Hai Huang's review findings on the CPEX plugin PR: Must-fix (blocker): - applyA2AResponseBodyMod: fail closed when >1 text-kind artifact parts exist, mirroring the inference response guard (cmf_inference.go:204). Previously only the first part was rewritten; parts[1..] forwarded the original unredacted content. Security/correctness: - Decision zero-value flipped from Allow to a fail-closed Unknown sentinel (DecisionUnknown = iota). An uninitialised Result never silently allows traffic. - FakeManager default changed from Allow to Deny for test fidelity. - Streaming (SSE) response gap detection: when the response body is non-empty but yields zero CMF parts (unparseable SSE), return DecisionError so fail_open governs rather than silently allowing. - Dockerfile: cosign verify-blob of the FFI .a asset at build time (sha256 + signature verification). Runtime ABI assertion already exists in the Go bindings. Nits: - Stale "not yet implemented" comment removed (manager_cpex.go:269). - IBM/... → contextforge-org/... import path comment fixed. - realm-export.json: _WARNING field flagging demo-only credentials. - 30-agent.yaml: model pinned from :latest to ollama/llama3.2:8b. - README + cpex-plugin.md: clarify authbridge-cpex is a build variant (not a second sidecar); document OPA dual presence. Verified: go test ./plugins/cpex/ passes (33 tests); go vet clean; full hr-cpex demo 9/9 scenarios pass on kind cluster. Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com>
|
Thanks for the thorough review, @huang195 . Addressed everything in Must-fix (blocker) — fixedMulti-part A2A response redaction leak: Suggestions addressed
Nits addressed
Verification
|
huang195
left a comment
There was a problem hiding this comment.
Re-Review Summary
Good iteration — most of the prior review is resolved:
- ✅ A2A response multi-part leak — FIXED.
applyA2AResponseBodyModnow collects all text parts and fails closed on>1("single-value redaction rewrite is ambiguous"), with two new fail-closed tests. - ✅ SSE / streaming bypass — FIXED. New
isStreamingResponseGapguard fails closed (routes throughfail_open, default deny) when a response body yields zero inspectable parts, with 6 unit tests. - ✅ FFI supply chain — improved.
cosign verify-blob(keyless/Fulcio, pinned to the cpex release workflow identity + GitHub OIDC issuer) added on top of the sha256 gate — a genuine authenticity upgrade.
The one remaining blocker: the original must-fix named both the MCP and A2A response paths; only A2A got the guard. The MCP re-serializer (cmf_body.go) still rewrites only the first type:"text" block and breaks, with no count guard and no fail-closed — the identical leak, just on the MCP side. A tools/call result with ≥2 text blocks (valid per spec) still forwards the original secret in the later blocks while returning mutated=true, err=nil. See inline. Mirror the A2A fix (collect all blocks → fail closed on >1, or rewrite all) and add a multi-block test that asserts the planted secret is absent from every block — the current TestMCPResponseBodyMod_TextBlockReplaced only covers the single-block case.
Smaller, non-blocking (carried over): the FFI ABI assertion is still a TODO comment — cosign is in, but nothing asserts the linked lib's FFI_ABI matches the Go binding's expectation at build/runtime; worth wiring the planned pre-commit/build check.
Merge gate: verify-pr-title is currently failing — the title prefix needs the org's exact case (e.g. Feat: not feat:).
Author: araujof (FIRST_TIME_CONTRIBUTOR) · Agent/IDE config (.claude/.vscode): none · Commits: 9, signed-off ✅ · CI: failing (pr-title); rest green
Nice responsiveness to the prior findings — fixing the MCP path symmetrically gets this essentially there.
Assisted-By: Claude Code
| // Primary path: rewrite the first text block in result.content[]. | ||
| // This is what every MCP client we see today reads from. | ||
| if contentArr, ok := result["content"].([]any); ok { | ||
| for _, block := range contentArr { |
There was a problem hiding this comment.
must-fix (data leak — carried over) — The A2A response path was correctly hardened in this revision (now fails closed on >1 text parts), but the symmetric MCP path here was not. This loop still rewrites only the first type:"text" block and breaks (line 538), leaving any subsequent text blocks in result.content[] unredacted while returning mutated=true, err=nil → the request is ALLOWED. A tools/call result with ≥2 text blocks (valid per the MCP spec) therefore still leaks the original secret in blocks[1..n].
Apply the same fix you gave applyA2AResponseBodyMod: collect all type:"text" blocks and fail closed when more than one is present (the len(targets) > 1 → error pattern), or rewrite all of them. And add an MCP multi-text-block test that plants a secret and asserts it is absent from every block — TestMCPResponseBodyMod_TextBlockReplaced only exercises the single-block case, so nothing currently catches this.
Summary
Adds a
cpexpipeline plugin that routes AuthBridge hooks through the CPEX policy framework. CPEX lets operators define authorization flows declaratively and turn decisions into ordered effects: redact a field, delegate a token, taint a session, call an external PDP, write an audit record.The motivation is agentic authorization patterns that are difficult to express with stateless decision engines:
on_allow/on_denydrive concrete actions in a defined order, not just a boolean.CPEX leverages existing PDPs (OPA/Cedar/AuthZEN) at the level of agentic entities (tools, prompts, models, A2A methods) rather than replacing them. CPEX also provides a robust execution pipeline with several dispatch modes for plugins.
Changes
New plugin (
authlib/plugins/cpex/)cmf.tool_pre_invoke,cmf.tool_post_invoke,cmf.llm_input,cmf.llm_output), applies returned modifications (redacted bodies, mutated headers, session labels), and maps the CPEX outcome back to AuthBridge's 5-value action vocabulary (allow / deny / modify / observe).bypass_hostshonored outbound-only (inbound Host is attacker-controlled);bypass_pathson both phases.fail_open(default false) governs CPEX-internal failures only — a policydenyis always honored.New binary (
cmd/authbridge-cpex/)-tags cpexandCGO_ENABLED=1, linkslibcpex_ffi.afrom a pinned CPEX release. The other three binaries stay pure-Go and do not import the package.cmd/authbridge-cpex/CPEX_FFI_VERSION.cpexin any non-cpex binary fails at boot with a clear error instead of a silent no-op.Supporting changes
authlib/contracts/identity.go,pipeline/extensions.go, and jwt-validation / token-exchange identity plumbing to surface delegation chain, agent identity, and curated claims into CMF.authbridge-cpeximage added to thebuild.yamlCI matrix.demos/hr-cpex/): containerized scenario with Bob/Alice/Eve personas covering the Workday delegation flow, GitHub flow with Cedar, PII scanning, redaction, and cross-call session-taint isolation. 9 runnable scenario scripts.docs/cpex-plugin.md, and thedocs/proposals/authbridge-hooks.mddesign proposal.Testing Instructions
Unit tests compile tag-free against a fake CPEX manager (no cgo), covering every CMF mapping and re-serializer:
End-to-end demo:
See
demos/hr-cpex/README.mdfor the full scenario walkthrough.Notes
authbridge-proxy,authbridge-envoy, orauthbridge-litebinaries — they do not import the cpex package and remain CGO-free.cpexplugin is opt-in per pipeline; an installed plugin with no hooks is inert (both phases return continue), so it can ship ahead of a policy update.Summary by CodeRabbit
New Features
Enhancements
Documentation