Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a Guardrails subsystem: new guard types/config, evaluator registry, Traceloop provider client, guard execution runner (pre/post phases) with OpenTelemetry spans, middleware and pipeline integration, extensive validation, tests, cassettes, and documentation. Also updates Cargo deps and .gitignore. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Pipeline as Pipeline
participant Pre as Pre-Call Guards
participant LLM as LLM Model
participant Post as Post-Call Guards
participant Traceloop as Traceloop
Client->>Pipeline: HTTP request + headers
Pipeline->>Pre: extract_prompt(input)
Pre->>Traceloop: POST /v2/guardrails/execute/{slug}
Traceloop-->>Pre: EvaluatorResponse
alt Pre blocked
Pre->>Client: 403 Forbidden (blocked_response)
else Pre passed
Pipeline->>LLM: call model
LLM-->>Pipeline: completion
Pipeline->>Post: extract_completion(output)
Post->>Traceloop: POST /v2/guardrails/execute/{slug}
Traceloop-->>Post: EvaluatorResponse
Post->>Pipeline: GuardrailsOutcome (warnings/blocked)
Pipeline->>Client: Response (+ warning headers)
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Comment |
|
|
🔒 Container Vulnerability Scan (hub-migrations - amd64)Click to expand results |
🔒 Container Vulnerability Scan (hub-migrations - arm64)Click to expand results |
🔒 Container Vulnerability Scan (hub - amd64)Click to expand results |
🔒 Container Vulnerability Scan (hub - arm64)Click to expand results |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
config-example.yaml (1)
132-163: Makerequiredexplicit for the post-call guard to avoid ambiguity.This is an example config; adding
requiredclarifies expected behavior and avoids relying on defaults.✅ Suggested tweak
- name: toxicity-filter provider: traceloop evaluator_slug: toxicity-detector params: threshold: 0.8 mode: post_call # Runs after the model call on_failure: block + required: true # set to false for best-effort behaviorsrc/guardrails/GUARDRAILS.md (1)
10-11: Improve grammar and sentence structure.Line 11 is a sentence fragment lacking a proper subject-verb structure. Consider revising for clarity:
✏️ Suggested revision
-Guardrails can be implemented in **Config Mode (Hub v1)** : -Guardrails fully defined in YAML configuration, applied automatically to gateway requests +**Config Mode (Hub v1)**: Guardrails are fully defined in YAML configuration and applied automatically to gateway requests.
|
The introduction of guardrails at the gateway level is a vital step for LLM observability and safety. The variety of evaluators you are adding (PII, secrets, etc.) is quite comprehensive. I am currently working with OpenClaw (https://clawoncloud.com), which is a personal AI assistant framework designed for self-hosting. We approach safety through a similar mindset of intercepting and validating tool calls and message flows. It is great to see more tooling emerging in this space to make agentic systems more predictable and secure! |
Additional review notesArchitecture: Guardrails should be a Tower layer, not inline handler codeThe original pipeline had a clean plugin architecture ( This is the root cause of issues #1 and #2 from the previous comment — when guardrails are manually stitched into the handler, it's easy to miss a code path. The streaming path was missed because the post-call and finalize logic only lives in the The idiomatic axum approach would be a Tower middleware layer: // Guardrails as a wrapping layer — applies to ALL routes uniformly
if let Some(gr) = guardrails {
router = router.layer(GuardrailsLayer::new(gr));
}This would:
Lines 120 to 185 in 254bacd Lower-confidence issues (not blocking, but worth noting)
hub/src/guardrails/providers/traceloop.rs Lines 54 to 68 in 254bacd
hub/src/guardrails/providers/traceloop.rs Lines 28 to 35 in 254bacd
Lines 149 to 154 in 254bacd
Lines 156 to 178 in 254bacd 🤖 Generated with Claude Code |
doronkopit5
left a comment
There was a problem hiding this comment.
Code Simplification Suggestions
Here are 7 simplification opportunities I identified in this PR. Together they'd reduce ~70 lines with zero behavioral changes. See inline comments for details.
Code Simplification SuggestionsHere are 7 simplification opportunities identified in this PR. Together they'd reduce ~70 lines with zero behavioral changes. 1. Extract shared
|
| Item | Reason |
|---|---|
types.rs Hash impls |
Only 2 call sites with different key/value types — a generic helper would add complexity for marginal gain |
runner.rs early-exit |
Only 4 shared lines, methods diverge after that — extraction adds indirection for no clarity gain |
setup.rs resolve_guard_defaults |
14 lines of clear imperative mutation; a .map() chain would be same length but harder to follow |
🤖 Generated with Claude Code
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/pipelines/pipeline.rs (1)
133-138:⚠️ Potential issue | 🟡 MinorOTEL span not marked as error when provider call fails.
When
chat_completionsfails (line 138), the error is propagated via?but thetracerspan is not explicitly marked with an error status. Per reviewer feedback, this can cause OTEL tracing issues. Consider callingtracer.log_error()before returning the error.📊 Proposed fix to mark span error
let response = model .chat_completions(payload.clone()) .await .inspect_err(|e| { eprintln!("Chat completion error for model {model_key}: {e:?}"); + tracer.log_error(format!("Chat completion error: {e:?}")); - })?; + }) + .map_err(|e| { + e + })?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/pipelines/pipeline.rs` around lines 133 - 138, The chat_completions error path doesn't mark the OTEL span as errored; update the inspect_err closure on the model.chat_completions(payload.clone()).await call to call tracer.log_error(...) (or the equivalent method on the tracer/span object in this module) with the error details before the error is propagated, so the span is explicitly marked as failed when the call returns Err; ensure you reference the same tracer/span instance used for this pipeline span when calling tracer.log_error().
🧹 Nitpick comments (4)
src/guardrails/middleware.rs (2)
30-37: Path matching order matters withcontains().The current order is correct:
/chat/completionsis checked before/completionssince both paths would match a/chat/completionsrequest if the order were reversed. Consider usingends_with()or exact path matching for more precise routing, or add a comment explaining the ordering dependency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/guardrails/middleware.rs` around lines 30 - 37, The path-matching in from_path uses contains() with an ordering dependency (e.g., "/chat/completions" must be checked before "/completions"); update from_path to use more precise checks (prefer using ends_with() or exact path matching for each route) or, if you keep contains(), add a clarifying comment above from_path documenting the ordering dependency and why "/chat/completions" must be matched first; reference the from_path function and the "/chat/completions", "/completions", and "/embeddings" match arms when making the change.
199-203: Inconsistent error handling: body buffer failure returns 400, but parse failures pass through.When buffering fails (line 201), the middleware returns
BAD_REQUEST. However, when JSON parsing fails (lines 212-214, 221-224, 231-234), the request is passed through to the inner service. This inconsistency could be confusing—consider either returning 400 for both cases or passing through for both.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/guardrails/middleware.rs` around lines 199 - 203, The middleware currently returns BAD_REQUEST on buffer errors but lets JSON parse failures pass through; make this consistent by changing the JSON parsing error branches (where serde_json::from_slice is used) to return axum::http::StatusCode::BAD_REQUEST.into_response() instead of forwarding to the inner service, and log the parse error (similar to the existing debug message "Guardrails middleware: failed to buffer request body, passing through") so parse failures produce a 400 with error context instead of calling inner.call(req).src/pipelines/pipeline.rs (1)
140-149: Prefermatchover consecutiveif letfor exhaustiveness.The current pattern uses two separate
if letblocks forNonStreamandStreamvariants. As noted in reviewer feedback, usingmatchensures exhaustiveness and future-proofs against new variants being added.♻️ Proposed refactor to match
- if let ChatCompletionResponse::NonStream(completion) = response { - tracer.log_success(&completion); - return Ok(Json(completion).into_response()); - } - - if let ChatCompletionResponse::Stream(stream) = response { - return Ok(Sse::new(trace_and_stream(tracer, stream)) - .keep_alive(KeepAlive::default()) - .into_response()); - } + return match response { + ChatCompletionResponse::NonStream(completion) => { + tracer.log_success(&completion); + Ok(Json(completion).into_response()) + } + ChatCompletionResponse::Stream(stream) => { + Ok(Sse::new(trace_and_stream(tracer, stream)) + .keep_alive(KeepAlive::default()) + .into_response()) + } + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/pipelines/pipeline.rs` around lines 140 - 149, Replace the two consecutive if-let blocks matching on response with a single match on ChatCompletionResponse to ensure exhaustiveness: match on response { ChatCompletionResponse::NonStream(completion) => { tracer.log_success(&completion); return Ok(Json(completion).into_response()); }, ChatCompletionResponse::Stream(stream) => { return Ok(Sse::new(trace_and_stream(tracer, stream)).keep_alive(KeepAlive::default()).into_response()); }, _ => { return Err(/* return an appropriate error or conversion for unknown variants */) } } — keep the existing calls to tracer.log_success, trace_and_stream, Json and Sse/KeepAlive exactly as used now and add a fallback arm to handle any future variants.src/guardrails/parsing.rs (1)
24-31: Consider extracting sharedcontent_to_stringhelper to reduce duplication.The
ChatMessageContentmatch logic is duplicated betweenPromptExtractorforChatCompletionRequest(lines 24-31) andCompletionExtractorforChatCompletion(lines 44-51). This aligns with reviewer feedback suggesting a shared helper.♻️ Proposed helper extraction
+fn content_to_string(content: &ChatMessageContent) -> String { + match content { + ChatMessageContent::String(s) => s.clone(), + ChatMessageContent::Array(parts) => parts + .iter() + .filter(|p| p.r#type == "text") + .map(|p| p.text.as_str()) + .collect::<Vec<_>>() + .join(" "), + } +} + impl PromptExtractor for ChatCompletionRequest { fn extract_prompt(&self) -> String { self.messages .iter() .filter_map(|m| { - m.content.as_ref().map(|content| match content { - ChatMessageContent::String(s) => s.clone(), - ChatMessageContent::Array(parts) => parts - .iter() - .filter(|p| p.r#type == "text") - .map(|p| p.text.as_str()) - .collect::<Vec<_>>() - .join(" "), - }) + m.content.as_ref().map(content_to_string) }) .collect::<Vec<_>>() .join("\n") } }Also applies to: 44-51
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/guardrails/parsing.rs` around lines 24 - 31, Extract the duplicated ChatMessageContent matching logic into a shared helper function (e.g., content_to_string) and use it from both PromptExtractor::... (where ChatCompletionRequest messages are processed) and CompletionExtractor::... (where ChatCompletion messages are processed); implement content_to_string to accept &ChatMessageContent (or Option<&ChatMessageContent> if convenient) and return a String by handling ChatMessageContent::String(s) -> s.clone() and ChatMessageContent::Array(parts) -> join the text of parts with type == "text", then replace the inline match blocks in both PromptExtractor and CompletionExtractor with calls to this helper.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/guardrails/middleware.rs`:
- Line 84: The code is unboundedly buffering request and response bodies by
calling axum::body::to_bytes(..., usize::MAX) (seen in middleware.rs for
resp_body and at the request handling around line 197); replace usize::MAX with
a configurable max byte limit (e.g., a constant DEFAULT_MAX_BODY_BYTES = 10 *
1024 * 1024 or read from existing app config) and use that value in both places
(response handling where resp_body is converted and the request body conversion
at the other location) so large payloads are rejected instead of exhausting
memory; ensure you return a clear error (or early response) when to_bytes
returns an error or indicates the body exceeded the limit.
---
Outside diff comments:
In `@src/pipelines/pipeline.rs`:
- Around line 133-138: The chat_completions error path doesn't mark the OTEL
span as errored; update the inspect_err closure on the
model.chat_completions(payload.clone()).await call to call tracer.log_error(...)
(or the equivalent method on the tracer/span object in this module) with the
error details before the error is propagated, so the span is explicitly marked
as failed when the call returns Err; ensure you reference the same tracer/span
instance used for this pipeline span when calling tracer.log_error().
---
Nitpick comments:
In `@src/guardrails/middleware.rs`:
- Around line 30-37: The path-matching in from_path uses contains() with an
ordering dependency (e.g., "/chat/completions" must be checked before
"/completions"); update from_path to use more precise checks (prefer using
ends_with() or exact path matching for each route) or, if you keep contains(),
add a clarifying comment above from_path documenting the ordering dependency and
why "/chat/completions" must be matched first; reference the from_path function
and the "/chat/completions", "/completions", and "/embeddings" match arms when
making the change.
- Around line 199-203: The middleware currently returns BAD_REQUEST on buffer
errors but lets JSON parse failures pass through; make this consistent by
changing the JSON parsing error branches (where serde_json::from_slice is used)
to return axum::http::StatusCode::BAD_REQUEST.into_response() instead of
forwarding to the inner service, and log the parse error (similar to the
existing debug message "Guardrails middleware: failed to buffer request body,
passing through") so parse failures produce a 400 with error context instead of
calling inner.call(req).
In `@src/guardrails/parsing.rs`:
- Around line 24-31: Extract the duplicated ChatMessageContent matching logic
into a shared helper function (e.g., content_to_string) and use it from both
PromptExtractor::... (where ChatCompletionRequest messages are processed) and
CompletionExtractor::... (where ChatCompletion messages are processed);
implement content_to_string to accept &ChatMessageContent (or
Option<&ChatMessageContent> if convenient) and return a String by handling
ChatMessageContent::String(s) -> s.clone() and ChatMessageContent::Array(parts)
-> join the text of parts with type == "text", then replace the inline match
blocks in both PromptExtractor and CompletionExtractor with calls to this
helper.
In `@src/pipelines/pipeline.rs`:
- Around line 140-149: Replace the two consecutive if-let blocks matching on
response with a single match on ChatCompletionResponse to ensure exhaustiveness:
match on response { ChatCompletionResponse::NonStream(completion) => {
tracer.log_success(&completion); return Ok(Json(completion).into_response()); },
ChatCompletionResponse::Stream(stream) => { return
Ok(Sse::new(trace_and_stream(tracer,
stream)).keep_alive(KeepAlive::default()).into_response()); }, _ => { return
Err(/* return an appropriate error or conversion for unknown variants */) } } —
keep the existing calls to tracer.log_success, trace_and_stream, Json and
Sse/KeepAlive exactly as used now and add a fallback arm to handle any future
variants.
doronkopit5
left a comment
There was a problem hiding this comment.
Code Simplification Review (Items 1-13)
Inline comments below highlight 13 opportunities to make this code more idiomatic, readable, and performant. Ordered by impact — items 1-6 are highest priority, 7-13 are medium.
src/guardrails/types.rs
Outdated
|
|
||
| /// Shared guardrail resources: resolved guards + client. | ||
| /// Built once per router build and shared across all pipelines. | ||
| pub type GuardrailResources = (Arc<Vec<Guard>>, Arc<dyn GuardrailClient>); |
There was a problem hiding this comment.
SIMPLIFY #1 (HIGH): Replace tuple alias with a named struct
GuardrailResources is a type alias for (Arc<Vec<Guard>>, Arc<dyn GuardrailClient>). Accessing .0/.1 throughout setup.rs (lines 83-86, 96-98) is opaque and error-prone (field order can be silently swapped).
Suggested change:
pub struct GuardrailResources {
pub guards: Arc<Vec<Guard>>,
pub client: Arc<dyn GuardrailClient>,
}Then shared.guards / shared.client instead of shared.0 / shared.1 everywhere.
src/guardrails/providers/mod.rs
Outdated
| TRACELOOP_PROVIDER => Some(Box::new(TraceloopClient::new())), | ||
| _ => None, | ||
| } | ||
| } |
There was a problem hiding this comment.
SIMPLIFY #2 (HIGH): Dead code — remove or integrate
create_guardrail_client is never called in production. build_guardrail_resources in setup.rs directly constructs TraceloopClient::new(), bypassing this function entirely.
Either:
- Remove this function (and the test at
test_traceloop_client.rs:157-161), or - Wire it into
build_guardrail_resourcesso provider selection is actually dynamic.
src/pipelines/pipeline.rs
Outdated
There was a problem hiding this comment.
SIMPLIFY #3 (HIGH): Collapse 3 identical mock providers into 1
TestProviderOpenAI, TestProviderAnthropic, and TestProviderAzure are identical except for r#type() return value. This is ~150 lines of boilerplate.
Suggested refactor:
#[derive(Clone)]
struct TestProvider(ProviderType);
#[async_trait]
impl Provider for TestProvider {
fn new(_config: &ProviderConfig) -> Self { Self(ProviderType::OpenAI) }
fn key(&self) -> String { format!("{:?}-key", self.0).to_lowercase() }
fn r#type(&self) -> ProviderType { self.0.clone() }
// ... shared impls
}The existing MockProviderForSpanTests already follows this parameterized pattern.
|
|
||
| #[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq, Hash)] | ||
| #[serde(rename_all = "snake_case")] | ||
| pub enum GuardMode { |
There was a problem hiding this comment.
SIMPLIFY #5 (HIGH): Add Copy to fieldless enums
Both GuardMode and OnFailure (line 29) are fieldless enums — they are Copy-sized. Adding Copy to their derives eliminates unnecessary .clone() calls throughout runner.rs (e.g. line 151: on_failure: guard.on_failure.clone() becomes a simple copy).
src/guardrails/types.rs
Outdated
| self.evaluator_slug.hash(state); | ||
| // Hash params by sorting keys and hashing serialized values | ||
| let mut params_vec: Vec<_> = self.params.iter().collect(); | ||
| params_vec.sort_by_key(|(k, _)| (*k).clone()); |
There was a problem hiding this comment.
SIMPLIFY #6 (HIGH): Zero-copy hash sorting
sort_by_key(|(k, _)| (*k).clone()) allocates a new String per key on every hash call. Same issue on line 117 for GuardrailsConfig::hash.
Fix: Use sort_by(|a, b| a.0.cmp(&b.0)) — zero allocations, same ordering.
src/pipelines/otel.rs
Outdated
There was a problem hiding this comment.
SIMPLIFY #10 (MEDIUM): Extract shared OTel span attribute helpers
The frequency_penalty/presence_penalty/top_p/temperature pattern (lines 231-248) is duplicated verbatim in CompletionRequest::record_span (lines 311-328). Also, the ChatMessageContent match block (lines 259-264) is duplicated in ChatCompletion::record_span (lines 288-294).
Suggested helpers:
fn set_optional_f64(span: &mut BoxedSpan, key: &'static str, value: Option<f32>) {
if let Some(v) = value {
span.set_attribute(KeyValue::new(key, v as f64));
}
}
fn content_to_string(content: &ChatMessageContent) -> String {
match content {
ChatMessageContent::String(s) => s.clone(),
ChatMessageContent::Array(parts) => serde_json::to_string(parts).unwrap_or_default(),
}
}
src/state.rs
Outdated
There was a problem hiding this comment.
SIMPLIFY #11 (MEDIUM): Unnecessary full router clone on every request
Arc::try_unwrap(router).unwrap_or_else(|arc| (*arc).clone()) will almost always take the unwrap_or_else branch (the HashMap still holds a reference), cloning the entire Router on every single request.
Axum's Router is designed to be cheaply cloneable (inner state is already Arc-wrapped). Simply do:
let router: Router = (*router).clone(); // cheap Arc-bump cloneOr even better, call .oneshot() directly on the Arc<Router> via deref.
src/guardrails/runner.rs
Outdated
| } | ||
| } | ||
|
|
||
| pub struct GuardPhaseResult { |
There was a problem hiding this comment.
SIMPLIFY #12 (MEDIUM): Consider Result instead of GuardPhaseResult
This struct is essentially Result<Vec<GuardWarning>, Response> — either you have a blocked response (error case) or you have warnings (success case). Using Result would let the middleware use the ? operator instead of if let Some(blocked) checks:
type GuardPhaseResult = Result<Vec<GuardWarning>, Response>;Then outcome_to_phase_result becomes more natural and callers can use ? for early-return on block.
src/pipelines/pipeline.rs
Outdated
| } | ||
|
|
||
| // Re-export builder and orchestrator functions for backward compatibility with tests | ||
| pub use crate::guardrails::runner::{blocked_response, warning_header_value}; |
There was a problem hiding this comment.
SIMPLIFY #13 (MEDIUM): Remove unnecessary re-exports
These re-exports exist "for backward compatibility with tests" — but all the tests are new in this PR, so there is no backward compatibility concern. Import directly from the guardrails modules in test code and remove these re-exports to avoid a confusing indirection layer.
Code reviewFound 13 issues:
hub/src/guardrails/middleware.rs Lines 243 to 248 in b973e9e
hub/src/guardrails/middleware.rs Lines 86 to 88 in b973e9e hub/src/guardrails/middleware.rs Lines 217 to 219 in b973e9e
Lines 214 to 217 in b973e9e
Lines 82 to 86 in b973e9e hub/src/guardrails/providers/mod.rs Lines 8 to 14 in b973e9e
hub/tests/guardrails/test_e2e.rs Lines 258 to 261 in b973e9e hub/tests/guardrails/test_types.rs Lines 129 to 132 in b973e9e
Lines 169 to 181 in b973e9e
Lines 368 to 378 in b973e9e
Lines 54 to 83 in b973e9e
Lines 94 to 100 in b973e9e Lines 120 to 123 in b973e9e
hub/src/guardrails/middleware.rs Lines 220 to 223 in b973e9e
hub/src/guardrails/providers/traceloop.rs Lines 11 to 13 in b973e9e hub/src/guardrails/providers/traceloop.rs Lines 50 to 52 in b973e9e
hub/src/guardrails/providers/traceloop.rs Lines 76 to 78 in b973e9e 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
Important
This pull request adds a comprehensive guardrails feature to the codebase, including configuration, validation, and extensive testing for safety, validation, and quality checks in LLM gateway traffic.
guardrailswith provider-level defaults.GatewayConfigintypes.rsto includeguardrails.config/validation.rs.tests/guardrails/for guardrails functionality.wiremockfor mocking evaluator API responses.Cargo.tomlandCargo.lockfor new dependencies.GUARDRAILS.mdfor documentation of the guardrails feature.This description was created by
for 99dff3d. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
New Features
Documentation
Tests
Chores