Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Changed
- **plugin**: `cel` `on_match` now rejects unknown fields at config-load time (`deny_unknown_fields`). Previously a typo like `on_match: { set_contxt: {...} }` was silently dropped — and crucially, the `on_match.deny` block proposed in early ADR-0030 drafts was silently dropped against any pre-existing `cel` deployment, which is the bug that motivated this PR. Operators with typos in `on_match` will now see a clear deserialization error instead of a silently broken policy.
- **plugin (BREAKING)**: `ai-proxy` no longer accepts a `model` field on flat config, on `targets.<name>`, or on `fallback[]` entries (ADR-0030 §0 — caller-owned model). The model identifier is now always taken from the client's `model` field on the request body and passed to the upstream provider verbatim; the gateway never picks a default. Migration: delete `model:` from every `ai-proxy` config block. Operators upgrading get a clear error from both layers — vacuum lints `Unknown config field "model" for dispatcher "ai-proxy"` against the JSON schema, and the runtime `serde(deny_unknown_fields)` rejects leftover nested `model:` at WASM instance load. Requests that omit `model` (or send an empty string / non-string value) now get `400 problem+json` with `code: "model_required"` and `type: "urn:barbacane:error:model_required"`.
- **plugin**: `ai-token-limit` config now uses `quota` + `window` (seconds) — aligned with the `rate-limit` plugin — instead of `max_tokens_per_minute` / `max_tokens_per_hour`. For multiple concurrent windows (e.g. per-minute and per-hour caps), stack two instances of the middleware with different `policy_name`s.
- **plugin**: AI guard/limit plugins (`ai-prompt-guard`, `ai-token-limit`, `ai-response-guard`) **fail-closed** on misconfiguration — a missing `default_profile` or invalid regex in a profile returns `500 problem+json` instead of silently letting traffic through. A silently disabled PII rule is precisely the class of bug operators only catch from an incident.
- **plugin**: `ai-token-limit` now persists the resolved partition key into context between `on_request` and `on_response` (scoped by `policy_name`) so `client_ip` and `header:*` partition sources charge the same bucket the request was admitted against. Previously token consumption leaked into a shared `"unknown"` bucket, effectively disabling per-consumer budgeting for those partition sources.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
<a href="https://github.com/barbacane-dev/barbacane/actions/workflows/ci.yml"><img src="https://github.com/barbacane-dev/barbacane/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
<a href="https://docs.barbacane.dev"><img src="https://img.shields.io/badge/docs-docs.barbacane.dev-blue" alt="Documentation"></a>
<img src="https://img.shields.io/badge/unit%20tests-517%20passing-brightgreen" alt="Unit Tests">
<img src="https://img.shields.io/badge/plugin%20tests-795%20passing-brightgreen" alt="Plugin Tests">
<img src="https://img.shields.io/badge/plugin%20tests-801%20passing-brightgreen" alt="Plugin Tests">
<img src="https://img.shields.io/badge/integration%20tests-275%20passing-brightgreen" alt="Integration Tests">
<img src="https://img.shields.io/badge/cli%20tests-23%20passing-brightgreen" alt="CLI Tests">
<img src="https://img.shields.io/badge/ui%20tests-44%20passing-brightgreen" alt="UI Tests">
Expand Down
3 changes: 0 additions & 3 deletions crates/barbacane-test/tests/ai_gateway.rs
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,6 @@ paths:
name: ai-proxy
config:
provider: ollama
model: llama3
base_url: "{base_url}"
timeout: 10
max_tokens: 512
Expand Down Expand Up @@ -206,7 +205,6 @@ paths:
name: ai-proxy
config:
provider: ollama
model: llama3
base_url: "{base_url}"
timeout: 10
max_tokens: 512
Expand Down Expand Up @@ -343,7 +341,6 @@ paths:
name: ai-proxy
config:
provider: ollama
model: llama3
base_url: "{base_url}"
timeout: 10
max_tokens: 512
Expand Down
4 changes: 0 additions & 4 deletions crates/barbacane-test/tests/ai_proxy.rs
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,6 @@ paths:
name: ai-proxy
config:
provider: ollama
model: llama3
base_url: "{base_url}"
timeout: 10
max_tokens: 512
Expand All @@ -96,7 +95,6 @@ paths:
targets:
local:
provider: ollama
model: llama3
base_url: "{base_url}"
timeout: 10
max_tokens: 512
Expand All @@ -118,12 +116,10 @@ paths:
name: ai-proxy
config:
provider: openai
model: gpt-4o
api_key: "sk-test"
base_url: "{base_url}/primary-fail"
fallback:
- provider: ollama
model: llama3
base_url: "{base_url}"
timeout: 10
max_tokens: 512
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ const schemas = {
required: [],
properties: {
provider: { type: "string" },
model: { type: "string" },
api_key: { type: "string" },
base_url: { type: "string" },
timeout: { type: "integer", minimum: 1 },
Expand Down
1 change: 0 additions & 1 deletion docs/rulesets/tests/valid-complete.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,6 @@ paths:
name: ai-proxy
config:
provider: "openai"
model: "gpt-4o"
api_key: "env://OPENAI_API_KEY"
timeout: 120
max_tokens: 4096
Expand Down
20 changes: 6 additions & 14 deletions plugins/ai-proxy/config-schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,23 @@
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "urn:barbacane:plugin:ai-proxy:config",
"title": "AI Proxy Dispatcher Config",
"description": "Configuration for the AI proxy dispatcher plugin. Exposes a unified OpenAI-compatible API and routes to LLM providers (OpenAI, Anthropic, Ollama). Supports named targets for policy-driven routing (via the `cel` middleware), provider fallback on 5xx, and token count propagation into context for downstream middlewares.\n\nTwo configuration modes:\n- **Flat**: set `provider` + `model` (+ optional `api_key`, `base_url`) for a single provider.\n- **Targets map**: define named `targets` and optionally a `default_target`; the `cel` middleware selects the active target by writing `ai.target` into context.\n\nThe `fallback` list is tried in order when the resolved target returns a 5xx or a connection error.",
"description": "Configuration for the AI proxy dispatcher plugin. Exposes a unified OpenAI-compatible API and routes to LLM providers (OpenAI, Anthropic, Ollama). Supports named targets for policy-driven routing (via the `cel` middleware), provider fallback on 5xx, and token count propagation into context for downstream middlewares.\n\nThe model identifier is **always** taken from the client's `model` field on the request body (ADR-0030 §0 — caller-owned model). Gateway config declares **providers** (where to go, with what credentials), never an authoritative model list. A request that omits `model` is rejected with 400 problem+json (`urn:barbacane:error:model_required`).\n\nTwo configuration modes:\n- **Flat**: set `provider` (+ optional `api_key`, `base_url`) for a single provider.\n- **Targets map**: define named `targets` and optionally a `default_target`; the `cel` middleware selects the active target by writing `ai.target` into context.\n\nThe `fallback` list is tried in order when the resolved target returns a 5xx or a connection error.",
"type": "object",
"$defs": {
"TargetConfig": {
"type": "object",
"description": "A named provider target: provider type, model, credentials, and optional custom endpoint.",
"required": ["provider", "model"],
"description": "A named provider target: provider type, credentials, and optional custom endpoint. The model identifier comes from the client request body, never from this config (ADR-0030 §0).",
"required": ["provider"],
"additionalProperties": false,
"properties": {
"provider": {
"type": "string",
"enum": ["openai", "anthropic", "ollama"],
"description": "LLM provider. `openai` and `ollama` are OpenAI-compatible (passthrough). `anthropic` uses the Messages API with automatic request/response translation."
},
"model": {
"type": "string",
"description": "Model identifier sent to the provider (e.g. `gpt-4o`, `claude-opus-4-6`, `mistral`)."
},
"api_key": {
"type": "string",
"description": "Provider API key. Supports `${ENV_VAR}` substitution. Omit for Ollama (unauthenticated local endpoint)."
"description": "Provider API key. Supports `env://VAR` substitution. Omit for Ollama (unauthenticated local endpoint)."
},
"base_url": {
"type": "string",
Expand All @@ -39,13 +35,9 @@
"enum": ["openai", "anthropic", "ollama"],
"description": "Provider for the flat single-provider config. Required when `targets` is not defined."
},
"model": {
"type": "string",
"description": "Model identifier for the flat config (e.g. `gpt-4o`, `claude-opus-4-6`, `llama3`)."
},
"api_key": {
"type": "string",
"description": "API key for the flat config. Supports `${ENV_VAR}` substitution."
"description": "API key for the flat config. Supports `env://VAR` substitution."
},
"base_url": {
"type": "string",
Expand Down Expand Up @@ -76,7 +68,7 @@
},
"default_target": {
"type": "string",
"description": "Name of the target to use when no `ai.target` context key is present. Must match a key in `targets`. When omitted and no context target is set, falls back to the flat `provider`/`model` config."
"description": "Name of the target to use when no `ai.target` context key is present. Must match a key in `targets`. When omitted and no context target is set, falls back to the flat `provider` config."
}
}
}
Loading