From cba8e74d7b8af2393b51729adb0c3a62ea6e0223 Mon Sep 17 00:00:00 2001 From: Nicolas Dreno Date: Thu, 23 Apr 2026 13:15:41 +0200 Subject: [PATCH 1/3] feat(blog): add "Barbacane vs Portkey and LiteLLM" comparison (draft) Second piece of the audit's Phase 1 comparison-content strategy. Targets the "Barbacane vs Portkey" / "Barbacane vs LiteLLM" / "inbound vs outbound AI gateway" search terms. Thesis: Portkey and LiteLLM (outbound AI gateways, application -> LLM) and Barbacane (inbound AI gateway, agent -> your APIs) solve different problems and most teams end up running both. The category distinction is the high-value takeaway; specific feature compare is the usual procurement grind. Structure: - Lede: the vocabulary-collision problem - Two directions of traffic - What Portkey and LiteLLM do, concretely - What Barbacane does, concretely - How the two categories compose in a production stack - Where the overlap actually is (AI governance middleware) - Decision table by situation - Feature comparison table - Procurement advice - Closing, links back to the canonical MCP gateway post and /mcp/ Saved as draft: true with publishDate 2026-04-29 (six days out). Flip the draft flag in a follow-up commit when ready to publish. --- .../blog/barbacane-vs-portkey-litellm.md | 162 ++++++++++++++++++ 1 file changed, 162 insertions(+) create mode 100644 src/content/blog/barbacane-vs-portkey-litellm.md diff --git a/src/content/blog/barbacane-vs-portkey-litellm.md b/src/content/blog/barbacane-vs-portkey-litellm.md new file mode 100644 index 0000000..539b888 --- /dev/null +++ b/src/content/blog/barbacane-vs-portkey-litellm.md @@ -0,0 +1,162 @@ +--- +title: "Barbacane vs Portkey and LiteLLM: inbound vs outbound AI gateways" +description: "You need to stop your application from exploding the OpenAI bill. You also need to stop agents from exploding your production APIs. These are different problems with different gateways. Here is how to tell them apart, and why most teams end up running both." +publishDate: 2026-04-29 +author: "Nicolas Dreno" +tags: ["mcp-gateway", "ai-gateway", "portkey", "litellm", "comparison", "model-context-protocol", "ai-governance"] +draft: true +--- + +*If your team is running agents in production, someone has already asked about Portkey or LiteLLM. Someone else has started asking about MCP. Both groups are right, and they are not asking about the same thing.* + +The AI gateway market in early 2026 is confused by a vocabulary problem. "AI gateway" describes two categories that share exactly one letter in common, and asking a buyer to pick one over the other is like asking them to pick between a load balancer and a CDN. You probably want both. + +This post is a map. What Portkey and LiteLLM are for. What Barbacane is for. Where they overlap (almost nowhere). How they compose (they do, cleanly). When you need which. + +--- + +### Two different directions of traffic + +Start with the bottom of the stack. Every AI gateway sits in the path of one of two kinds of traffic: + +**Outbound AI traffic.** Your application calls an LLM provider. You send prompts and tokens out, you get completions back. The outbound gateway is a proxy in front of OpenAI, Anthropic, Bedrock, Gemini, or your own hosted model. Portkey and LiteLLM are outbound gateways. + +**Inbound AI traffic.** An AI agent calls your APIs. The agent discovers tools, picks one, and invokes it. The inbound gateway is a proxy in front of your own services, speaking the Model Context Protocol on the agent-facing side. [Barbacane](/mcp/) is an inbound gateway. + +The two travel opposite directions. That is why "AI gateway" collides badly as a term: the same word is doing double duty for two products that touch different parts of your infrastructure, solve different problems, and typically do not compete at procurement. + +--- + +### What Portkey and LiteLLM do, concretely + +Portkey and LiteLLM are not identical, but they solve the same category of problem. Both let your application call LLMs through one interface, with the operational controls an application team needs. + +Both typically provide: + +- **Provider abstraction.** Swap OpenAI for Anthropic with a config change, not a code change. +- **Fallbacks and retries.** If one provider errors, try another. If the first attempt fails, back off and try again. +- **Caching.** Identical prompts can return cached completions, saving latency and tokens. +- **Observability.** Per-call traces, token usage, latency histograms, cost dashboards. +- **Budget guardrails.** Spend limits per key, tenant, or user. +- **Prompt safety.** Varies by product, but most offer some form of scrubbing and policy enforcement on the prompt leaving your infrastructure. + +Portkey and LiteLLM do these jobs well. If your application sends prompts to an LLM, you should be running one of them, or something similar, between the application and the provider. Calling OpenAI directly from production code without this layer is a common way to turn a weekend experiment into a Monday incident. + +--- + +### What Barbacane does, concretely + +Barbacane is an [MCP gateway](/blog/what-is-an-mcp-gateway/). It does not sit between your application and OpenAI. It sits between an AI agent and your internal APIs, turning operations in your OpenAPI spec into MCP tools the agent can discover and call. + +Specifically: + +- **Tool exposure.** Your existing OpenAPI spec compiles into an MCP tool server. Every operation with an `operationId` and a `summary` becomes a typed, agent-callable tool. +- **Opt-out per operation.** Hide admin endpoints, destructive actions, or anything you do not want agents touching, with a spec annotation. +- **Middleware pass-through.** Tool calls are HTTP requests under the hood. Your existing auth, rate limits, validation, transformations, and observability apply without re-plumbing. +- **AI governance.** Prompt guarding, token limits, cost tracking, response guarding, layered on top of the API-gateway basics. +- **Spec-first.** The schema agents see is derived from the spec your API team already maintains. No parallel source of truth. + +None of this is a replacement for what Portkey or LiteLLM do. Your agent still needs an outbound LLM gateway to talk to its model. Barbacane does not talk to LLMs at all. + +--- + +### How they compose + +Most production systems running agents in 2026 look roughly like this: + +``` + [your application] + | + v + [outbound AI gateway] <- Portkey / LiteLLM + | + v + [LLM provider] <- OpenAI / Anthropic / self-hosted + | + v (the model decides to call a tool) + | + [MCP-compatible agent] + | + v + [inbound AI gateway / MCP gateway] <- Barbacane + | + v + [your APIs and services] +``` + +The outbound gateway is everything between your application and the model. The inbound gateway is everything between the agent and your infrastructure. The agent itself is the hinge. + +If you are building an agent product end-to-end, you probably want both. If you are only publishing APIs to be used by agents built elsewhere (which is the more common case for platform teams), you only need the inbound gateway. If you are only calling LLMs from your application without running agents against your APIs, you only need the outbound one. + +--- + +### Where the overlap actually is + +There is one real overlap worth naming, so nobody buys the same thing twice: **AI governance middleware.** + +Both categories have to think about prompt safety, cost attribution, and rate limiting. The middleware surface looks similar on paper. The differences: + +- Outbound gateways govern traffic *your application sends to the LLM*. The prompt being guarded is written by your code or your user, on the way out. +- Inbound gateways govern traffic *an agent sends to your APIs*. The prompt, if there is one, is written by the agent and has already passed through the model, on the way in. + +Both layers are useful. They catch different things. A prompt-injection pattern that slips past your outbound PII scrubber can still be caught by your inbound response guard before it reaches the user. Budget limits at the outbound layer control what you pay OpenAI. Budget limits at the inbound layer control what the agent costs *you* to serve. + +Treating them as redundant is a mistake. Running both is defense in depth. + +--- + +### When you need which: a decision table + +| Situation | You need | +|--- |--- | +| Your application calls OpenAI or Anthropic directly | Outbound AI gateway (Portkey, LiteLLM) | +| You want one interface across multiple LLM providers | Outbound AI gateway | +| You want caching, retries, fallbacks, and cost dashboards for LLM calls | Outbound AI gateway | +| Your APIs are going to be called by AI agents | Inbound AI gateway / MCP gateway | +| You want your existing OpenAPI operations to become MCP tools | MCP gateway | +| You want auth, rate limits, and audit applied to agent tool calls | MCP gateway | +| You want to cap how much agents can cost you to *serve* | MCP gateway | +| You are shipping an end-to-end agent product | Both | +| You are a platform team enabling other teams' agents | MCP gateway; outbound is their problem | + +--- + +### Feature comparison + +A compact comparison. All three products evolve quickly, so treat this as direction, not specification. Check current docs before committing. + +| Concern | Portkey | LiteLLM | Barbacane | +|--- |--- |--- |--- | +| Direction of traffic | Outbound | Outbound | Inbound | +| Primary surface | LLM provider API | LLM provider API | MCP tool server | +| Source of truth for tools | N/A | N/A | Your OpenAPI spec | +| Protocol | HTTP JSON | HTTP JSON | JSON-RPC 2.0 (MCP) | +| Runtime | SaaS or self-host | Python proxy, self-host | Rust binary, self-host | +| License | Commercial + OSS core | Open source (MIT) | AGPLv3 + commercial | +| Typical buyer | Application team | Application team | Platform or AI team | +| Governs agent calls to your APIs | No | No | Yes | +| Governs calls to LLM provider | Yes | Yes | No | + +Where a row says "No", the product was not designed for that concern. Forcing a tool into the wrong role is how shadow stacks start. + +--- + +### What to watch for during procurement + +If you are being pitched an "AI gateway" and the direction of traffic is not the first slide, ask. If you are being told a product does both inbound and outbound, dig into which side has actual engineering depth. Most products excel on one side and offer thin coverage on the other, usually because the sides solve structurally different problems and require different primitives. + +The healthy procurement pattern: + +1. **Decide which direction matters.** If you are a platform team, start with inbound. If you are an application team, start with outbound. If you are both, buy both. +2. **Prefer specialists over generalists.** A sharp inbound gateway and a sharp outbound gateway compose cleanly. A vague do-everything gateway usually means you rebuild the missing side yourself later. +3. **Check the seam.** Verify that the outbound gateway's audit logs can correlate with the inbound gateway's audit logs, ideally on the agent identity. When an incident happens, you will want to follow the request from application through LLM through agent through your API without stitching together three different telemetry stacks. + +--- + +### Closing thoughts + +Portkey and LiteLLM are good at what they do. Barbacane is good at what it does. They are not in the same evaluation. + +If you are running agents in production and your infrastructure includes only one of these two gateway categories, you have a blind spot. If you are still deciding which to adopt first, ask which direction of traffic is currently unguarded. That is the gateway you need next. + +For the inbound side, [Barbacane's /mcp page](/mcp/) is the five-minute version. For the outbound side, Portkey's and LiteLLM's own docs are the right place to start. The category distinction is doing most of the work; from there, evaluation is the usual procurement grind. From 71f14de68d72612f146a76c93e3e0c73e4d7fed3 Mon Sep 17 00:00:00 2001 From: Nicolas Dreno Date: Thu, 23 Apr 2026 13:39:43 +0200 Subject: [PATCH 2/3] fix(blog): rewrite Portkey/LiteLLM comparison against actual Barbacane AI capability Previous draft was factually wrong: it claimed Barbacane is inbound-only and does not talk to LLMs. In fact: - `ai-proxy` dispatcher (shipped, docs.barbacane.dev/guide/dispatchers.html#ai-proxy) is an outbound AI proxy that routes to OpenAI, Anthropic, Ollama, and any OpenAI-compat endpoint; supports provider fallback, named targets for policy-driven routing, pinned provider API versions, SSE streaming for OpenAI-compatible providers. - ADR-0024 names Kong AI Gateway, LiteLLM, Portkey, KrakenD as direct competitors and positions Barbacane's spec-driven + WASM-composable approach as the differentiator. - PR #67 (feat: AI gateway middleware suite) ships the four AI governance middlewares that compose around ai-proxy. - PR #69 (ADR-0030) extends ai-proxy with OpenAI Responses API support and dynamic model routing. Rewrite: - Title changed from "inbound vs outbound AI gateways" to "picking an AI gateway in 2026". Inbound/outbound is still useful taxonomy, but the dichotomy is wrong for Barbacane. - New thesis: all three ship outbound LLM proxy competently; the axis that differentiates them is how AI gateway relates to the rest of the gateway (monolithic AI proxy vs dispatcher + middlewares, config file vs OpenAPI spec, AI-only vs also-MCP). - Decision table and feature table rebuilt honestly; Barbacane shows Yes on outbound-related concerns now. - Kept MCP as an additional capability that Portkey/LiteLLM do not have, not as the entire reason to pick Barbacane. Claims verified against: - docs.barbacane.dev/guide/dispatchers.html (ai-proxy) - barbacane-dev/barbacane adr/0024-ai-gateway-plugin.md - barbacane-dev/barbacane#67 (AI gateway middleware suite) - barbacane-dev/barbacane#69 (ADR-0030 Responses API) --- .../blog/barbacane-vs-portkey-litellm.md | 205 ++++++++++-------- 1 file changed, 110 insertions(+), 95 deletions(-) diff --git a/src/content/blog/barbacane-vs-portkey-litellm.md b/src/content/blog/barbacane-vs-portkey-litellm.md index 539b888..e22ea1e 100644 --- a/src/content/blog/barbacane-vs-portkey-litellm.md +++ b/src/content/blog/barbacane-vs-portkey-litellm.md @@ -1,141 +1,151 @@ --- -title: "Barbacane vs Portkey and LiteLLM: inbound vs outbound AI gateways" -description: "You need to stop your application from exploding the OpenAI bill. You also need to stop agents from exploding your production APIs. These are different problems with different gateways. Here is how to tell them apart, and why most teams end up running both." +title: "Barbacane vs Portkey and LiteLLM: picking an AI gateway in 2026" +description: "Portkey, LiteLLM, and Barbacane all ship an outbound AI gateway. Where they diverge is what else the gateway does: spec-first routing, MCP for the inbound direction, and composition with the rest of your API governance. An honest comparison for teams picking one." publishDate: 2026-04-29 author: "Nicolas Dreno" -tags: ["mcp-gateway", "ai-gateway", "portkey", "litellm", "comparison", "model-context-protocol", "ai-governance"] +tags: ["ai-gateway", "mcp-gateway", "portkey", "litellm", "comparison", "model-context-protocol", "ai-governance"] draft: true --- -*If your team is running agents in production, someone has already asked about Portkey or LiteLLM. Someone else has started asking about MCP. Both groups are right, and they are not asking about the same thing.* +*If you are picking an AI gateway in 2026, Portkey, LiteLLM, and Barbacane are all real options. They overlap enough to make the choice real, and they differ enough that the right answer depends on what else you want your gateway to do.* -The AI gateway market in early 2026 is confused by a vocabulary problem. "AI gateway" describes two categories that share exactly one letter in common, and asking a buyer to pick one over the other is like asking them to pick between a load balancer and a CDN. You probably want both. +Every AI-gateway evaluation runs into the same question after the first demo: once your OpenAI calls go through a gateway, what about everything else? The rate limits your platform team owns, the auth your security team owns, the audit trail your compliance team owns, the spec-first workflow your API team relies on, the agents calling back the other way. The more of that lives next to the AI traffic, the more the choice of AI gateway becomes an architecture decision and not a feature match. -This post is a map. What Portkey and LiteLLM are for. What Barbacane is for. Where they overlap (almost nowhere). How they compose (they do, cleanly). When you need which. +This post compares the three products on that axis. What they share. What separates them. How to pick. --- -### Two different directions of traffic +### The overlap: outbound LLM proxying -Start with the bottom of the stack. Every AI gateway sits in the path of one of two kinds of traffic: +All three products sit between your application and one or more LLM providers. All three give you: -**Outbound AI traffic.** Your application calls an LLM provider. You send prompts and tokens out, you get completions back. The outbound gateway is a proxy in front of OpenAI, Anthropic, Bedrock, Gemini, or your own hosted model. Portkey and LiteLLM are outbound gateways. +- **Provider abstraction** with an OpenAI-compatible API surface +- **Fallback chains** when a provider errors, times out, or is unreachable +- **Token usage and latency metrics** per call and per provider +- **Budget and rate-limit guardrails** at the gateway layer +- **Prompt and response guardrails** (scope varies by product) -**Inbound AI traffic.** An AI agent calls your APIs. The agent discovers tools, picks one, and invokes it. The inbound gateway is a proxy in front of your own services, speaking the Model Context Protocol on the agent-facing side. [Barbacane](/mcp/) is an inbound gateway. - -The two travel opposite directions. That is why "AI gateway" collides badly as a term: the same word is doing double duty for two products that touch different parts of your infrastructure, solve different problems, and typically do not compete at procurement. +If outbound LLM proxy is all you need, all three will work. The differences show up in what else the gateway does, how it is configured, and what happens when your requirements grow beyond the LLM path. --- -### What Portkey and LiteLLM do, concretely +### What Portkey is + +Portkey is a commercial AI gateway, available as managed SaaS or self-hosted. It focuses specifically on the LLM path and invests heavily in the operator experience: a configuration UI, a playground, a prompt library, an observability dashboard purpose-built for LLM traffic. It tends to be the right pick if you want an AI gateway as a product (vendor support, managed upgrades, fancy UI) and AI is the thing your team cares about most. + +### What LiteLLM is -Portkey and LiteLLM are not identical, but they solve the same category of problem. Both let your application call LLMs through one interface, with the operational controls an application team needs. +LiteLLM is an open-source Python proxy that exposes a very broad set of LLM providers behind one unified OpenAI-compatible API. Actively developed, wide provider coverage, can run as a Python library or as a proxy server. Good pick if you want broad provider support, an MIT-licensed OSS foundation, and a Python-native runtime that plays well with your ML tooling. -Both typically provide: +### What Barbacane is -- **Provider abstraction.** Swap OpenAI for Anthropic with a config change, not a code change. -- **Fallbacks and retries.** If one provider errors, try another. If the first attempt fails, back off and try again. -- **Caching.** Identical prompts can return cached completions, saving latency and tokens. -- **Observability.** Per-call traces, token usage, latency histograms, cost dashboards. -- **Budget guardrails.** Spend limits per key, tenant, or user. -- **Prompt safety.** Varies by product, but most offer some form of scrubbing and policy enforcement on the prompt leaving your infrastructure. +Barbacane is an open-source, Rust-native API gateway. AI capability is built from composable plugins rather than a monolithic feature: -Portkey and LiteLLM do these jobs well. If your application sends prompts to an LLM, you should be running one of them, or something similar, between the application and the provider. Calling OpenAI directly from production code without this layer is a common way to turn a weekend experiment into a Monday incident. +- **`ai-proxy` dispatcher** routes requests to OpenAI, Anthropic, and Ollama (plus any OpenAI-compatible endpoint: vLLM, TGI, LocalAI, Azure). The client always sends OpenAI format; the dispatcher translates per provider, pins the provider API version, and handles SSE streaming where the provider supports it. +- **Named targets + `cel` middleware** express policy-driven routing. A target like `premium` is a full provider profile (provider, model, credentials); the `cel` middleware writes `ai.target` into the request context when a rule matches, and the dispatcher picks the target from there. Credentials never leave dispatcher config. +- **`ai-prompt-guard`, `ai-token-limit`, `ai-cost-tracker`, `ai-response-guard`** middlewares compose around the dispatcher. Each is a separate, skippable concern with named profiles, CEL expressions, and fail-closed defaults on misconfig. + +And one more capability Portkey and LiteLLM do not offer: Barbacane is also an [MCP gateway](/mcp/). The same artifact that proxies your LLM traffic outbound also exposes your existing APIs to AI agents as tools inbound. One gateway covers both directions of AI traffic. --- -### What Barbacane does, concretely +### The architectural difference: monolithic AI proxy vs dispatcher plus middlewares -Barbacane is an [MCP gateway](/blog/what-is-an-mcp-gateway/). It does not sit between your application and OpenAI. It sits between an AI agent and your internal APIs, turning operations in your OpenAPI spec into MCP tools the agent can discover and call. +This is where the three products diverge. -Specifically: +Portkey and LiteLLM treat the AI gateway as a unified product: one binary, one config, one API surface. Every operational concern (rate limits, caching, observability, guardrails) is a feature baked into the proxy. This is the right shape when AI is the only traffic the gateway handles. -- **Tool exposure.** Your existing OpenAPI spec compiles into an MCP tool server. Every operation with an `operationId` and a `summary` becomes a typed, agent-callable tool. -- **Opt-out per operation.** Hide admin endpoints, destructive actions, or anything you do not want agents touching, with a spec annotation. -- **Middleware pass-through.** Tool calls are HTTP requests under the hood. Your existing auth, rate limits, validation, transformations, and observability apply without re-plumbing. -- **AI governance.** Prompt guarding, token limits, cost tracking, response guarding, layered on top of the API-gateway basics. -- **Spec-first.** The schema agents see is derived from the spec your API team already maintains. No parallel source of truth. +Barbacane treats the AI gateway as a set of primitives you compose: -None of this is a replacement for what Portkey or LiteLLM do. Your agent still needs an outbound LLM gateway to talk to its model. Barbacane does not talk to LLMs at all. +- The `ai-proxy` dispatcher handles translation and routing. +- Each concern is a separate middleware, ordered explicitly in the spec. +- You stack the middlewares you need, skip the ones you do not, and compose multiple instances of the same plugin (stack two `ai-token-limit` instances for a minute-and-hour window, stack multiple `cel` rules for routing). +- The exact same primitives govern non-AI traffic on the same gateway. ---- +The trade-off is sharp. If you want the shortest path from zero to "OpenAI call via a gateway", Portkey and LiteLLM win on time-to-live. If you want AI traffic governed the same way your team already governs every other HTTP request, Barbacane's composition model gets you there without a second product to run, a second config source to reconcile, or a second telemetry stack to watch. -### How they compose +The architectural bet is the same one the service-mesh community made five years ago: specialized proxies for specialized traffic, or one data plane that handles every protocol your platform cares about. Both are valid; they produce different operational footprints. -Most production systems running agents in 2026 look roughly like this: +--- -``` - [your application] - | - v - [outbound AI gateway] <- Portkey / LiteLLM - | - v - [LLM provider] <- OpenAI / Anthropic / self-hosted - | - v (the model decides to call a tool) - | - [MCP-compatible agent] - | - v - [inbound AI gateway / MCP gateway] <- Barbacane - | - v - [your APIs and services] +### Spec-first: OpenAPI as source of truth + +Portkey and LiteLLM configure AI routes in their own config files (YAML for LiteLLM, config UI or SDK for Portkey). Barbacane configures AI routes in your OpenAPI spec: + +```yaml +paths: + /v1/chat/completions: + post: + operationId: chatCompletion + summary: Route LLM chat completion requests + x-barbacane-dispatch: + name: ai-proxy + config: + provider: openai + model: gpt-4o + api_key: "${OPENAI_API_KEY}" + fallback: + - provider: anthropic + model: claude-sonnet-4-20250514 + api_key: "${ANTHROPIC_API_KEY}" + - provider: ollama + model: llama3 + base_url: http://ollama:11434 ``` -The outbound gateway is everything between your application and the model. The inbound gateway is everything between the agent and your infrastructure. The agent itself is the hinge. +The documentation your frontend team reads, the client SDKs they generate, the contracts your platform team enforces, and the gateway config your SRE team operates all derive from the same file. Adding an LLM route adds an entry in the spec. Renaming a parameter renames it everywhere. Vacuum-based lint runs shift-left in your editor, in a pre-commit hook, or in CI, so provider typos and invalid regex patterns fail at lint time, not at call time. -If you are building an agent product end-to-end, you probably want both. If you are only publishing APIs to be used by agents built elsewhere (which is the more common case for platform teams), you only need the inbound gateway. If you are only calling LLMs from your application without running agents against your APIs, you only need the outbound one. +If your organization is already spec-first for non-AI APIs, extending that discipline to AI routes is the cheapest integration path. If you do not run spec-first APIs, Portkey and LiteLLM feel more familiar because they do not ask you to change your workflow. --- -### Where the overlap actually is - -There is one real overlap worth naming, so nobody buys the same thing twice: **AI governance middleware.** +### The inbound direction: MCP -Both categories have to think about prompt safety, cost attribution, and rate limiting. The middleware surface looks similar on paper. The differences: +One axis Portkey and LiteLLM do not compete on. -- Outbound gateways govern traffic *your application sends to the LLM*. The prompt being guarded is written by your code or your user, on the way out. -- Inbound gateways govern traffic *an agent sends to your APIs*. The prompt, if there is one, is written by the agent and has already passed through the model, on the way in. +Portkey and LiteLLM sit between your application and the LLM. They do not stand between an AI agent and your APIs. That inbound direction is a different gateway category; we covered it at length in the [canonical MCP gateway post](/blog/what-is-an-mcp-gateway/). -Both layers are useful. They catch different things. A prompt-injection pattern that slips past your outbound PII scrubber can still be caught by your inbound response guard before it reaches the user. Budget limits at the outbound layer control what you pay OpenAI. Budget limits at the inbound layer control what the agent costs *you* to serve. +Barbacane is a full MCP gateway in addition to its outbound AI capability. One artifact handles both directions. Whether that matters depends on whether agents calling your APIs is in scope: -Treating them as redundant is a mistake. Running both is defense in depth. +- If you are building an agent product and your agents only hit public tools and third-party services, the inbound direction does not apply and the MCP capability is not doing work for you. +- If your agents call your internal APIs, or if you are a platform team preparing to expose internal APIs to agents built elsewhere, the inbound direction is real work. Barbacane treats it as a first-class concern. Portkey and LiteLLM leave it outside the gateway entirely, which means a separate MCP server per service and all the sprawl the canonical post describes. --- -### When you need which: a decision table +### When to pick which -| Situation | You need | -|--- |--- | -| Your application calls OpenAI or Anthropic directly | Outbound AI gateway (Portkey, LiteLLM) | -| You want one interface across multiple LLM providers | Outbound AI gateway | -| You want caching, retries, fallbacks, and cost dashboards for LLM calls | Outbound AI gateway | -| Your APIs are going to be called by AI agents | Inbound AI gateway / MCP gateway | -| You want your existing OpenAPI operations to become MCP tools | MCP gateway | -| You want auth, rate limits, and audit applied to agent tool calls | MCP gateway | -| You want to cap how much agents can cost you to *serve* | MCP gateway | -| You are shipping an end-to-end agent product | Both | -| You are a platform team enabling other teams' agents | MCP gateway; outbound is their problem | +| Situation | Pick | +|--- |--- | +| Fastest path from zero to an OpenAI call via a gateway, with an operator UI | Portkey | +| Very broad LLM provider coverage, Python-native, OSS-first | LiteLLM | +| Managed SaaS with vendor support and a polished dashboard | Portkey | +| AI gateway as part of a broader API gateway, not a second box | Barbacane | +| AI routes defined in your OpenAPI spec alongside the rest of your API | Barbacane | +| Same gateway also exposes your APIs to AI agents via MCP | Barbacane | +| OSS, self-hostable, Rust-native, FIPS-ready for regulated-industry posture | Barbacane | +| Platform team; AI is one of many gateway concerns (auth, routing, observability) | Barbacane | +| AI-first product team; LLM calls are the only traffic the gateway proxies | Portkey or LiteLLM | --- ### Feature comparison -A compact comparison. All three products evolve quickly, so treat this as direction, not specification. Check current docs before committing. - -| Concern | Portkey | LiteLLM | Barbacane | -|--- |--- |--- |--- | -| Direction of traffic | Outbound | Outbound | Inbound | -| Primary surface | LLM provider API | LLM provider API | MCP tool server | -| Source of truth for tools | N/A | N/A | Your OpenAPI spec | -| Protocol | HTTP JSON | HTTP JSON | JSON-RPC 2.0 (MCP) | -| Runtime | SaaS or self-host | Python proxy, self-host | Rust binary, self-host | -| License | Commercial + OSS core | Open source (MIT) | AGPLv3 + commercial | -| Typical buyer | Application team | Application team | Platform or AI team | -| Governs agent calls to your APIs | No | No | Yes | -| Governs calls to LLM provider | Yes | Yes | No | +A compact, direction-setting comparison. All three products evolve; check current docs before committing. + +| Concern | Portkey | LiteLLM | Barbacane | +|--- |--- |--- |--- | +| Outbound LLM proxy | Yes | Yes | Yes (`ai-proxy` dispatcher) | +| Inbound MCP gateway | No | No | Yes | +| Provider coverage | Broad | Very broad (100+ models) | OpenAI, Anthropic, Ollama, plus any OpenAI-compat API | +| Provider fallback | Yes | Yes | Yes | +| Policy-driven routing | Yes | Yes | Yes (via `cel` middleware + named targets) | +| Prompt and response guardrails | Built in | Built in | `ai-prompt-guard` + `ai-response-guard` middlewares | +| Token rate limits | Built in | Built in | `ai-token-limit` middleware | +| Cost tracking | Built-in dashboard | Built-in metrics | `ai-cost-tracker` middleware | +| Source of truth for config | Config UI or SDK | YAML config | OpenAPI spec | +| Runtime | SaaS and self-host | Python proxy | Rust binary | +| License | Commercial | MIT | AGPLv3 + commercial | +| Governs non-AI HTTP traffic | No | No | Yes (full API gateway) | Where a row says "No", the product was not designed for that concern. Forcing a tool into the wrong role is how shadow stacks start. @@ -143,20 +153,25 @@ Where a row says "No", the product was not designed for that concern. Forcing a ### What to watch for during procurement -If you are being pitched an "AI gateway" and the direction of traffic is not the first slide, ask. If you are being told a product does both inbound and outbound, dig into which side has actual engineering depth. Most products excel on one side and offer thin coverage on the other, usually because the sides solve structurally different problems and require different primitives. +If you are being pitched an AI gateway and the first question is "do you already run an API gateway?", you are in the right conversation. If it is not asked, ask it yourself. The answer changes what you need from the new product. -The healthy procurement pattern: +A short procurement checklist: -1. **Decide which direction matters.** If you are a platform team, start with inbound. If you are an application team, start with outbound. If you are both, buy both. -2. **Prefer specialists over generalists.** A sharp inbound gateway and a sharp outbound gateway compose cleanly. A vague do-everything gateway usually means you rebuild the missing side yourself later. -3. **Check the seam.** Verify that the outbound gateway's audit logs can correlate with the inbound gateway's audit logs, ideally on the agent identity. When an incident happens, you will want to follow the request from application through LLM through agent through your API without stitching together three different telemetry stacks. +1. **Where does AI gateway config live?** If the answer is "a second config file", you are creating a drift source. Prefer products that integrate with the spec or config surface your team already uses. +2. **Is the feature set monolithic or composable?** Monolithic is simpler day one and harder to extend. Composable is more to learn and easier to shape to your operational model. +3. **Does it govern agent traffic too?** If agents calling your APIs is on your roadmap, ask about MCP. If not, skip. +4. **How does it integrate with your observability stack?** Prometheus, OpenTelemetry, structured logs. Avoid products that ship their own telemetry you have to separately consume. +5. **Self-hosting path and license.** SaaS is fine for many teams; regulated, on-prem, or air-gapped environments will need an OSS, self-hostable option. --- ### Closing thoughts -Portkey and LiteLLM are good at what they do. Barbacane is good at what it does. They are not in the same evaluation. +All three products handle the core outbound LLM path competently. The axis that differentiates them is how the AI gateway relates to the rest of your infrastructure: + +- If AI is the primary problem and the AI gateway stands alone, **Portkey or LiteLLM** will get you live faster. Pick Portkey if you want SaaS with a UI. Pick LiteLLM if you want OSS breadth and a Python runtime. +- If AI is one of several gateway concerns and you want one spec-first artifact covering auth, rate limits, routing, AI, and MCP, **Barbacane** is the architecture fit. -If you are running agents in production and your infrastructure includes only one of these two gateway categories, you have a blind spot. If you are still deciding which to adopt first, ask which direction of traffic is currently unguarded. That is the gateway you need next. +Pick by architecture, not feature count. The feature sets will converge; the architectural assumptions will not. -For the inbound side, [Barbacane's /mcp page](/mcp/) is the five-minute version. For the outbound side, Portkey's and LiteLLM's own docs are the right place to start. The category distinction is doing most of the work; from there, evaluation is the usual procurement grind. +For the Barbacane side of the comparison, [the /mcp page](/mcp/) is the five-minute version, and the [canonical MCP gateway post](/blog/what-is-an-mcp-gateway/) is the longer read. For Portkey and LiteLLM, their own docs are the right place to start; their positioning is consistent enough that a fair comparison is easier now than it was a year ago. From a1242e1a08d12c019821cae61916a9a14489ca99 Mon Sep 17 00:00:00 2001 From: Nicolas Dreno Date: Wed, 29 Apr 2026 19:19:28 +0200 Subject: [PATCH 3/3] fix(blog): publish Barbacane vs Portkey/LiteLLM article (remove draft flag) --- src/content/blog/barbacane-vs-portkey-litellm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/blog/barbacane-vs-portkey-litellm.md b/src/content/blog/barbacane-vs-portkey-litellm.md index e22ea1e..e83946a 100644 --- a/src/content/blog/barbacane-vs-portkey-litellm.md +++ b/src/content/blog/barbacane-vs-portkey-litellm.md @@ -4,7 +4,7 @@ description: "Portkey, LiteLLM, and Barbacane all ship an outbound AI gateway. W publishDate: 2026-04-29 author: "Nicolas Dreno" tags: ["ai-gateway", "mcp-gateway", "portkey", "litellm", "comparison", "model-context-protocol", "ai-governance"] -draft: true +draft: false --- *If you are picking an AI gateway in 2026, Portkey, LiteLLM, and Barbacane are all real options. They overlap enough to make the choice real, and they differ enough that the right answer depends on what else you want your gateway to do.*