Docs: Propose AI Gateway CRDs and Controller by usize · Pull Request #419 · kagenti/kagenti-operator

usize · 2026-06-09T22:57:57Z

Summary

This proposal is derived from the upstream wg-ai-gateway project -- part of Kubernetes SIG-Network.

It targets Envoy AI Gateway as a proof of concept, but is meant to be extended to support additional proxies.

A novel integration with SPIFFE is proposed, as a value add unique to our platform.

Related issue(s)

This proposal is derived from the upstream wg-ai-gateway project -- part of Kubernetes SIG-Network. It targets Envoy AI Gateway as a proof of concept, but is meant to be extended to support additional proxies. A novel integration with SPIFFE is proposed, as a value add unique to our platform. Signed-off-by: usize <mofoster@redhat.com>

Signed-off-by: usize <mofoster@redhat.com>

pdettori · 2026-06-10T02:19:23Z

@usize I like the direction where this is going !

pdettori

Great direction — this is essentially the "service mesh for LLM calls" layer that completes Kagenti's infrastructure story. Kagenti already provides framework-neutral auth, identity (SPIFFE), and scaling for agents; adding platform-level control over the inference egress path is the natural next step. The renderer abstraction and alignment with WG AI Gateway proposals are solid forward-looking choices.

A few comments inline on areas that would strengthen the proposal.

pdettori · 2026-06-10T03:05:10Z

+  ├── read SPIRE trust bundle ConfigMap
+  ├── parse SPIFFE JSON → extract x509-svid certs → PEM
+  ├── create/update CA Secret
+  ├── create/update self-signed server cert (if no serverCertRef)


Status conditions need to be defined. The reconciler mentions "update status conditions" but the proposal never specifies what conditions are set. For production readiness, operators need clear signals: Is the gateway bound? Are providers valid? Is the SPIFFE bundle healthy?

Suggest adding a concrete .status.conditions schema for both CRDs, e.g.:

GatewayBound — target Gateway found and accepted

ProvidersConfigured — all referenced providers validated (credentials exist, endpoints reachable)

RoutingApplied — downstream resources generated successfully

BundleReady (AIAccessPolicy) — SPIFFE bundle parsed with ≥1 valid cert

Without these, debugging "why isn't my model routing?" becomes guesswork.

Err, @pdettori I gave Claude access to the gh tool and it disobeyed my global CLAUDE.md and posted a comment without my permission. I have a separate account and token (https://github.com/usize-agent) which is configured in my sandboxes. Unfortunately this instance was a quick session that I spun up without using my sandbox command. Lesson learned. Apologies to post AI garble. I'm updating the proposal. :]

pdettori · 2026-06-10T03:05:10Z

+
+### Gateway API 
+
+[Gateway API] is the Kubernetes-native standard for configuring network


Agent integration example missing. This is the key value prop — "tokenless inference access to agent workloads" — but the proposal doesn't show how an agent actually consumes the gateway. What URL does it call? How does it present its SPIFFE cert? Does the webhook sidecar handle this automatically?

A short end-to-end example connecting AIAccessPolicy → Gateway → Agent (showing the env var or mount the agent uses) would make this much more tangible and help reviewers validate the design against real usage patterns.

Also worth clarifying: how does this coexist with AuthBridge/AuthProxy if both are on the same namespace? And is this orthogonal to the existing MCP Gateway (MCP = protocol routing, AI Gateway = model routing)?

I've addressed Agent integration with AIAccessPolicy via the Why: tokenless inference access section.

I've touched on support for additional protocols e.g., MCP and integration with projects like AuthBridge as well.

The tl;dr is, yes, we can support MCP and A2A policies and I gave an example of what it would look like. But it's out of scope for our initial implementation, which is focused on governing inference.

pdettori · 2026-06-10T03:05:10Z

+routing. This maps directly to Envoy AI Gateway's rate limiting
+mechanism — one rule, one model header selector, one Redis counter —
+with no ambiguity in descriptor grouping.
+


Multi-tenancy / namespace scoping for rate limits. The rate limit design uses x-ai-eg-model header matching with one Redis counter per model. But in a multi-team deployment:

If two teams' AIRoutingPolicies define a model named gpt-4o against the same Gateway, do they share a Redis counter? That would silently pool their quotas.

If they use separate Gateways, is that documented as the isolation boundary?

Suggest clarifying whether model names are namespace-scoped in the Redis descriptor key (e.g., <namespace>/<model-name>) or whether the expectation is one Gateway per team.

More broadly: a short section on the namespace model (can teams share a Gateway? must each own one?) would help operators plan their deployment topology.

If two teams' AIRoutingPolicies define a model named gpt-4o against the same Gateway, do they share a Redis counter? That would silently pool their quotas.

In the current design. Yes. I punted that down below via:

Per-user or per-tenant rate limiting (multi-tenant cost allocation) is
a separate concern. If needed, a future policy CRD could attach to the
Gateway to layer per-client quotas on top of per-model limits.

Where per-tenant policy should live is a difficult question. If we create a different AIRoutingPolicy per tenant, we will end up duplicating a lot of boilerplate.

The model I had considered moving toward was to treat AIRoutingPolicy as a global configuration. With token rate limits etc... existing as defaults.

Then we could add another CRD for per-tenant configurations that overrides it. Let me think this through and update the proposal with something tidier here.

davidhadas · 2026-06-10T15:30:16Z

Hi, the naming of this GW is very confusing in the context of an agent platform.
From an agent platform POV it is an LLM GW, not an AI GW.
The agent platform offer ai services to users by deploying agents. These agents access an LLM resource, other agents, MCP resource and maybe other resources. ANy access to an LLM resource goes via the LLM GW - this makes sense :)

usize · 2026-06-11T23:08:18Z

Re: naming

The idea is to support the class of proxies broadly being called AI Gateways. See e.g., https://github.com/kubernetes-sigs/wg-ai-gateway

It's a term of art being used more broadly. For example, https://github.com/BerriAI/litellm is one of the most popular "AI Gateway" projects in use in agentic systems today.

What I will say, is that while governing inference is the initial scope. Support for programming a proxy with body based policies opens up the possibility of governing A2A traffic, MCP traffic etc....

Calling it an inference gateway operator would contradict the positioning of the proxies it's meant to program like Envoy AI Gateway, AgentGateway and others.

Signed-off-by: usize <mofoster@redhat.com>

…Pipeline Signed-off-by: usize <mofoster@redhat.com>

Signed-off-by: usize <mofoster@redhat.com>

usize · 2026-06-12T00:32:37Z

@pdettori I've added a number of new sections here.

fleshed out status conditions.
described both the why and the mechanism behind tokenless inference access via AIAccessPolicy
deferred client access to a followup -- I want to keep this proposal focused on programming the Gateway.
deferred multi-tenancy but elaborated on a possible solution via BackendTrafficPolicy-style precedence along with some open questions we'll need to answer.
elaborated on future support paths for Providers that speak protocols like MCP or A2A, along with Inference request protocols.
laid the foundation for inclusion of a PayloadProcessingPipeline to configure arbitrary body based policies in the Gateway. This is out of scope for this proposal, however, the discussion here should give us a good jumping off point when we area ready to tackle it.

…stream. Signed-off-by: usize <mofoster@redhat.com>

usize requested a review from a team as a code owner June 9, 2026 22:57

The title doesn't need to mention policy attachment patterns.

0132f91

Signed-off-by: usize <mofoster@redhat.com>

pdettori reviewed Jun 10, 2026

View reviewed changes

usize added 7 commits June 11, 2026 17:16

Enumerate status conditions for resources.

b8df5ff

Signed-off-by: usize <mofoster@redhat.com>

Describe client integration as out of scope.

c519bdb

Signed-off-by: usize <mofoster@redhat.com>

Defer multi-tenancy and multi-protocol support.

6e8e71c

Signed-off-by: usize <mofoster@redhat.com>

Lay the foundation for a future PayloadProcessingPipeline resource.

49ebb83

Signed-off-by: usize <mofoster@redhat.com>

Remove superfluous guardrails section | replaced by PayloadProcessing…

46f6905

…Pipeline Signed-off-by: usize <mofoster@redhat.com>

Explain the value of mTLS based inference access.

371dd51

Signed-off-by: usize <mofoster@redhat.com>

Fix inconsistencies leftover from new section additions.

8e3778f

Signed-off-by: usize <mofoster@redhat.com>

usize requested a review from pdettori June 12, 2026 00:32

Separate protocol from schema | align terminology with Gateway API up…

aebb753

…stream. Signed-off-by: usize <mofoster@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs: Propose AI Gateway CRDs and Controller#419

Docs: Propose AI Gateway CRDs and Controller#419
usize wants to merge 10 commits into
kagenti:mainfrom
usize:plan/ai-gateway

usize commented Jun 9, 2026

Uh oh!

pdettori commented Jun 10, 2026

Uh oh!

pdettori left a comment

Uh oh!

pdettori Jun 10, 2026

Uh oh!

usize Jun 11, 2026 •

edited

Loading

Uh oh!

pdettori Jun 10, 2026

Uh oh!

usize Jun 12, 2026

Uh oh!

pdettori Jun 10, 2026

Uh oh!

usize Jun 10, 2026

Uh oh!

davidhadas commented Jun 10, 2026

Uh oh!

usize commented Jun 11, 2026 •

edited

Loading

Uh oh!

usize commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		### Gateway API

		[Gateway API] is the Kubernetes-native standard for configuring network

Conversation

usize commented Jun 9, 2026

Summary

Related issue(s)

Uh oh!

pdettori commented Jun 10, 2026

Uh oh!

pdettori left a comment

Choose a reason for hiding this comment

Uh oh!

pdettori Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

usize Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pdettori Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

usize Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

pdettori Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

usize Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

davidhadas commented Jun 10, 2026

Uh oh!

usize commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

usize commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

usize Jun 11, 2026 •

edited

Loading

usize commented Jun 11, 2026 •

edited

Loading