diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc index d941010..31bb54e 100644 --- a/modules/ROOT/nav.adoc +++ b/modules/ROOT/nav.adoc @@ -11,7 +11,7 @@ ** xref:connect:architecture-patterns.adoc[Choose an Agent Architecture] ** xref:connect:system-prompts.adoc[Write Effective System Prompts] ** xref:connect:create-agent.adoc[Create an Agent] -** xref:connect:byoa-register.adoc[Register Your Own Agent (BYOA)] +** xref:connect:self-managed-agents.adoc[Set Up a Self-Managed Agent] ** xref:connect:triggers/overview.adoc[Trigger Agents from External Channels] *** xref:connect:triggers/microsoft-teams.adoc[Connect an Agent to Microsoft Teams] @@ -55,8 +55,6 @@ ** xref:monitor:monitor-agents.adoc[Monitor Agent Activity] ** xref:monitor:transcripts.adoc[See What Your Agent Did] ** xref:monitor:troubleshoot-ai-agents.adoc[Troubleshoot Agents] -** xref:monitor:byoa-telemetry.adoc[Send BYOA Telemetry] -*** xref:monitor:ingest-custom-traces.adoc[Ingest OpenTelemetry Traces from Custom Agents] * xref:control:index.adoc[Control & Govern] ** xref:control:guardrails/index.adoc[Set Up Guardrails] diff --git a/modules/connect/images/self-managed-agent-credentials.png b/modules/connect/images/self-managed-agent-credentials.png new file mode 100644 index 0000000..e2d979b Binary files /dev/null and b/modules/connect/images/self-managed-agent-credentials.png differ diff --git a/modules/connect/images/self-managed-agent-secret-created.png b/modules/connect/images/self-managed-agent-secret-created.png new file mode 100644 index 0000000..ae563d1 Binary files /dev/null and b/modules/connect/images/self-managed-agent-secret-created.png differ diff --git a/modules/connect/images/self-managed-agent-setup-env.png b/modules/connect/images/self-managed-agent-setup-env.png new file mode 100644 index 0000000..29eb801 Binary files /dev/null and b/modules/connect/images/self-managed-agent-setup-env.png differ diff --git a/modules/connect/images/self-managed-agent-setup-steps.png b/modules/connect/images/self-managed-agent-setup-steps.png new file mode 100644 index 0000000..e3b1549 Binary files /dev/null and b/modules/connect/images/self-managed-agent-setup-steps.png differ diff --git a/modules/connect/images/self-managed-agent-transcripts.png b/modules/connect/images/self-managed-agent-transcripts.png new file mode 100644 index 0000000..3fb3572 Binary files /dev/null and b/modules/connect/images/self-managed-agent-transcripts.png differ diff --git a/modules/connect/pages/agents.adoc b/modules/connect/pages/agents.adoc index f3223f2..e0af0df 100644 --- a/modules/connect/pages/agents.adoc +++ b/modules/connect/pages/agents.adoc @@ -2,4 +2,4 @@ :description: Understand how AI agents work in the Agentic Data Plane, then create, register, and design them. :page-layout: index -Agents are the workloads that call LLMs and tools through the Agentic Data Plane. Start with how agents work, then create a declarative agent, register one you run yourself, and apply architecture and system-prompt best practices. +Agents are the workloads that call LLMs and tools through the Agentic Data Plane. Start with how agents work, then create a declarative agent, set up an agent you host yourself, and apply architecture and system-prompt best practices. diff --git a/modules/connect/pages/byoa-register.adoc b/modules/connect/pages/byoa-register.adoc deleted file mode 100644 index 50a0375..0000000 --- a/modules/connect/pages/byoa-register.adoc +++ /dev/null @@ -1,127 +0,0 @@ -= Register Your Own Agent (BYOA) -:description: Register a self-managed agent so it appears in the ADP agent registry, with its telemetry and cost attributed alongside managed agents. -:page-topic-type: how-to -:personas: agent_builder, platform_engineer -:learning-objective-1: Choose between a self-managed (BYOA) and a Redpanda-managed agent for your use case -:learning-objective-2: Register a self-managed agent so it appears in the agent registry -:learning-objective-3: Identify the telemetry and service-account model that attributes a self-managed agent's activity in ADP - -Register a self-managed agent with Redpanda ADP to bring it under the same observability and governance as managed agents. Your agent gains registry visibility, transcript capture, and cost attribution without moving off your own infrastructure. Registration creates a metadata record: ADP does not host, run, or proxy calls to your agent. - -After completing this guide, you will be able to: - -* [ ] {learning-objective-1} -* [ ] {learning-objective-2} -* [ ] {learning-objective-3} - -== When to use BYOA versus a managed agent - -The two models differ in who runs the agent, who owns scaling, and how the agent is defined. - -[cols="1,2,2"] -|=== -|Question |Choose BYOA when… |Choose a managed agent when… - -|Where does your agent run? -|You have an existing runtime (LangGraph, custom Go, and so on) you want to keep, and you run it yourself. -|You want Redpanda to host and operate the agent runtime for you. - -|How is the agent defined? -|It's already coded; you don't want to translate it into the declarative agent format. -|You want a declarative agent you configure through the ADP wizard, with no runtime code to maintain. - -|Who scales and operates it? -|Your team owns scaling, deploys, and the failure model. -|Redpanda owns the runtime; you reason about the agent definition only. - -|What are you optimizing for? -|Maximum control over runtime, libraries, networking, with governance and observability layered on. -|Time-to-first-running-agent and built-in observability without integration work. -|=== - -If you have a Redpanda-managed agent today and you're considering BYOA, you don't have to migrate; the two coexist in the same registry and the same dashboard. - -== What registration gives you - -When a self-managed agent is registered: - -* It appears on the *Agents* page alongside managed agents, and in cost-attribution queries. -* A service-account identity is created for the agent at registration. Mint credentials for it to authenticate the agent's calls to AI Gateway and its telemetry ingestion. Credential issuance follows the same pattern as managed agents. See xref:connect:concepts.adoc#service-account-authorization[Service account authorization]. - -Registration does not make the Agentic Data Plane run your agent or route requests to it. The agent runs in your infrastructure, and any clients (including other agents) call it directly. - -== Prerequisites - -Before you register a self-managed agent, make sure you have: - -* An agent running in your own infrastructure. -* The agent instrumented with OpenTelemetry, emitting the minimum required spans contract with its registered name as the `service.name`. See xref:monitor:byoa-telemetry.adoc[BYOA telemetry]. -* The `dataplane_adp_agent_create` permission, granted by the Writer built-in role. See xref:control:permissions-reference.adoc#agent-management-permissions[Agent management permissions]. -* A name for the agent that follows DNS-1123 conventions (1 to 63 characters, lowercase letters, numbers, and hyphens, starting with a letter). The name is immutable after registration. - -== Register the agent - -You can register a self-managed agent from the *Agents* page in the console, with the `rpk` CLI, or through `AgentRegistryService.CreateAgent`. The registration carries the agent's metadata, and you leave the agent-type configuration unset: - -[cols="1,3"] -|=== -|Field |What it carries - -|`name` -|DNS-1123 identifier. Immutable. Used in the registry, in cost-attribution queries, and as the agent's resource identifier. - -|`display_name` -|Human-readable label shown in the UI. Editable later. - -|`description` -|Free-text description of what the agent does. Editable later. - -|`tags` -|Optional key/value labels (up to 50 pairs). Useful for filtering agents in cost-attribution queries. - -|`agent_type` -|Leave unset. Populating the `managed` arm creates a Redpanda-managed agent instead; leaving the whole field unset registers a self-managed (BYOA) metadata record. -|=== - -Once registered, your agent appears on the *Agents* page. The first time it serves a request and emits telemetry, transcripts begin populating. - -== Make your agent callable by other agents (optional) - -If you want other agents or clients to call your agent, expose a standard https://a2aproject.org/[A2A protocol] endpoint on your own infrastructure (an agent-card document at `/.well-known/agent-card.json`, plus the A2A message endpoints). Callers reach your agent directly at its own address. The Agentic Data Plane does not proxy A2A traffic to self-managed agents; its A2A reverse proxy serves Redpanda-managed agents only. - -For the agent-card schema and the A2A message-endpoint shapes, see xref:connect:a2a-concepts.adoc[Agent-to-agent concepts]. - -== Verify the registration - -After registering, confirm the following end-to-end: - -. *Registry*: The agent appears on the *Agents* page and in cost-attribution queries. At the API level, confirm `AgentRegistryService.ListAgents` returns it. -. *Telemetry*: Open the transcripts list, filter by your agent's `service.name`, and confirm a recent execution shows up with non-zero token counts and a non-empty conversation ID. If it doesn't, see xref:monitor:byoa-telemetry.adoc[BYOA telemetry] troubleshooting. - -== Troubleshooting - -[cols="1,2"] -|=== -|Symptom |What to check - -|Agent registered but doesn't appear on the *Agents* page -|Confirm `AgentRegistryService.ListAgents` returns the agent. If `ListAgents` is empty, the registration didn't persist; retry. - -|Transcripts list shows the agent column blank for your agent's runs -|Your agent's OpenTelemetry `service.name` resource attribute doesn't match the registered name, or isn't being emitted at all. See xref:monitor:byoa-telemetry.adoc[BYOA telemetry]. -|=== - -== Limitations - -This page does not cover: - -* *Building the agent itself.* Bring whatever runtime, framework, and language you want. The integration points (telemetry attributes plus the registry record) are what make it visible in ADP. -* *Tool use through MCP.* If your agent calls MCP servers hosted in AI Gateway, see xref:connect:mcp-overview.adoc[MCP Servers] for the consumer-side flow. Tool calls appear in your agent's transcript when MCP servers emit their own spans. -* *Gateway-proxied A2A routing.* The Agentic Data Plane does not route A2A calls to self-managed agents; that is a managed-agent capability. Clients call your agent directly. -* *Migration from a managed declarative agent to BYOA.* The two coexist; BYOA is for agents that already exist outside the managed runtime, not for re-platforming existing managed agents. - -== Next steps - -* xref:connect:a2a-concepts.adoc[Agent-to-agent concepts] -* xref:monitor:byoa-telemetry.adoc[BYOA telemetry] -* xref:connect:create-agent.adoc[Create a declarative agent] diff --git a/modules/connect/pages/create-agent.adoc b/modules/connect/pages/create-agent.adoc index 1461deb..db1d92e 100644 --- a/modules/connect/pages/create-agent.adoc +++ b/modules/connect/pages/create-agent.adoc @@ -27,7 +27,7 @@ After reading this page, you will be able to: . Open *Agents* in the sidebar. . Click *Create agent*. -. Choose how the agent runs. Click *Redpanda manages it*, so Redpanda deploys, runs, and observes the agent for you. (To register an agent you build and run yourself, see xref:connect:byoa-register.adoc[].) +. Choose how the agent runs. Click *Redpanda manages it*, so Redpanda deploys, runs, and observes the agent for you. (To run an agent you host yourself, see xref:connect:self-managed-agents.adoc[].) + image::shared:create-agent-runtime-choice.png[The runtime choice in the create-agent flow, with a Redpanda manages it card for the managed runtime and an I host it myself card for self-managed agents] diff --git a/modules/connect/pages/self-managed-agents.adoc b/modules/connect/pages/self-managed-agents.adoc new file mode 100644 index 0000000..bbca36f --- /dev/null +++ b/modules/connect/pages/self-managed-agents.adoc @@ -0,0 +1,1088 @@ += Set Up a Self-Managed Agent +:description: Register a self-managed agent, issue it a client credential, and route its LLM and tool calls through the AI Gateway so spend, traces, and transcripts attribute back to the agent. +:page-topic-type: how-to +:personas: agent_builder, platform_engineer +:page-aliases: connect:byoa-register.adoc +:learning-objective-1: Choose a self-managed agent over a managed agent for your use case +:learning-objective-2: Register a self-managed agent and issue it an OAuth client credential +:learning-objective-3: Route an agent's LLM and MCP calls through the AI Gateway and group them into transcripts + +// Source: cloudv2 `proto/public/cloud/redpanda/api/adp/v1alpha1/agent.proto` (Agent.agent_type oneof, OauthClient) cross-referenced against `apps/adp-ui/src/components/agents/agent-type-picker.tsx`, `self-managed-create-form.tsx`, `credentials-tab.tsx`, and `setup-tab.tsx` on origin/main, and verified on adp-production 2026-06-16. + +A self-managed agent is an agent you build and run yourself, registered with Redpanda ADP as an identity. You keep your runtime, framework, and hosting; ADP gives the agent a service account and a client credential, and the glossterm:AI Gateway[] becomes the agent's LLM and MCP endpoint. Because every model call and tool call flows through the gateway, ADP attributes spend, tokens, latency, and traces back to the agent and reconstructs each session as a transcript. ADP does not host or run your agent. + +After reading this page, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + +== Self-managed compared to managed agents + +The two agent types differ in who runs the agent and how it is defined. They coexist in the same registry, the same governance views, and the same cost-attribution queries. + +[cols="1,2,2"] +|=== +|Question |Self-managed |Managed + +|Who runs the agent? +|You do. ADP registers the agent and proxies its LLM and tool calls, but the runtime is yours. +|Redpanda deploys, runs, and observes the agent for you. + +|How is the agent defined? +|It is already coded in your own framework, for example, LangChain, CrewAI, or a custom runtime. +|You configure it declaratively through the create form, with no runtime code to maintain. + +|What connects it to ADP? +|A client credential the agent exchanges for a gateway token. Your code points its LLM and MCP clients at the gateway. +|The managed runtime wires the gateway for you. +|=== + +For the declarative path, see xref:connect:create-agent.adoc[Create an agent]. + +== Prerequisites + +* The `dataplane_adp_agent_create` permission, granted by the Writer built-in role. See xref:control:permissions-reference.adoc#agent-management-permissions[Agent management permissions]. +* At least one xref:gateway:configure-provider.adoc[LLM provider configured] in ADP. The agent calls the model through this provider. +* If the agent calls tools: One or more xref:connect:mcp-overview.adoc[MCP servers] registered in ADP. +* An agent built in your own framework. The *Setup* tab generates ready-to-paste samples for ai-sdk-go, LangChain, CrewAI, ADK Java, and ADK Go. + +== Register the agent + +. Open *Agents* in the sidebar. +. Click *Create agent*. +. Click *I host it myself*, so ADP registers the agent as an identity and leaves the runtime to you. ++ +image::shared:create-agent-runtime-choice.png[The runtime choice in the create-agent flow, with a Redpanda manages it card for the managed runtime and an I host it myself card for self-managed agents] + +. Fill in the identity fields, then click *Create agent*: ++ +* *Agent ID*: Required. Lowercase letters, numbers, and hyphens, up to 63 characters. Used in URLs, in cost-attribution queries, and as the agent's resource identifier. Immutable after creation. +* *Display name*: Human-readable name shown in the agent list and detail header. +* *Description*: Optional. Up to 1024 characters. +* *Tags*: Optional key/value pairs to organize and filter agents. + +The agent opens on its detail page with a *Self-managed* type badge. A self-managed agent carries no provider, model, or tool configuration of its own: it is an identity that calls the organization's shared gateway resources. + +== Issue a client secret + +ADP provisions a service account for the agent at registration. To authenticate the agent's calls, issue an OAuth 2.0 client secret on the agent's *Credentials* tab. + +. Open the agent's *Credentials* tab. ++ +image::self-managed-agent-credentials.png[The Credentials tab for a self-managed agent, showing the service account Client ID, the authorized scope, and an empty client-secrets list with a Create secret button] + +. Note the *Client ID*. It has the form `serviceaccounts/`, where `` is the agent's identifier. The Client ID is public and stable: every secret on the agent shares it. +. Click *Create secret*. +. Optionally enter a *Label* to identify the secret in the list and in audit logs, for example, `production`. +. Click *Generate secret*. ++ +ADP shows the plaintext *Client secret* one time. ++ +image::self-managed-agent-secret-created.png[The Secret created dialog showing a masked client secret, a Copy secret button, and a warning that the plaintext is shown only at creation] + +. Copy the secret into your secret manager or container environment variables, then click *I've saved it*. + +[IMPORTANT] +==== +The client secret is shown only at creation. ADP stores a hash and cannot show the plaintext again. Each secret expires 90 days after creation. To rotate without downtime, create a new secret, deploy it, and then revoke the old one with *Revoke* on its row in the secrets list. +==== + +== Connect your agent to the AI Gateway + +The agent's *Setup* tab generates everything your code needs: the gateway endpoints, an environment-variable block, and a copy-paste SDK sample for your framework. Open the *Setup* tab, then wire three things. + +image::self-managed-agent-setup-steps.png[The Setup tab listing the three steps to integrate a self-managed agent: route LLM calls through the gateway, add MCP servers, and stamp the conversation header] + +On the *Setup* tab, select your *LLM provider*. Optionally, select the *MCP servers* the sample connects through. Your selection fills in the *Environment variables* block without changing the agent. + +image::self-managed-agent-setup-env.png[The Setup tab Environment variables block, with exported values for the client ID, token URL, LLM provider URL, and MCP base URL] + +In the endpoint URLs below, `` is your cluster's identifier and `` is `\https://aigw..clusters.rdpa.co`. Copy the exact values from the *Setup* tab. + +=== Authenticate with the client credential + +The gateway runs its own OAuth 2.0 identity provider. Exchange the Client ID and client secret for a short-lived access token with the `client_credentials` grant against the token endpoint: + +[source,text] +---- +/oauth/idp/token +---- + +Send the resulting token as an `Authorization: Bearer` header on every LLM and MCP request. The gateway authenticates on this token and injects the real upstream provider key itself, so your SDK's own API-key field is a placeholder. Your client is responsible for refreshing the token before it expires. + +=== Route LLM calls through the gateway + +Point your SDK's base URL at the provider's gateway endpoint instead of the upstream API: + +[source,text] +---- +/llm/v1/providers/ +---- + +In this URL, `` is the name of an LLM provider you configured in ADP. The gateway forwards each provider's native API to the upstream, so you keep using the provider's own SDK. The provider enforces a model allow-list: pick a model the provider serves, or the gateway rejects the call. For the full proxy contract and per-SDK setup, see xref:gateway:connect-agent.adoc[Connect your app to AI Gateway]. + +=== Route MCP tool calls through the gateway + +Point your MCP client at each server's gateway URL, with the same bearer token: + +[source,text] +---- +/mcp/v1/ +---- + +In this URL, `` is the name of an MCP server registered in ADP. Routing tool calls through the gateway keeps them under the same identity, governance, and observability as the model calls. + +=== Group calls into transcripts + +Stamp every request, both the model call and each tool call, with the `X-Redpanda-Genai-Conversation` header set to your framework's own session identifier, for example, a chat-thread ID or a request ID. The gateway groups that session's calls into one transcript on the agent's *Transcripts* tab. + +This header is required for transcripts. Without it, the gateway drops the spans and the *Transcripts* tab stays empty. The header does not affect authentication or whether calls succeed. + +Each distinct header value becomes one conversation on the *Transcripts* tab, grouping every model call and tool call that carry it into one row. The *Turns* column counts the model calls in that conversation, so an agent that loops over several tool calls shows more than one. + +image::self-managed-agent-transcripts.png[The Transcripts tab listing conversations, each grouped under one conversation ID, with columns for when it started, its duration, the number of turns, the status, and the token count] + +== Framework samples + +The *Setup* tab generates a ready-to-paste sample for your framework, prefilled with your selected provider and MCP servers. Each sample performs the three steps the framework's way: it runs the `client_credentials` grant, routes the LLM client and every MCP client through one HTTP client that carries the bearer token, and stamps the framework's own session identifier as the `X-Redpanda-Genai-Conversation` header. + +The samples read their configuration from these environment variables: + +[cols="1,2"] +|=== +|Variable |Where to get it + +|`REDPANDA_CLIENT_ID` +|The Client ID from the *Credentials* tab. + +|`REDPANDA_CLIENT_SECRET` +|A client secret you minted on the *Credentials* tab. + +|`REDPANDA_TOKEN_URL` +|The token endpoint from the *Setup* tab, ending in `/oauth/idp/token`. + +|`REDPANDA_LLM_PROVIDER_URL` +|The provider-scoped LLM endpoint from the *Setup* tab, ending in `/llm/v1/providers/`. + +|`REDPANDA_LLM_PROVIDER_TYPE` +|The upstream provider family the SDK builds against: `openai`, `anthropic`, or `google`. + +|`REDPANDA_LLM_MODEL` +|A model the provider serves. + +|`REDPANDA_MCP_BASE_URL` +|The MCP base endpoint from the *Setup* tab, ending in `/mcp/v1`. + +|`REDPANDA_MCP_SERVERS` +|A comma-separated list of MCP server names to connect, or empty for none. + +|=== + +Set these variables, then run the sample for your framework. + +[tabs] +====== +ai-sdk-go:: ++ +-- +[source,go] +---- +package main + +import ( + "context" + "crypto/rand" + "encoding/hex" + "fmt" + "log" + "net/http" + "os" + "strings" + "time" + + "golang.org/x/oauth2" + "golang.org/x/oauth2/clientcredentials" + + "github.com/redpanda-data/ai-sdk-go/agent" + "github.com/redpanda-data/ai-sdk-go/agent/llmagent" + "github.com/redpanda-data/ai-sdk-go/llm" + "github.com/redpanda-data/ai-sdk-go/providers/anthropic" + "github.com/redpanda-data/ai-sdk-go/providers/google" + "github.com/redpanda-data/ai-sdk-go/providers/openai" + "github.com/redpanda-data/ai-sdk-go/runner" + "github.com/redpanda-data/ai-sdk-go/store/session" + "github.com/redpanda-data/ai-sdk-go/tool" + "github.com/redpanda-data/ai-sdk-go/tool/mcp" +) + +// convoKey carries the conversation id on the context. +type convoKey struct{} + +// convoTransport stamps the session id (read from the context) as the +// conversation header. It sits beneath the oauth2 transport, so one http.Client +// carries the bearer AND the conversation id on the LLM call and every MCP tool +// call. +type convoTransport struct{ base http.RoundTripper } + +func (t *convoTransport) RoundTrip(r *http.Request) (*http.Response, error) { + if id, ok := r.Context().Value(convoKey{}).(string); ok && id != "" { + r = r.Clone(r.Context()) + r.Header.Set("X-Redpanda-Genai-Conversation", id) + } + return t.base.RoundTrip(r) +} + +func main() { + ctx := context.Background() + + // OAuth2 client_credentials: x/oauth2 fetches and refreshes the bearer and + // its Transport sets it on every request; convoTransport underneath adds the + // conversation header. One client instruments the LLM call and every MCP call. + cc := clientcredentials.Config{ + ClientID: mustEnv("REDPANDA_CLIENT_ID"), + ClientSecret: mustEnv("REDPANDA_CLIENT_SECRET"), + TokenURL: mustEnv("REDPANDA_TOKEN_URL"), + } + hc := &http.Client{Transport: &oauth2.Transport{ + Source: cc.TokenSource(ctx), + Base: &convoTransport{base: http.DefaultTransport}, + }} + + model, err := buildModel(ctx, hc) + if err != nil { + log.Fatal(err) + } + + // MCP tools ride the SAME client. Each client syncs its server's tools into a + // shared registry; Start connects and performs that initial sync. The agent + // is then built from the registry, so the model can actually call the tools. + registry := tool.NewRegistry(tool.RegistryConfig{}) + mcpBase := mustEnv("REDPANDA_MCP_BASE_URL") + for _, name := range mcpServers() { + factory := mcp.NewStreamableTransport(mcpBase+"/"+name, mcp.WithHTTPClient(hc)) + client, err := mcp.NewClient(name, factory, + mcp.WithRegistry(registry), // sync this server's tools into the registry + mcp.WithToolTimeout(time.Minute)) + if err != nil { + log.Fatal(err) + } + if err := client.Start(ctx); err != nil { + log.Fatal(err) + } + defer client.Close() + } + + // WithTools(registry) is what hands the synced MCP tools to the model. + ag, err := llmagent.New("assistant", "You are a helpful agent.", model, + llmagent.WithTools(registry)) + if err != nil { + log.Fatal(err) + } + run, err := runner.New(ag, session.NewInMemoryStore()) + if err != nil { + log.Fatal(err) + } + + // The runner takes a CALLER-owned conversation id (run.Run keys the session on + // it; ai-sdk-go does not mint one). Use your app's own id - a chat thread id, + // request id, A2A contextId - reused across the turn; we mint one here. It is + // the value convoTransport stamps as X-Redpanda-Genai-Conversation on the model + // call and every MCP tool call. + const prompt = "What tools can you call?" + conversationID := newConversationID() + ctx = context.WithValue(ctx, convoKey{}, conversationID) + + fmt.Printf("> %s\n\n", prompt) + msg := llm.NewMessage(llm.RoleUser, llm.NewTextPart(prompt)) + for ev, err := range run.Run(ctx, "user-123", conversationID, msg) { + if err != nil { + log.Fatal(err) + } + // MessageEvent carries a finished assistant turn (an agentic run may have + // several). Print its text so you can see the model actually replied. + if m, ok := ev.(agent.MessageEvent); ok { + fmt.Println(m.Response.TextContent()) + } + } +} + +// buildModel constructs the native ai-sdk-go model for the configured provider. +// REDPANDA_LLM_PROVIDER_TYPE selects the SDK: "anthropic" and "google" use their +// native wire (the gateway forwards /v1/messages and /v1beta/...:generateContent +// to the upstream), everything else uses OpenAI chat-completions. All three point +// at the same provider-scoped REDPANDA_LLM_PROVIDER_URL. The bearer (set by the +// oauth2 transport) is the real auth; the key arg only satisfies the constructor +// (the gateway ignores the native x-api-key/x-goog-api-key). +func buildModel(ctx context.Context, hc *http.Client) (llm.Model, error) { + base := mustEnv("REDPANDA_LLM_PROVIDER_URL") + model := mustEnv("REDPANDA_LLM_MODEL") + const key = "redpanda-gateway" + switch strings.ToLower(os.Getenv("REDPANDA_LLM_PROVIDER_TYPE")) { + case "anthropic": + p, err := anthropic.NewProvider(key, anthropic.WithBaseURL(base), anthropic.WithHTTPClient(hc)) + if err != nil { + return nil, err + } + return p.NewModel(model) + case "google", "gemini": + p, err := google.NewProvider(ctx, key, google.WithBaseURL(base), google.WithHTTPClient(hc)) + if err != nil { + return nil, err + } + return p.NewModel(model) + default: // openai (and openai-compatible) + p, err := openai.NewProvider(key, openai.WithBaseURL(base), openai.WithHTTPClient(hc)) + if err != nil { + return nil, err + } + return p.NewModel(model) + } +} + +// mustEnv reads a required env var, exiting with a clear message (not an opaque +// downstream panic) when it is unset. Export the values from the Setup tab. +func mustEnv(k string) string { + v := os.Getenv(k) + if v == "" { + log.Fatalf("missing env var %s - export it from the Setup tab", k) + } + return v +} + +// newConversationID mints a fresh conversation id. In a real app, use your own +// per-conversation id (chat thread id, request id, A2A contextId) reused across +// the turn, not a value generated per call. +func newConversationID() string { + b := make([]byte, 8) + _, _ = rand.Read(b) + return "conv-" + hex.EncodeToString(b) +} + +// mcpServers reads the comma-separated REDPANDA_MCP_SERVERS list. Empty is fine +// - the agent then runs with no MCP tools. +func mcpServers() []string { + var out []string + for _, p := range strings.Split(os.Getenv("REDPANDA_MCP_SERVERS"), ",") { + if p = strings.TrimSpace(p); p != "" { + out = append(out, p) + } + } + return out +} +---- +-- + +LangChain:: ++ +-- +[source,python] +---- +import asyncio +import contextvars +import os + +import httpx +from langchain_mcp_adapters.client import MultiServerMCPClient +from langgraph.prebuilt import create_react_agent + + +def env(k: str) -> str: + """Read a required env var, failing with a clear message (not an opaque + KeyError) when it is unset. Export the values from the Setup tab.""" + v = os.environ.get(k) + if not v: + raise SystemExit(f"missing env var {k} - export it from the Setup tab") + return v + + +def get_access_token() -> str: + """OAuth2 client_credentials grant against the gateway IDP (httpx is already a dep).""" + resp = httpx.post( + env("REDPANDA_TOKEN_URL"), + data={ + "grant_type": "client_credentials", + "client_id": env("REDPANDA_CLIENT_ID"), + "client_secret": env("REDPANDA_CLIENT_SECRET"), + }, + ) + resp.raise_for_status() + return resp.json()["access_token"] + + +token = get_access_token() +mcp_base = env("REDPANDA_MCP_BASE_URL") + +# The LangGraph thread_id IS the conversation. Carry it in a contextvar so the +# MCP transport reads the current one per request. +thread_var: contextvars.ContextVar[str] = contextvars.ContextVar("thread") + + +class GatewayAuth(httpx.Auth): + """MCP headers are fixed per connection, so inject per request via httpx.Auth. + + The default tool mode opens a fresh session per call, so auth_flow re-reads + the contextvar and always carries the current thread id. + """ + + def auth_flow(self, request): + request.headers["Authorization"] = f"Bearer {token}" + request.headers["X-Redpanda-Genai-Conversation"] = thread_var.get() + yield request + + +def mcp_servers() -> dict: + """Build the MultiServerMCPClient connection map from REDPANDA_MCP_SERVERS.""" + servers = {} + for name in os.environ.get("REDPANDA_MCP_SERVERS", "").split(","): + name = name.strip() + if name: + servers[name] = { + "transport": "streamable_http", + "url": f"{mcp_base}/{name}", + "auth": GatewayAuth(), + } + return servers + + +mcp_client = MultiServerMCPClient(mcp_servers()) + + +def build_model(thread_id: str): + """Construct the native LangChain chat model for the configured provider. + + REDPANDA_LLM_PROVIDER_TYPE picks the SDK. Auth is the gateway bearer token, + injected on the Authorization header; the SDK's own api_key field is just a + non-empty placeholder (the gateway ignores the native x-api-key / + x-goog-api-key). The thread id rides along as the conversation header. + """ + model = env("REDPANDA_LLM_MODEL") + base_url = env("REDPANDA_LLM_PROVIDER_URL") + headers = { + "Authorization": f"Bearer {token}", + "X-Redpanda-Genai-Conversation": thread_id, + } + provider = os.environ.get("REDPANDA_LLM_PROVIDER_TYPE", "openai").lower() + + if provider == "anthropic": + from langchain_anthropic import ChatAnthropic + + return ChatAnthropic( + model=model, + base_url=base_url, + default_headers=headers, + api_key="unused", # gateway authenticates on the bearer header + ) + + if provider in ("google", "gemini"): + from langchain_google_genai import ChatGoogleGenerativeAI + + return ChatGoogleGenerativeAI( + model=model, + base_url=base_url, + api_version="v1beta", # native Gemini wire under the provider URL + additional_headers=headers, + api_key="unused", # gateway authenticates on the bearer header + ) + + # openai (and openai-compatible) + from langchain_openai import ChatOpenAI + + return ChatOpenAI( + model=model, + base_url=base_url, + api_key=token, + ).bind(extra_headers={"X-Redpanda-Genai-Conversation": thread_id}) + + +async def chat(thread_id: str, text: str): + thread_var.set(thread_id) # one id per conversation + tools = await mcp_client.get_tools() + + llm = build_model(thread_id) + agent = create_react_agent(llm, tools) + return await agent.ainvoke( + {"messages": [("user", text)]}, + config={"configurable": {"thread_id": thread_id}}, + ) + + +async def main(): + result = await chat("user-123-thread-1", "What tools can you call?") + for message in result["messages"]: + message.pretty_print() + + +if __name__ == "__main__": + asyncio.run(main()) +---- +-- + +CrewAI:: ++ +-- +[source,python] +---- +import os + +import httpx +from crewai import LLM, Agent, Crew, Task +from crewai.llms.hooks import BaseInterceptor +from crewai_tools import MCPServerAdapter + + +def env(k: str) -> str: + """Read a required env var, failing with a clear message (not an opaque + KeyError) when it is unset. Export the values from the Setup tab.""" + v = os.environ.get(k) + if not v: + raise SystemExit(f"missing env var {k} - export it from the Setup tab") + return v + + +def get_access_token() -> str: + """OAuth2 client_credentials grant against the gateway IDP (httpx is already a dep).""" + resp = httpx.post( + env("REDPANDA_TOKEN_URL"), + data={ + "grant_type": "client_credentials", + "client_id": env("REDPANDA_CLIENT_ID"), + "client_secret": env("REDPANDA_CLIENT_SECRET"), + }, + ) + resp.raise_for_status() + return resp.json()["access_token"] + + +token = get_access_token() +provider_url = env("REDPANDA_LLM_PROVIDER_URL") +provider_type = os.environ.get("REDPANDA_LLM_PROVIDER_TYPE", "openai").lower() +model = env("REDPANDA_LLM_MODEL") +mcp_base = env("REDPANDA_MCP_BASE_URL") + +# The gateway authenticates on this Bearer token; the native x-api-key / +# x-goog-api-key are ignored, so the SDK's api_key is just a placeholder. +GATEWAY_API_KEY = "redpanda-gateway" + + +class GatewayInterceptor(BaseInterceptor): + """LLM side (OpenAI / Anthropic native clients): stamp the Bearer token and + the conversation id on every outbound request via a transport interceptor. + + The conversation id is carried per instance so one crew.kickoff() groups + cleanly. Both the OpenAI and Anthropic native clients build an httpx client + around this interceptor; Gemini does not support interceptors and is wired + separately (see build_llm). + """ + + def __init__(self, conversation_id: str) -> None: + self.conversation_id = conversation_id + + def on_outbound(self, request: httpx.Request) -> httpx.Request: + request.headers["Authorization"] = f"Bearer {token}" + request.headers["X-Redpanda-Genai-Conversation"] = self.conversation_id + return request + + def on_inbound(self, response): + return response + + async def aon_outbound(self, request): + return self.on_outbound(request) + + async def aon_inbound(self, response): + return response + + +def build_llm(conversation_id: str) -> LLM: + """Build the native LLM for REDPANDA_LLM_PROVIDER_TYPE, pointed at the + gateway provider URL and carrying the Bearer token + conversation header. + """ + if provider_type == "anthropic": + # Native Anthropic SDK posts to {base_url}/v1/messages. The SDK sends + # x-api-key natively; the interceptor adds Authorization: Bearer (which + # the gateway authenticates on) plus the conversation header. + return LLM( + provider="anthropic", + model=model, + base_url=provider_url, + api_key=GATEWAY_API_KEY, + interceptor=GatewayInterceptor(conversation_id), + ) + + if provider_type in ("google", "gemini"): + # Native google-genai SDK posts to {base_url}/v1beta/models/{model}:generateContent. + # It does NOT support transport interceptors, so the Bearer token and the + # conversation header are set as fixed client headers via http_options. + from google.genai import types + + return LLM( + provider="gemini", + model=model, + api_key=GATEWAY_API_KEY, + client_params={ + "http_options": types.HttpOptions( + base_url=provider_url, + headers={ + "Authorization": f"Bearer {token}", + "X-Redpanda-Genai-Conversation": conversation_id, + }, + ), + }, + ) + + # openai (and openai-compatible): native OpenAI SDK posts to + # {base_url}/chat/completions. The interceptor stamps both headers. + return LLM( + provider="openai", + model=model, + base_url=provider_url, + api_key=GATEWAY_API_KEY, + interceptor=GatewayInterceptor(conversation_id), + ) + + +def mcp_server_params(conversation_id: str) -> list: + """MCP headers are fixed per connection, so build params per conversation.""" + servers = [] + for name in os.environ.get("REDPANDA_MCP_SERVERS", "").split(","): + name = name.strip() + if name: + servers.append( + { + "url": f"{mcp_base}/{name}", + "transport": "streamable-http", + "headers": { + "Authorization": f"Bearer {token}", + "X-Redpanda-Genai-Conversation": conversation_id, + }, + } + ) + return servers + + +def kickoff(prompt: str, llm: LLM, tools) -> str: + agent = Agent(role="Assistant", goal="Help the user", backstory="", llm=llm, tools=tools) + crew = Crew( + agents=[agent], + tasks=[Task(description=prompt, agent=agent, expected_output="A reply")], + ) + return crew.kickoff() + + +def run_conversation(conversation_id: str, prompt: str) -> str: + # One crew.kickoff() == one conversation: build the LLM and MCP clients with + # this id so the model call and every tool call carry the same header. + llm = build_llm(conversation_id) + servers = mcp_server_params(conversation_id) + if not servers: + return kickoff(prompt, llm, []) + with MCPServerAdapter(servers) as tools: + return kickoff(prompt, llm, list(tools)) + + +if __name__ == "__main__": + print(run_conversation("user-123-conversation-1", "What tools can you call?")) +---- +-- + +ADK Java:: ++ +-- +[source,java] +---- +package com.redpanda.example; + +import com.fasterxml.jackson.databind.JsonNode; +import com.fasterxml.jackson.databind.ObjectMapper; +import com.google.adk.agents.LlmAgent; +import com.google.adk.models.langchain4j.LangChain4j; +import com.google.adk.runner.Runner; +import com.google.adk.sessions.InMemorySessionService; +import com.google.adk.sessions.Session; +import com.google.adk.tools.mcp.McpToolset; +import com.google.adk.tools.mcp.StreamableHttpServerParameters; +import com.google.genai.types.Content; +import com.google.genai.types.Part; +import dev.langchain4j.model.anthropic.AnthropicChatModel; +import dev.langchain4j.model.chat.ChatModel; +import dev.langchain4j.model.googleai.GoogleAiGeminiChatModel; +import dev.langchain4j.model.openai.OpenAiChatModel; +import java.net.URI; +import java.net.URLEncoder; +import java.net.http.HttpClient; +import java.net.http.HttpRequest; +import java.net.http.HttpResponse; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.List; +import java.util.Locale; +import java.util.Map; + +public final class Main { + + private Main() {} + + public static void main(String[] args) throws Exception { + String token = accessToken(); + String appName = "redpanda-self-managed-agent"; + String userId = "user-123"; + + // ADK owns the session and mints its id - that id IS the conversation, never + // a hardcoded constant. langchain4j fixes customHeaders at build time, so we + // create the session first (createSession with a null id mints one), then + // pass its id to the chat model and the runner. The same id rides the LLM + // call and every MCP tool call. One run is one conversation. + InMemorySessionService sessions = new InMemorySessionService(); + Session session = sessions.createSession(appName, userId).blockingGet(); + String sessionId = session.id(); + + String model = env("REDPANDA_LLM_MODEL"); + ChatModel chat = buildChatModel(token, sessionId); + + // MCP: same bearer + conversation id, fixed per toolset construction. The + // agent is built from the toolsets, so the model can call their tools. + Map mcpHeaders = + Map.of( + "Authorization", "Bearer " + token, + "X-Redpanda-Genai-Conversation", sessionId); + String mcpBase = env("REDPANDA_MCP_BASE_URL"); + List tools = new ArrayList<>(); + for (String name : mcpServers()) { + tools.add( + new McpToolset( + StreamableHttpServerParameters.builder() + .url(mcpBase + "/" + name) + .headers(mcpHeaders) + .build())); + } + + LlmAgent agent = + LlmAgent.builder() + .name("assistant") + .description("Self-managed agent on the Redpanda AI Gateway.") + .instruction("You are a helpful agent.") + .model(LangChain4j.builder().chatModel(chat).modelName(model).build()) + .tools(tools) + .build(); + + // Build the runner over the SAME session service, so it sees the session we + // just minted above. + Runner runner = + Runner.builder().agent(agent).appName(appName).sessionService(sessions).build(); + + Content message = Content.fromParts(Part.fromText("What tools can you call?")); + runner + .runAsync(userId, sessionId, message) + .blockingForEach(event -> System.out.println(event.stringifyContent())); + } + + /** + * buildChatModel constructs the native langchain4j ChatModel for the configured provider. + * + *

REDPANDA_LLM_PROVIDER_TYPE selects the wire: "anthropic" speaks /v1/messages and "google" + * speaks /v1beta/...:generateContent (the gateway forwards both to the upstream), everything else + * speaks OpenAI chat-completions. All three point at the same provider-scoped + * REDPANDA_LLM_PROVIDER_URL. + * + *

Auth is the gateway bearer, sent on the Authorization header via langchain4j's + * customHeaders(Map) - fixed at build time, so it also carries the (fixed) conversation id. The + * gateway authenticates on that bearer and ignores the native x-api-key/x-goog-api-key, so we + * never send a real provider key (OpenAI/Anthropic require a non-empty apiKey, so we pass a dummy + * placeholder; Gemini sends no key at all). + */ + private static ChatModel buildChatModel(String token, String sessionId) { + String base = env("REDPANDA_LLM_PROVIDER_URL"); + String model = env("REDPANDA_LLM_MODEL"); + Map headers = + Map.of( + "Authorization", "Bearer " + token, + "X-Redpanda-Genai-Conversation", sessionId); + String type = System.getenv("REDPANDA_LLM_PROVIDER_TYPE"); + switch (type == null ? "" : type.toLowerCase(Locale.ROOT)) { + case "anthropic": + // Native Anthropic Messages API. langchain4j posts to {baseUrl}/messages, so the base URL + // carries the version segment: {provider-url}/v1 -> {provider-url}/v1/messages. + return AnthropicChatModel.builder() + .baseUrl(base + "/v1") + .apiKey("redpanda") // dummy; gateway injects the real key and ignores x-api-key + .modelName(model) + .customHeaders(headers) + .build(); + case "google": + case "gemini": + // Native Gemini API. langchain4j posts to {baseUrl}/models/{model}:generateContent, so the + // base URL carries the version segment: {provider-url}/v1beta. We do NOT call apiKey(...) - + // leaving it null suppresses the x-goog-api-key header; auth rides the Authorization bearer + // in customHeaders (requires langchain4j 1.15.0+). + return GoogleAiGeminiChatModel.builder() + .baseUrl(base + "/v1beta") + .modelName(model) + .customHeaders(headers) + .build(); + default: // openai (and openai-compatible) + // OpenAI chat-completions. langchain4j posts to {baseUrl}/chat/completions; the provider + // URL is the base as-is (the gateway's OpenAI upstream already includes /v1). + return OpenAiChatModel.builder() + .baseUrl(base) + .apiKey("redpanda") // dummy; gateway injects the real key and ignores it + .modelName(model) + .customHeaders(headers) + .build(); + } + } + + /** mcpServers reads the comma-separated REDPANDA_MCP_SERVERS list. */ + private static List mcpServers() { + List out = new ArrayList<>(); + String raw = System.getenv("REDPANDA_MCP_SERVERS"); + if (raw != null) { + for (String name : raw.split(",")) { + name = name.trim(); + if (!name.isEmpty()) { + out.add(name); + } + } + } + return out; + } + + /** accessToken runs the OAuth2 client_credentials grant against the gateway IDP. */ + private static String accessToken() throws Exception { + String form = + "grant_type=client_credentials" + + "&client_id=" + + enc(env("REDPANDA_CLIENT_ID")) + + "&client_secret=" + + enc(env("REDPANDA_CLIENT_SECRET")); + HttpRequest request = + HttpRequest.newBuilder(URI.create(env("REDPANDA_TOKEN_URL"))) + .header("Content-Type", "application/x-www-form-urlencoded") + .POST(HttpRequest.BodyPublishers.ofString(form)) + .build(); + HttpResponse response = + HttpClient.newHttpClient().send(request, HttpResponse.BodyHandlers.ofString()); + JsonNode node = new ObjectMapper().readTree(response.body()); + return node.get("access_token").asText(); + } + + private static String enc(String value) { + return URLEncoder.encode(value, StandardCharsets.UTF_8); + } + + /** + * env reads a required env var, failing with a clear message (not an opaque downstream NPE) when + * it is unset. Export the values from the Setup tab. + */ + private static String env(String k) { + String v = System.getenv(k); + if (v == null || v.isEmpty()) { + throw new IllegalStateException("missing env var " + k + " - export it from the Setup tab"); + } + return v; + } +} +---- +-- + +ADK Go:: ++ +-- +[source,go] +---- +package main + +import ( + "context" + "fmt" + "log" + "net/http" + "os" + "strings" + + "golang.org/x/oauth2" + "golang.org/x/oauth2/clientcredentials" + + "github.com/modelcontextprotocol/go-sdk/mcp" + "google.golang.org/genai" + + "google.golang.org/adk/agent" + "google.golang.org/adk/agent/llmagent" + "google.golang.org/adk/model/gemini" + "google.golang.org/adk/runner" + "google.golang.org/adk/session" + "google.golang.org/adk/tool" + "google.golang.org/adk/tool/mcptoolset" +) + +const appName = "redpanda-self-managed-agent" + +// convoKey carries the ADK session id on the context. +type convoKey struct{} + +// convoTransport stamps the session id (read from the context) as the +// conversation header. It sits beneath the oauth2 transport, so one http.Client +// carries the bearer AND the conversation id. ADK threads the ctx you pass to +// runner.Run down to both the LLM HTTP call and the MCP tool-call POSTs. +type convoTransport struct{ base http.RoundTripper } + +func (t *convoTransport) RoundTrip(r *http.Request) (*http.Response, error) { + if id, ok := r.Context().Value(convoKey{}).(string); ok && id != "" { + r = r.Clone(r.Context()) + r.Header.Set("X-Redpanda-Genai-Conversation", id) // = ADK session id + } + return t.base.RoundTrip(r) +} + +func main() { + ctx := context.Background() + + // OAuth2 client_credentials: x/oauth2 fetches and refreshes the bearer and + // its Transport sets it on every request; convoTransport underneath adds the + // conversation header. One client instruments the LLM call and every MCP call. + cc := clientcredentials.Config{ + ClientID: mustEnv("REDPANDA_CLIENT_ID"), + ClientSecret: mustEnv("REDPANDA_CLIENT_SECRET"), + TokenURL: mustEnv("REDPANDA_TOKEN_URL"), + } + hc := &http.Client{Transport: &oauth2.Transport{ + Source: cc.TokenSource(ctx), + Base: &convoTransport{base: http.DefaultTransport}, + }} + + // genai refuses to construct the Gemini-API client without a non-empty + // APIKey, but the real auth is the bearer the oauth2 transport sets - this + // just satisfies the constructor (the gateway ignores the x-goog-api-key). + model, err := gemini.NewModel(ctx, mustEnv("REDPANDA_LLM_MODEL"), &genai.ClientConfig{ + APIKey: "redpanda-gateway", + HTTPClient: hc, + HTTPOptions: genai.HTTPOptions{BaseURL: mustEnv("REDPANDA_LLM_PROVIDER_URL")}, + }) + if err != nil { + log.Fatal(err) + } + + // Each MCP server becomes a Toolset over the SAME client; the agent is built + // from them via llmagent.Config.Toolsets, so the model can call the tools. + mcpBase := mustEnv("REDPANDA_MCP_BASE_URL") + var toolsets []tool.Toolset + for _, name := range mcpServers() { + ts, err := mcptoolset.New(mcptoolset.Config{ + Transport: &mcp.StreamableClientTransport{Endpoint: mcpBase + "/" + name, HTTPClient: hc}, + }) + if err != nil { + log.Fatal(err) + } + toolsets = append(toolsets, ts) + } + + a, err := llmagent.New(llmagent.Config{ + Name: "assistant", + Model: model, + Description: "Self-managed agent on the Redpanda AI Gateway.", + Instruction: "You are a helpful agent.", + Toolsets: toolsets, + }) + if err != nil { + log.Fatal(err) + } + + sessionService := session.InMemoryService() + r, err := runner.New(runner.Config{ + AppName: appName, + Agent: a, + SessionService: sessionService, + }) + if err != nil { + log.Fatal(err) + } + + // ADK owns the session; its id IS the conversation. Create it, put the id on + // ctx, and the transport stamps it on the LLM call and every MCP tool call. + resp, err := sessionService.Create(ctx, &session.CreateRequest{AppName: appName, UserID: "user-123"}) + if err != nil { + log.Fatal(err) + } + sessionID := resp.Session.ID() + ctx = context.WithValue(ctx, convoKey{}, sessionID) + + const prompt = "What tools can you call?" + fmt.Printf("> %s\n\n", prompt) + msg := genai.NewContentFromText(prompt, genai.RoleUser) + for ev, err := range r.Run(ctx, "user-123", sessionID, msg, agent.RunConfig{}) { + if err != nil { + log.Fatal(err) + } + if ev.LLMResponse.Content == nil { + continue + } + for _, p := range ev.LLMResponse.Content.Parts { + fmt.Print(p.Text) // the assistant's reply, streamed as parts arrive + } + } + fmt.Println() +} + +// mustEnv reads a required env var, exiting with a clear message (not an opaque +// downstream panic) when it is unset. Export the values from the Setup tab. +func mustEnv(k string) string { + v := os.Getenv(k) + if v == "" { + log.Fatalf("missing env var %s - export it from the Setup tab", k) + } + return v +} + +// mcpServers reads the comma-separated REDPANDA_MCP_SERVERS list. Empty is fine +// - the agent then runs with no MCP tools. +func mcpServers() []string { + var out []string + for _, p := range strings.Split(os.Getenv("REDPANDA_MCP_SERVERS"), ",") { + if p = strings.TrimSpace(p); p != "" { + out = append(out, p) + } + } + return out +} +---- +-- + +====== + +NOTE: ADK Go ships only Gemini-shaped models, so the ADK Go sample works against a Google provider only. For an OpenAI or Anthropic provider, use one of the other frameworks. + +== Observe the agent + +Because the agent's traffic flows through the gateway, ADP attributes its activity without any instrumentation in your code: + +* *Cost & Usage*: Spend, tokens, and latency roll up to the agent. See them on the agent's *Cost & Usage* tab and in xref:control:budgets.adoc[budgets]. +* *Transcripts*: Each session that carries the conversation header appears on the agent's *Transcripts* tab. See xref:monitor:transcripts.adoc[See what your agent did]. + +Transcript message text is recorded per LLM provider and is off by default. A transcript always shows token usage, latency, and tool calls; it shows the prompt and response text only when input and output message recording is turned on for the provider. See xref:gateway:configure-provider.adoc[Configure an LLM provider]. + +== Troubleshooting + +[cols="1,2"] +|=== +|Symptom |What to check + +|`401` on the token request +|The Client ID or client secret is wrong, or the secret expired or was revoked. The Client ID must be the full `serviceaccounts/` value. Issue a new secret on the *Credentials* tab. + +|`403` with `model_not_allowed` +|The model is not on the provider's allow-list. Pick a model the provider serves. The *Setup* tab fills in a valid model for you. + +|`404` from the LLM endpoint +|The provider name in the URL does not match a configured provider. Confirm the segment after `/providers/` matches the provider's name exactly. + +|The *Transcripts* tab stays empty +|The agent is not sending the `X-Redpanda-Genai-Conversation` header. Stamp it on every model call and tool call with the framework's session identifier. + +|A transcript shows usage but no message text +|Message recording is off for the agent's LLM provider. Turn on input and output message recording in the provider settings. Recording applies to future conversations only. +|=== + +== Next steps + +* xref:gateway:connect-agent.adoc[Connect your app to AI Gateway] +* xref:monitor:transcripts.adoc[See what your agent did] +* xref:connect:create-agent.adoc[Create an agent] diff --git a/modules/monitor/pages/byoa-telemetry.adoc b/modules/monitor/pages/byoa-telemetry.adoc deleted file mode 100644 index 518f711..0000000 --- a/modules/monitor/pages/byoa-telemetry.adoc +++ /dev/null @@ -1,162 +0,0 @@ -= Send BYOA Telemetry -:description: Emit OpenTelemetry traces from your BYOA (Bring Your Own Agent) so the Agentic Data Plane can attribute calls, costs, and tool invocations to your agent. Covers the minimum required span contract, common optional attributes, and how it differs from a Redpanda-managed agent. -:page-topic-type: how-to -:personas: agent_builder, platform_engineer -:learning-objective-1: Identify the resource attributes and span attributes a BYOA agent must emit so transcripts and cost rollups can attribute calls correctly -:learning-objective-2: Choose between optional enrichment attributes that improve cost and usage reporting fidelity (model, tool, agent name, conversation, cache tokens) -:learning-objective-3: Validate a BYOA agent's telemetry by reading the resulting transcript and confirming non-zero metric values - -A *BYOA* (Bring Your Own Agent) is an agent you operate yourself, outside Redpanda's managed runtime. To make it visible across transcripts, cost rollups, and the agent registry, your agent must emit OpenTelemetry traces with a specific minimum set of resource attributes and span attributes. Emit the right attributes from the start to avoid missing traces and misattributed cost data. - -For the full OTLP ingestion flow (deploying the Connect pipeline, authenticating, sending traces over HTTP or gRPC), see xref:monitor:ingest-custom-traces.adoc[Ingest custom traces]. This page focuses on _what_ to emit; that page covers _how_ to send it. - -After reading this page, you will be able to: - -* [ ] {learning-objective-1} -* [ ] {learning-objective-2} -* [ ] {learning-objective-3} - -== Why this matters - -When an agent runs, the Agentic Data Plane reconstructs a turn-by-turn transcript from the spans the agent (and its LLM, MCP server, sub-agent calls) emit. The transcripts UI groups, labels, and totals fields read directly from span attributes. If your agent omits a required attribute, the corresponding column in cost and usage reporting or in transcripts shows as empty, zero, or unattributed. - -Redpanda-managed agents emit the contract automatically through the runtime. BYOA agents must emit it themselves. - -== Minimum required contract - -These attributes _must_ appear on your agent's spans for transcripts and cost rollups to surface non-empty values. - -=== Resource attributes - -Set on the OTel `Resource` so every span the agent emits inherits them: - -[cols="1,3"] -|=== -|Attribute |Required value - -|`service.name` -|A stable identifier for your agent. Surfaces as the agent identity on transcripts and as the `Name` column in the governance Agents list, and the `service.name` filter chip. Use a slug-style name like `support-bot-prod` or `pricing-agent-eu`. - -// TODO: confirm whether `service.name` should match an `AgentRegistryService` resource name once BYOA registration ships, or whether the two stay decoupled (transcripts attribute by `service.name`, governance attributes by registered resource name). Open Q with team-ai. -|=== - -=== Span attributes - -Set on every relevant span: - -[cols="1,3"] -|=== -|Attribute |Required value - -|`gen_ai.conversation.id` -|A stable identifier shared across every span in the same conversation (system prompt, user turn, assistant turn, tool call, sub-agent call). Drives the `Conversation ID` in the transcript header and the cross-service-trace filter. Use a UUID per conversation. - -|`gen_ai.operation.name` -|One of `invoke_agent`, `chat`, or `execute_tool`. Drives turn-role inference (`SYSTEM` / `USER` / `ASSISTANT` / `TOOL`). A span with no `gen_ai.operation.name` cannot be classified into a transcript role. - -|`gen_ai.request.model` -|The LLM model identifier the agent calls. Surfaces in the transcript turn header and in the *Cost & Usage* model breakdown. Required on `chat`-operation spans; optional on `invoke_agent` and `execute_tool` spans. - -|`gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens` -|Token counts on LLM-call spans. Drive the token totals in cost and usage reporting and the per-turn USD-cost calculation in transcripts. Without them the cost column reads `0`. -|=== - -If your agent emits these five attributes plus the resource `service.name`, every cost and usage report and every transcript field has a non-empty value to render. - -== Recommended enrichment - -Cost and usage reporting degrades gracefully without these, but their presence lets the UI build richer views. - -[cols="1,3"] -|=== -|Attribute |Why it helps - -|`gen_ai.provider.name` -|Labels the LLM provider (`openai`, `anthropic`, `gemini`, `bedrock`). Drives the provider filter and per-provider grouping in cost and usage reporting. Without it, spend groups everything under an unknown-provider bucket. - -|`gen_ai.agent.name` -|A human-readable agent label distinct from `service.name`. Use when the same service runs multiple logical agents (for example, `support-bot/refunds`, `support-bot/onboarding`). - -|`gen_ai.tool.name` -|On `execute_tool` spans, identifies which tool was invoked. Drives the `Tool name` attribute filter and the per-tool latency view in transcripts. - -|`gen_ai.usage.cache_read_input_tokens` -|Cache-hit token count on LLM-call spans. Surfaces in the `CACHED` bucket in cost and usage reporting and in per-turn cost. Without it, the `CACHED` bucket reads `0` even when your agent reuses a system prompt that the upstream cached. - -|`gen_ai.input.messages` and `gen_ai.output.messages` -|Conversation content. Used to reconstruct turn content, and required for transcript history reconstruction when older spans are evicted from `redpanda.otel_traces` (see xref:monitor:concepts.adoc#history-reconstruction[Reconstructed transcript history]). Without them, evicted spans render as empty turns rather than reconstructed turns. -|=== - -[NOTE] -==== -Latency and timestamps come from OTel span `start_time` and `end_time` automatically; you don't need to add a separate `latency` attribute. -==== - -== Span hierarchy - -Transcripts read your agent's span tree to lay out turns. The recognized span types (matched by `gen_ai.operation.name` and span name) are documented in xref:monitor:concepts.adoc[Observability]. The four span shapes are: - -* *Top-level span*: One per agent invocation. Sets `gen_ai.operation.name = "invoke_agent"`, carries the conversation ID and service name. -* *Reasoning or chat spans*: Set `gen_ai.operation.name = "chat"` for LLM calls. Carry the model, token counts, and provider attributes. -* *Tool spans*: Set `gen_ai.operation.name = "execute_tool"` for tool invocations. Carry the tool name and arguments. -* *Sub-agent spans*: Set `gen_ai.operation.name = "invoke_agent"` nested under a parent agent's span when one BYOA agent calls another. - -Parent-child relationships are expressed through OTel's standard `parent_span_id`. Keep the tree faithful to your agent's call graph; the transcript turn order follows it. - -== Validate with a transcript - -After your agent emits a few traces, confirm they surface in ADP: - -. Confirm the `Name` column in the governance Agents list shows your `service.name` value. If it shows blank or `unknown`, the resource attribute didn't make it through. -. Open one of the agent's transcripts and confirm the `Conversation ID` in the summary header matches the UUID your agent emitted. -. Look at the assistant turn in the *Detailed* view for token counts and latency. Non-zero values mean the LLM-call span attributes are correctly emitted. -. If you sent a tool call, expand the TOOL turn and confirm the `Tool name` renders. - -If any field shows blank or zero unexpectedly, the corresponding attribute is missing or misnamed in your agent's instrumentation. - -== Authentication - -BYOA agents authenticate against the OTLP ingest endpoint with a service-account access token from your organization. Send the token in `Authorization: Bearer ` (HTTP) or `authorization: Bearer ` (gRPC). - -For the token-acquisition flow and endpoint URL format, see xref:monitor:ingest-custom-traces.adoc[Ingest custom traces]. - -// TODO: confirm the standalone-ADP service-account auth model for OTLP ingest once the standalone product ships. The current page assumes federation in Redpanda Cloud, where service-account credentials come from Cloud Organization IAM. For standalone ADP, replace with the ADP-native auth model and update the cross-link. - -== Where to find code examples - -The xref:monitor:ingest-custom-traces.adoc[Ingest custom traces] page has full HTTP and gRPC examples in Python, Node.js, and Go, each instrumenting an LLM call with the GenAI semantic-convention attributes. Adapt the examples to your agent's framework. The attribute set is the same; only the OTel SDK ergonomics differ. - -// TODO: once the BYOA tutorials track ships at GA, link a worked end-to-end BYOA agent example here (from the Examples folder or a tutorials page). - -== Troubleshooting - -Common symptoms and fixes: - -[cols="1,2"] -|=== -|Symptom |What to check - -|Agent missing from the governance Agents list -|Resource `service.name` is missing or set after the SDK was initialized. Set it at SDK construction. - -|`Conversation ID` missing in transcript header -|`gen_ai.conversation.id` not on the top-level invoke_agent span. Add it on the agent's outer span; child spans inherit it through the trace. - -|Token / USD cost columns show `0` for assistant turns -|`gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens` aren't on the LLM-call span. The model's response carries them; lift them onto the span before it ends. - -|Tool calls not visible in the transcript -|`gen_ai.operation.name = "execute_tool"` is missing on the tool span. Also confirm the tool span is parented to an assistant span, not the root. - -|Agent shows up in transcripts but not in the agent registry (`AgentRegistryService.ListAgents`) -|Transcripts attribute by `service.name` resource attribute; cost attribution and the agent registry attribute by registered agent resource. BYOA agent registration ships separately. See xref:connect:byoa-register.adoc[Register your own agent (BYOA)]. - -|Older turns in a long conversation render as `is_reconstructed` -|Spans were evicted from `redpanda.otel_traces` retention. Reconstruction works only if your agent emitted `gen_ai.input.messages` and `gen_ai.output.messages` on later spans. See xref:monitor:concepts.adoc#history-reconstruction[Reconstructed transcript history]. -|=== - -== Next steps - -* xref:monitor:ingest-custom-traces.adoc[Ingest custom traces] -* xref:monitor:transcripts.adoc[Read a transcript] -* xref:connect:byoa-register.adoc[Register your own agent (BYOA)] diff --git a/modules/monitor/pages/ingest-custom-traces.adoc b/modules/monitor/pages/ingest-custom-traces.adoc deleted file mode 100644 index 95aa809..0000000 --- a/modules/monitor/pages/ingest-custom-traces.adoc +++ /dev/null @@ -1,625 +0,0 @@ -= Ingest OpenTelemetry Traces from Custom Agents -:description: Configure a Redpanda Connect pipeline to ingest OpenTelemetry traces from custom agents into Redpanda's immutable log for unified governance and observability. -:page-topic-type: how-to -:personas: agent_builder, platform_engineer -:learning-objective-1: pass:q[Configure and deploy a Redpanda Connect pipeline to receive OpenTelemetry traces from custom agents through HTTP and publish them to `redpanda.otel_traces`] -:learning-objective-2: Validate trace data format and compatibility with existing MCP server traces -:learning-objective-3: Secure the ingestion endpoint using authentication mechanisms - -// TODO (standalone-ADP rewrite): This page was written when ADP lived inside a Redpanda Cloud cluster, so the setup flow deploys a Redpanda Connect pipeline on the user's own cluster and references cluster-id-shaped URLs (`.pipelines..clusters.rdpa.co`). Post-2026-04-21, ADP ships as its own product surface and users won't necessarily have a Redpanda Cloud cluster. The whole ingestion flow (prereqs, pipeline deployment, endpoint URL format, auth) needs a rewrite once the standalone-ADP ingestion path is defined. Tracking under ADP Docs Plan Workflow #11 (BYOA telemetry). Until then, the content below is accurate for the federated-in-cloud-ui preview but will mislead standalone-ADP users. - -You can extend Redpanda's transcript observability to custom agents built with frameworks like LangChain or instrumented with OpenTelemetry SDKs. By ingesting traces from external applications into the `redpanda.otel_traces` topic, you gain unified visibility across all agent executions, from Redpanda's declarative agents, Remote MCP servers, to your own custom implementations. - -After reading this page, you will be able to: - -* [ ] {learning-objective-1} -* [ ] {learning-objective-2} -* [ ] {learning-objective-3} - -== Prerequisites - -* A Redpanda Connect pipeline host. Ability to manage secrets on that host. -// TODO: Replace with the standalone-ADP ingestion target once defined (may no longer require a Redpanda Cloud cluster). -* The latest version of xref:reference:rpk-install.adoc[`rpk`] installed -* Custom agent or application instrumented with OpenTelemetry SDK -* Basic understanding of the https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/[OpenTelemetry span format^] and https://opentelemetry.io/docs/specs/otlp/[OpenTelemetry Protocol (OTLP)^] - -== Quickstart for LangChain users - -If you're using LangChain with OpenTelemetry tracing, you can send traces to Redpanda's `redpanda.otel_traces` glossterm:topic[] to view them in the Transcripts view. - -. Configure LangChain's OpenTelemetry integration by following the https://docs.langchain.com/langsmith/trace-with-opentelemetry[LangChain documentation^]. - -. Deploy a Redpanda Connect pipeline using the `otlp_http` input to receive OTLP traces over HTTP. Create the pipeline in the *Connect* page. For a sample configuration, see <>. -// TODO: Update the deployment entry point once the standalone-ADP ingestion flow is defined. - -. Configure your OTEL exporter to send traces to your Redpanda Connect pipeline using environment variables: -+ -[,bash] ----- -# Configure LangChain OTEL integration -export LANGSMITH_OTEL_ENABLED=true -export LANGSMITH_TRACING=true - -# Send traces to Redpanda Connect pipeline (use your pipeline URL) -export OTEL_EXPORTER_OTLP_ENDPOINT="https://.pipelines..clusters.rdpa.co" -export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer " ----- - -By default, traces are sent to both LangSmith and your Redpanda Connect pipeline. If you want to send traces only to Redpanda (not LangSmith), set: - -[,bash] ----- -export LANGSMITH_OTEL_ONLY="true" ----- - -Your LangChain application will send traces to the `redpanda.otel_traces` topic, making them visible in the Transcripts view alongside Remote MCP server and declarative agent traces. - -For non-LangChain applications or custom instrumentation, continue with the sections below. - -== About custom trace ingestion - -Custom agents are applications with OpenTelemetry instrumentation that operate independently of Redpanda's Remote MCP servers or declarative agents (such as LangChain, CrewAI, or manually instrumented applications). - -When these agents send traces to `redpanda.otel_traces`, you gain unified observability alongside Remote MCP server and declarative agent traces. See xref:monitor:concepts.adoc#cross-service-transcripts[Cross-service transcripts] for details on how traces correlate across services. - -=== Trace format requirements - -Custom agents must emit traces in OTLP format. The xref:connect:components:inputs/otlp_http.adoc[`otlp_http`] input accepts both OTLP Protobuf (`application/x-protobuf`) and JSON (`application/json`) payloads. For <>, use the xref:connect:components:inputs/otlp_grpc.adoc[`otlp_grpc`] input. - -Each trace must follow the OTLP specification with these required fields: - -[cols="1,3", options="header"] -|=== -| Field | Description - -| `traceId` -| Hex-encoded unique identifier for the entire trace - -| `spanId` -| Hex-encoded unique identifier for this span - -| `name` -| Descriptive operation name - -| `startTimeUnixNano` and `endTimeUnixNano` -| Timing information in nanoseconds - -| `instrumentationScope` -| Identifies the library that created the span - -| `status` -| Operation status with code (0 = UNSET, 1 = OK, 2 = ERROR) -|=== - -Optional but recommended fields: - -- `parentSpanId` for hierarchical traces -- `attributes` for contextual information - -For complete trace structure details, see xref:monitor:concepts.adoc#understand-the-transcript-structure[Understand the transcript structure]. - -== Configure the ingestion pipeline - -Create a Redpanda Connect pipeline that receives OTLP traces and publishes them to the `redpanda.otel_traces` topic. Choose HTTP or gRPC transport based on your agent's requirements. - -=== Create the pipeline configuration - -Create a pipeline configuration file that defines the OTLP ingestion endpoint. - -[tabs] -==== -HTTP:: -+ --- -The `otlp_http` input component: - -* Exposes an OpenTelemetry Collector HTTP receiver -* Accepts traces at the standard `/v1/traces` endpoint -* Converts incoming OTLP data into individual Redpanda OTEL v1 Protobuf messages - -The following example shows a minimal pipeline configuration. Redpanda ADP automatically injects authentication handling, so you don't need to configure `auth_token` in the input. - -[,yaml] ----- -input: - otlp_http: {} - -output: - redpanda: - seed_brokers: - - "${PRIVATE_REDPANDA_BROKERS}" - tls: - enabled: ${PRIVATE_REDPANDA_TLS_ENABLED} - sasl: - - mechanism: "REDPANDA_CLOUD_SERVICE_ACCOUNT" - topic: "redpanda.otel_traces" ----- --- - -gRPC:: -+ --- -The `otlp_grpc` input component: - -* Exposes an OpenTelemetry Collector gRPC receiver -* Accepts traces through the OTLP gRPC protocol -* Converts incoming OTLP data into individual Redpanda OTEL v1 Protobuf messages - -The following example shows a minimal pipeline configuration. ADP automatically injects authentication handling. - -[,yaml] ----- -input: - otlp_grpc: {} - -output: - redpanda: - seed_brokers: - - "${PRIVATE_REDPANDA_BROKERS}" - tls: - enabled: ${PRIVATE_REDPANDA_TLS_ENABLED} - sasl: - - mechanism: "REDPANDA_CLOUD_SERVICE_ACCOUNT" - topic: "redpanda.otel_traces" ----- - -NOTE: Clients must include the authentication token in gRPC metadata as `authorization: Bearer `. --- -==== - -The OTLP input automatically handles format conversion, so no processors are needed for basic trace ingestion. Each span becomes a separate message in the `redpanda.otel_traces` topic. - -=== Deploy the ingestion pipeline - -// TODO (standalone-ADP): Steps below assume a Redpanda Cloud cluster with Connect enabled. The standalone ADP product surface may expose a different deployment entry point. Revisit the whole subsection under Workflow #11. - -. In the *Connect* page, click *Create Pipeline*. -. For the input, select the *otlp_http* (or *otlp_grpc*) component. -. Skip to *Add a topic* and select `redpanda.otel_traces` from the list of existing topics. Leave the default advanced settings. -. In the *Add permissions* step, create a service account with write access to the `redpanda.otel_traces` topic. -. In the *Create pipeline* step, enter a name for your pipeline and paste the configuration. Authentication for incoming requests is handled automatically. - -== Send traces from your custom agent - -Configure your custom agent to send OpenTelemetry traces to the pipeline endpoint. After deploying the pipeline, you can find its URL on the pipeline details page in the host UI. -// TODO (standalone-ADP): Confirm where users find the pipeline URL once the ingestion path moves out of the Redpanda Cloud UI. - -[cols="1,3", options="header"] -|=== -| Transport | URL Format - -| HTTP -| `+https://.pipelines..clusters.rdpa.co/v1/traces+` - -| gRPC -| `.pipelines..clusters.rdpa.co:443` -|=== - -=== Authenticate to the pipeline - -The OTLP pipeline authenticates with a service account access token. Obtain an access token using your service account credentials as described in xref:cloud-data-platform:security:cloud-authentication.adoc#authenticate-to-the-cloud-api[Authenticate to the Cloud API]. -// TODO (standalone-ADP): Update the auth model when the standalone ADP ingestion path is defined. - -Include the token in your requests: - -* HTTP: Set the `Authorization` header to `Bearer ` -* gRPC: Set the `authorization` metadata field to `Bearer ` - -=== Configure your OTEL exporter - -Install the OpenTelemetry SDK for your language and configure the OTLP exporter to target your Redpanda Connect pipeline endpoint. - -The exporter configuration requires: - -* Endpoint: Your pipeline's URL (the SDK adds `/v1/traces` automatically for HTTP) -* Headers: Authorization header with your bearer token -* Protocol: HTTP to match the `otlp_http` input (or gRPC for `otlp_grpc`) - -[tabs] -====== -HTTP:: -+ --- -.View Python example -[%collapsible] -==== -[,python] ----- -from opentelemetry import trace -from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter -from opentelemetry.sdk.trace import TracerProvider -from opentelemetry.sdk.trace.export import BatchSpanProcessor -from opentelemetry.sdk.resources import Resource - -# Configure resource attributes to identify your agent -resource = Resource(attributes={ - "service.name": "my-custom-agent", - "service.version": "1.0.0" -}) - -# Configure the OTLP HTTP exporter -exporter = OTLPSpanExporter( - endpoint="https://.pipelines..clusters.rdpa.co/v1/traces", - headers={"Authorization": "Bearer YOUR_TOKEN"} -) - -# Set up tracing with batch processing -provider = TracerProvider(resource=resource) -processor = BatchSpanProcessor(exporter) -provider.add_span_processor(processor) -trace.set_tracer_provider(provider) - -# Use the tracer with GenAI semantic conventions -tracer = trace.get_tracer(__name__) -with tracer.start_as_current_span( - "invoke_agent my-assistant", - kind=trace.SpanKind.INTERNAL -) as span: - # Set GenAI semantic convention attributes - span.set_attribute("gen_ai.operation.name", "invoke_agent") - span.set_attribute("gen_ai.agent.name", "my-assistant") - span.set_attribute("gen_ai.provider.name", "openai") - span.set_attribute("gen_ai.request.model", "gpt-4") - - # Your agent logic here - result = process_request() - - # Set token usage if available - span.set_attribute("gen_ai.usage.input_tokens", 150) - span.set_attribute("gen_ai.usage.output_tokens", 75) ----- -==== - -.View Node.js example -[%collapsible] -==== -[,javascript] ----- -const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node'); -const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http'); -const { BatchSpanProcessor } = require('@opentelemetry/sdk-trace-base'); -const { Resource } = require('@opentelemetry/resources'); -const { trace, SpanKind } = require('@opentelemetry/api'); - -// Configure resource -const resource = new Resource({ - 'service.name': 'my-custom-agent', - 'service.version': '1.0.0' -}); - -// Configure OTLP HTTP exporter -const exporter = new OTLPTraceExporter({ - url: 'https://.pipelines..clusters.rdpa.co/v1/traces', - headers: { - 'Authorization': 'Bearer YOUR_TOKEN' - } -}); - -// Set up provider -const provider = new NodeTracerProvider({ resource }); -provider.addSpanProcessor(new BatchSpanProcessor(exporter)); -provider.register(); - -// Use the tracer with GenAI semantic conventions -const tracer = trace.getTracer('my-agent'); -const span = tracer.startSpan('invoke_agent my-assistant', { - kind: SpanKind.INTERNAL -}); - -// Set GenAI semantic convention attributes -span.setAttribute('gen_ai.operation.name', 'invoke_agent'); -span.setAttribute('gen_ai.agent.name', 'my-assistant'); -span.setAttribute('gen_ai.provider.name', 'openai'); -span.setAttribute('gen_ai.request.model', 'gpt-4'); - -// Your agent logic -processRequest().then(result => { - // Set token usage if available - span.setAttribute('gen_ai.usage.input_tokens', 150); - span.setAttribute('gen_ai.usage.output_tokens', 75); - span.end(); -}); ----- -==== - -.View Go example -[%collapsible] -==== -[,go] ----- -package main - -import ( - "context" - "log" - - "go.opentelemetry.io/otel" - "go.opentelemetry.io/otel/attribute" - "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp" - "go.opentelemetry.io/otel/sdk/resource" - sdktrace "go.opentelemetry.io/otel/sdk/trace" - semconv "go.opentelemetry.io/otel/semconv/v1.26.0" - "go.opentelemetry.io/otel/trace" -) - -func main() { - ctx := context.Background() - - // Configure OTLP HTTP exporter - exporter, err := otlptracehttp.New(ctx, - otlptracehttp.WithEndpoint(".pipelines..clusters.rdpa.co"), - otlptracehttp.WithHeaders(map[string]string{ - "Authorization": "Bearer YOUR_TOKEN", - }), - ) - if err != nil { - log.Fatalf("Failed to create exporter: %v", err) - } - - // Configure resource - res, _ := resource.New(ctx, - resource.WithAttributes( - semconv.ServiceName("my-custom-agent"), - semconv.ServiceVersion("1.0.0"), - ), - ) - - // Set up tracer provider - tp := sdktrace.NewTracerProvider( - sdktrace.WithBatcher(exporter), - sdktrace.WithResource(res), - ) - defer tp.Shutdown(ctx) - otel.SetTracerProvider(tp) - - tracer := tp.Tracer("my-agent") - - // Create span with GenAI semantic conventions - _, span := tracer.Start(ctx, "invoke_agent my-assistant", - trace.WithSpanKind(trace.SpanKindInternal), - ) - span.SetAttributes( - attribute.String("gen_ai.operation.name", "invoke_agent"), - attribute.String("gen_ai.agent.name", "my-assistant"), - attribute.String("gen_ai.provider.name", "openai"), - attribute.String("gen_ai.request.model", "gpt-4"), - attribute.Int("gen_ai.usage.input_tokens", 150), - attribute.Int("gen_ai.usage.output_tokens", 75), - ) - span.End() - - tp.ForceFlush(ctx) -} ----- -==== --- - -gRPC:: -+ --- -.View Python example -[%collapsible] -==== -[,python] ----- -from opentelemetry import trace -from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter -from opentelemetry.sdk.trace import TracerProvider -from opentelemetry.sdk.trace.export import BatchSpanProcessor -from opentelemetry.sdk.resources import Resource - -resource = Resource(attributes={ - "service.name": "my-custom-agent", - "service.version": "1.0.0" -}) - -# gRPC endpoint without https:// prefix -exporter = OTLPSpanExporter( - endpoint=".pipelines..clusters.rdpa.co:443", - headers={"authorization": "Bearer YOUR_TOKEN"} -) - -provider = TracerProvider(resource=resource) -provider.add_span_processor(BatchSpanProcessor(exporter)) -trace.set_tracer_provider(provider) ----- -==== - -.View Node.js example -[%collapsible] -==== -[,javascript] ----- -const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node'); -const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-grpc'); -const { BatchSpanProcessor } = require('@opentelemetry/sdk-trace-base'); -const { Resource } = require('@opentelemetry/resources'); - -const resource = new Resource({ - 'service.name': 'my-custom-agent', - 'service.version': '1.0.0' -}); - -// gRPC exporter with TLS -const exporter = new OTLPTraceExporter({ - url: 'https://.pipelines..clusters.rdpa.co:443', - headers: { - 'authorization': 'Bearer YOUR_TOKEN' - } -}); - -const provider = new NodeTracerProvider({ resource }); -provider.addSpanProcessor(new BatchSpanProcessor(exporter)); -provider.register(); ----- -==== - -.View Go example -[%collapsible] -==== -[,go] ----- -package main - -import ( - "context" - - "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc" - "google.golang.org/grpc" - "google.golang.org/grpc/credentials" -) - -func createGRPCExporter(ctx context.Context) (*otlptracegrpc.Exporter, error) { - return otlptracegrpc.New(ctx, - otlptracegrpc.WithEndpoint(".pipelines..clusters.rdpa.co:443"), - otlptracegrpc.WithDialOption(grpc.WithTransportCredentials(credentials.NewTLS(nil))), - otlptracegrpc.WithHeaders(map[string]string{ - "authorization": "Bearer YOUR_TOKEN", - }), - ) -} ----- -==== --- -====== - -TIP: Use environment variables for the endpoint URL and authentication token to keep credentials out of your code. - -=== Use recommended semantic conventions - -The Transcripts view recognizes https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/[OpenTelemetry semantic conventions for GenAI operations^]. Following these conventions ensures your traces display correctly with proper attribution, token usage, and operation identification. - -==== Required attributes for agent operations - -Following the OpenTelemetry semantic conventions, agent spans should include these attributes: - -* Operation identification: -** `gen_ai.operation.name` - Set to `"invoke_agent"` for agent execution spans -** `gen_ai.agent.name` - Human-readable name of your agent (displayed in Transcripts view) -* LLM provider details: -** `gen_ai.provider.name` - LLM provider identifier (for example, `"openai"`, `"anthropic"`, `"gcp.vertex_ai"`) -** `gen_ai.request.model` - Model name (for example, `"gpt-4"`, `"claude-sonnet-4"`) -* Token usage (for cost tracking): -** `gen_ai.usage.input_tokens` - Number of input tokens consumed -** `gen_ai.usage.output_tokens` - Number of output tokens generated -* Session correlation: -** `gen_ai.conversation.id` - Identifier linking related agent invocations in the same conversation - -==== Required attributes for proper display - -Set these attributes on your spans for proper display and filtering in the Transcripts view: - -[cols="2,3", options="header"] -|=== -| Attribute | Purpose - -| `gen_ai.operation.name` -| Set to `"invoke_agent"` for agent execution spans - -| `gen_ai.agent.name` -| Human-readable name displayed in Transcripts view - -| `gen_ai.provider.name` -| LLM provider (for example, `"openai"`, `"anthropic"`) - -| `gen_ai.request.model` -| Model name (for example, `"gpt-4"`, `"claude-sonnet-4"`) - -| `gen_ai.usage.input_tokens` / `gen_ai.usage.output_tokens` -| Token counts for cost tracking - -| `gen_ai.conversation.id` -| Links related agent invocations in the same conversation -|=== - -See the code examples earlier in this page for how to set these attributes in Python, Node.js, or Go. - -=== Validate trace format - -Before deploying to production, verify your traces match the expected format: - -. Run your agent locally and enable debug logging in your OpenTelemetry SDK to inspect outgoing spans. -. Verify required fields are present: - * `traceId`, `spanId`, `name` - * `startTimeUnixNano`, `endTimeUnixNano` - * `instrumentationScope` with a `name` field - * `status` with a `code` field (1 for success, 2 for error) -. Check that `service.name` is set in the resource attributes to identify your agent in the Transcripts view. -. Verify GenAI semantic convention attributes if you want proper display in the Transcripts view: - * `gen_ai.operation.name` set to `"invoke_agent"` for agent spans - * `gen_ai.agent.name` for agent identification - * Token usage attributes if tracking costs - -== Verify trace ingestion - -After deploying your pipeline and configuring your custom agent, verify traces are flowing correctly. - -=== Consume traces from the topic - -Check that traces are being published to the `redpanda.otel_traces` topic: - -[,bash] ----- -rpk topic consume redpanda.otel_traces --offset end -n 10 ----- - -You can also view the `redpanda.otel_traces` topic in the *Topics* page of the host UI. - -Look for spans with your custom `instrumentationScope.name` to identify traces from your agent. - -=== View traces in Transcripts - -After your custom agent sends traces through the pipeline, they appear in the *Transcripts* view in ADP alongside traces from Remote MCP servers, declarative agents, and AI Gateway. - -==== Identify custom agent transcripts - -Custom agent transcripts are identified by the `service.name` resource attribute, which differs from Redpanda's built-in services (`ai-agent` for declarative agents, `mcp-{server-id}` for MCP servers). See xref:monitor:concepts.adoc#cross-service-transcripts[Cross-service transcripts] to understand how the `service.name` attribute identifies transcript sources. - -Your custom agent transcripts display with: - -* **Service name** in the service filter dropdown (from your `service.name` resource attribute) -* **Agent name** in span details (from the `gen_ai.agent.name` attribute) -* **Operation names** like `"invoke_agent my-assistant"` indicating agent executions - -For detailed instructions on filtering, searching, and navigating transcripts in the UI, see xref:monitor:transcripts.adoc[View Transcripts]. - -==== Token usage tracking - -If your spans include the recommended token usage attributes (`gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens`), they display in the summary panel's token usage section. This enables cost tracking alongside Remote MCP server and declarative agent transcripts. - -== Troubleshooting - -If traces from your custom agent aren't appearing in the Transcripts view, use these diagnostic steps to identify and resolve common ingestion issues. - -=== Pipeline not receiving requests - -If your custom agent cannot reach the ingestion endpoint: - -. Verify the endpoint URL format: - * HTTP: `\https://.pipelines..clusters.rdpa.co/v1/traces` - * gRPC: `.pipelines..clusters.rdpa.co:443` (no `https://` prefix for gRPC clients) -. Check network connectivity and firewall rules. -. Ensure authentication tokens are valid and properly formatted in the `Authorization: Bearer ` header (HTTP) or `authorization` metadata field (gRPC). -. Verify the Content-Type header matches your data format (`application/x-protobuf` or `application/json`). -. Review pipeline logs for connection errors or authentication failures. - -=== Traces not appearing in topic - -If requests succeed but traces do not appear in `redpanda.otel_traces`: - -. Check pipeline output configuration. -. Verify topic permissions. -. Validate trace format matches OTLP specification. - -== Limitations - -* The `otlp_http` and `otlp_grpc` inputs accept only traces, logs, and metrics, not profiles. -* Only traces are published to the `redpanda.otel_traces` topic. -* Exceeded rate limits return HTTP 429 (HTTP) or ResourceExhausted status (gRPC). - -== Next steps - -* xref:monitor:transcripts.adoc[] -* xref:connect:components:inputs/otlp_http.adoc[OTLP HTTP input reference] -* xref:connect:components:inputs/otlp_grpc.adoc[OTLP gRPC input reference] diff --git a/modules/reference/pages/glossary.adoc b/modules/reference/pages/glossary.adoc index 11c2489..29ef5ad 100644 --- a/modules/reference/pages/glossary.adoc +++ b/modules/reference/pages/glossary.adoc @@ -7,7 +7,6 @@ * glossterm:AI agent[] * glossterm:AI Gateway[] * glossterm:declarative agent[] -* glossterm:BYOA (bring your own agent)[] * glossterm:MCP server[] * glossterm:MCP tool[] * glossterm:tool[]