Skip to content

feat: add MCP tool server discovery and invocation endpoint#1437

Open
syn-zhu wants to merge 2 commits intokagent-dev:mainfrom
syn-zhu:feat/mcp-tool-server-discovery
Open

feat: add MCP tool server discovery and invocation endpoint#1437
syn-zhu wants to merge 2 commits intokagent-dev:mainfrom
syn-zhu:feat/mcp-tool-server-discovery

Conversation

@syn-zhu
Copy link

@syn-zhu syn-zhu commented Mar 5, 2026

Summary

Adds three new MCP tools to the kagent-controller's existing MCP endpoint at :8083/mcp for dynamic tool server discovery and invocation across all tool source types.

Tool Purpose
list_tool_servers List all tool servers (RemoteMCPServer, Service, MCPServer)
list_tools Connect to a tool server and return its available tools
call_tool Invoke a specific tool on a specific tool server

Closes #1436

Changes

  • internal/mcp/transport.go (new) — Shared MCP transport creation with header resolution, extracted from reconciler pattern
  • internal/mcp/mcp_tool_server_handler.go (new) — Input/output types, ref parsing (Kind/namespace/name), session caching with evict-and-retry, three tool handler methods
  • internal/mcp/mcp_handler.go (modified) — Added sessions sync.Map field, registered three new tools, session cleanup in Shutdown
  • internal/mcp/mcp_tool_server_handler_test.go (new) — 22 unit tests covering ref parsing, resource resolution, listing all types, namespace filtering, and input validation

Design decisions

  • Unified ref format: Kind/namespace/name (e.g. RemoteMCPServer/default/my-server) — unambiguous across resource types
  • Reuses existing URL derivation: ConvertServiceToRemoteMCPServer and ConvertMCPServerToRemoteMCPServer from the translator package
  • MCPServer CRD optional: Gracefully returns empty list if kmcp is not installed
  • Session caching: Proven sync.Map + evict-on-stale pattern from the original kmcp#123 implementation

Background

This moves the functionality originally proposed in kagent-dev/kmcp#123 into kagent per reviewer feedback, since kagent already watches all three tool source types and has the existing MCP handler infrastructure.

Test plan

  • go build ./... — compiles cleanly
  • go vet ./... — no issues
  • go test ./go/core/internal/mcp/... — all 22 tests pass
  • E2E: Deploy to cluster, port-forward to :8083, verify list_tool_servers / list_tools / call_tool

🤖 Generated with Claude Code

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends kagent’s MCP and agent toolchain to support dynamic MCP tool server discovery/invocation, while also adding STS-audience scoping support for MCP tool calls and improving observability via trace/header propagation and additional Python client instrumentation.

Changes:

  • Added three MCP tools (list_tool_servers, list_tools, call_tool) with session caching and resource resolution across RemoteMCPServer, labeled Services, and optional MCPServer CRs.
  • Introduced STS audience plumbing end-to-end (CRDs/API types → translator → ADK runtime) and expanded unit tests for header propagation and ADK STS integration behavior.
  • Improved tracing/propagation: aiohttp client OTEL instrumentation (Python) and W3C trace-context capture/forwarding (Go), plus Service appProtocol updates in translator test outputs.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
python/packages/kagent-core/src/kagent/core/tracing/_utils.py Adds optional aiohttp-client OTEL instrumentation during tracing configure.
python/packages/kagent-core/pyproject.toml Adds aiohttp-client OTEL instrumentation dependency.
python/packages/kagent-adk/tests/unittests/test_header_propagation.py Updates STS header provider signature + adds sts_audience coverage tests.
python/packages/kagent-adk/src/kagent/adk/types.py Adds sts_audience support to MCP tool configs and forwards it into header providers/toolsets.
python/packages/kagent-adk/src/kagent/adk/_mcp_toolset.py Extends MCP toolset wrapper to store sts_audience.
python/packages/agentsts-adk/tests/test_adk_integration.py Updates ADK token propagation tests for audience-scoped exchange + before_tool behavior.
python/packages/agentsts-adk/src/agentsts/adk/_base.py Implements audience-scoped STS exchange (deferred to before_tool_callback) and per-audience caching.
helm/kagent-crds/templates/kagent.dev_remotemcpservers.yaml Adds stsAudience to RemoteMCPServer CRD schema (Helm).
helm/kagent-crds/templates/kagent.dev_agents.yaml Adds stsAudience to Agent→McpServerTool schema (Helm).
go/core/test/e2e/mocks/mock_sts_server.go Mock STS server now includes aud in generated token when provided.
go/core/pkg/auth/auth.go Simplifies sessionKey declaration.
go/core/internal/tracecontext/tracecontext.go New helper package to store/retrieve W3C trace headers in context.
go/core/internal/mcp/transport.go New shared MCP transport creation with header resolution.
go/core/internal/mcp/mcp_tool_server_handler_test.go Adds unit tests for tool server ref parsing, resolution, and tool handlers.
go/core/internal/mcp/mcp_tool_server_handler.go Implements list_tool_servers/list_tools/call_tool handlers with session caching + conversion logic.
go/core/internal/mcp/mcp_handler.go Registers new MCP tools, adds session cache, and closes cached sessions on shutdown.
go/core/internal/httpserver/auth/authn.go Captures incoming tracecontext headers and injects them into outgoing upstream requests.
go/core/internal/controller/translator/agent/testdata/outputs/tls-with-system-cas-disabled.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/tls-with-disabled-verify.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/tls-with-custom-ca.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/ollama_agent.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/bedrock_agent.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/basic_agent.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/anthropic_agent.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_system_message_from_secret.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_system_message_from_configmap.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_sts_audience_override.json New translator golden output for per-agent STS audience override.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_sts_audience.json New translator golden output for RemoteMCPServer STS audience default.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_streaming.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_skills.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_security_context.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_scheduling_attributes.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_proxy_service.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_proxy_mcpserver_custom_timeout.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_proxy_mcpserver.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_proxy_external_remotemcp.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_proxy.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_passthrough.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_nested_agent.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_mcp_service.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_http_toolserver.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_custom_sa.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_cross_namespace_tools.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_code.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/outputs/agent_with_allowed_headers.json Updates expected Service port output to include appProtocol.
go/core/internal/controller/translator/agent/testdata/inputs/agent_with_sts_audience_override.yaml New translator input fixture for per-agent STS audience override.
go/core/internal/controller/translator/agent/testdata/inputs/agent_with_sts_audience.yaml New translator input fixture for RemoteMCPServer STS audience default.
go/core/internal/controller/translator/agent/adk_api_translator.go Sets Service appProtocol and adds STS audience resolution/forwarding into ADK tool configs.
go/api/v1alpha2/zz_generated.deepcopy.go Updates deep-copies for new STSAudience fields.
go/api/v1alpha2/remotemcpserver_types.go Adds stsAudience to RemoteMCPServerSpec.
go/api/v1alpha2/agent_types.go Adds stsAudience override to McpServerTool.
go/api/config/crd/bases/kagent.dev_remotemcpservers.yaml Adds stsAudience to RemoteMCPServer CRD schema (base).
go/api/config/crd/bases/kagent.dev_agents.yaml Adds stsAudience to Agent CRD schema (base).
go/api/adk/types.go Adds sts_audience field to ADK MCP server config JSON.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@syn-zhu syn-zhu force-pushed the feat/mcp-tool-server-discovery branch 2 times, most recently from 9cbec8b to d7e9b1b Compare March 5, 2026 12:18
Adds three new MCP tools to the kagent-controller's existing MCP endpoint
at :8083/mcp, enabling dynamic discovery and invocation of tools across
all tool source types:

- list_tool_servers: Lists all tool servers (RemoteMCPServer, Service
  with kagent.dev/mcp-service=true label, MCPServer CRs)
- list_tools: Connects to a tool server and returns its tool catalog
- call_tool: Invokes a specific tool on a specific tool server

This moves the functionality originally proposed in kagent-dev/kmcp#123
into kagent per reviewer feedback, since kagent already watches all three
resource types and has the existing MCP handler infrastructure.

Key design decisions:
- Unified ref format: Kind/namespace/name (e.g. RemoteMCPServer/default/my-server)
- Session caching with evict-and-retry for stale connections
- Reuses existing ConvertServiceToRemoteMCPServer and
  ConvertMCPServerToRemoteMCPServer from the translator package
- MCPServer CRD is optional (graceful degradation if not installed)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Simon Zhu <simon.zhu@mongodb.com>
@syn-zhu syn-zhu force-pushed the feat/mcp-tool-server-discovery branch from d7e9b1b to 00fef77 Compare March 5, 2026 12:45
Copy link
Contributor

@EItanya EItanya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main comment here is that refs are very confusing, and we've removed them from other parts of the codebase, can we please move away from them here as well.

Comment on lines +82 to +97
// --- Ref parsing ---

// parseServerRef parses a "Kind/namespace/name" reference into components.
func parseServerRef(ref string) (kind, namespace, name string, err error) {
parts := strings.SplitN(ref, "/", 3)
if len(parts) != 3 || parts[0] == "" || parts[1] == "" || parts[2] == "" {
return "", "", "", fmt.Errorf("invalid server reference %q: must be Kind/namespace/name (e.g. RemoteMCPServer/default/my-server)", ref)
}
kind, namespace, name = parts[0], parts[1], parts[2]
switch kind {
case "RemoteMCPServer", "Service", "MCPServer":
return kind, namespace, name, nil
default:
return "", "", "", fmt.Errorf("unknown server kind %q: must be RemoteMCPServer, Service, or MCPServer", kind)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using string refs everywhere which need to be parsed? Can't we just use explicit fields for these values?

Comment on lines +42 to +52
type ListToolServersOutput struct {
Servers []ToolServerSummary `json:"servers"`
}

type ToolServerSummary struct {
Ref string `json:"ref"` // "Kind/namespace/name"
Kind string `json:"kind"` // "RemoteMCPServer", "Service", "MCPServer"
URL string `json:"url"` // resolved endpoint URL
Protocol string `json:"protocol"` // "STREAMABLE_HTTP" or "SSE"
Status string `json:"status,omitempty"` // "Ready" / "NotReady" (MCPServer only)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we re-use the types from the list HTTP endpoints?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add MCP tool server discovery and invocation endpoint

3 participants