Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,11 @@ site/node_modules/
# Temporary files
*.tmp
*.bak

# AI assistant local config (per-developer, do not commit)
.gemini/

# Built plugin binaries (compiled artifacts of examples/plugins/*)
examples/plugins/*/awf-plugin-*
!examples/plugins/*/awf-plugin-*.go
!examples/plugins/*/awf-plugin-*_test.go
49 changes: 49 additions & 0 deletions .go-arch-lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ commonComponents:
- pkg-httpx
- pkg-output
- pkg-registry
- pkg-mcpserver

vendors:
go-stdlib:
Expand Down Expand Up @@ -205,6 +206,9 @@ components:
pkg-validation:
in: ../pkg/validation

pkg-mcpserver:
in: ../pkg/mcpserver

# PROTOBUF
proto-plugin:
in: ../proto/plugin/v1
Expand All @@ -213,6 +217,9 @@ components:
application:
in: application

application-tools:
in: application/tools

# INFRASTRUCTURE LAYER
infra-agents:
in: infrastructure/agents
Expand Down Expand Up @@ -271,6 +278,12 @@ components:
infra-otel:
in: infrastructure/otel

infra-tools:
in: infrastructure/tools

infra-tools-builtins:
in: infrastructure/tools/builtins

infra-roles:
in: infrastructure/roles

Expand Down Expand Up @@ -306,6 +319,8 @@ components:
deps:
# DOMAIN — only stdlib (+ pkg via commonComponents)
domain-workflow:
mayDependOn:
- domain-errors
canUse:
- go-stdlib
- go-sync
Expand Down Expand Up @@ -344,6 +359,7 @@ deps:
- domain-errors
- domain-plugin
- domain-operation
- application-tools
- infra-agents
- infra-expression
- infra-github
Expand All @@ -352,12 +368,22 @@ deps:
- infra-repository
- infra-roles
- infra-skills
- infra-tools
- infra-tools-builtins
- infra-xdg
canUse:
- go-stdlib
- go-sync
- uuid

application-tools:
mayDependOn:
- domain-ports
- domain-errors
- domain-plugin
canUse:
- go-stdlib

# INFRASTRUCTURE — domain + vendors
infra-agents:
mayDependOn:
Expand Down Expand Up @@ -564,10 +590,31 @@ deps:
canUse:
- go-stdlib

pkg-mcpserver:
canUse:
- go-stdlib

infra-tools:
mayDependOn:
- domain-ports
- domain-plugin
- domain-errors
canUse:
- go-stdlib

infra-tools-builtins:
mayDependOn:
- domain-ports
- domain-plugin
- domain-errors
canUse:
- go-stdlib

# INTERFACES — wiring layer (app + infra + domain)
interfaces-cli:
mayDependOn:
- application
- application-tools
- domain-workflow
- domain-ports
- domain-errors
Expand All @@ -592,6 +639,8 @@ deps:
- infra-skills
- infra-store
- infra-tokenizer
- infra-tools
- infra-tools-builtins
- infra-updater
- infra-workflowpkg
- infra-xdg
Expand Down
Empty file added .zpm/kb/default/journal.wal
Empty file.
11 changes: 11 additions & 0 deletions .zpm/mounts.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"version": 1,
"mounts": [
{
"name": "default",
"path": ".zpm/kb/default",
"scope": "project",
"mode": "rw"
}
]
}
12 changes: 6 additions & 6 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,6 @@ func TestWorkflowValidation(t *testing.T) {

## Architecture Rules

- Application layer must persist source metadata (SetSourceData) after successful infrastructure installation; omitting state blocks downstream operations like updates
- Use dual import aliases (e.g., infrastructurePlugin + registry) when consuming refactored packages; explicitly requalify all symbol references to prevent ambiguity
- Keep thin wrapper functions in original location for backward compatibility; delegate completely to extracted packages to maintain single source of truth
- Verify pkg/ package extractions are complete by confirming orphaned imports are removed and make lint passes with zero violations
Expand All @@ -240,12 +239,10 @@ func TestWorkflowValidation(t *testing.T) {
- Server owns background task coordination (WaitGroup); pass by pointer to handlers and coordinate shutdown: httpSrv.Shutdown() then sseWG.Wait()
- Always update `.go-arch-lint.yml` when adding new infrastructure components; register the package and document its dependency rules in the commit message
- When implementing infrastructure adapters that follow established patterns (e.g., FilesystemAgentRoleRepository mirrors FilesystemSkillRepository), reuse shared utilities (skills.StripFrontmatter) to maintain single source of truth
- Provide doc.go for new packages in pkg/ and infrastructure/ subdirectories; document architecture assumptions, error codes, protocol behavior, and implementation patterns (aim for 100+ lines)

## Common Pitfalls

- Always provide graceful fallback to stateless mode when optional session ID extraction fails; never fail the entire operation due to extraction errors
- When migrating API JSON field names, parse both old and new keys with new key preferred; use dual-key parsing for backwards compatibility without validation errors
- Leverage Go's map[string]any behavior to silently ignore unsupported provider options; avoids validation errors while maintaining clear intent
- Avoid variable shadowing; never redeclare outer-scope variables with := in inner blocks
- Use index-based loops or pointer ranges when iterating large structs (>128 bytes); avoid per-iteration copying
- Limit function return values to 5; return a struct for 6+ outputs to maintain readability
Expand Down Expand Up @@ -284,11 +281,12 @@ func TestWorkflowValidation(t *testing.T) {
- Never use standard YAML unmarshaling for skill metadata; implement frontmatter parsing (YAML header between --- delimiters) to preserve metadata
- Never skip testing XDG directory fallback paths; code will fail on systems without XDG_DATA_HOME and XDG_CONFIG_HOME variables set
- Major feature implementations require supporting infrastructure changes (ExecutionContext getters, helper modifications); document rationale in commit message and update validation plan if discovered
- Prepend MCP-only instructions to system prompts in all MCP provider injectors before applying mutations; verify implementation across Codex, Opencode, and other MCP providers
- Accumulate streaming tool_call deltas by index when assembling tool calls from chunked responses; track name and arguments separately, then validate and return errors for invalid JSON instead of empty slices
- Always test tool handlers and CLI command construction with shell metacharacters, empty inputs, and special characters; verify proper escaping in all output parsing and command formatting

## Test Conventions

- Use _Integration suffix for tests requiring live agent execution or system dependencies; keep unit tests suffix-less in domain/application/infrastructure packages
- Separate provider output format validation tests into dedicated *_extract_test.go files; verify extraction patterns before session resume integration tests
- Document provider output format assumptions (JSON wrapper field names, text patterns) in code comments; validate assumptions with assertion-based tests before production
- Update all YAML fixtures when removing option support from code; synchronize fixtures with validation rule changes to prevent accidental bypass of removed validations
- Add //nolint:gosec to test code with controlled inputs when GOSEC flags false positives
Expand All @@ -307,6 +305,8 @@ func TestWorkflowValidation(t *testing.T) {
- Always write unit tests for CLI helper functions; parseInputFlags, resolvePromptInput, categorizeError must have >80% coverage before commit
- HTTP servers require unit tests for the server struct itself: route registration, API initialization, graceful shutdown, not just individual handlers
- Organize interface layer test fixtures in tests/fixtures/<interface-type>/ with descriptive names (e.g., api-simple-success.yaml, api-failing.yaml)
- Write tests validating streaming tool call assembly across all scenarios: single chunk, multiple chunks, parallel calls, out-of-order indices, and malformed JSON arguments
- Always write dedicated unit tests for tool handlers (bash, glob, grep, read, write, edit); test option parsing, argument escaping, and error conditions independent of integration tests

## Review Standards

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ A Go CLI tool for orchestrating AI agents (Claude, Gemini, Codex, GitHub Copilot
- **External Prompt Files** - Load agent prompts from `.md` files with full template interpolation, helper functions, and local override support
- **External Script Files** - Load commands from external script files with shebang-based interpreter dispatch, template interpolation, path resolution, and local override support
- **Conversation Mode** - Multi-turn conversations with native session resume for CLI providers (`claude`, `codex`, `gemini`, `opencode`, `github_copilot`), automatic context window management for HTTP providers, mid-conversation context injection via `inject_context` field, and token tracking across all turns
- **MCP Proxy** - Intercept and audit AI agent tool calls via Model Context Protocol (MCP); re-expose the 6 built-in tools (`Read`, `Write`, `Edit`, `Bash`, `Glob`, `Grep`) with full observability (OTel spans, structured logs); expose custom gRPC plugin operations as MCP tools; optional full interception or additive mode per step
- **OpenAI-Compatible Provider** - Use any Chat Completions API (OpenAI, Ollama, vLLM, Groq) with native HTTP integration, accurate token reporting, and no CLI tool required
- **Parallel Execution** - Run multiple steps concurrently with configurable strategies
- **Loop Constructs** - For-each and while loops with full context access
Expand Down
92 changes: 92 additions & 0 deletions docs/ADR/017-mcp-proxy-stdio-subprocess-for-tool-interception.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
title: "017: MCP Proxy via Per-Step stdio Subprocess for Tool Interception"
---

**Status**: Accepted
**Date**: 2026-05-23
**Issue**: F099
**Supersedes**: N/A
**Superseded by**: N/A

## Context

AWF orchestrates AI agents (Claude, Gemini, Codex, OpenCode, OpenAI Compatible) that invoke file system and shell tools as part of workflow execution. Currently those tool calls are entirely opaque to AWF: the agent CLI receives a prompt, runs, and returns output — AWF cannot intercept, audit, or extend the tool set the agent uses.

F099 must solve three problems simultaneously:

1. **Interception**: Make AWF's 6 built-in tools (`Read`, `Write`, `Edit`, `Bash`, `Glob`, `Grep`) available to agents via a structured protocol, so that tool calls are observable and auditable (OTel spans, structured logs).
2. **Extension**: Allow AWF gRPC plugins (existing `ports.OperationProvider` implementations) to expose their operations as agent tools without agents knowing about AWF's plugin model.
3. **Multi-provider support**: The mechanism must work across five agents with different injection APIs — four stdio CLIs and one HTTP provider — without requiring provider-specific tool-call logic in the domain or application layers.

Two protocol-level questions are load-bearing beyond this feature:

- **Which protocol** governs the host–agent tool call contract? The answer locks in an external-facing API that plugin SDK authors and workflow authors will depend on.
- **What process topology** delivers that protocol? The answer determines crash isolation, subprocess lifecycle complexity, and client compatibility across all five providers.

## Candidates

### Protocol

| Option | Pros | Cons |
|--------|------|------|
| **MCP 2024-11-05 (Model Context Protocol)** | Already supported by Claude, Gemini, Codex, OpenCode; JSON-RPC 2.0 base; standardized `tools/list` + `tools/call` semantics; schema-first tool definitions | Subset selection required; not all features needed |
| **Custom JSON-RPC over stdio** | Full control over schema | No CLI support out-of-box; every provider needs a custom adapter; no ecosystem tooling |
| **OpenAI `tools[]` HTTP format only** | Native to OpenAI Compatible provider; well-documented | Not supported by stdio CLIs (Claude, Gemini, Codex, OpenCode); two protocols required anyway |

### Process Topology

| Option | Description | Files changed | Risk |
|--------|-------------|---------------|------|
| **A: In-process MCP server** | AWF embeds the MCP server as a goroutine; agents connect via UNIX socket | ~10 | High — UNIX socket transport is nonstandard for Claude/Gemini CLI; stdio is the documented path |
| **B: Per-step subprocess `awf mcp-serve`** | AWF spawns `awf mcp-serve --config=<tmp>` per step; agents connect via stdio JSON-RPC | ~15 | Medium — subprocess lifecycle, but proven pattern from `RPCPluginManager` |
| **C: External MCP server via go-plugin gRPC** | Proxy as a go-plugin gRPC plugin loaded by AWF | ~25 | High — unnecessary extra layer; harder to debug; changes the plugin model |

## Decision

**Protocol:** Adopt MCP 2024-11-05 (latest stable as of 2026-01-01). Implement only the required subset: `initialize`, `initialized`, `tools/list`, `tools/call`, `shutdown`. Prompts, resources, `notifications/progress`, and sampling are out of scope and deferred.

**Process topology:** Option B — per-step subprocess `awf mcp-serve`. One `awf mcp-serve` process is spawned per step where `mcp_proxy.enable: true`. The subprocess serves MCP over stdin/stdout. The parent `awf run` process spawns it via `ToolProxyService.Start()` and tears it down via `ToolProxyService.Close()`.

**Public package:** The MCP server implementation lives in `pkg/mcpserver/` (not `internal/`), with zero `internal/` imports enforced by a lint rule and an AST-based architecture test. This gives future external consumers (plugin SDK authors, other AWF tooling) a stable, embeddable MCP server.

**OpenAI Compatible exception:** The HTTP provider cannot use stdio; instead, `ToolRouter` is invoked in-process and its tool definitions are injected as `tools[]` in the Chat Completions request. This is an explicit split: stdio providers use subprocess MCP, HTTP provider uses in-process `tools[]`.

**Key rules established:**

- `pkg/mcpserver` depends on Go stdlib only — no `internal/` imports, no framework deps.
- `ToolProvider` port in domain; `BuiltinToolProvider` + `PluginToolAdapter` in infrastructure; `ToolRouter` + `ToolProxyService` in application.
- Tool names follow `<plugin>_<op>` (snake-case, single underscore) to satisfy MCP client name constraints. Dots are forbidden (Claude rejects them).
- Collision detection is fatal at step startup (registration time), not at call time.
- Subprocess lifecycle uses goroutine + buffered channel + 5s SIGTERM→SIGKILL deadline, matching `RPCPluginManager.connectWithTimeout` exactly.
- `awf mcp-serve` is `Hidden: true` — not user-facing; no stability guarantees independent of AWF binary version.
- `USER.MCP_PROXY.*` validation codes extend the error taxonomy (exit code 1) with six new codes: `UNKNOWN_KEY`, `UNKNOWN_PLUGIN`, `UNKNOWN_OPERATION`, `NAME_COLLISION`, `EMPTY_PROXY`, `UNSUPPORTED_PROVIDER`.

## Consequences

**What becomes easier:**

- Tool calls from all five agent providers are observable: each `tools/call` produces an OTel span and a structured zap log line.
- AWF gRPC plugins can expose operations to agents with no changes to the plugin manifest — `PluginToolAdapter` wraps the existing `ports.OperationProvider`.
- New tools can be added by implementing `ports.ToolProvider` without touching any agent provider code.
- External consumers can embed `pkg/mcpserver` to build custom MCP-enabled tooling.
- Subprocess crash isolation: a panic in `awf mcp-serve` is visible to the parent as a subprocess exit error but does not crash `awf run`.

**What becomes harder:**

- Each step with `mcp_proxy.enable: true` spawns an extra Go process (~10 MB RSS). At AWF's current scale this is acceptable; at high parallelism it requires monitoring.
- Codex and OpenCode have no `--tools ""` equivalent. Built-in tools cannot be disabled via flag injection; the proxy coexists with native tools and emits a startup `WARN` log. This is an accepted limitation documented in the YAML validation.
- MCP protocol version upgrades require coordinated changes to `pkg/mcpserver`, the hidden `mcp-serve` subcommand, and the per-provider config injection. The committed MCP version (2024-11-05) becomes the wire contract.
- `pkg/mcpserver` becoming public means adding new MCP methods (e.g., `notifications/progress`) is a semver-visible change.
- The OpenAI Compatible provider requires a separate in-process tools path (`tools[]` + `tool_choice` + multi-turn tool-call loop), maintained in parallel with the stdio subprocess path.

## Constitution Compliance

| Principle | Status | Justification |
|-----------|--------|---------------|
| Hexagonal Architecture | Compliant | Domain port `ports.ToolProvider`; application `ToolRouter`/`ToolProxyService`; infrastructure adapters; `pkg/mcpserver` has zero `internal/` imports; `.go-arch-lint.yml` extended with `pkg-mcpserver` and `infra-tools` components scoped appropriately |
| Go Idioms | Compliant | `context.Context` on all blocking ops; goroutine+buffered-channel+select for subprocess lifecycle; `errors.Is`/`fmt.Errorf` wrapping throughout |
| Minimal Abstraction | Compliant | No `ToolPolicy`/`ToolMiddleware`/`ToolCache` ports — decorator pattern is available if needed but not added prematurely; single function-value extension on `cliProviderHooks` (not a new interface) |
| Error Taxonomy | Compliant | Six new `USER.MCP_PROXY.*` codes extend the existing taxonomy; no new exit code category required (all are user/configuration errors, exit code 1) |
| Security First | Compliant | `Bash` tool delegates to `ShellExecutor` (existing shell-escaping, secret masking); subprocess uses SIGTERM→SIGKILL (no zombies); tmp config file written atomically (PID+timestamp suffix) |
| Test-Driven Development | Compliant | Table-driven unit tests per component; AST-based architecture tests for `pkg/mcpserver` import invariant; `make test-race` required on all application/infrastructure new code |
| Documentation Co-location | Compliant | `doc.go` per new package; YAML schema documented in `mcp_proxy.go` struct comments |
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Learn how to use AWF effectively:
- [Streaming Output Display & Tool Markers](user-guide/agent-steps.md#streaming-output-display--tool-markers) - Human-readable filtered output and tool-use markers for `--output streaming` and `--output buffered` modes
- [External Prompt Files](user-guide/agent-steps.md#external-prompt-files) - Load prompts from Markdown files with template interpolation
- [Model Validation](user-guide/agent-steps.md#model-validation) - Provider-specific model name validation (Claude, Gemini, Codex)
- [MCP Proxy](user-guide/agent-steps.md#mcp-proxy-tool-interception-and-control) - Tool call interception and observability via Model Context Protocol; expose plugin operations as MCP tools
- [Conversation Mode](user-guide/conversation-steps.md) - Multi-turn conversations with native session resume for CLI providers and context window management
- [Configuration](user-guide/configuration.md) - Project configuration file
- [Workflow Syntax](user-guide/workflow-syntax.md) - YAML workflow definition reference
Expand Down
Loading
Loading