Skip to content

refactor: vendor upstream tools and anthropic provider, replace stub loop#5

Merged
Redux0223 merged 27 commits intomainfrom
feat/vendor-upstream-tools
Mar 30, 2026
Merged

refactor: vendor upstream tools and anthropic provider, replace stub loop#5
Redux0223 merged 27 commits intomainfrom
feat/vendor-upstream-tools

Conversation

@Redux0223
Copy link
Copy Markdown
Collaborator

@Redux0223 Redux0223 commented Mar 30, 2026

Summary

  • Vendor file tools (read, write, edit) from pi-mono with Zod schemas, fuzzy edit matching, and pluggable Operations interfaces
  • Vendor exec tool (renamed from pi-mono bash) with background/yield and process management
  • Vendor Anthropic streaming provider with extended thinking support (adaptive for Opus 4.6/Sonnet 4.6)
  • Vendor agent loop with parallel tool dispatch and abort signal propagation
  • Vendor web tools with SSRF protection (DNS rebinding guard, private IP rejection)
  • Vendor browser tool schema (host-mode Playwright stub, 16 actions)
  • Replace PR3 stub echo loop with real Anthropic Messages API (when API key explicitly set)
  • Preserve stub fallback for backwards compatibility (no API key = keyword matching + "Acknowledged")
  • Real token usage tracking from API (replaces chars/4 estimate)
  • Gateway-coupled tools remain as hosted-tool protocol (unchanged)
  • All vendored code is MIT licensed

Source Provenance

Source SHA License Files
badlogic/pi-mono cb4e4d8c MIT tools, agent loop, anthropic provider
openclaw/openclaw edb5123f MIT web tools, SSRF, browser, exec extensions

New Dependencies

Package Version License Purpose
@anthropic-ai/sdk ^0.80.0 MIT Anthropic Messages API client
diff ^7.0.0 BSD-3 Unified diff for edit tool
partial-json ^0.1.7 MIT Streaming JSON parse for tool call args

Test Results

  • 43 tests, all passing (0 failures)
  • 19 new unit tests (tools, providers, loop, SSRF)
  • All 8 original integration/contract tests preserved and passing

Test Plan

  • pnpm run check — typecheck passes (0 errors)
  • pnpm run build — build succeeds
  • pnpm vitest run — all 43 tests pass
  • Manual: set anthropicApiKey in SDK options and run a session that uses read/write/edit/exec tools

Redux0223 and others added 25 commits March 30, 2026 15:59
Add @anthropic-ai/sdk ^0.80.0, diff ^7.0.0, partial-json ^0.1.7.
These are needed by vendored Anthropic provider and tools.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mirrors existing sync-from-openclaw.mjs pattern.
Manifest tracks 21 files from badlogic/pi-mono @ cb4e4d8c.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Raw TypeScript copies from pi-mono. These files will NOT compile
(broken imports to @mariozechner/*, @sinclair/typebox, pi-tui).
Subsequent commits adapt imports and schemas.

Source: https://github.com/badlogic/pi-mono @ cb4e4d8c
License: MIT (Mario Zechner)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
OpenClawTool interface with Zod schemas. toAnthropicToolDef() converts
to Anthropic API format. textResult/jsonResult/imageResult helpers
for tool return values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Subset of pi-ai types for Anthropic-only use. Removed @sinclair/typebox
dependency (replaced TSchema with any). Simplified env-api-keys to
anthropic-only. Event stream, JSON parse, surrogate sanitize unchanged.

Source: pi-mono @ cb4e4d8c (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Streaming Messages API with extended thinking (adaptive Opus 4.6/
Sonnet 4.6, budget-based older models), cache control, streaming
JSON tool call parsing.

Removed: stealth mode, OAuth/Claude Code identity, GitHub Copilot.

Source: pi-mono @ cb4e4d8c (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AgentTool, AgentToolResult, AgentEvent, AgentLoopConfig interfaces.
Removed pi-ai model registry dependency, replaced TSchema with any.

Source: pi-mono @ cb4e4d8c (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Agentic tool-dispatch loop with parallel/sequential execution,
beforeToolCall/afterToolCall hooks, steering/follow-up messages,
abort signal propagation.

Source: pi-mono @ cb4e4d8c (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
truncate (2000 lines/50KB), path-utils (macOS NFD), file-mutation-queue,
shell (simplified, SHELL env var), child-process, mime detection
(inline magic bytes, no file-type dep).
Stripped pi-mono config imports.

Source: pi-mono @ cb4e4d8c (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TypeBox→Zod, stripped TUI. Preserved: ReadOperations interface,
image MIME detection, offset/limit paging, truncation.

Source: pi-mono @ cb4e4d8c (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TypeBox→Zod, stripped TUI. Preserved: WriteOperations interface,
auto-mkdir, withFileMutationQueue serialization.

Source: pi-mono @ cb4e4d8c (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TypeBox→Zod, stripped TUI. Preserved: EditOperations, fuzzy matching
(Unicode NFKC, smart quotes, trailing whitespace), uniqueness check,
unified diff output via 'diff' package.

Source: pi-mono @ cb4e4d8c (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix TruncationResult field names (truncated, outputLines) in read
and exec tools. Add diff.d.ts type declaration. Add @types/diff dev dep.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Simplified from openclaw bash-process-registry. In-memory session
tracking with add/get/drain/markExited/delete operations.
No scope keys, no sweeper, no supervisor.

Source: openclaw @ edb5123f (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TypeBox→Zod, stripped TUI, renamed bash→exec. Added background/yield
support from openclaw. Preserved: ExecOperations interface, streaming
output buffer, tail truncation.

Source: pi-mono @ cb4e4d8c + openclaw @ edb5123f (both MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Actions: list, poll, log, write, kill, remove.
Shares process registry with exec tool.

Source: openclaw @ edb5123f (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
DNS rebinding protection, private IP rejection (RFC 1918, loopback,
link-local, IPv4-in-IPv6), hostname blocking (localhost, *.local,
*.internal, metadata.google.internal). Fail-closed on parse errors.

Source: openclaw @ edb5123f (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SSRF-guarded HTTP fetch with HTML→text extraction.
Simplified from openclaw (removed config chain, secrets, Firecrawl).

Source: openclaw @ edb5123f (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Single-provider search. Returns null if BRAVE_SEARCH_API_KEY not set.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Flat object schema (not union) for LLM compatibility.
16 actions. TypeBox→Zod.

Source: openclaw @ edb5123f (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Playwright-based browser automation stub. Removed sandbox/node modes
and gateway dependency. Host-only execution. Full implementation
pending browser lifecycle management.

Source: openclaw @ edb5123f (MIT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
assembleLocalTools() creates all local SDK tools.
Added anthropicApiKey to OpenClawAgentSdkOptions and
OpenClawSessionParams.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BREAKING: replaces stub echo with real Anthropic Messages API.
Session now: calls Claude, dispatches local tool calls
(read/write/edit/exec/process/web_fetch/web_search),
suspends for hosted tool calls (existing protocol preserved).

Falls back to stub behavior when no ANTHROPIC_API_KEY is set.
Usage tracking now uses real API token counts.

WARNING: existing integration tests will fail until C23 updates them.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
API key from env var is handled by the Anthropic provider, not
the session layer. Session falls back to stub behavior when no
explicit anthropicApiKey is set in params or SDK options.
This preserves all existing integration test behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update provenance for all vendored files from pi-mono @ cb4e4d8c.
Each entry now has specific adaptation notes instead of "pending".
MIT licensed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@jessicairwinwa jessicairwinwa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — PR #5: Vendor upstream tools and Anthropic provider

Nice work, Redux! Thorough PR. I've gone through all 47 files. Here's my assessment:

Strengths

  1. SSRF protection (src/tools/web/ssrf.ts) — Excellent implementation. Fail-closed by default, covers DNS rebinding, IPv4-mapped IPv6, link-local, carrier-grade NAT, and cloud metadata endpoints. Test coverage is solid.

  2. Provenance trackingmanifests/pi-mono-provenance.json with per-file upstream SHA references and adaptation notes is great practice for vendored code.

  3. Backward compatibility — The stub fallback when no anthropicApiKey is set preserves all existing behavior. Clean opt-in pattern.

  4. Tool interface designOpenClawTool with Zod schemas + toAnthropicToolDef converter is clean. The result helpers reduce boilerplate.

  5. Test coverage — 19 new unit tests (read, write, edit, exec, SSRF, tool interface, provider utils). All 43 passing.

  6. Exec tool — Process registry with background/yield, process tree killing, shell env detection — much more robust than a simple exec.

Issues / Questions

1. Two separate agentic loops
sdk-session.ts has its own runAgenticLoop (simple, non-streaming, 50-turn max) while src/loop/agent-loop.ts is the vendored loop (streaming, parallel tool dispatch, abort signals). Are both needed? The vendored one is more capable.

2. Hardcoded dev path in sync script
scripts/sync-from-pi-mono.mjs has /Users/apple/programme/funny_projects/pi-mono hardcoded. Should accept an env var or CLI arg.

3. require.resolve("playwright") in ESM-only project
src/tools/browser/browser.ts uses require.resolve which doesn't exist in pure ESM. Need createRequire(import.meta.url).resolve or a dynamic import try/catch.

4. Duplicate diff types
Both @types/diff in devDependencies AND custom src/types/diff.d.ts. The @types/diff package is deprecated (diff 7+ ships its own types). Pick one approach.

5. Tool coverage — 8 of 21
Assembly includes read, write, edit, exec, process, web_fetch, web_search, browser(stub). Remaining 13 are absent. Gateway-coupled tools make sense as hosted protocol, but glob and grep are pure local tools that could be vendored here. Intentional deferral?

6. Missing signal propagation in sdk-session loop
The runAgenticLoop in sdk-session.ts doesn't propagate AbortSignal to tool execution or API calls. The vendored loop handles this correctly.

Relationship with PR #4

My PR #4 implemented all 21 tool schemas under src/core/tools/builtin/. This PR vendors production implementations from pi-mono — fundamentally different approach. These will conflict structurally.

Recommendation: If this is the direction, PR #4 should be closed or rebased to only contribute gateway-coupled tool schemas not covered here.

Verdict

Solid foundation. The vendored implementations are substantially more robust than stub schemas. Main concerns are the dual-loop architecture and ESM compatibility in browser.ts.

- Remove deprecated @types/diff, keep custom src/types/diff.d.ts
- Replace hardcoded dev path in sync-from-pi-mono.mjs with
  PI_MONO_ROOT env var / CLI arg
- Replace require.resolve("playwright") with dynamic import()
  for ESM compatibility

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the 120-line non-streaming runAgenticLoop with the vendored
agent-loop.ts via adapter layer:

- agent-event-adapter.ts: translates AgentEvent → OpenClawStreamEvent
  (text_delta→assistant_delta, thinking→reasoning, tool events)
- hosted-tool-bridge.ts: blocks vendored loop on hosted tool calls,
  resumes when host provides result via submitHostedToolResult
- model-from-ref.ts: builds Model object from modelRef string

Now uses streaming API, parallel tool dispatch, AbortSignal propagation
(requestStop() → controller.abort()), and beforeToolCall/afterToolCall
hooks from the vendored loop. Removes direct @anthropic-ai/sdk import
from session — all API calls go through the vendored provider.

Stub fallback (no API key) preserved for backwards compatibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator Author

@Redux0223 Redux0223 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Response to Review

All 6 points addressed in commits 847817e and b0eac76. Here's the breakdown:

1. Two separate agentic loops → Fixed

Replaced the inline runAgenticLoop (120-line non-streaming, sequential-only, no signal propagation) with the vendored src/loop/agent-loop.ts via an adapter layer:

  • agent-event-adapter.ts — translates AgentEventOpenClawStreamEvent (text_delta→assistant_delta, thinking→reasoning, tool events)
  • hosted-tool-bridge.ts — implements hosted tool suspension by blocking the vendored loop's tool.execute() Promise until the host calls submitHostedToolResult
  • model-from-ref.ts — builds Model object from modelRef string

The session no longer imports @anthropic-ai/sdk directly — all API calls go through the vendored streaming provider.

2. Hardcoded dev path → Fixed (847817e)

scripts/sync-from-pi-mono.mjs now uses process.env.PI_MONO_ROOT || process.argv[2] with usage instructions on missing arg.

3. require.resolve("playwright") in ESM → Fixed (847817e)

Replaced with dynamic import("playwright" as string) in the execute() path. The tool is always registered (so its schema is discoverable); Playwright availability is checked at execution time.

4. Duplicate diff types → Fixed (847817e)

Removed @types/diff (deprecated stub). Kept src/types/diff.d.ts as the authoritative declaration.

5. Tool coverage (8 of 21) → Intentional deferral

The plan scope covered the 8 tools listed in the upstream tool name table. glob and grep are good candidates for a follow-up PR — they're pure local tools with no gateway dependency.

6. Missing signal propagation → Fixed (part of point 1)

The vendored loop propagates AbortSignal to API calls, tool execution, beforeToolCall/afterToolCall hooks, and context transforms. requestStop() now calls controller.abort() which flows through the entire chain.

Additional: Missing git tags → Added

Created all 10 planned rollback tags (vendor/raw-sync through vendor/complete) at their correct commit SHAs.

Regarding PR #4

Agreed with the recommendation. This PR vendors production implementations that supersede the stub schemas in PR #4. PR #4 should be closed or rebased to only contribute gateway-coupled tool schemas not covered here.

Verification

All gates pass:

  • pnpm run check — 0 errors
  • pnpm run build — success
  • pnpm vitest run — 43/43 tests pass
  • Import cleanliness: 0 @mariozechner, 0 @sinclair/typebox, 0 pi-tui in compiled source

Copy link
Copy Markdown

@jessicairwinwa jessicairwinwa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review — LGTM ✅

All 6 issues addressed. Reviewed the two new commits (847817e, b0eac76).

Verified Fixes

  1. Dual loops → Single vendored loop ✅ — Clean adapter layer (agent-event-adapter.ts, hosted-tool-bridge.ts, model-from-ref.ts). The inline runAgenticLoop is gone; sdk-session.ts no longer imports @anthropic-ai/sdk directly. Architecture is much cleaner now.

  2. Hardcoded path → env/argv ✅ — PI_MONO_ROOT via process.env || process.argv[2] with clear usage error.

  3. require.resolve → dynamic import ✅ — await import("playwright" as string) at execution time. Tool schema always registered. Good.

  4. @types/diff removed ✅ — Clean.

  5. glob/grep deferred — Acknowledged. Follow-up PR makes sense.

  6. AbortSignal propagation ✅ — requestStop()controller.abort() → flows through vendored loop.

One observation (non-blocking)

The hosted tool suspension uses break from for await, which calls eventStream.return() and terminates the agent loop generator. This means the tool result provided via submitHostedToolResult resolves the bridge promise but the loop is already dead — the result flows through the SDK's higher-level protocol on the next submit(). This is consistent with the old inline loop behavior (which also returned early), so not a regression. Just worth documenting for future contributors.

Verdict

Ship it. The vendored loop + adapter layer is a significant architectural improvement. 43/43 tests, CI green.

@Redux0223 Redux0223 merged commit 297552a into main Mar 30, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants