OpenHarness delivers core lightweight agent infrastructure: tool-use, skills, memory, and multi-agent coordination.
Join the community: contribute Harness for open agent development.
One Command (oh) to Launch OpenHarness and Unlock All Agent Harnesses.
Supports CLI agent integration including OpenClaw, nanobot, Cursor, and more.
β’ Streaming Tool-Call Cycle β’ API Retry with Exponential Backoff β’ Parallel Tool Execution β’ Token Counting & Cost Tracking |
β’ 43 Tools (File, Shell, Search, Web, MCP) β’ On-Demand Skill Loading (.md) β’ Plugin Ecosystem (Skills + Hooks + Agents) β’ Compatible with anthropics/skills & plugins |
β’ CLAUDE.md Discovery & Injection β’ Context Compression (Auto-Compact) β’ MEMORY.md Persistent Memory β’ Session Resume & History |
β’ Multi-Level Permission Modes β’ Path-Level & Command Rules β’ PreToolUse / PostToolUse Hooks β’ Interactive Approval Dialogs |
β’ Subagent Spawning & Delegation β’ Team Registry & Task Management β’ Background Task Lifecycle β’ ClawTeam Integration (Roadmap) |
| Claude Code | OpenHarness | |
|---|---|---|
| Lines of Code | 512,664 | 11,733 (44x lighter) |
| Files | 1,884 | 163 |
| Language | TypeScript | Python |
| Tools | ~44 | 43 (98%) |
| Commands | ~88 | 54 (61%) |
| Skills Compatible | β | β anthropics/skills |
| Plugin Compatible | β | β claude-code/plugins |
| Tests | β | 114 unit + 6 E2E suites |
Leverages Python's power with pure focus on Harness architectureβstripped of enterprise overhead like telemetry, OAuth complexity, and hundreds of React components.
An Agent Harness is the complete infrastructure that wraps around an LLM to make it a functional agent. The model provides intelligence; the harness provides hands, eyes, memory, and safety boundaries.
OpenHarness is an open-source Python implementation designed for researchers, builders, and the community:
- Understand how production AI agents work under the hood
- Experiment with cutting-edge tools, skills, and agent coordination patterns
- Extend the harness with custom plugins, providers, and domain knowledge
- Build specialized agents on top of proven architecture
- 2026-04-01 π¨ v0.1.0 β Initial OpenHarness open-source release featuring complete Harness architecture:
Start here: Quick Start Β· Provider Compatibility Β· Showcase Β· Contributing Β· Changelog
- Python 3.10+ and uv
- Node.js 18+ (optional, for the React terminal UI)
- An LLM API key
ANTHROPIC_API_KEY=your_key uv run oh -p "Inspect this repository and list the top 3 refactors"# Clone and install
git clone https://github.com/HKUDS/OpenHarness.git
cd OpenHarness
uv sync --extra dev
# Example: use Kimi as the backend
export ANTHROPIC_BASE_URL=https://api.moonshot.cn/anthropic
export ANTHROPIC_API_KEY=your_kimi_api_key
export ANTHROPIC_MODEL=kimi-k2.5
# Launch
oh # if venv is activated
uv run oh # without activating venv# Single prompt β stdout
oh -p "Explain this codebase"
# JSON output for programmatic use
oh -p "List all functions in main.py" --output-format json
# Stream JSON events in real-time
oh -p "Fix the bug" --output-format stream-jsonOpenHarness currently detects and adapts to a small set of provider profiles in code. The table below is intentionally conservative and reflects the profiles implemented in src/openharness/api/provider.py.
| Provider profile | Detection signal | Auth kind | Voice mode | Notes |
|---|---|---|---|---|
| Anthropic | Default when no custom ANTHROPIC_BASE_URL is set |
API key | Not wired in current build | Default Claude-oriented setup |
| Moonshot / Kimi | ANTHROPIC_BASE_URL contains moonshot or model starts with kimi |
API key | Not wired in current build | Works through an Anthropic-compatible endpoint |
| Vertex-compatible | Base URL contains vertex or aiplatform |
GCP | Not wired in current build | Good fit for Anthropic-style gateways on Vertex |
| Bedrock-compatible | Base URL contains bedrock |
AWS | Not wired in current build | Intended for Bedrock-style deployments |
| Generic Anthropic-compatible | Any other explicit ANTHROPIC_BASE_URL |
API key | Not wired in current build | Useful for proxies and internal gateways |
If you are evaluating cross-provider workflows or want a concrete demo path, start with Anthropic or the Kimi example above, then compare behavior against your own compatible endpoint.
OpenHarness implements the core Agent Harness pattern with 10 subsystems:
openharness/
engine/ # π§ Agent Loop β query β stream β tool-call β loop
tools/ # π§ 43 Tools β file I/O, shell, search, web, MCP
skills/ # π Knowledge β on-demand skill loading (.md files)
plugins/ # π Extensions β commands, hooks, agents, MCP servers
permissions/ # π‘οΈ Safety β multi-level modes, path rules, command deny
hooks/ # β‘ Lifecycle β PreToolUse/PostToolUse event hooks
commands/ # π¬ 54 Commands β /help, /commit, /plan, /resume, ...
mcp/ # π MCP β Model Context Protocol client
memory/ # π§ Memory β persistent cross-session knowledge
tasks/ # π Tasks β background task management
coordinator/ # π€ Multi-Agent β subagent spawning, team coordination
prompts/ # π Context β system prompt assembly, CLAUDE.md, skills
config/ # βοΈ Settings β multi-layer config, migrations
ui/ # π₯οΈ React TUI β backend protocol + frontend
The heart of the harness. One loop, endlessly composable:
while True:
response = await api.stream(messages, tools)
if response.stop_reason != "tool_use":
break # Model is done
for tool_call in response.tool_uses:
# Permission check β Hook β Execute β Hook β Result
result = await harness.execute_tool(tool_call)
messages.append(tool_results)
# Loop continues β model sees results, decides next actionThe model decides what to do. The harness handles how β safely, efficiently, with full observability.
flowchart LR
U[User Prompt] --> C[CLI or React TUI]
C --> R[RuntimeBundle]
R --> Q[QueryEngine]
Q --> A[Anthropic-compatible API Client]
A -->|tool_use| T[Tool Registry]
T --> P[Permissions + Hooks]
P --> X[Files Shell Web MCP Tasks]
X --> Q
| Category | Tools | Description |
|---|---|---|
| File I/O | Bash, Read, Write, Edit, Glob, Grep | Core file operations with permission checks |
| Search | WebFetch, WebSearch, ToolSearch, LSP | Web and code search capabilities |
| Notebook | NotebookEdit | Jupyter notebook cell editing |
| Agent | Agent, SendMessage, TeamCreate/Delete | Subagent spawning and coordination |
| Task | TaskCreate/Get/List/Update/Stop/Output | Background task management |
| MCP | MCPTool, ListMcpResources, ReadMcpResource | Model Context Protocol integration |
| Mode | EnterPlanMode, ExitPlanMode, Worktree | Workflow mode switching |
| Schedule | CronCreate/List/Delete, RemoteTrigger | Scheduled and remote execution |
| Meta | Skill, Config, Brief, Sleep, AskUser | Knowledge loading, configuration, interaction |
Every tool has:
- Pydantic input validation β structured, type-safe inputs
- Self-describing JSON Schema β models understand tools automatically
- Permission integration β checked before every execution
- Hook support β PreToolUse/PostToolUse lifecycle events
Skills are on-demand knowledge β loaded only when the model needs them:
Available Skills:
- commit: Create clean, well-structured git commits
- review: Review code for bugs, security issues, and quality
- debug: Diagnose and fix bugs systematically
- plan: Design an implementation plan before coding
- test: Write and run tests for code
- simplify: Refactor code to be simpler and more maintainable
- pdf: PDF processing with pypdf (from anthropics/skills)
- xlsx: Excel operations (from anthropics/skills)
- ... 40+ more
Compatible with anthropics/skills β just copy .md files to ~/.openharness/skills/.
Compatible with claude-code plugins. Tested with 12 official plugins:
| Plugin | Type | What it does |
|---|---|---|
commit-commands |
Commands | Git commit, push, PR workflows |
security-guidance |
Hooks | Security warnings on file edits |
hookify |
Commands + Agents | Create custom behavior hooks |
feature-dev |
Commands | Feature development workflow |
code-review |
Agents | Multi-agent PR review |
pr-review-toolkit |
Agents | Specialized PR review agents |
# Manage plugins
oh plugin list
oh plugin install <source>
oh plugin enable <name>OpenHarness is useful as a lightweight harness layer around Claude-style tooling conventions:
- OpenClaw-oriented workflows can reuse Markdown-first knowledge and command-driven collaboration patterns.
- Claude-style plugins and skills stay portable because OpenHarness keeps those formats familiar.
- ClawTeam-style multi-agent work maps well onto the built-in team, task, and background execution primitives.
For concrete usage ideas instead of generic claims, see docs/SHOWCASE.md.
Multi-level safety with fine-grained control:
| Mode | Behavior | Use Case |
|---|---|---|
| Default | Ask before write/execute | Daily development |
| Auto | Allow everything | Sandboxed environments |
| Plan Mode | Block all writes | Large refactors, review first |
Path-level rules in settings.json:
{
"permission": {
"mode": "default",
"path_rules": [{"pattern": "/etc/*", "allow": false}],
"denied_commands": ["rm -rf /", "DROP TABLE *"]
}
}React/Ink TUI with full interactive experience:
- Command picker: Type
/β arrow keys to select β Enter - Permission dialog: Interactive y/n with tool details
- Mode switcher:
/permissionsβ select from list - Session resume:
/resumeβ pick from history - Animated spinner: Real-time feedback during tool execution
- Keyboard shortcuts: Shown at the bottom, context-aware
oh [OPTIONS] COMMAND [ARGS]
Session: -c/--continue, -r/--resume, -n/--name
Model: -m/--model, --effort, --max-turns
Output: -p/--print, --output-format text|json|stream-json
Permissions: --permission-mode, --dangerously-skip-permissions
Context: -s/--system-prompt, --append-system-prompt, --settings
Advanced: -d/--debug, --mcp-config, --bare
Subcommands: oh mcp | oh plugin | oh auth
| Suite | Tests | Status |
|---|---|---|
| Unit + Integration | 114 | β All passing |
| CLI Flags E2E | 6 | β Real model calls |
| Harness Features E2E | 9 | β Retry, skills, parallel, permissions |
| React TUI E2E | 3 | β Welcome, conversation, status |
| TUI Interactions E2E | 4 | β Commands, permissions, shortcuts |
| Real Skills + Plugins | 12 | β anthropics/skills + claude-code/plugins |
# Run all tests
uv run pytest -q # 114 unit/integration
python scripts/test_harness_features.py # Harness E2E
python scripts/test_real_skills_plugins.py # Real plugins E2Efrom pydantic import BaseModel, Field
from openharness.tools.base import BaseTool, ToolExecutionContext, ToolResult
class MyToolInput(BaseModel):
query: str = Field(description="Search query")
class MyTool(BaseTool):
name = "my_tool"
description = "Does something useful"
input_model = MyToolInput
async def execute(self, arguments: MyToolInput, context: ToolExecutionContext) -> ToolResult:
return ToolResult(output=f"Result for: {arguments.query}")Create ~/.openharness/skills/my-skill.md:
---
name: my-skill
description: Expert guidance for my specific domain
---
# My Skill
## When to use
Use when the user asks about [your domain].
## Workflow
1. Step one
2. Step two
...Create .openharness/plugins/my-plugin/.claude-plugin/plugin.json:
{
"name": "my-plugin",
"version": "1.0.0",
"description": "My custom plugin"
}Add commands in commands/*.md, hooks in hooks/hooks.json, agents in agents/*.md.
OpenHarness is most useful when treated as a small, inspectable harness you can adapt to a real workflow:
- Repo coding assistant for reading code, patching files, and running checks locally.
- Headless scripting tool for
jsonandstream-jsonoutput in automation flows. - Plugin and skill testbed for experimenting with Claude-style extensions.
- Multi-agent prototype harness for task delegation and background execution.
- Provider comparison sandbox across Anthropic-compatible backends.
See docs/SHOWCASE.md for short, reproducible examples.
OpenHarness is a community-driven research project. We welcome contributions in:
| Area | Examples |
|---|---|
| Tools | New tool implementations for specific domains |
| Skills | Domain knowledge .md files (finance, science, DevOps...) |
| Plugins | Workflow plugins with commands, hooks, agents |
| Providers | Support for more LLM backends (OpenAI, Ollama, etc.) |
| Multi-Agent | Coordination protocols, team patterns |
| Testing | E2E scenarios, edge cases, benchmarks |
| Documentation | Architecture guides, tutorials, translations |
# Development setup
git clone https://github.com/HKUDS/OpenHarness.git
cd OpenHarness
uv sync --extra dev
uv run pytest -q # Verify everything worksUseful contributor entry points:
CONTRIBUTING.mdfor setup, checks, and PR expectationsCHANGELOG.mdfor user-visible changesdocs/SHOWCASE.mdfor real-world usage patterns worth documenting
MIT β see LICENSE.
Oh my Harness!
The model is the agent. The code is the harness.









