NanoDeer

A Reference Implementation for Agent Runtime Engineering

ReAct Loop · Tool Use · Sandbox · Checkpoint · Memory · SSE Streaming

Exploring the runtime behind LLM agents.

English | 中文

NanoDeer is an open-source reference implementation for agent runtime engineering — not another product competing with Claude Code or Cursor, but a walkthrough that takes the agent runtime apart and shows how each core pattern is implemented.

At its core is a straightforward ReAct loop — no middleware chain, no graph DSL, no framework lock-in. Each core module demonstrates exactly one pattern. Non-core modules (subagent, plan, skills, wiki, layers) are kept as extensions — the code stays on disk but is not in the default load path.

Background

At the end of last year I started working on agent-related projects — my understanding was rough: just AI doing things for you. In early March my mentor mentioned "harness engineering is getting popular lately, maybe look into it." So I started searching for materials and picked up Claude Code along the way.

By late March, DeerFlow came onto my radar. ByteDance's open-source project showed me for the first time what a proper enterprise-grade Agent harness framework should look like — state machine, middleware chain, sandbox isolation, tiered memory, every piece in its right place.

The story might have ended there. But on the last evening of March, I attended ByteDance's campus recruiting talk. One thing that stuck with me was their motto — "Work with great people on challenging things." During the talk, a message flashed across my phone screen — Claude Code went open source. Something clicked in that moment. DeerFlow showed me what a framework should look like. Claude Code showed me what a product could feel like. With OpenClaw trending in China, everything suddenly connected. That night, back in my dorm, I wrote down the first draft.

The core idea: distill the patterns that work — native ReAct loop, Docker sandbox isolation, tiered memory, inline orchestration — into a focused, auditable foundation where every module has one job and concerns are handled inline.

Design Philosophy

1. No middleware chain

Most agent frameworks route cross-cutting concerns as pre/post hooks. NanoDeer does none of that. Every concern is handled inline in the ReAct loop or via standalone Managers:

Concern	Implementation
Context loading	`ContextManager.load()` — parallel I/O
Sandbox lifecycle	`SandboxManager.acquire()/release()` — idempotent
Bash audit	`_bash_safe()` — inline regex, blocks dangerous patterns
LLM retry	`_call_with_retry()` — exponential backoff for 429/5xx/timeout
Clarification	`_check_clarification()` — checks `[CLARIFICATION]` tag
Convergence guard	Repeated identical tool calls capped, max turn limit

This means you can read the entire execution path in react.py and understand control flow without learning a graph DSL.

2. Core + Extension split

The project is explicitly divided into two layers:

Core (always loaded): ReAct loop, 8 tools, checkpoint, flat-file memory, sandbox, SSE API
Extension (on disk, not default): subagent, plan, skills, wiki, memory layers, 12 additional tools

This was a deliberate cleanup from an earlier version that loaded everything by default. Keeping extension code on disk means exploration work isn't wasted — it's just not in the critical path. Users who need those patterns can activate them manually.

3. Why only bash goes through the sandbox

File tools (read/write/edit) run on the host because workspace directories are volume-mounted into the container — both host and container see the same files. Only bash needs to execute inside the container to isolate arbitrary command execution. This simplifies the sandbox wrapper from 263 lines (with per-tool templates, base64 encoding, and virtual path translation) to 40 lines.

4. Why flat-file memory

The original L1-L4 layered memory model (episodic → semantic → wiki → user) was a beautiful concept but added complexity disproportionate to its practical value. The simplified version uses two flat files: USER.md for preferences and MEMORY.md for facts. The layered model remains as an extension for those who want it.

5. Why ToolManager was replaced with a dict

The original ToolManager + groups.py system handled progressive tool exposure (core tools first, request more via request_tools() meta-tool). This solved a problem that doesn't exist for modern LLMs — they handle 20 tools just fine. The dict lookup is simpler, has zero dependencies, and is immediately readable.

6. Why factory was merged into engine

NanoDeerFactory was a thin assembly layer that forwarded parameters from NanoEngine to ReActExecutor. One fewer indirection layer means less code to read when tracing how the executor is built.

7. Reference implementation, not product

This is the most important decision. NanoDeer does not compete with Claude Code, Cursor, Aider, or Continue. It exists to be read — to show how agent runtimes work, to be forked and modified, to serve as teaching material. The value is in the clarity of the code and the reasoning behind each design choice, not in the feature count.

Core Architecture

                      ┌──────────────────────────────┐
                      │      CLI / API / SSE         │
                      │  cli/api.py · cli/repl.py    │
                      └──────────┬───────────────────┘
                                 │
                      ┌──────────▼───────────────────┐
                      │  NanoEngine (engine.py)      │
                      │  — LLM provider routing      │
                      │  — Thread state create/resume│
                      │  — Inline executor assembly  │
                      └──────────┬───────────────────┘
                                 │
                      ┌──────────▼───────────────────┐
                      │  ReActExecutor (react.py)    │
                      │  1. ContextManager.load()    │
                      │  2. SandboxManager.acquire() │
                      │  3. LLM.ainvoke() + retry    │
                      │  4. Clarification check      │
                      │  5. Tool loop (bash audit)   │
                      │  6. Checkpoint.save()        │
                      └──────────────────────────────┘

The 10 core patterns:

Module	Pattern	Lines
`react.py`	ReAct loop — context → LLM → tools → checkpoint	~1160
`state.py`	ThreadState / TurnSignals / NextAction data model	43
`context.py`	Parallel context loading (memory + files)	111
`prompt.py`	Static base + dynamic injection prompt assembly	196
`llm.py`	Multi-provider LLM abstraction	41
`sandbox/tools.py`	Sandbox tool wrapping (bash only, 40 lines)	40
`memory/storage.py`	Flat-file memory (USER.md + MEMORY.md)	97
`checkpoint/sqlite.py`	SQLite conversation persistence	287
`cli/api.py`	SSE streaming API	~336
`config.py`	YAML + env var configuration	~195

Tools

8 core tools (always available via default_tools()):

Tool	Category	Runs in
`read_file`	File	Host
`write_file`	File	Host
`edit_file`	File	Host
`bash`	Shell	Sandbox (container)
`web_search`	Web	Host
`web_fetch`	Web	Host
`save_memory`	Memory	Host
`search_memory`	Memory	Host

12 extension tools (on disk, import individually): ls, glob, grep, git, exec_python, read_image, create_plan, add_step, update_step, list_plans, spawn_subagent, get_subagent_results, invoke_skill

Quick Start

Requirements

Python ≥ 3.10
An LLM API key (Anthropic, OpenAI, DeepSeek, MiniMax, etc.)
Docker (optional, for sandbox isolation)

Install

git clone https://github.com/gzhzk/nanodeer
cd nanodeer

cp .env.example .env
# Edit .env with your API key

pip install -e .

Run (backend only)

nanodeer          # Start API server at http://127.0.0.1:20266
nanodeer-repl     # CLI REPL for debugging

Test

pip install -e '.[dev]'
pytest

Demo frontend

A Next.js demo frontend is available in demo/frontend/:

cd demo/frontend
npm install
npm run dev       # Opens at http://127.0.0.1:20265

Project Structure

nanodeer/
├── pyproject.toml           # Build config (hatchling), entry points, deps
├── config.yaml              # Runtime config (LLM providers, sandbox, storage)
├── config.yaml.example      # Template — copy to config.yaml and edit
├── .env / .env.example      # API keys
├── AGENTS.md                # Agent development guide (Claude Code context)
├── LICENSE                  # MIT
│
├── src/nanodeer/            # Python source
│   ├── __init__.py          # Package exports: NanoEngine, RuntimeFeatures, config
│   ├── engine.py            # NanoEngine — app entry point, executor assembly
│   ├── config.py            # HarnessConfig — Pydantic models, YAML + env loading
│   │
│   ├── agent/               # Core runtime
│   │   ├── __init__.py
│   │   ├── react.py         # ReActExecutor — main loop (~1160 lines)
│   │   ├── state.py         # ThreadState, TurnSignals, NextAction, SandboxState
│   │   ├── context.py       # ContextManager — memory + uploads, parallel load
│   │   ├── prompt.py        # PromptConfig, build_base/lead_agent_prompt
│   │   ├── llm.py           # ReasoningChatOpenAI (OpenAI-compatible wrapper)
│   │   ├── messages.py      # HumanMessage, AIMessage, ToolMessage, ToolCall
│   │   ├── sandbox_manager.py  # SandboxManager — acquire/release lifecycle
│   │   ├── trace.py         # TraceCollector — structured event emission
│   │   ├── checkpoint/
│   │   │   ├── __init__.py
│   │   │   ├── base.py      # Checkpointer ABC
│   │   │   └── sqlite.py    # SqliteCheckpointer — message + metadata persistence
│   │   └── memory/
│   │       ├── __init__.py
│   │       └── storage.py   # MemoryStore — USER.md + MEMORY.md flat files
│   │
│   ├── sandbox/             # Sandbox isolation
│   │   ├── __init__.py      # SandboxProvider ABC, Sandbox, RunResult, get/set/clear
│   │   ├── docker.py        # DockerSandboxProvider — container lifecycle
│   │   ├── local.py         # LocalSandboxProvider — subprocess fallback
│   │   ├── tools.py         # SandboxToolWrapper — bash only, 40 lines
│   │   └── path.py          # Path validation (retained for extension use)
│   │
│   ├── tools/               # Built-in tool definitions
│   │   ├── __init__.py      # default_tools() → 8 core, extension tools importable
│   │   ├── read_file.py     # Core: read file content
│   │   ├── write_file.py    # Core: write file content
│   │   ├── edit_file.py     # Core: string replacement editing
│   │   ├── bash.py          # Core: execute shell command (sandbox-wrapped)
│   │   ├── web_search.py    # Core: DuckDuckGo search
│   │   ├── web_fetch.py     # Core: fetch URL content
│   │   ├── save_memory.py   # Core: persist to USER.md / MEMORY.md
│   │   ├── search_memory.py # Core: recall from USER.md / MEMORY.md
│   │   ├── ls.py            # Extension: list directory
│   │   ├── glob.py          # Extension: file pattern match
│   │   ├── grep.py          # Extension: search file contents
│   │   ├── git.py           # Extension: git operations
│   │   ├── exec_python.py   # Extension: execute Python code
│   │   ├── read_image.py    # Extension: read image files
│   │   ├── invoke_skill.py  # Extension: load skill workflow
│   │   ├── create_plan.py   # Extension: create task plan
│   │   ├── plan_step.py     # Extension: add/update plan step
│   │   ├── list_plans.py    # Extension: list plans
│   │   ├── spawn_subagent.py # Extension: fork worker agent
│   │   └── get_subagent_results.py # Extension: collect worker results
│   │
│   ├── cli/
│   │   ├── __init__.py
│   │   ├── api.py           # FastAPI app, SSE /api/chat, conversation CRUD
│   │   └── repl.py          # Async CLI REPL for debugging
│   │
│   ├── subagent/            # Extension: SubagentCoordinator, runner, types
│   └── plan/                # Extension: PlanStore, Plan/Step types
│
├── scripts/
│   ├── dev.sh               # One-command launch: backend (+ --with-frontend)
│   └── check.sh             # Run tests (pytest)
│
├── tests/                   # Python test suite (115 tests)
│   ├── conftest.py          # Shared fixtures (thread_id, alt_thread_id)
│   ├── test_agent/          # ReAct, engine, state, messages, trace, factory
│   ├── test_sandbox/        # Docker provider, path, tool wrapping, exec
│   ├── test_agent_memory/   # MemoryStore (USER.md + MEMORY.md)
│   ├── test_tools_integration/ # Tool schema + invocation tests
│   ├── test_subagents/      # Extension test (SubagentCoordinator)
│   ├── test_plan/           # Extension test (PlanStore)
│   ├── test_skills/         # Extension test (skill loader)
│   ├── test_evaluation/     # Archived test (evaluation runner)
│   └── test_cli/            # API upload tests
│
├── demo/frontend/           # Next.js + assistant-ui demo (separate concern)
├── evaluation/              # Evaluation harness (archived)
└── docs/                    # Design documentation (中文)
    ├── harness_architecture.md
    ├── runtime_architecture.md
    ├── sandbox_design.md
    ├── tools_design.md
    ├── prompt_design.md
    ├── memory_design.md
    ├── nanodeer_blueprint_20260401.md
    ├── refactoring_journey.md
    └── archive/             # Docs for removed modules

Storage Layout

~/.nanodeer/
├── memory/                    # Core: flat-file memory (USER.md + MEMORY.md)
├── threads/
│   ├── threads.db             # SQLite — message + metadata persistence
│   └── {thread_id}/
│       └── user-data/         # Volume-mounted to sandbox container
│           ├── workspace/
│           ├── uploads/
│           └── outputs/
└── conversations/
    └── {thread_id}.json       # UI metadata index (title, timestamps)

Extension modules (subagent, plan, wiki, layers) create additional directories under ~/.nanodeer/ when used, but nothing in core references them by default.

Extension Modules

The following modules exist as extension patterns — they are importable but not part of the default runtime chain:

Module	Files	What it demonstrates
`subagent/`	coordinator, runner, types	Parallel worker orchestration
`plan/`	storage, types	Structured plan with step tracking
`skills/`	loader	Markdown-based skill workflow
`memory/wiki.py`	WikiStore	LLM-curated structured knowledge
`memory/layers.py`	MemoryLayers	L1-L4 tiered memory model
Extension tools	Individual .py files	Additional tool patterns

To use an extension module, import and configure it manually.

Acknowledgments

To my family — for their silent support and endless patience, which made this possible.

To my mentor — for opening the door to Agent and Harness Engineering, and encouraging me to explore.

Claude Code — my best coding companion, supercharging my AI workflow, and showing me that a product can be both powerful and elegant.

DeerFlow — for showing me what an enterprise-grade Agent framework truly looks like.

OpenClaw — for the layered memory and IM channel inspiration.

NanoClaw — for the Docker sandbox isolation pattern.

assistant-ui — for the beautiful and extensible React chat UI that powers the frontend.

DeepSeek — for providing the deepseek-v4-flash model with exceptional inference efficiency.

MiniMax — for providing the MiniMax-M2.7 model service that powers this project.

Andrej Karpathy — for the LLM wiki concept that inspired the wiki memory system: letting the LLM curate its own structured knowledge base.

Design Inspirations

Source	Pattern
DeerFlow	State machine + `next_action` signal routing
Claude Code	Tool-first design, clarification via tags
OpenClaw	Layered memory, wiki-structured knowledge
NanoClaw	Docker sandbox, volume mounts, path isolation

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NanoDeer

Background

Design Philosophy

1. No middleware chain

2. Core + Extension split

3. Why only bash goes through the sandbox

4. Why flat-file memory

5. Why ToolManager was replaced with a dict

6. Why factory was merged into engine

7. Reference implementation, not product

Core Architecture

Tools

Quick Start

Requirements

Install

Run (backend only)

Test

Demo frontend

Project Structure

Storage Layout

Extension Modules

Acknowledgments

Design Inspirations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 204 Commits
demo/frontend		demo/frontend
docs		docs
evaluation		evaluation
sandbox		sandbox
scripts		scripts
src/nanodeer		src/nanodeer
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
config.yaml.example		config.yaml.example
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

NanoDeer

Background

Design Philosophy

1. No middleware chain

2. Core + Extension split

3. Why only bash goes through the sandbox

4. Why flat-file memory

5. Why ToolManager was replaced with a dict

6. Why factory was merged into engine

7. Reference implementation, not product

Core Architecture

Tools

Quick Start

Requirements

Install

Run (backend only)

Test

Demo frontend

Project Structure

Storage Layout

Extension Modules

Acknowledgments

Design Inspirations

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages