A from-scratch re-implementation of Docker's core architecture using plain OS processes instead of containers. The goal is to understand how Docker actually works by building the same pieces — daemon, CLI, Unix socket, state machine, shim process — without any of the namespace or cgroup machinery.
No isolation. No images. Just processes that behave like containers.
| Docker concept | This project |
|---|---|
dockerd HTTP API over Unix socket |
orchestratord FastAPI daemon |
docker CLI |
orchestrate Typer CLI |
| Container state machine | State enum with transition guards |
docker attach / WebSocket stdio |
WebSocket bidirectional stream |
docker exec |
new subprocess in the container's cwd |
docker logs --follow |
async condition-variable fan-out |
| containerd-shim keeping process alive across daemon restarts | per-container shim.py process |
PR_SET_CHILD_SUBREAPER |
shim adopts orphaned grandchildren |
orchestrate (CLI)
│ HTTP + WebSocket over /tmp/orchestrator.sock
▼
orchestratord (daemon)
│ spawns one shim per container
▼
shim.py (one per container, outlives daemon restarts)
│ spawns and waits on the managed process
▼
container process (your "docker run" target)
The daemon is the API server. It creates containers, routes requests, and tails log files, but it is intentionally stateless with respect to running processes — all durable state lives in shim directories on disk.
The shim is the durable layer. It owns the container's stdio, writes an append-only log file, listens on a Unix socket for stdin forwarding, and writes an exit-code file when the process terminates. Because the shim is a separate process, a daemon crash or restart does not kill or disconnect the running container. The daemon re-reads shim directories on startup and reconnects.
simple-orchestrator/
├── orchestrator/ # async rewrite (FastAPI + Typer + websockets)
│ ├── orchestrate.py # CLI entry point
│ ├── shared.py # socket path, State enum, ID generator
│ ├── orchestratord/
│ │ ├── __main__.py # daemon entry point (uvicorn)
│ │ ├── routes.py # FastAPI app, all HTTP + WebSocket handlers
│ │ ├── models.py # Container dataclass, out(), resolve()
│ │ ├── process.py # launch(), persist_meta(), tail_log()
│ │ ├── lifecycle.py # lifespan, recovery on startup, reaper
│ │ └── shim.py # per-container shim process
│ ├── docs/
│ │ └── shim-process.md # detailed write-up of the shim design
│ └── tests/ # bash integration test suite
│
└── orchestrator-stdlib/ # earlier version using only Python stdlib
├── orchestrate.py # CLI (argparse + http.client over Unix socket)
├── orchestratord.py # daemon (http.server + threading)
└── docs/
├── daemon.md # how the threading daemon works
└── unix_socket.md # how Unix socket HTTP is wired up
The orchestrator-stdlib/ version intentionally uses zero external
dependencies to make the Unix socket and HTTP plumbing visible. The
orchestrator/ version replaces the hand-rolled pieces with FastAPI, Typer,
httpx, and websockets.
Each container gets a directory under /tmp/orchestrator-shims/<id>/:
shim.pid PID of the shim process (liveness sentinel)
pid PID of the container process
meta.json {id, name, cmd, cwd, created_at, started_at}
log append-only stdout + stderr of the container
exit exit code, written atomically on container termination
shim.sock Unix socket for stdin forwarding
On daemon restart, the daemon scans this directory, checks whether each shim
is alive with kill(shim_pid, 0), and reconstructs in-memory state from
meta.json and the presence or absence of exit.
cd orchestrator
pip install -e . # or: uv sync
# terminal 1 — start the daemon
orchestratord --socket /tmp/orch.sock
# terminal 2 — use the CLI
export ORCHESTRATOR_SOCKET=/tmp/orch.sock
orchestrate run --name hello -- echo "hello world"
orchestrate ps -a
orchestrate logs hello
orchestrate run --name shell -- bash
orchestrate attach shell # interactive stdin/stdout
orchestrate exec shell -- ps aux # run a command insidecd orchestrator
bash tests/run_all.shThe suite is bash integration tests that start a real daemon, exercise the CLI, and assert on state, log output, and process liveness. Suites cover:
test_lifecycle— create / run / start / stop / rm / restarttest_logs— ring buffer cap, log follow (streaming)test_exec— stdin EOF protocol, cwd inheritancetest_pause_unpause— SIGSTOP / SIGCONT, output freezestest_process_groups— killpg reaches grandchildrentest_grandchild_lifecycle— escaped-PGID grandchildren are cleaned up via subreapertest_recovery— CREATED and RUNNING containers survive a daemon SIGKILLtest_errors— correct HTTP error codes for every invalid transition
Unix socket, not TCP. The socket file acts as a capability — only processes
with filesystem access to /tmp/orchestrator.sock can talk to the daemon.
No port, no binding to a network interface.
Shim process, not pipes. Holding a subprocess.PIPE to a container ties
the container's lifetime to the daemon's. A shim process decouples them: the
container keeps running when the daemon is restarted, and the daemon can
reconnect to an already-running container by re-reading the state directory.
PR_SET_CHILD_SUBREAPER on the shim. If a container forks children that
move to their own process group (escaping killpg), those children would
normally orphan to init when the container exits. With the shim as subreaper,
they re-parent to the shim instead, which kills and reaps them before exiting.
WebSocket for attach/exec, not HTTP upgrade. The stdlib version uses a raw
HTTP 101 Switching Protocols upgrade. The async version uses FastAPI's native
WebSocket support, which is cleaner and handles the stdin half-close problem
(WebSocket has no SHUT_WR, so a {"stdin_eof": true} sentinel is sent
instead).
File-based state, not SQLite. The shim's state directory IS the persistent
state. Writing a secondary index in SQLite would create a sync problem. The
only gap is CREATED-state containers (never started, so no shim), which are
handled by writing meta.json at create time — the same approach Docker uses
with config.v2.json.