recipes: surface non-AI use cases + ship ci-parallel-pytest by WaylandYang · Pull Request #231 · deeplethe/forkd

WaylandYang · 2026-06-06T08:36:31Z

Phase A — recipes/README.md reorg

The recipes/ directory already had non-AI use cases (postgres-fixture for DB testing, playwright-browser for browser farms, nodejs for generic JS) — but the top-level README only categorized by AI agent framework. New "By problem you're solving" table makes the breadth explicit:

Problem	Recipes
AI agent fan-out	langgraph-react, crewai-fanout, autogen-branch, openai-swarm, mcp-agent, speculative-agent, coding-agent-fork
CI test parallelism	postgres-fixture, ci-parallel-pytest (new)
Database test fixtures	postgres-fixture
Browser automation farms	playwright-browser
Notebook / code interpreter	jupyter-kernel, e2b-codeinterpreter
Generic compute	python-numpy, coding-agent, nodejs, agent-workbench

The AI agent lens stays prominent (first row) — this isn't a rebrand, just a widening of the discovery surface.

Phase B — new recipe `ci-parallel-pytest`

```
┌──────────────────────────────────────┐
│ parent snapshot ci-pytest │
│ python:3.12-slim + pytest + numpy │
│ + pandas + sklearn + your tests │
│ (heavy imports already paid) │
└────────────────┬─────────────────────┘
│ mmap MAP_PRIVATE (CoW)
┌──────────┬───────────────┴───────────┬─────────┐
│ worker 1 │ worker 2 │ worker 3 │ worker N│
│ pytest │ pytest │ pytest │ pytest │
│ slice 1 │ slice 2 │ slice 3 │ slice N │
└──────────┴──────────────┴────────────┴─────────┘
```

A typical Python ML CI re-pays ~1.5 s of `import numpy/pandas/sklearn` on every fresh worker container. With forkd, those imports live in the warmed parent's snapshot; every fork inherits them via mmap CoW. Per-worker fixed cost drops from ~3.5 s (container cold-start + imports) to ~80 ms (forkd spawn) + 0 ms (warmed imports).

Ships:

`build.sh` — wraps python:3.12-slim + pinned deps + the demo test project + a prewarm step
`test_project/` — ~30 representative tests across 5 files (arithmetic, numpy, pandas, sklearn, text)
`demo.py` — fan-out driver: slices tests across N workers, runs each in a child sandbox, reports per-worker spawn/exec + total wall-clock + sequential baseline
`README.md` — story, when-to-use / when-not, quickstart, GitHub Actions snippet, comparison vs sequential / pytest-xdist / docker

Numbers

The README quickstart shows projected numbers (i7-12700 / ext4):

Approach	Wall-clock 4 workers
Sequential, fresh container	~4-5 s
pytest-xdist -n 4 in one container	~3 s
docker × 4 fresh containers	~5-7 s
forkd fan-out (this recipe)	~1.6 s

Real dev-box measurement is the follow-up commit before merge — keeping this PR as draft until those land.

Test plan

`cargo fmt --all -- --check` n/a (no Rust changes)
`cargo test` n/a (no Rust changes)
Real numbers from dev box (pending; will be follow-up commit)
Visual review of recipes/README.md categorization

🤖 Generated with Claude Code

Phase A — reorg recipes/README.md --------------------------------- The recipes/ directory already had postgres-fixture (DB testing), playwright-browser (browser farms), and nodejs (generic JS runtime) — all non-AI use cases. But the top-level README only categorized by framework/audience under an AI agent lens, so non-AI users had to drill into individual recipes to discover them. New "By problem you're solving" table makes it explicit. Same recipes, surfaced for non-AI audiences: AI agent fan-out (5 recipes) CI test parallelism postgres-fixture + ci-parallel-pytest (new) Database test fixtures postgres-fixture Browser automation playwright-browser Notebook / interpreter jupyter-kernel, e2b-codeinterpreter Generic compute python-numpy, coding-agent, nodejs The AI agent lens stays prominent (first row, 5 recipes) — this isn't a rebrand, just a widening of the discovery surface. Phase B — new recipe: ci-parallel-pytest ---------------------------------------- The pitch: run pytest workers across N forkd microVMs and skip per-worker container cold-start + dependency import cost. A typical Python ML CI re-pays ~1.5 s of `import numpy/pandas/sklearn` on every fresh worker; with forkd, that's in the warmed parent's page cache and inherited via mmap CoW. Ships: build.sh Wraps python:3.12-slim + pinned pytest/numpy/ pandas/sklearn + the demo test project + a prewarm step, builds the ext4 rootfs. test_project/ ~30 representative tests across 5 files (arithmetic, numpy, pandas, sklearn, text) so worker slicing has something meaningful to do. demo.py Slices test files across N workers, spawns one child per slice from the snapshot, runs pytest inside each, reports per-worker spawn/exec timing + total wall-clock + sequential baseline. README.md Story, when-to-use / when-not-to-use, quickstart, GitHub Actions integration snippet, comparison table (sequential / pytest-xdist / docker / forkd). Numbers in the README quickstart are illustrative (i7-12700 / ext4 projected), with a "replace your tests, re-measure" note. Real measurement on the dev box will land as a follow-up commit before PR merge. Closes the "scope feels narrow" feedback by demonstrating that a non-AI use case (CI test fan-out) ships cleanly on the same primitive — no special daemon mode, no new API, just a recipe. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

End-to-end verified on the dev box (Intel i7-12700, ext4): Plan: 4 worker(s) × pytest slice off `ci-pytest`. === fan-out: 4 workers in parallel === batch spawn (4 children): 81 ms [0] PASS exec= 232 ms files=test_arithmetic.py,test_text_processing.py [1] PASS exec= 304 ms files=test_numpy_ops.py [2] PASS exec= 546 ms files=test_pandas_etl.py [3] PASS exec=1458 ms files=test_sklearn_models.py fan-out wall-clock: 1601 ms (~20 ms/worker spawn) === sequential baseline === [0] PASS spawn=61 ms exec=1507 ms sequential wall-clock: 1625 ms (fan-out speedup: 1.01×) Real-numbers reframing in the README: The 1.01× fan-out-vs-sequential ratio is honest for THIS suite — one sklearn slice dominates (1458 ms). Fan-out shines when suites have many comparable-cost slices. The cross-suite-invariant number to compare is the **batch spawn cost: 81 ms for 4 children = ~20 ms/worker**, vs ~2-3 s for a fresh container. Two demo.py changes to land it: 1. **Batch spawn via single POST /v1/sandboxes with n=N** instead of N concurrent POST calls. Concurrent calls race FC's "cannot /snapshot/load after InstanceStart" rejection — the daemon's `restore_many` is purpose-built for the batch case and atomically spawns N children with per-child netns. 2. **`cd /opt/test_project && pytest …`** in the exec instead of bare `pytest`. The guest agent's exec runs from `/`; the Dockerfile's `WORKDIR` isn't honored at exec time, so we have to switch directories explicitly. Comparison table in README updated to reflect actual measured costs (no more `~80 ms spawn` per worker — the real number is the batch total, not per-worker). The break-even framing also clarified: forkd wins when a per-worker test slice is shorter than the ~3 s container cold-start tax — which is most ML / data-science CI suites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

WaylandYang and others added 2 commits June 6, 2026 16:35

WaylandYang marked this pull request as ready for review June 6, 2026 09:14

WaylandYang merged commit c5a255e into main Jun 7, 2026
2 checks passed

WaylandYang deleted the recipes/non-ai-surface branch June 7, 2026 02:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

recipes: surface non-AI use cases + ship ci-parallel-pytest#231

recipes: surface non-AI use cases + ship ci-parallel-pytest#231
WaylandYang merged 2 commits into
mainfrom
recipes/non-ai-surface

WaylandYang commented Jun 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

WaylandYang commented Jun 6, 2026

Phase A — recipes/README.md reorg

Phase B — new recipe `ci-parallel-pytest`

Numbers

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant