recipes: surface non-AI use cases + ship ci-parallel-pytest#231
Merged
Conversation
Phase A — reorg recipes/README.md
---------------------------------
The recipes/ directory already had postgres-fixture (DB testing),
playwright-browser (browser farms), and nodejs (generic JS runtime)
— all non-AI use cases. But the top-level README only categorized
by framework/audience under an AI agent lens, so non-AI users had
to drill into individual recipes to discover them.
New "By problem you're solving" table makes it explicit. Same
recipes, surfaced for non-AI audiences:
AI agent fan-out (5 recipes)
CI test parallelism postgres-fixture + ci-parallel-pytest (new)
Database test fixtures postgres-fixture
Browser automation playwright-browser
Notebook / interpreter jupyter-kernel, e2b-codeinterpreter
Generic compute python-numpy, coding-agent, nodejs
The AI agent lens stays prominent (first row, 5 recipes) — this
isn't a rebrand, just a widening of the discovery surface.
Phase B — new recipe: ci-parallel-pytest
----------------------------------------
The pitch: run pytest workers across N forkd microVMs and skip
per-worker container cold-start + dependency import cost. A typical
Python ML CI re-pays ~1.5 s of `import numpy/pandas/sklearn` on
every fresh worker; with forkd, that's in the warmed parent's page
cache and inherited via mmap CoW.
Ships:
build.sh Wraps python:3.12-slim + pinned pytest/numpy/
pandas/sklearn + the demo test project + a
prewarm step, builds the ext4 rootfs.
test_project/ ~30 representative tests across 5 files
(arithmetic, numpy, pandas, sklearn, text) so
worker slicing has something meaningful to do.
demo.py Slices test files across N workers, spawns one
child per slice from the snapshot, runs pytest
inside each, reports per-worker spawn/exec
timing + total wall-clock + sequential baseline.
README.md Story, when-to-use / when-not-to-use, quickstart,
GitHub Actions integration snippet, comparison
table (sequential / pytest-xdist / docker / forkd).
Numbers in the README quickstart are illustrative (i7-12700 / ext4
projected), with a "replace your tests, re-measure" note. Real
measurement on the dev box will land as a follow-up commit before
PR merge.
Closes the "scope feels narrow" feedback by demonstrating that a
non-AI use case (CI test fan-out) ships cleanly on the same
primitive — no special daemon mode, no new API, just a recipe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end verified on the dev box (Intel i7-12700, ext4):
Plan: 4 worker(s) × pytest slice off `ci-pytest`.
=== fan-out: 4 workers in parallel ===
batch spawn (4 children): 81 ms
[0] PASS exec= 232 ms files=test_arithmetic.py,test_text_processing.py
[1] PASS exec= 304 ms files=test_numpy_ops.py
[2] PASS exec= 546 ms files=test_pandas_etl.py
[3] PASS exec=1458 ms files=test_sklearn_models.py
fan-out wall-clock: 1601 ms (~20 ms/worker spawn)
=== sequential baseline ===
[0] PASS spawn=61 ms exec=1507 ms
sequential wall-clock: 1625 ms (fan-out speedup: 1.01×)
Real-numbers reframing in the README:
The 1.01× fan-out-vs-sequential ratio is honest for THIS suite —
one sklearn slice dominates (1458 ms). Fan-out shines when suites
have many comparable-cost slices. The cross-suite-invariant number
to compare is the **batch spawn cost: 81 ms for 4 children =
~20 ms/worker**, vs ~2-3 s for a fresh container.
Two demo.py changes to land it:
1. **Batch spawn via single POST /v1/sandboxes with n=N** instead
of N concurrent POST calls. Concurrent calls race FC's "cannot
/snapshot/load after InstanceStart" rejection — the daemon's
`restore_many` is purpose-built for the batch case and
atomically spawns N children with per-child netns.
2. **`cd /opt/test_project && pytest …`** in the exec instead of
bare `pytest`. The guest agent's exec runs from `/`; the
Dockerfile's `WORKDIR` isn't honored at exec time, so we have
to switch directories explicitly.
Comparison table in README updated to reflect actual measured costs
(no more `~80 ms spawn` per worker — the real number is the batch
total, not per-worker). The break-even framing also clarified:
forkd wins when a per-worker test slice is shorter than the
~3 s container cold-start tax — which is most ML / data-science
CI suites.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Phase A — recipes/README.md reorg
The recipes/ directory already had non-AI use cases (postgres-fixture for DB testing, playwright-browser for browser farms, nodejs for generic JS) — but the top-level README only categorized by AI agent framework. New "By problem you're solving" table makes the breadth explicit:
The AI agent lens stays prominent (first row) — this isn't a rebrand, just a widening of the discovery surface.
Phase B — new recipe `ci-parallel-pytest`
```
┌──────────────────────────────────────┐
│ parent snapshot ci-pytest │
│ python:3.12-slim + pytest + numpy │
│ + pandas + sklearn + your tests │
│ (heavy imports already paid) │
└────────────────┬─────────────────────┘
│ mmap MAP_PRIVATE (CoW)
┌──────────┬───────────────┴───────────┬─────────┐
│ worker 1 │ worker 2 │ worker 3 │ worker N│
│ pytest │ pytest │ pytest │ pytest │
│ slice 1 │ slice 2 │ slice 3 │ slice N │
└──────────┴──────────────┴────────────┴─────────┘
```
A typical Python ML CI re-pays ~1.5 s of `import numpy/pandas/sklearn` on every fresh worker container. With forkd, those imports live in the warmed parent's snapshot; every fork inherits them via mmap CoW. Per-worker fixed cost drops from ~3.5 s (container cold-start + imports) to ~80 ms (forkd spawn) + 0 ms (warmed imports).
Ships:
Numbers
The README quickstart shows projected numbers (i7-12700 / ext4):
Real dev-box measurement is the follow-up commit before merge — keeping this PR as draft until those land.
Test plan
🤖 Generated with Claude Code