The patch engine for code agents, in pure Python. Apply unified diffs
and fuzzy search/replace edits with no git, no patch binary, no C
extension — in sandboxes, Pyodide/WASM, Lambda, anywhere pip install
works. And because it runs in-process, it applies a patch in ~25 µs where
spawning a binary costs milliseconds.
pip install purepatchimport purepatch
new_text = purepatch.apply(diff_text, old_text) # unified diff -> text
report = purepatch.apply_files(diff_text, root=".") # multi-file patch
new_text = purepatch.apply_edit(text, search, replace) # fuzzy block editpurepatch --dry-run < change.patch # the familiar CLI, agent-friendly
purepatch -R < change.patch # un-applyLLMs edit code by emitting unified diffs and SEARCH/REPLACE blocks — and both arrive slightly wrong: line numbers drifted, context rotted, indentation moved, trailing whitespace differs. The existing Python options either only parse diffs (unidiff) or are long abandoned (python-patch, last release 2019). So every agent framework re-implements patching, badly, or shells out to git.
purepatch is that missing engine:
- GNU patch semantics for unified diffs: cumulative offset tracking, bidirectional position search, fuzz degradation — verified against the real thing (below).
- A fuzzy edit ladder for LLM edit blocks: exact match → trailing whitespace tolerance → indentation transplant (the block the model wrote at top level gets re-indented to where it actually lives). Refuses to guess on ambiguity.
- Errors an agent can act on: failed matches report the closest
near-miss (
closest match: line 41, 87% similar) so the model can correct its edit instead of retrying blind. - Git extended headers understood: new/deleted files, renames, quoted
paths,
\ No newline at end of file, CRLF content.
Following the pure* series methodology: behavior is checked by differential testing against the reference implementations, run in CI on every commit —
- 500 random clean patches:
purepatch ≡ GNU patch ≡ git apply ≡ expected output, byte for byte; - 200 drift scenarios (the file gained unrelated lines): offset behavior matches GNU patch exactly;
- 200 rotted-context scenarios: fuzz behavior matches GNU patch's output wherever GNU patch succeeds;
- 300 property cases:
apply(diff(a,b), a) == bandapply(diff(a,b), b, reverse=True) == a.
Per-application latency — how a code agent actually uses a patcher: one
patch at a time. Spawn cost is the binaries' real cost; in-process is
purepatch's real cost. Median of 7, three independent rounds (spread
<10%), outputs verified equal before timing. Reproduce:
python tools/bench.py --verify.
| workload | purepatch (in-process) | GNU patch (spawn) | git apply (spawn) |
|---|---|---|---|
| 200-line file, 5 edits | 0.021 ms | 2.6 ms (~120×) | 7.8 ms (~370×) |
| 2k-line file, 30 edits | 0.17 ms | 2.9 ms (17×) | 8.8 ms (52×) |
| 20k-line file, 200 edits | 1.4 ms | 6.0 ms (4.3×) | 16.2 ms (12×) |
Fuzzy apply_edit on a 400-line file: ~0.01 ms per call.
The slow paths are engineered too, because an agent waits on them: hunk placement uses tiered C-speed anchor scans (a patch drifted by 500 lines in a 20k-line file still applies in ~1.5 ms), and a failed fuzzy edit produces its closest-match diagnostics on a 10k-line file in ~10 ms (two-pass candidate scoring; 21× faster than naively diffing every window).
An agent loop applying hundreds of edits per session pays milliseconds total, not seconds — and needs no git in its sandbox.
purepatch.parse(text) -> PatchSet # inspect hunks/files
purepatch.apply(patch, source, reverse=False, max_fuzz=2) -> str
purepatch.apply_files(patch, root=".", strip=None, # strip auto-detected
reverse=False, dry_run=False) -> ApplyReport
purepatch.apply_edit(content, search, replace) -> str
purepatch.find_block(content, search) -> (start, end, strategy)ApplyReport.ok, per-file actions (patched/created/deleted/renamed/ failed), and per-hunk offset/fuzz are all inspectable — log them and an
agent can explain exactly what happened.
Exceptions: ParseError, HunkApplyError, NoMatchError (with
closest_line / closest_similarity), AmbiguousMatchError (with all
locations).
- Binary patches are rejected, not applied.
- File modes are parsed from git headers but not applied to the filesystem (chmod is on the roadmap).
purepatchthe CLI covers the agent subset (-p -d -R --fuzz --dry-run), not every GNU patch flag.- Like GNU patch, fuzzy hunk placement can in principle pick a wrong spot
in pathological inputs;
--fuzz 0disables tolerance entirely.