docs(claude-md): differential-test-first + bump-and-yank recipe#99
Merged
Conversation
Two development directions promoted from the v0.2.3 session: 1. **Differential Test First** — when lattice diverges from MLX/HF/llama.cpp, write a 20-line Python script comparing the same primitive across both frameworks BEFORE reading lattice code or spawning agents. This closed a 0.77 PPL gap (Qwen3.5-0.8B, WikiText-2) in 5 seconds that had been misdiagnosed as "FP precision drift" for days. The actual bug was RoPE pairing convention (interleaved vs stride-half). Also: quantitative literature bounds cheaply reject hypotheses — f16-vs-f32 PPL <0.01, bf16-vs-f32 <0.05; gaps above those bounds are structural, not numerical. Also: be skeptical of comments that paraphrase config fields without explaining what the field actually controls in the reference impl. 2. **Bump-and-yank recovery** — crates.io is immutable. When a published release has a correctness bug, bump to next patch + ship the fix + yank the broken version. Done in v0.2.3 (yanked 0.2.2 which shipped with the RoPE bug). Plus: corrected stale "version = 0.1.0" pin in the publish section to match current workspace-version convention. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Promotes two development directions from the v0.2.3 release session into project guidance.
1. Differential Test First
When lattice diverges from MLX / HF transformers / llama.cpp, write a 20-line Python script comparing the same primitive in both frameworks before reading lattice code or spawning investigation agents. The 0.77 PPL gap on Qwen3.5-0.8B (WikiText-2) was misdiagnosed as "FP precision drift" for days; the actual cause was a RoPE pairing convention bug (interleaved vs stride-half), identified in 5 seconds by a script comparing MLX
nn.RoPE(traditional=False)against both candidates.Includes:
2. Bump-and-yank recipe
crates.io versions are immutable. When a shipped release has a correctness bug, the right pattern is bump + ship fix + yank broken. Done in v0.2.3 (yanked 0.2.2 across all 5 crates). Adds the explicit recipe to the Publishing section.
Also corrected stale
version = "0.1.0"pin in the Publishing section to match current convention (path deps bump in lockstep with workspace version).Test plan
make cidoc lint passed (pre-commit hook)🤖 Generated with Claude Code