Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
cccbe67
Non-record writeup: notes on the recurrence band
leon2k2k2k May 1, 2026
0376b4e
chore: consolidate unsaved research notes + run from worktrees before…
leon2k2k2k May 2, 2026
8943d54
fix: correct author handle for PR #2050 — @someone114514 → @AidenGeun…
leon2k2k2k May 2, 2026
3e3f5d3
writeup: add blog post outline (5-part structure)
leon2k2k2k May 4, 2026
717546c
writeup: fix technique descriptions after source cross-check
leon2k2k2k May 4, 2026
dc74ccb
writeup: fix parallel residuals layer (7→8), add three-expert n-gram …
leon2k2k2k May 4, 2026
3575bd7
writeup: respond to two outline comments
leon2k2k2k May 4, 2026
c5e4d7c
writeup: rewrite Part 0 — competition setup, scoring, C1-C4
leon2k2k2k May 4, 2026
99b75d4
writeup: draft Part 1 baseline section
leon2k2k2k May 4, 2026
76f979e
writeup: add tokenizer and model architecture sections to Part 1
leon2k2k2k May 4, 2026
cec9e59
writeup: add training, quantization, and TTT sections to Part 1
leon2k2k2k May 4, 2026
62b2ac3
writeup: add closing paragraph to Part 1 final model section
leon2k2k2k May 5, 2026
407830f
writeup: add Part 2 disqualifications — n-gram tilt and PPM-D sections
leon2k2k2k May 5, 2026
ac24777
writeup: fact-check corrections across Part 1 and Part 2
leon2k2k2k May 5, 2026
94d1d09
writeup: fix internal inconsistency in global SGD doc count
leon2k2k2k May 5, 2026
1fe9390
writeup: add CaseOps leak section and closing Looking Back paragraph
leon2k2k2k May 5, 2026
b43a4ea
writeup: tighten closing paragraph to single sentence
leon2k2k2k May 5, 2026
ef98897
writeup: fix "picture-book" → "picture-perfect" finish
leon2k2k2k May 5, 2026
8ebcc08
writeup: rename "The Final Model" section to "The Evolution"
leon2k2k2k May 5, 2026
6f6ba33
writeup: promote "Drama on the Last Day" to top-level section, fold i…
leon2k2k2k May 5, 2026
1bf836c
writeup: remove all em-dashes throughout draft
leon2k2k2k May 5, 2026
3633339
writeup: rename 060A Research Model to Near-SOTA Model in table
leon2k2k2k May 5, 2026
bf0b915
writeup: add "Let's break it down" transition in baseline section
leon2k2k2k May 5, 2026
25aa472
writeup: tighten baseline quantization paragraph
leon2k2k2k May 5, 2026
b7ca759
writeup: reformat model architecture section as bullet points
leon2k2k2k May 5, 2026
0258c80
writeup: reformat training, quantization, TTT sections as bullet points
leon2k2k2k May 5, 2026
d6d1b9d
writeup: add v2 outline and draft skeleton with hook
leon2k2k2k May 5, 2026
a0fcfd2
writeup: fix title and remove section 0 header
leon2k2k2k May 5, 2026
bb1815a
writeup: fix byte count and scoring description in hook
leon2k2k2k May 5, 2026
1d54375
writeup: trim hook sentence
leon2k2k2k May 5, 2026
e0693ce
writeup: rewrite hook second paragraph
leon2k2k2k May 5, 2026
7192dfa
writeup: add model comparison tables to top of v2
leon2k2k2k May 5, 2026
e687784
writeup: add closing sentence to hook paragraph
leon2k2k2k May 5, 2026
a965a06
Update draft_v2.md
leon2k2k2k May 5, 2026
214b44f
writeup: draft section 1 - the competition
leon2k2k2k May 5, 2026
c82e267
writeup: expand section 1 with probability distribution explanation a…
leon2k2k2k May 5, 2026
dd9fe34
Update draft_v2.md
leon2k2k2k May 5, 2026
f42ab0f
writeup: expand section 1 with per-token cost and uniform baseline co…
leon2k2k2k May 5, 2026
f19fdd6
writeup: draft section 2 - model evolution
leon2k2k2k May 5, 2026
7532ef6
writeup: clarify block structure and add block chain diagram
leon2k2k2k May 5, 2026
be8d300
writeup: add attention/MLP role description
leon2k2k2k May 5, 2026
517d6de
writeup: add U-Net skip connection code block to baseline
leon2k2k2k May 5, 2026
7834905
writeup: restore baseline-to-evolution transition from v1
leon2k2k2k May 5, 2026
7e328c9
writeup: add PR number to depth recurrence, add parallel residuals se…
leon2k2k2k May 5, 2026
597bb87
writeup: remove redundant layers/parameters table
leon2k2k2k May 5, 2026
5b1382d
Update draft_v2.md
leon2k2k2k May 5, 2026
fe493f7
writeup: rewrite training closing as choreographed dance
leon2k2k2k May 5, 2026
29f3879
writeup: add quantization intro paragraph from v1
leon2k2k2k May 5, 2026
ec7dae2
writeup: restore v1 baseline prose, keep v2 U-Net code block
leon2k2k2k May 5, 2026
fb5aad5
writeup: tighten TTT opening sentence
leon2k2k2k May 5, 2026
d9eb241
writeup: add TTT two-step transition sentence
leon2k2k2k May 5, 2026
14b8455
writeup: remove incorrect TTT row from other changes table
leon2k2k2k May 5, 2026
6a7f20e
writeup: add intro sentence to Other Changes table
leon2k2k2k May 5, 2026
754f8c5
writeup: remove redundant transition sentence before section 3
leon2k2k2k May 5, 2026
4dcc044
writeup: add transition line into section 3
leon2k2k2k May 5, 2026
12969bc
writeup: remove transition line before section 3
leon2k2k2k May 5, 2026
59a88c0
writeup: soften other changes intro
leon2k2k2k May 5, 2026
012fbe7
writeup: draft section 3 hook
leon2k2k2k May 5, 2026
d445053
writeup: add closing line to section 2
leon2k2k2k May 5, 2026
55a90b0
writeup: add ablations mention to section 2 closing line
leon2k2k2k May 5, 2026
2f177bf
writeup: draft section 3 - too good to be true
leon2k2k2k May 5, 2026
4673875
writeup: connect hook sentences with while
leon2k2k2k May 5, 2026
835ecda
writeup: restructure n-gram section, move C1-C4 footnote, improve exa…
leon2k2k2k May 5, 2026
237f935
Refine explanation of token prediction and PR #1514
leon2k2k2k May 5, 2026
a1219ba
writeup: rewrite PPM-D section with flashier hook and Russian novel a…
leon2k2k2k May 5, 2026
9bb020e
writeup: restructure PPM-D section, tighten C2 explanation, defer PR …
leon2k2k2k May 5, 2026
a37cf00
writeup: add expected entropy footnote to lesson section
leon2k2k2k May 5, 2026
8d2029d
writeup: implement user's PPM-D edits from GitHub
leon2k2k2k May 5, 2026
c649d91
Update draft_v2.md
leon2k2k2k May 5, 2026
c78613b
writeup: draft section 4 - drama on the last day
leon2k2k2k May 5, 2026
8fc21c5
writeup: add hook line to section 4, clarify PR #2014, add gap values
leon2k2k2k May 5, 2026
55971bd
writeup: rewrite data leak discovery with cleaner explanation and exa…
leon2k2k2k May 5, 2026
5834765
Update draft_v2.md
leon2k2k2k May 5, 2026
0cbea32
writeup: rephrase leaky PRs paragraph, change fifteen to many
leon2k2k2k May 5, 2026
b8d4051
writeup: simplify the ironic detail in section 4
leon2k2k2k May 5, 2026
3e7e24e
writeup: rewrite section 4 closing with proper drama arc
leon2k2k2k May 5, 2026
46ffdc3
writeup: fix timing of leaky PRs - shortly after not final hours
leon2k2k2k May 5, 2026
5f7e97d
Update draft_v2.md
leon2k2k2k May 5, 2026
754522d
writeup: add closing paragraph
leon2k2k2k May 5, 2026
e00c3ab
writeup: remove all em-dashes from draft_v2
leon2k2k2k May 5, 2026
833c99a
writeup: fix PR #1344 loop attribution and GPTQ scope description
leon2k2k2k May 5, 2026
e128626
writeup: fix 8 incorrect PR attributions after leaderboard cross-check
leon2k2k2k May 5, 2026
15dbf5e
writeup: revert 4 PR attributions to true first appearance (pre-leade…
leon2k2k2k May 5, 2026
76b45f3
writeup: fix GPTQ (#374→#535) and SmearGate (#162→#65) after direct P…
leon2k2k2k May 5, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
180 changes: 180 additions & 0 deletions caseops-memory-leakage/family-tree.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
# CaseOps records — family tree with leak/clean annotations

**Updated 2026-05-02 with strict re-audit applied** (see `verdicts.md` for criteria).

Legend: `[C]` = CLEAN (val docs not in train), `[L]` = LEAK (val docs in train), `[?]` = AMBIGUOUS (cannot resolve from PR artifacts alone).

## Tree 1 — Merged trunk (linear ancestry)

```
#1493 [pre-CaseOps boundary, clean by lineage]
#1626 [pre-CaseOps boundary, VarLen, clean by lineage]
↓ ← BOUNDARY: pre-CaseOps to CaseOps
#1729 [C] @romeerp bpb=1.0678 (Apr 18)
│ — first CaseOps record; cached_challenge_fineweb.py from romeerp/parameter-golf-caseops-v1
↓ ←== LEAK INTRODUCED HERE ==
#1736 [L] @dexhunter bpb=1.06549 (Apr 19)
│ — first prepare_caseops_data.py default; train docs 10k+, val docs 0–49,999
│ — OUR CURRENT RESEARCH BASELINE
#1769 [L] @dexhunter bpb=1.06453 (Apr 22)
│ — +MLPClip12; same prep
#1787 [L] @nprime06 bpb=1.06335 (Apr 23)
│ — +Polar Express NS, MIN_LR, SparseAttnGate, FusedCE; same prep
├──→ #1797 [L] @dexhunter bpb=1.06157 (Apr 25) — +SmearGate +LQER int4
│ │
│ ↓ ←== LEAK FIXED HERE ==
│ #1851 [C] @aquariouseworkman bpb=1.06128 (Apr 27)
│ │ — +SmearGate BOS-fix; SWITCHED to /dev/shm/pgolf_data (HF subset, 39 shards)
│ │ — current merged-leaderboard SOTA leader
│ │
│ ├──→ #1855 [L] @codemath3000 bpb=1.06108 (Apr 27)
│ │ — 9-hparam stack; LEAK RE-INTRODUCED — author rebuilt locally with default --val-docs
│ │ — DATASET_AUDIT.md (PR #2018) verified --val-docs=10000 byte-for-byte
│ │
│ └──→ #1868 [C] @Christopher-Lee-McClendon bpb=1.06141 (Apr 29)
│ — 3-seed reproduction of #1851; STAYED on HF dataset
│ — LATEST clean merged record
```

## Tree 2 — Unmerged frontier branches off #1855

#1855 became the dominant fork point for the unmerged frontier. Most descendants inherited the leaky local prep workflow.

```
#1855 [L] @codemath3000 bpb=1.06108
├──→ #1908 [C] @romeerp bpb=1.06081 — README explicit HF source; +AWQ-lite GPTQ
├──→ #1923 [L] @jorge-asenjo bpb=1.05971 — +AsymLogit +AWQ-lite; ORIGINAL val=9.66M (default --val-docs=10000), val-only re-pulled from HF after corruption; train still doc 10k+ → leak
├──→ #1945 [C] ← *flipped from [L] in re-audit* @alertcat bpb=1.05943
│ │ — finalize_v18.sh has `snapshot_download(repo_id='romeerp/parameter-golf-caseops-v1', local_dir='/workspace/caseops_data')`
│ │ — README's prepare_caseops_data.py "Data setup" is stale — actual run used HF
│ │ — IF this is correct, #1945 at 1.05943 is a clean-frontier candidate
│ │
│ ├──→ #1953 [?] ← *downgraded from [L] in re-audit* @andrewbaggio1 bpb=1.05855
│ │ │ — V21 + TTT tweaks. PR ships only train_gpt.py + logs. No prep evidence.
│ │ │ — Path matches HF target. Parent #1945 confirmed HF. **Lean CLEAN.**
│ │ │
│ ├──→ #1967 [L] @ndokutovich bpb=1.05851 — V21 + LeakyReLU 0.3 + N-gram Tilt
│ │ │ — setup.sh invokes prepare_caseops_data.py default; ALSO has within/word boundary_lut C1 leak
│ │ │
│ │ └──→ #2018 [L] Simon Marcus bpb=1.04722 (Apr 30)
│ │ │ — multi-parent (#1945, #1967, #1953, #1855); +Gated XSA, LQER top-1, AsymLogit, n-gram tilt
│ │ │ — DATASET_AUDIT.md is gold-standard leak documentation
│ │ │ — note: parent #1945 is CLEAN but #2018 audit explicitly proves LEAK construction
│ │ │
│ │ ├──→ #2118 [L] @aquariouseworkman bpb=1.04350 (May 1)
│ │ │ — CURRENT FRONTIER (claimed); submission.json: "--val-docs=10000 train shards + 50k val eval"
│ │ │ — same author who shipped clean #1851 a week earlier
│ │ │
│ │ └──→ #2041 [?] ← *downgraded from [L] in re-audit* @jorge-asenjo bpb=1.05692
│ │ — No prep invocation in PR; double-nested path, ambiguous
│ │
│ └──→ #2014 [L] @simonbissonnette bpb=1.05759
│ │ — "uses same shards as PR #1855"; /dev/shm/pgolf_caseops_data_80_l17_final
│ │
│ └──→ #2078 [L] @hi-aduek bpb=1.05804 — #2014 reproduction
├──→ #2007 [L] @Elubrazione bpb=1.05899 — LongCtx + NoQV; triple nesting + ships prep
│ │
│ └──→ #2060 [L] @S0urC10ud bpb=1.05792 — 5-knob retune
│ │
│ └──→ #2100 [L] @someone114514 bpb=1.05807 — LongCtx + No-QV + Prefix3500
├──→ #2019 [C] @aquariouseworkman bpb=1.05847 — README explicit: snapshot_download from HF
├──→ #2031 [C] @deborahnelson8788726 bpb=1.05985 — README explicit: 39 train shards from HF
├──→ #2068 [C] @jayaram1125 bpb=1.06172 (parent #1797) — cached_challenge_fineweb.py from HF
├──→ #2071 [L] @jamesEmerson112 bpb=1.0066 (claimed) (parent #1851)
│ — SEPARATE LEAK: symlink-leak (audit-flagged); SP8192 path symlinked to CaseOps shards
├──→ #2075 [?] ← *downgraded from [L] in re-audit* @deusexnatura — PairGeom-V; ships prep but no explicit invocation
├──→ #2101 [L] @OnlyJundong bpb=1.05845 — AWQ-lite + AsymLogit + GradCentral; ships prep
│ │
│ └──→ #2117 [L] @JulianTang2027 — 3-seed reproduction of #2101
├──→ #2109 [L] @izlley bpb=1.05917 — MP3 marker-pair fusion (CUSTOM dataset variant); val_tokens=36.56M
├──→ #2121 [L] @Kbediako bpb=1.06099 — StageB v2; ships prep
├──→ #2123 [L] @vaibhavmishra1 bpb=1.05933 — closed; superseded by #2124
└──→ #2124 [L] @vaibhavmishra1 bpb=1.05933 — resubmission of #2123
```

## Tree 3 — Out-of-CaseOps-scope (in date window but different lineage)

```
#1493 [pre-CaseOps boundary]
#2027 [C] @H1cSuNtDr4C0n3S bpb=1.08064 (Apr 30)
— SP8192 QRescue + JEPA-Lite; non-CaseOps SP8192 lineage; clean by lineage

(separately:)
#1915 [not in working set; bulk-classified clean in state.json]
#2050 [INHERIT] @AidenGeunGeun bpb=1.06083 (Apr 30)
— eval-only on frozen #1915 quantized artifacts; data verdict depends on #1915
```

## Tree 4 — Symlink leak branch (separate mechanism)

```
#1851 [C]
#2071 [L] @jamesEmerson112 bpb=1.0066 (claimed)
— caseops_enabled=False but pod data paths symlinked to CaseOps-tokenized shards
— README admits: "active via symlinked data"
— NOT the val10k-train leak; orthogonal mechanism
```

## Where leak transitions occur

| Edge | Author of child | Action |
|---|---|---|
| #1729 [C] → #1736 [L] | @dexhunter | **LEAK INTRODUCED**: first use of `prepare_caseops_data.py` default `--val-docs=10000`, started the leaky CaseOps trunk |
| #1797 [L] → #1851 [C] | @aquariouseworkman | **LEAK FIXED**: switched to `/dev/shm/pgolf_data` (39-shard HF subset); first clean record post-#1736 |
| #1851 [C] → #1855 [L] | @codemath3000 | **LEAK RE-INTRODUCED**: rebuilt locally with `prepare_caseops_data.py` default, despite parent being clean |
| #1851 [C] → #1868 [C] | @Christopher-Lee-McClendon | (clean stays clean) — used HF dataset same as parent |
| #1855 [L] → #1908 [C] | @romeerp | **LEAK FIXED**: README explicit HF source |
| #1855 [L] → #1923 [L] | @jorge-asenjo | (leak stays leak) — only val-side fix, train kept default-prep |
| #1855 [L] → #2019 [C] | @aquariouseworkman | **LEAK FIXED**: snapshot_download from HF |
| #1855 [L] → #2031 [C] | @deborahnelson8788726 | **LEAK FIXED**: HF first-39 explicit |
| #1855 [L] → #2068 [C] | @jayaram1125 | **LEAK FIXED**: cached_challenge_fineweb.py from HF |
| #2018 [L] → #2118 [L] | @aquariouseworkman | **REGRESSION**: same author who fixed leak in #1851 now ships leaky #2118; submission.json admits |

## Author behaviors

| Author | Records | Shipped status |
|---|---|---|
| @romeerp | #1729 [C], #1908 [C] | Always clean |
| @dexhunter | #1736 [L], #1769 [L], #1797 [L] | Always leaky (started the leak) |
| @nprime06 | #1787 [L] | Leaky |
| @aquariouseworkman | #1851 [C], #2019 [C], #2118 [L] | Mostly clean; regressed on #2118 |
| @codemath3000 | #1855 [L] | Leaky (re-introduced after #1851 fixed it) |
| @Christopher-Lee-McClendon | #1868 [C] | Clean |
| @jorge-asenjo | #1923 [L], #2041 [L] | Leaky |
| @jamesEmerson112 | #2071 [L] (symlink) | Different leak mechanism |
| @alertcat | #1945 [L] | Leaky |
| @andrewbaggio1 | #1953 [L] | Leaky |
| @ndokutovich | #1967 [L] | Leaky |
| Simon Marcus | #2018 [L] | Leaky (with audit doc) |
| @deborahnelson8788726 | #2031 [C] | Clean (HF) |
| @jayaram1125 | #2068 [C] | Clean (HF) |
| @vaibhavmishra1 | #2123 [L], #2124 [L] | Leaky |

## Key takeaways

1. **The clean trunk is short**: pre-CaseOps → #1729 → (#1851 → #1868). Three actual record submissions in the post-leak-introduction era.
2. **The leaky trunk is long**: #1736 → #1855 → V21 (#1945) → #1967/#1953 → #2018 → #2118, with many sibling forks.
3. **Same authors switch verdicts across PRs**: @aquariouseworkman shipped clean #1851 / #2019 and leaky #2118 within a week.
4. **Once a fork "fixes" the leak by going HF, it stays clean** (e.g., #1908, #2019, #2031, #2068 all sit downstream of leaky #1855 but went HF).
5. **Conversely, "fixing" doesn't propagate**: #1851's HF switch didn't stop #1855 from re-introducing the leak using a sibling local prep.
Loading