From dff916e8263feedd9377e561e42c901e9688b68f Mon Sep 17 00:00:00 2001 From: Alex Zhao Date: Sat, 2 May 2026 18:08:09 +0000 Subject: [PATCH 1/4] Update leaderboard with May 1 audited rows Co-authored-by: Codex --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index 88840d27b8..76fa5a699e 100644 --- a/README.md +++ b/README.md @@ -30,6 +30,10 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| +| Token-Only N-gram Tilt + AsymLogit + One-Phase TTT | 1.0567 | TanishGudise | On PR #2130: PR #2014/#1953 lineage with token-only n-gram tilt (within/word off), AsymLogit, #2060 hparam levers, and one-phase score-first TTT; 3-seed mean 1.05670 (p=0.015 vs PR #2014) | 2026-05-01 | [info](https://github.com/openai/parameter-golf/pull/2130) | +| Progressive Context Growth + Short-Doc Score-First TTT | 1.0576 | simonbissonnette | On PR #2014: PR #1855/#1953 CaseOps stack with progressive context growth to 3k plus short-doc score-first TTT on the AWQ-lite/AsymLogit lineage; 3-seed mean 1.05759 (p=0.011 vs PR #1953) | 2026-04-30 | [info](https://github.com/openai/parameter-golf/pull/2014) | +| Long-Context No-Q/V TTT + QK-Gain 5.25 | 1.0586 | andrewbaggio1 | On PR #1953: PR #1945 V21 base with 2560 eval/TTT context, no-Q/V TTT mask, TTT LR 0.75, and QK_GAIN_INIT=5.25; 3-seed mean 1.05855 (p=0.063 vs PR #1945 V21 v2) | 2026-04-30 | [info](https://github.com/openai/parameter-golf/pull/1953) | +| AWQ-Lite GPTQ + AsymLogit on PR1855 Stack | 1.0594 | alertcat | On PR #1945 commit 70067534: PR #1855 stack plus PR #1908 AWQ-lite mixed GPTQ and PR #1923 AsymLogit; V21 v2 3-seed mean 1.05943 after strict seed-42 rerun (p=0.034 vs PR #1855) | 2026-04-29 | [info](https://github.com/openai/parameter-golf/pull/1945), [commit](https://github.com/openai/parameter-golf/pull/1945/commits/7006753424886886bc27a17f839f6afd01962a08) | | BOS-Fixed SmearGate + LQER + SparseAttnGate + 9-Hparam Stack | 1.0611 | codemath3000 | On PR #1855: BOS-fixed #1797-derived stack with LQER, PR #1787 SparseAttnGate/PolarNS/FusedCE base, per-group lrzip compression, and 9 greedy hyperparameter overrides; submitted 3-seed mean 1.06108 with broader reproduction support (p=0.188 vs PR #1868 latest rerun) | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1855), [repro](https://github.com/openai/parameter-golf/pull/1855#issuecomment-4336629746) | | BOS-Fixed SmearGate + LQER Asymmetric + PR1787 SparseAttn + Phased TTT | 1.0614 | aquariouseworkman | On PR #1851 with 3-seed compliance-rerun support from PR #1868: BOS-boundary fix from PR #1851 applied to dexhunter's PR #1797 SmearGate + LQER stack, using the PR #1787 SparseAttnGate/PolarNS/FusedCE base plus CaseOps and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | | PR1736 + PolarNS + MIN_LR + SparseAttnGate + FusedCE + Warm-A TTT | 1.0634 | nprime06 | On PR #1787: PR #1736 CaseOps stack plus Polar Express Newton-Schulz coefficients, MIN_LR=0.1, SparseAttnGate, fused softcapped CE, and PR #1767-style warm-start-A TTT | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1787) | From cb46d197038a03bcc284b9fcb7cc8284ed63833b Mon Sep 17 00:00:00 2001 From: Alex Zhao Date: Sat, 2 May 2026 18:20:46 +0000 Subject: [PATCH 2/4] Clarify PR 2130 leaderboard attribution Co-authored-by: Codex --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 76fa5a699e..92e6a27b5b 100644 --- a/README.md +++ b/README.md @@ -30,7 +30,7 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| -| Token-Only N-gram Tilt + AsymLogit + One-Phase TTT | 1.0567 | TanishGudise | On PR #2130: PR #2014/#1953 lineage with token-only n-gram tilt (within/word off), AsymLogit, #2060 hparam levers, and one-phase score-first TTT; 3-seed mean 1.05670 (p=0.015 vs PR #2014) | 2026-05-01 | [info](https://github.com/openai/parameter-golf/pull/2130) | +| Token-Only N-gram Tilt + AsymLogit + One-Phase TTT | 1.0567 | TanishGudise | On PR #2130: PR #1797/#1855 lineage with token-only n-gram tilt (within/word off), PR #1923 AsymLogit, PR #2060 hparam levers, and PR #2014-style one-phase score-first TTT; 3-seed mean 1.05670 (p=0.015 vs PR #2014) | 2026-05-01 | [info](https://github.com/openai/parameter-golf/pull/2130) | | Progressive Context Growth + Short-Doc Score-First TTT | 1.0576 | simonbissonnette | On PR #2014: PR #1855/#1953 CaseOps stack with progressive context growth to 3k plus short-doc score-first TTT on the AWQ-lite/AsymLogit lineage; 3-seed mean 1.05759 (p=0.011 vs PR #1953) | 2026-04-30 | [info](https://github.com/openai/parameter-golf/pull/2014) | | Long-Context No-Q/V TTT + QK-Gain 5.25 | 1.0586 | andrewbaggio1 | On PR #1953: PR #1945 V21 base with 2560 eval/TTT context, no-Q/V TTT mask, TTT LR 0.75, and QK_GAIN_INIT=5.25; 3-seed mean 1.05855 (p=0.063 vs PR #1945 V21 v2) | 2026-04-30 | [info](https://github.com/openai/parameter-golf/pull/1953) | | AWQ-Lite GPTQ + AsymLogit on PR1855 Stack | 1.0594 | alertcat | On PR #1945 commit 70067534: PR #1855 stack plus PR #1908 AWQ-lite mixed GPTQ and PR #1923 AsymLogit; V21 v2 3-seed mean 1.05943 after strict seed-42 rerun (p=0.034 vs PR #1855) | 2026-04-29 | [info](https://github.com/openai/parameter-golf/pull/1945), [commit](https://github.com/openai/parameter-golf/pull/1945/commits/7006753424886886bc27a17f839f6afd01962a08) | From e8db3562fbe802e5fb2fd6749c05847f119ff933 Mon Sep 17 00:00:00 2001 From: Alex Zhao Date: Sat, 2 May 2026 21:00:21 +0000 Subject: [PATCH 3/4] Remove PR 2130 from leaderboard update Co-authored-by: Codex --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 92e6a27b5b..4cbbaea2a5 100644 --- a/README.md +++ b/README.md @@ -30,7 +30,6 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| -| Token-Only N-gram Tilt + AsymLogit + One-Phase TTT | 1.0567 | TanishGudise | On PR #2130: PR #1797/#1855 lineage with token-only n-gram tilt (within/word off), PR #1923 AsymLogit, PR #2060 hparam levers, and PR #2014-style one-phase score-first TTT; 3-seed mean 1.05670 (p=0.015 vs PR #2014) | 2026-05-01 | [info](https://github.com/openai/parameter-golf/pull/2130) | | Progressive Context Growth + Short-Doc Score-First TTT | 1.0576 | simonbissonnette | On PR #2014: PR #1855/#1953 CaseOps stack with progressive context growth to 3k plus short-doc score-first TTT on the AWQ-lite/AsymLogit lineage; 3-seed mean 1.05759 (p=0.011 vs PR #1953) | 2026-04-30 | [info](https://github.com/openai/parameter-golf/pull/2014) | | Long-Context No-Q/V TTT + QK-Gain 5.25 | 1.0586 | andrewbaggio1 | On PR #1953: PR #1945 V21 base with 2560 eval/TTT context, no-Q/V TTT mask, TTT LR 0.75, and QK_GAIN_INIT=5.25; 3-seed mean 1.05855 (p=0.063 vs PR #1945 V21 v2) | 2026-04-30 | [info](https://github.com/openai/parameter-golf/pull/1953) | | AWQ-Lite GPTQ + AsymLogit on PR1855 Stack | 1.0594 | alertcat | On PR #1945 commit 70067534: PR #1855 stack plus PR #1908 AWQ-lite mixed GPTQ and PR #1923 AsymLogit; V21 v2 3-seed mean 1.05943 after strict seed-42 rerun (p=0.034 vs PR #1855) | 2026-04-29 | [info](https://github.com/openai/parameter-golf/pull/1945), [commit](https://github.com/openai/parameter-golf/pull/1945/commits/7006753424886886bc27a17f839f6afd01962a08) | From bfc3a26ee99d3adc29f1b4138f354c70e6c9e6b1 Mon Sep 17 00:00:00 2001 From: Alex Zhao Date: Sat, 2 May 2026 23:36:00 +0000 Subject: [PATCH 4/4] Add PR 2135 under grace policy Co-authored-by: Codex --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 4cbbaea2a5..27c0ca0f84 100644 --- a/README.md +++ b/README.md @@ -30,6 +30,7 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| +| Calib32 Token-Only N-gram + AsymLogit Stack | 1.0565 | codemath3000 | On PR #2135: pre-cutoff PR #2130 architecture rerun on clean canonical CaseOps data with GPTQ_CALIBRATION_BATCHES=32; 3-seed mean 1.05651 under grace policy (p=0.014 vs PR #2014) | 2026-05-01 | [info](https://github.com/openai/parameter-golf/pull/2135) | | Progressive Context Growth + Short-Doc Score-First TTT | 1.0576 | simonbissonnette | On PR #2014: PR #1855/#1953 CaseOps stack with progressive context growth to 3k plus short-doc score-first TTT on the AWQ-lite/AsymLogit lineage; 3-seed mean 1.05759 (p=0.011 vs PR #1953) | 2026-04-30 | [info](https://github.com/openai/parameter-golf/pull/2014) | | Long-Context No-Q/V TTT + QK-Gain 5.25 | 1.0586 | andrewbaggio1 | On PR #1953: PR #1945 V21 base with 2560 eval/TTT context, no-Q/V TTT mask, TTT LR 0.75, and QK_GAIN_INIT=5.25; 3-seed mean 1.05855 (p=0.063 vs PR #1945 V21 v2) | 2026-04-30 | [info](https://github.com/openai/parameter-golf/pull/1953) | | AWQ-Lite GPTQ + AsymLogit on PR1855 Stack | 1.0594 | alertcat | On PR #1945 commit 70067534: PR #1855 stack plus PR #1908 AWQ-lite mixed GPTQ and PR #1923 AsymLogit; V21 v2 3-seed mean 1.05943 after strict seed-42 rerun (p=0.034 vs PR #1855) | 2026-04-29 | [info](https://github.com/openai/parameter-golf/pull/1945), [commit](https://github.com/openai/parameter-golf/pull/1945/commits/7006753424886886bc27a17f839f6afd01962a08) |