From 0c846141aec9f6d51c93ca14a12bce337f8ad6da Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 21 Mar 2026 23:20:27 +0000 Subject: [PATCH] =?UTF-8?q?docs:=20clarify=20why=20LFM=202.5=20uses=2010:6?= =?UTF-8?q?=20not=205:3=20in=20=C2=A74.5?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 10:6 encodes absolute layer counts (10 CfC + 6 GQA = 16 layers total); reducing to 5:3 would describe a shallower 8-layer architecture. Add one-sentence note to PARAMETER_GOLF.md §4.5. https://claude.ai/code/session_01JpxhvpizFcE1iLL9aT5MUF --- PARAMETER_GOLF.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/PARAMETER_GOLF.md b/PARAMETER_GOLF.md index d4d3d77..35b2f40 100644 --- a/PARAMETER_GOLF.md +++ b/PARAMETER_GOLF.md @@ -265,8 +265,11 @@ for two reasons: ### 4.5 Geode-derived layer layout -LFM 2.5's 10:6 CfC:GQA ratio was found empirically. The Geode factorization -(§D-4.1) provides a principled derivation that eliminates the guesswork. +LFM 2.5's 10:6 CfC:GQA ratio was found empirically. Note that 10:6 cannot be +reduced to 5:3: the numbers are absolute layer counts (10 CfC + 6 GQA = 16 layers +total), not a bare ratio. Reducing to 5:3 would describe a different 8-layer +model, halving the depth. The Geode factorization (§D-4.1) provides a principled +derivation that eliminates the guesswork. The generating function for Q²'s transition sequences: