From e3bb822eaa78fde21365293285338c2d7baea041 Mon Sep 17 00:00:00 2001 From: Ilia Kulikov Date: Tue, 7 Apr 2026 07:35:17 -0700 Subject: [PATCH] Fix LaTeX rendering in section 2 of thinking mid-training blog Split the long paragraph so inline math expressions with subscripts are in separate paragraphs, preventing the Markdown parser from pairing underscores across expressions as emphasis markers. The loss function is now a display equation on its own line. --- projects/thinking_midtraining/README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/projects/thinking_midtraining/README.md b/projects/thinking_midtraining/README.md index 6f62edf..d6ac919 100644 --- a/projects/thinking_midtraining/README.md +++ b/projects/thinking_midtraining/README.md @@ -58,7 +58,11 @@ $\tilde{\mathcal{D}} = \{\tilde{c}^1, \tilde{c}^2, \ldots, \tilde{c}^N\}$. ### 2) Thinking SFT Mid-training -We perform supervised fine-tuning (SFT) mid-training on half of the augmented corpus, which we call $\tilde{\mathcal{D}}_{\text{SFT}}$, using standard next-token prediction. Given a base model $\mathcal{M}_{\text{base}}$ parameterized by $\theta$, we optimize the following objective: $\mathcal{L}_{\text{SFT}}(\theta) = -\mathbb{E}_{\tilde{c}^i \sim \tilde{\mathcal{D}}} \left[ \sum_{j=1}^{|\tilde{c}^i|} \log P_\theta(\tilde{c}^i_j \mid \tilde{c}^i_{