Skip to content

Commit f7a01f6

Browse files
apartsinclaude
andcommitted
Expand biomedical content in 28.3, add missing-output audit, fix stacked captions across 28 files
- Deepen protein (ESM-2/3, AlphaFold3), genomics (DNABERT-2, Evo-2), and drug discovery (ChemBERTa-2, MolGPT) coverage in section 28.3 - Add new Code Fragment 28.3.4 (ESM-2 protein embedding) and expand 28.3.3 (multi-molecule SMILES) - Create p2_missing_output.py audit script (detects code blocks missing output panes) - Add section 34.10 entry to table of contents - Remove stacked/duplicate captions in 28 section files across Parts 2-8 and appendices Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 13c4b7d commit f7a01f6

31 files changed

Lines changed: 360 additions & 102 deletions

File tree

appendices/appendix-c-python-for-llm/section-c.1.html

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -95,8 +95,6 @@ <h3>NumPy and Pandas</h3>
9595
dataset = df[["instruction", "response"]].to_dict(orient="records")
9696
print(f"Training examples: {len(dataset)}")</code></pre>
9797
<div class="code-caption"><strong>Code Fragment C.1.2:</strong> This snippet demonstrates this approach using PyTorch. Study the implementation to understand how each component contributes to the overall workflow.</div>
98-
<!-- FIXME: stacked caption, needs manual review -->
99-
<div class="code-caption"><strong>Code Fragment C.1.3:</strong> This snippet demonstrates this approach. Study the implementation to understand how each component contributes to the overall workflow.</div>
10098
<h3>Additional Libraries</h3>
10199

102100
<div class="comparison-table">

appendices/appendix-c-python-for-llm/section-c.2.html

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,8 +60,6 @@ <h3>Option 2: Conda (Recommended for GPU Work)</h3>
6060
# Export environment
6161
conda env export > environment.yml</code></pre>
6262
<div class="code-caption"><strong>Code Fragment C.2.1:</strong> This snippet demonstrates this approach using <a href="https://pytorch.org/" target="_blank" rel="noopener">PyTorch</a>. Study the implementation to understand how each component contributes to the overall workflow.</div>
63-
<!-- FIXME: stacked caption, needs manual review -->
64-
<div class="code-caption"><strong>Code Fragment C.2.2:</strong> This snippet demonstrates this approach using PyTorch. Study the implementation to understand how each component contributes to the overall workflow.</div>
6563
<div class="callout key-insight">
6664
<div class="callout-title">Key Insight: Why Conda for GPU Work?</div>
6765
<p>The main advantage of Conda over venv for LLM work is CUDA management. Installing PyTorch with <code>conda</code> automatically includes the correct CUDA toolkit version, sidestepping the need to install system-level <a href="https://www.nvidia.com/" target="_blank" rel="noopener">NVIDIA</a> drivers and CUDA separately. This is especially valuable on shared machines or when you need different CUDA versions for different projects.</p>

appendices/appendix-c-python-for-llm/section-c.4.html

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -123,12 +123,6 @@ <h3>Pattern 5: Saving and Loading Checkpoints</h3>
123123
model.push_to_hub("your-username/my-finetuned-model")
124124
tokenizer.push_to_hub("your-username/my-finetuned-model")</code></pre>
125125
<div class="code-caption"><strong>Code Fragment C.4.4:</strong> This snippet demonstrates the <code>call_with_retry</code> function using API integration. The function encapsulates reusable logic that can be applied across different inputs.</div>
126-
<!-- FIXME: stacked caption, needs manual review -->
127-
<div class="code-caption"><strong>Code Fragment C.4.2:</strong> This snippet demonstrates this approach. Study the implementation to understand how each component contributes to the overall workflow.</div>
128-
<!-- FIXME: stacked caption, needs manual review -->
129-
<div class="code-caption"><strong>Code Fragment C.4.1:</strong> This snippet demonstrates this approach using <a href="https://pytorch.org/" target="_blank" rel="noopener">PyTorch</a>. Study the implementation to understand how each component contributes to the overall workflow.</div>
130-
<!-- FIXME: stacked caption, needs manual review -->
131-
<div class="code-caption"><strong>Code Fragment C.4.5:</strong> This snippet demonstrates this approach. Study the implementation to understand how each component contributes to the overall workflow.</div>
132126
<div class="callout fun-note">
133127
<div class="callout-title">Fun Fact: The Two-Line LLM</div>
134128
<p>Thanks to the <code>pipeline</code> API, you can run a language model in two lines of Python: one to create the pipeline, one to call it. The entire transformer architecture, tokenization, and decoding are handled behind the scenes. This is both a blessing (rapid prototyping) and a danger (it is easy to treat the model as a black box without understanding its behavior).</p>

appendices/appendix-e-git-collaboration/section-e.3.html

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,8 +84,6 @@ <h3>MLflow</h3>
8484
# Log the model as an artifact
8585
mlflow.log_artifact("./output/adapter_model.safetensors")</code></pre>
8686
<div class="code-caption"><strong>Code Fragment E.3.1:</strong> This snippet demonstrates this approach using experiment tracking, loss computation. Notice how experiment parameters and artifacts are logged together for full reproducibility. Reproducible experiments are the foundation of reliable iteration in production ML systems.</div>
87-
<!-- FIXME: stacked caption, needs manual review -->
88-
<div class="code-caption"><strong>Code Fragment E.3.2:</strong> This snippet demonstrates this approach using monitoring, experiment tracking. Notice how the metrics are tagged with request metadata so you can slice dashboards by model, user, or endpoint. Proactive monitoring catches regressions before they reach users and simplifies root-cause analysis.</div>
8987
<div class="comparison-table">
9088
<div class="comparison-table-title">Feature Comparison</div>
9189
<table>

appendices/appendix-i-prompt-templates/section-i.5.html

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -47,9 +47,7 @@ <h3>Code Generation with Specification</h3>
4747
Input: {{input_description}}
4848
Output: {{output_description}}
4949
Edge cases to handle: {{edge_cases}}</code></pre>
50-
<div class="code-caption"><strong>Code Fragment I.5.1:</strong> Instructs the model to generate code from a specification, including type hints, docstrings, and edge case handling for production quality.</div>
51-
<!-- FIXME: stacked caption, needs manual review -->
52-
<div class="code-caption"><strong>Code Fragment I.5.2:</strong> Supplies the function signature and requirements, giving the model a clear contract to implement.</div>
50+
<div class="code-caption"><strong>Code Fragment I.5.1:</strong> Instructs the model to generate code from a specification, including type hints, docstrings, and edge case handling for production quality. The user message supplies the function signature and requirements, giving the model a clear contract to implement.</div>
5351

5452
<div class="callout tip">
5553
<div class="callout-title">Tip</div>
@@ -82,9 +80,7 @@ <h3>Code Review and Improvement</h3>
8280
```{{language}}
8381
{{code}}
8482
```</code></pre>
85-
<div class="code-caption"><strong>Code Fragment I.5.3:</strong> Configures the model as a code reviewer that identifies bugs, performance issues, and style violations with actionable fix suggestions.</div>
86-
<!-- FIXME: stacked caption, needs manual review -->
87-
<div class="code-caption"><strong>Code Fragment I.5.4:</strong> Presents the code to review with context about the programming language and review focus areas.</div>
83+
<div class="code-caption"><strong>Code Fragment I.5.3:</strong> Configures the model as a code reviewer that identifies bugs, performance issues, and style violations with actionable fix suggestions. The user message presents the code to review with context about the programming language and review focus areas.</div>
8884

8985
</div>
9086

part-2-understanding-llms/module-06-pretraining-scaling-laws/section-6.3.html

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -389,9 +389,7 @@ <h3>Computing the Chinchilla-Optimal Allocation</h3>
389389
Optimal model size: 288.7B parameters
390390
Optimal data: 5774B tokens
391391
</div>
392-
<div class="code-caption"><strong>Code Fragment 6.3.1:</strong> Empirical data: (parameters, final_loss) from small training runs.</div>
393-
<!-- FIXME: stacked caption, needs manual review -->
394-
<div class="code-caption"><strong>Code Fragment 6.3.2:</strong> Example compute budgets.</div>
392+
<div class="code-caption"><strong>Code Fragment 6.3.1:</strong> Empirical data from small training runs showing (parameters, final_loss) pairs and example compute budgets with optimal model size and data allocations.</div>
395393

396394
<h2>9. Summary Table: Scaling Regimes</h2>
397395

part-2-understanding-llms/module-07-modern-llm-landscape/section-7.1.html

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -362,9 +362,7 @@ <h3>Pricing Comparison</h3>
362362
# Similar pattern for Google (Vertex AI) and other providers</code></pre>
363363
<div class="code-output">GPT-4o: 25 * 37 = 925
364364
Tokens: 14 in, 8 out</div>
365-
<div class="code-caption"><strong>Code Fragment 7.1.1:</strong> Approximate pricing comparison (per million tokens, USD).</div>
366-
<!-- FIXME: stacked caption, needs manual review -->
367-
<div class="code-caption"><strong>Code Fragment 7.1.2:</strong> Example: Making an API call to compare providers.</div>
365+
<div class="code-caption"><strong>Code Fragment 7.1.2:</strong> Making an API call to compare providers using the OpenAI-compatible chat format across different model providers.</div>
368366

369367
<div class="callout note">
370368
<div class="callout-title">Note: Where This Leads Next</div>

part-3-working-with-llms/module-10-llm-apis/section-10.3.html

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -446,9 +446,6 @@ <h3>5.2 Helicone</h3>
446446
print("Check Helicone dashboard for detailed analytics.")</code></pre>
447447
<div class="code-output">Response received. Tokens: 89
448448
Check Helicone dashboard for detailed analytics.</div>
449-
<div class="code-caption"><strong>Code Fragment 10.3.6:</strong> Configuring Portkey as an AI gateway with fallback routing and semantic caching. The OpenAI client is pointed at Portkey's gateway URL, which transparently handles provider failover and caching.</div>
450-
451-
<!-- FIXME: stacked caption, needs manual review -->
452449
<div class="code-caption"><strong>Code Fragment 10.3.7:</strong> Routing API calls through Helicone for observability by changing the base URL and adding custom headers. Every request is automatically logged with latency, token counts, cost, and tagged properties.</div>
453450

454451
<div class="callout key-insight">

part-3-working-with-llms/module-11-prompt-engineering/section-11.3.html

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -371,8 +371,6 @@ <h2>2. Meta-Prompting: Prompts That Generate Prompts <span class="level-badge in
371371
})
372372
print(messages) # ready to pass to any LangChain LLM</code></pre>
373373
<div class="code-caption"><strong>Code Fragment 11.3.3:</strong> Meta-prompting via <code>generate_expert_prompt()</code>, which uses an LLM to write system prompts for other LLM calls. The meta-prompt template specifies five structural requirements (role definition, output format, quality criteria, edge cases, and two examples) and requests only the prompt text with no commentary.</div>
374-
<!-- TODO: insert library shortcut code block for this caption -->
375-
<div class="code-caption"><strong>Code Fragment 11.3.6:</strong> LangChain <code>ChatPromptTemplate</code> shortcut. Declarative templates separate prompt structure from variable content, making prompts versionable, testable, and composable into chains without manual string formatting.</div>
376374

377375
<div class="callout note">
378376
<div class="callout-title">Note: Meta-Prompting and Iteration</div>

part-3-working-with-llms/module-12-hybrid-ml-llm/section-12.2.html

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -286,10 +286,7 @@ <h2>4. Combining Embeddings with Structured Features <span class="level-badge in
286286
Embeddings only 0.534 (+/- 0.035)
287287
Combined (structured + embeddings) 0.841 (+/- 0.018)
288288
</div>
289-
<div class="code-caption"><strong>Code Fragment 12.2.2:</strong> Local embedding with <code>SentenceTransformer('all-MiniLM-L6-v2')</code>, an 80 MB model producing 384-dimensional vectors. The <code>normalize_embeddings=True</code> flag enables direct dot-product similarity. At 5.7 ms per text on CPU with zero API cost, this is orders of magnitude cheaper than cloud embedding APIs.</div>
290-
291-
<!-- FIXME: stacked caption, needs manual review -->
292-
<div class="code-caption"><strong>Code Fragment 12.2.3:</strong> Feature ablation study comparing structured-only, embeddings-only, and combined feature sets using XGBoost with 5-fold cross-validation. The combined configuration (<code>StandardScaler</code> on structured features concatenated with 384-dim embeddings) outperforms either source alone, demonstrating complementary signal.</div>
289+
<div class="code-caption"><strong>Code Fragment 12.2.2:</strong> Local embedding with <code>SentenceTransformer('all-MiniLM-L6-v2')</code> producing 384-dimensional vectors, followed by a feature ablation study comparing structured-only, embeddings-only, and combined feature sets using XGBoost with 5-fold cross-validation. The combined configuration outperforms either source alone, demonstrating complementary signal.</div>
293290

294291
<div class="callout key-insight">
295292
<div class="callout-title">Key Insight</div>

0 commit comments

Comments
 (0)