You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: front-matter/about-book.html
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -27,14 +27,14 @@ <h2>At a Glance</h2>
27
27
28
28
<p>Whether you want to build your first RAG pipeline, ship an AI agent to production, or make strategic decisions about LLM adoption at your organization, this book meets you where you are. It is for software engineers, ML practitioners, researchers, product leaders, domain specialists, and educators who want to understand, build, and deploy systems powered by large language models. It assumes familiarity with Python and basic linear algebra; appendices cover the remaining prerequisites.</p>
29
29
30
-
<p>The book spans <strong>38 chapters</strong> in 11 parts, plus <strong>22 appendices</strong> (A through V) with framework tutorials, and a <ahref="../capstone/index.html">capstone project</a>. For the full chapter map, dependency diagram, audience details, and background requirements, see <ahref="section-fm.1a.html">FM.1: What This Book Covers</a>. Twenty tailored <ahref="pathways/index.html">reading pathways</a> help you find the most relevant chapters for your goals.</p>
30
+
<p>The book spans <strong>39 chapters</strong> (numbered 0 through 38) in 11 parts, plus <strong>22 appendices</strong> (A through V) with framework tutorials, and a <ahref="../capstone/index.html">capstone project</a>. For the full chapter map, dependency diagram, audience details, and background requirements, see <ahref="section-fm.1a.html">FM.1: What This Book Covers</a>. Twenty tailored <ahref="pathways/index.html">reading pathways</a> help you find the most relevant chapters for your goals.</p>
<p>This book was produced through a collaborative process between its human authors and a team of 42 specialized AI writing agents. The authors curated every chapter, validated all technical content, and made all editorial decisions; AI agents proposed initial drafts, generated code examples, created illustrations, and checked cross-references across the 38-chapter structure.</p>
37
+
<p>This book was produced through a collaborative process between its human authors and a team of 42 specialized AI writing agents. The authors curated every chapter, validated all technical content, and made all editorial decisions; AI agents proposed initial drafts, generated code examples, created illustrations, and checked cross-references across the 39-chapter structure.</p>
38
38
39
39
<p>Many of the book's illustrations were produced using Google Gemini's image generation capabilities, with prompts crafted by the authors and refined through iterative feedback. All diagrams and SVG figures were either hand-coded or generated and reviewed for technical accuracy.</p>
Copy file name to clipboardExpand all lines: front-matter/index.html
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ <h1>Introduction, Pathways & How to Use This Book</h1>
27
27
<divclass="overview">
28
28
<h2>Overview of the Front Matter</h2>
29
29
<p>
30
-
Before you build anything, you need a map. This front matter orients you before you dive into the technical chapters. It answers four questions every reader has at the start: What does this book cover, and who is it for? How should I navigate 38 chapters and 11 parts given my background and goals? How can an instructor build a university course from this material? And what conventions, callout types, and recurring elements will I encounter on every page?
30
+
Before you build anything, you need a map. This front matter orients you before you dive into the technical chapters. It answers four questions every reader has at the start: What does this book cover, and who is it for? How should I navigate 39 chapters and 11 parts given my background and goals? How can an instructor build a university course from this material? And what conventions, callout types, and recurring elements will I encounter on every page?
31
31
</p>
32
32
<p>
33
33
Whether you plan to read cover to cover or jump straight to the chapters that match your role, spending 15 minutes here will save you hours of backtracking later. Each section below links to a dedicated page with full detail.
Copy file name to clipboardExpand all lines: front-matter/section-fm.1a.html
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,7 @@ <h1>What This Book Covers</h1>
25
25
</blockquote>
26
26
27
27
<p>
28
-
Six months from now, you will be building AI systems that did not exist when you started reading. This book is a comprehensive, practitioner-oriented guide to the entire Large Language Model stack. It begins with the mathematical and conceptual foundations of machine learning, moves through the architecture and training of transformers, and culminates in the design, deployment, and governance of production AI agent systems. The journey spans 38 chapters (numbered 0 through 38) organized into eleven parts, plus 22 appendices covering frameworks, tools, and reference material.
28
+
Six months from now, you will be building AI systems that did not exist when you started reading. This book is a comprehensive, practitioner-oriented guide to the entire Large Language Model stack. It begins with the mathematical and conceptual foundations of machine learning, moves through the architecture and training of transformers, and culminates in the design, deployment, and governance of production AI agent systems. The journey spans 39 chapters (numbered 0 through 38) organized into eleven parts, plus 22 appendices covering frameworks, tools, and reference material.
Copy file name to clipboardExpand all lines: front-matter/section-fm.5.html
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -47,7 +47,7 @@ <h2>The Production Philosophy</h2>
47
47
</p>
48
48
<p>
49
49
This is not a gimmick. It is a deliberate architectural decision that enables: rapid iteration
50
-
(a chapter can be produced and revised in hours, not months), consistent quality across all 38
50
+
(a chapter can be produced and revised in hours, not months), consistent quality across all 39
51
51
chapters (every chapter passes through the same 22 quality stages), and deep cross-referencing
52
52
(agents can read and reference the entire book while writing any single section).
53
53
</p>
@@ -79,7 +79,7 @@ <h2>The Human Role</h2>
79
79
While the AI agents produce the content, human oversight plays a critical role at several points:
80
80
</p>
81
81
<ul>
82
-
<li><strong>Book architecture:</strong> The overall structure (11 parts, 38 chapters, section breakdown) was designed by a human author with input from the Curriculum Architect agent.</li>
82
+
<li><strong>Book architecture:</strong> The overall structure (11 parts, 39 chapters, section breakdown) was designed by a human author with input from the Curriculum Architect agent.</li>
83
83
<li><strong>Quality standards:</strong> The conformance checklist, callout types, page layout standards, and CSS design system were human-defined, then enforced by agents.</li>
84
84
<li><strong>Editorial judgment:</strong> Major decisions about scope (what to include/exclude), tone (technical but accessible), and audience (engineers, researchers, students) were human decisions.</li>
85
85
<li><strong>Review and iteration:</strong> Every chapter is reviewed by a human who can request revisions, flag inaccuracies, or redirect emphasis.</li>
Copy file name to clipboardExpand all lines: front-matter/syllabi/index.html
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ <h3>How to Use This Section</h3>
27
27
<h2>University Course Syllabi</h2>
28
28
29
29
<p>
30
-
With 38 chapters, no single semester can cover everything. The following four syllabi are designed for instructors adopting this book for a single-semester (14-week) university course. Click any card for the complete week-by-week syllabus with hyperlinked chapter references.
30
+
With 39 chapters, no single semester can cover everything. The following four syllabi are designed for instructors adopting this book for a single-semester (14-week) university course. Click any card for the complete week-by-week syllabus with hyperlinked chapter references.
<p>In <strong>regression</strong>, the output is a continuous number. Predicting house prices, stock returns, <aclass="cross-ref" href="../module-05-decoding-text-generation/section-05.2.html">temperature</a>, or the probability that a user clicks an ad: all regression tasks. The model produces a numeric prediction, and we measure how far off it is from the true value.</p>
120
+
<p>In <strong>regression</strong>, the output is a continuous number. Predicting house prices, stock returns, <aclass="cross-ref" href="../module-05-decoding-text-generation/section-5.2.html">temperature</a>, or the probability that a user clicks an ad: all regression tasks. The model produces a numeric prediction, and we measure how far off it is from the true value.</p>
<p><strong>Supervised learning</strong> requires human labels (input-output pairs). <strong>Unsupervised learning</strong> finds patterns in data without labels (clustering, dimensionality reduction). <strong>Self-supervised learning</strong> creates its own labels from the data: mask a word and predict it (<aclass="cross-ref" href="../../part-2-understanding-llms/module-06-pretraining-scaling-laws/section-06.1.html">BERT</a>), or predict the next word from all previous words (GPT). This is how every large language model is pre-trained. It is the reason LLMs can learn from the entire internet without human annotation.</p>
146
+
<p><strong>Supervised learning</strong> requires human labels (input-output pairs). <strong>Unsupervised learning</strong> finds patterns in data without labels (clustering, dimensionality reduction). <strong>Self-supervised learning</strong> creates its own labels from the data: mask a word and predict it (<aclass="cross-ref" href="../../part-2-understanding-llms/module-06-pretraining-scaling-laws/section-6.1.html">BERT</a>), or predict the next word from all previous words (GPT). This is how every large language model is pre-trained. It is the reason LLMs can learn from the entire internet without human annotation.</p>
147
147
</div>
148
148
149
149
<h2>3. Loss Functions and Optimization <spanclass="level-badge intermediate" title="Intermediate">INTERMEDIATE</span></h2>
<p>Squaring the errors does two things: it makes all errors positive (so they do not cancel out), and it penalizes large errors more severely than small ones. A prediction that is off by 10 contributes 100 to the loss, while one that is off by 1 contributes just 1.</p>
163
163
164
-
<p><strong>For classification</strong>, the standard is <strong><aclass="cross-ref" href="../module-04-transformer-architecture/section-04.1.html">Cross-Entropy</a> Loss</strong>:</p>
164
+
<p><strong>For classification</strong>, the standard is <strong><aclass="cross-ref" href="../module-04-transformer-architecture/section-4.1.html">Cross-Entropy</a> Loss</strong>:</p>
<p>The hyperparameter <spanclass="math">$\lambda$</span> controls the strength of the penalty. Large weights are penalized quadratically, which pushes all weights toward smaller values without forcing them to zero. This is the most common regularization in deep learning, where it is called <strong><aclass="cross-ref" href="section-0.2.html">weight decay</a></strong>. You will see weight decay appear again as a critical hyperparameter when <aclass="cross-ref" href="../../part-4-training-adapting/module-14-fine-tuning-fundamentals/section-14.3.html">tuning fine-tuning hyperparameters in Chapter 13</a>.</p>
376
+
<p>The hyperparameter <spanclass="math">$\lambda$</span> controls the strength of the penalty. Large weights are penalized quadratically, which pushes all weights toward smaller values without forcing them to zero. This is the most common regularization in deep learning, where it is called <strong><aclass="cross-ref" href="section-0.2.html">weight decay</a></strong>. You will see weight decay appear again as a critical hyperparameter when <aclass="cross-ref" href="../../part-4-training-adapting/module-14-fine-tuning-fundamentals/section-14.3.html">tuning fine-tuning hyperparameters in Chapter 14</a>.</p>
<p>Modern deep learning complicates the classical bias-variance tradeoff. Very large neural networks (including LLMs) are so overparameterized that they can memorize the training set perfectly, yet they still generalize well. This phenomenon, sometimes called "benign overfitting" or the "double descent" curve, is an active area of research that connects directly to <aclass="cross-ref" href="../../part-2-understanding-llms/module-06-pretraining-scaling-laws/section-06.2.html">scaling laws and the Chinchilla findings in Chapter 6</a>. The classical framework remains a valuable mental model, but reality is richer than the simple U-shaped curve suggests.</p>
505
+
<p>Modern deep learning complicates the classical bias-variance tradeoff. Very large neural networks (including LLMs) are so overparameterized that they can memorize the training set perfectly, yet they still generalize well. This phenomenon, sometimes called "benign overfitting" or the "double descent" curve, is an active area of research that connects directly to <aclass="cross-ref" href="../../part-2-understanding-llms/module-06-pretraining-scaling-laws/section-6.2.html">scaling laws and the Chinchilla findings in Chapter 6</a>. The classical framework remains a valuable mental model, but reality is richer than the simple U-shaped curve suggests.</p>
506
506
</div>
507
507
508
508
<h2>6. Cross-Validation and Model Selection <spanclass="level-badge intermediate" title="Intermediate">INTERMEDIATE</span></h2>
0 commit comments