From 0d5eaee9200c2fccd3b9dcd73cbd6d2f85b5d43f Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 11:25:43 +0000
Subject: [PATCH 01/18] fix link to Lloyd-Smith et al. in
 superspreading-estimate.Rmd

---
 episodes/superspreading-estimate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index 9c8f1e35..9b6e1609 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -59,7 +59,7 @@ go to the [main setup page](../learners/setup.md#software-setup).
 
 <!-- we know -->
 
-From smallpox to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), some infected individuals spread infection to more people than others. Disease transmission is the result of a combination of biological and social factors, and these factors average out to some extent at the population level during a large epidemic. Hence researchers often use population averages to assess the potential for disease to spread. However, in the earlier or later phases of an outbreak, individual differences in infectiousness can be more important. In particular, they increase the chance of superspreading events (SSEs), which can ignite explosive epidemics and also influence the chances of controlling transmission ([Lloyd-Smith et al., 2005](https://wellcomeopenresearch.org/articles/5-83)).
+From smallpox to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), some infected individuals spread infection to more people than others. Disease transmission is the result of a combination of biological and social factors, and these factors average out to some extent at the population level during a large epidemic. Hence researchers often use population averages to assess the potential for disease to spread. However, in the earlier or later phases of an outbreak, individual differences in infectiousness can be more important. In particular, they increase the chance of superspreading events (SSEs), which can ignite explosive epidemics and also influence the chances of controlling transmission ([Lloyd-Smith et al., 2005](https://www.nature.com/articles/nature04153)).
 
 ![**Chains of SARS-CoV-2 transmission in Hong Kong initiated by local or imported cases.** (**a**), Transmission network of a cluster of cases traced back to a collection of four bars across Hong Kong (n = 106). (**b**), Transmission network associated with a wedding without clear infector–infectee pairs but linked back to a preceding social gathering and local source (n = 22). (**c**), Transmission network associated with a temple cluster of undetermined source (n = 19). (**d**), All other clusters of SARS-CoV-2 infections where the source and transmission chain could be determined ([Adam et al., 2020](https://www.nature.com/articles/s41591-020-1092-0)).](fig/see-intro-superspreading.png)
 

From 8dafa10a046d9d57162917689e986b0bebd9428e Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 11:55:18 +0000
Subject: [PATCH 02/18] change citation name in superspreading-estimate.Rmd

---
 episodes/superspreading-estimate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index 9b6e1609..07d217aa 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -65,7 +65,7 @@ From smallpox to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), s
 
 <!-- we dont know -->
 
-The [basic reproduction number](../learners/reference.md#basic), $R_{0}$, measures the average number of cases caused by one infectious individual in a entirely susceptible population. Estimates of $R_{0}$ are useful for understanding the average dynamics of an epidemic at the population-level, but can obscure considerable individual variation in infectiousness. This was highlighted during the global emergence of SARS-CoV-2 by numerous ‘superspreading events’ in which certain infectious individuals generated unusually large numbers of secondary cases ([LeClerc et al, 2020](https://wellcomeopenresearch.org/articles/5-83)).
+The [basic reproduction number](../learners/reference.md#basic), $R_{0}$, measures the average number of cases caused by one infectious individual in a entirely susceptible population. Estimates of $R_{0}$ are useful for understanding the average dynamics of an epidemic at the population-level, but can obscure considerable individual variation in infectiousness. This was highlighted during the global emergence of SARS-CoV-2 by numerous ‘superspreading events’ in which certain infectious individuals generated unusually large numbers of secondary cases ([Leclerc et al, 2020](https://wellcomeopenresearch.org/articles/5-83)).
 
 ![**Observed offspring distribution of SARS-CoV-2 transmission in Hong Kong.** N = 91 SARS-CoV-2 infectors, N = 153 terminal infectees and N = 46 sporadic local cases. Histogram bars indicate the proportion of onward transmission per amount of secondary cases. Line corresponds to a fitted negative binomial distribution ([Adam et al., 2020](https://www.nature.com/articles/s41591-020-1092-0)).](fig/see-intro-secondary-cases-fig-b.png){alt='R = 0.58 and k = 0.43.'}
 

From a8fb2cf2e313d566a381b4d4f3fc57d8057290ba Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 11:57:13 +0000
Subject: [PATCH 03/18] fix spelling in callout in superspreading-estimate.Rmd

---
 episodes/superspreading-estimate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index 07d217aa..544f97a2 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -96,7 +96,7 @@ library(tidyverse)
 
 ### The double-colon
 
-The double-colon `::` in R let you call a specific function from a package without loading the entire package into the current environment. 
+The double-colon `::` in R lets you call a specific function from a package without loading the entire package into the current environment. 
 
 For example, `dplyr::filter(data, condition)` uses `filter()` from the `{dplyr}` package.
 

From 167975e992da1b9dd265037b580327a5308cc283 Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 11:58:56 +0000
Subject: [PATCH 04/18] add pkg install info to callout in
 superspreading-estimate.Rmd

---
 episodes/superspreading-estimate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index 544f97a2..f3003ea2 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -96,7 +96,7 @@ library(tidyverse)
 
 ### The double-colon
 
-The double-colon `::` in R lets you call a specific function from a package without loading the entire package into the current environment. 
+The double-colon `::` in R lets you call a specific function from a package without loading the entire package into the current environment. The package must be installed.
 
 For example, `dplyr::filter(data, condition)` uses `filter()` from the `{dplyr}` package.
 

From eb089f6c7cc83a1a811422e6b2b167dc4ad0e847 Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 11:59:29 +0000
Subject: [PATCH 05/18] fix typo in superspreading-estimate.Rmd header

---
 episodes/superspreading-estimate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index f3003ea2..376c0400 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -104,7 +104,7 @@ This help us remember package functions and avoid namespace conflicts.
 
 :::::::::::::::::::
 
-## The individual reprodution number
+## The individual reproduction number
 
 The individual reproduction number is defined as the number of secondary cases caused by a particular infected individual. 
 

From 7b18a876a4d649e1080edbb6c309ce68fa327d85 Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 12:21:05 +0000
Subject: [PATCH 06/18] grammar fix in dropdown in superspreading-estimate.Rmd

---
 episodes/superspreading-estimate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index 376c0400..c17b64ea 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -338,7 +338,7 @@ For occurrences of associated discrete events we can use **Poisson** or negative
 
 In a Poisson distribution, mean is equal to variance. But when variance is higher than the mean, this is called **overdispersion**. In biological applications, overdispersion occurs and so a negative binomial may be worth considering as an alternative to Poisson distribution.
 
-**Negative binomial** distribution is specially useful for discrete data over an unbounded positive range whose sample variance exceeds the sample mean. In such terms, the observations are overdispersed with respect to a Poisson distribution, for which the mean is equal to the variance.
+The **negative binomial** distribution is specially useful for discrete data over an unbounded positive range whose sample variance exceeds the sample mean. In such terms, the observations are overdispersed with respect to a Poisson distribution, for which the mean is equal to the variance.
 
 In epidemiology, [negative binomial](https://en.wikipedia.org/wiki/Negative_binomial_distribution) have being used to model disease transmission for infectious diseases where the likely number of onward infections may vary considerably from individual to individual and from setting to setting, capturing all variation in infectious histories of individuals, including properties of the biological (i.e. degree of viral shedding) and environmental circumstances (e.g. type and location of contact).
 

From 3c3f07568a571e15f85067ef73366e425ad56717 Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 12:23:09 +0000
Subject: [PATCH 07/18] wording fix in dropdown in superspreading-estimate.Rmd

---
 episodes/superspreading-estimate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index c17b64ea..b135b435 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -340,7 +340,7 @@ In a Poisson distribution, mean is equal to variance. But when variance is highe
 
 The **negative binomial** distribution is specially useful for discrete data over an unbounded positive range whose sample variance exceeds the sample mean. In such terms, the observations are overdispersed with respect to a Poisson distribution, for which the mean is equal to the variance.
 
-In epidemiology, [negative binomial](https://en.wikipedia.org/wiki/Negative_binomial_distribution) have being used to model disease transmission for infectious diseases where the likely number of onward infections may vary considerably from individual to individual and from setting to setting, capturing all variation in infectious histories of individuals, including properties of the biological (i.e. degree of viral shedding) and environmental circumstances (e.g. type and location of contact).
+In epidemiology, the [negative binomial distribution](https://en.wikipedia.org/wiki/Negative_binomial_distribution) has been used to model disease transmission for infectious diseases where the likely number of onward infections may vary considerably from individual to individual and from setting to setting, capturing all variation in infectious histories of individuals, including properties of the biological (i.e. degree of viral shedding) and environmental circumstances (e.g. type and location of contact).
 
 :::::::::::::::::::::::::::::
 

From 96e12ae6df79a10738d686ab0df1872c277f85b0 Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 12:36:32 +0000
Subject: [PATCH 08/18] clarify wording of callout in
 superspreading-estimate.Rmd

---
 episodes/superspreading-estimate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index b135b435..62ddee30 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -488,7 +488,7 @@ ggplot() +
 
 ### Individual-level variation in transmission
 
-The individual-level variation in transmission is defined by the relationship between the mean ($R_{0}$), dispersion ($k$), and the variance of a negative binomial distribution.
+The individual-level variation in transmission is defined by the relationship between the mean ($R_{0}$), dispersion ($k$), which defines the variance of a negative binomial distribution.
 
 The negative binomial model has $variance = R_{0}(1+\frac{R_{0}}{k})$, so smaller values of $k$ indicate greater variance and, consequently, greater **individual-level variation** in transmission.
 

From be2f9d8b11ca47acc3335fe9c64f784018b49168 Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 12:42:44 +0000
Subject: [PATCH 09/18] consistent in-text citation style in
 superspreading-estimate.Rmd

---
 episodes/superspreading-estimate.Rmd | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index 62ddee30..20c13115 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -65,7 +65,7 @@ From smallpox to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), s
 
 <!-- we dont know -->
 
-The [basic reproduction number](../learners/reference.md#basic), $R_{0}$, measures the average number of cases caused by one infectious individual in a entirely susceptible population. Estimates of $R_{0}$ are useful for understanding the average dynamics of an epidemic at the population-level, but can obscure considerable individual variation in infectiousness. This was highlighted during the global emergence of SARS-CoV-2 by numerous ‘superspreading events’ in which certain infectious individuals generated unusually large numbers of secondary cases ([Leclerc et al, 2020](https://wellcomeopenresearch.org/articles/5-83)).
+The [basic reproduction number](../learners/reference.md#basic), $R_{0}$, measures the average number of cases caused by one infectious individual in a entirely susceptible population. Estimates of $R_{0}$ are useful for understanding the average dynamics of an epidemic at the population-level, but can obscure considerable individual variation in infectiousness. This was highlighted during the global emergence of SARS-CoV-2 by numerous ‘superspreading events’ in which certain infectious individuals generated unusually large numbers of secondary cases ([Leclerc et al., 2020](https://wellcomeopenresearch.org/articles/5-83)).
 
 ![**Observed offspring distribution of SARS-CoV-2 transmission in Hong Kong.** N = 91 SARS-CoV-2 infectors, N = 153 terminal infectees and N = 46 sporadic local cases. Histogram bars indicate the proportion of onward transmission per amount of secondary cases. Line corresponds to a fitted negative binomial distribution ([Adam et al., 2020](https://www.nature.com/articles/s41591-020-1092-0)).](fig/see-intro-secondary-cases-fig-b.png){alt='R = 0.58 and k = 0.43.'}
 
@@ -560,7 +560,7 @@ We can use the maximum likelihood estimates from `{fitdistrplus}` to compare dif
 
 ### The dispersion parameter across diseases
 
-Research into sexually transmitted and vector-borne diseases has previously suggested a '20/80' rule, with 20% of individuals contributing at least 80% of the transmission potential ([Woolhouse et al](https://www.pnas.org/doi/10.1073/pnas.94.1.338)). 
+Research into sexually transmitted and vector-borne diseases has previously suggested a '20/80' rule, with 20% of individuals contributing at least 80% of the transmission potential ([Woolhouse et al., 1997](https://www.pnas.org/doi/10.1073/pnas.94.1.338)). 
 
 On its own, the dispersion parameter $k$ is hard to interpret intuitively, and hence converting into proportional summary can enable easier comparison. When we consider a wider range of pathogens, we can see there is no hard and fast rule for the percentage that generates 80% of transmission, but variation does emerge as a common feature of infectious diseases
 

From b977c587245756bcc1a346c812b2fea7e6aaedfb Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 13:09:48 +0000
Subject: [PATCH 10/18] fix typo in figure legend in
 superspreading-estimate.Rmd

---
 episodes/superspreading-estimate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index 20c13115..d98c3d9e 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -688,7 +688,7 @@ During an outbreak, it is common to try and reduce transmission by identifying p
 
 In the presence of individual-level variation in transmission, i.e., with an overdispersed offspring distribution, if this primary case is identified, a larger fraction of the transmission chain can be detected by forward tracing each of the contacts of this primary case  ([Endo et al., 2020](https://wellcomeopenresearch.org/articles/5-239/v3)).
 
-![Schematic representation of contact tracing strategies. Black arrows indicate the directions of transmission, blue and Orange arrows, a successful or failed contact tracing, respectivelly. When there is evidence of individual-level variation in transmission, often resulting in superspreading, backward contact tracing from the index case (blue circle) increase the probability to find the primary case (green circle) or clusters with a larger fraction of cases, potentially increasing the number of quarentined cases (yellow circles). [Claire Blackmore, 2021](https://www.paho.org/sites/default/files/backward_contact_tracing_v3_0.pdf)](fig/contact-tracing-strategies.png)
+![Schematic representation of contact tracing strategies. Black arrows indicate the directions of transmission, blue and Orange arrows, a successful or failed contact tracing, respectively. When there is evidence of individual-level variation in transmission, often resulting in superspreading, backward contact tracing from the index case (blue circle) increase the probability to find the primary case (green circle) or clusters with a larger fraction of cases, potentially increasing the number of quarentined cases (yellow circles). [Claire Blackmore, 2021](https://www.paho.org/sites/default/files/backward_contact_tracing_v3_0.pdf)](fig/contact-tracing-strategies.png)
 
 When there is evidence of individual-level variation (i.e. overdispersion), often resulting in so-called superspreading events, a large proportion of infections may be linked to a small proportion of original clusters. As a result, finding and targeting originating clusters in combination with reducing onwards infection may substantially enhance the effectiveness of tracing methods ([Endo et al., 2020](https://wellcomeopenresearch.org/articles/5-239/v3)). 
 

From 61b2feb6481a70a68343fc3b4bc463b0374697af Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 13:13:29 +0000
Subject: [PATCH 11/18] fix typo in superspreading-estimate.Rmd

---
 episodes/superspreading-estimate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index d98c3d9e..8a5d6297 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -692,7 +692,7 @@ In the presence of individual-level variation in transmission, i.e., with an ove
 
 When there is evidence of individual-level variation (i.e. overdispersion), often resulting in so-called superspreading events, a large proportion of infections may be linked to a small proportion of original clusters. As a result, finding and targeting originating clusters in combination with reducing onwards infection may substantially enhance the effectiveness of tracing methods ([Endo et al., 2020](https://wellcomeopenresearch.org/articles/5-239/v3)). 
 
-Empirical evidence focused on evaluating the efficiency of backward tracing lead to 42% more cases identified than forward tracing supporting its implementation when rigorous suppression of transmission is justified ([Raymenants et al., 2022](https://www.nature.com/articles/s41467-022-32531-6))
+Empirical evidence focused on evaluating the efficiency of backward tracing led to 42% more cases identified than forward tracing supporting its implementation when rigorous suppression of transmission is justified ([Raymenants et al., 2022](https://www.nature.com/articles/s41467-022-32531-6)).
 
 
 ## Probability of cases in a given cluster

From 08b0f6f0c983c1a3dee9b03ca7ab80adc6067e34 Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 13:15:27 +0000
Subject: [PATCH 12/18] fix wording in superspreading-estimate.Rmd

---
 episodes/superspreading-estimate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index 8a5d6297..c164edc8 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -735,7 +735,7 @@ cluster_probability_percent <- cluster_probability %>%
 
 Even though we have an $R<1$, a highly overdispersed offspring distribution ($k=0.02$) means that if we detect a new case, there is a `r cluster_probability_percent` probability they originated from a cluster of 25 infections or more. Hence, by following a backwards strategy, contact tracing efforts will increase the probability of successfully contain and quarantining this large number of earlier infected individuals, rather than simply focusing on the new case, who is likely to have infected nobody (because $k$ is very small).
 
-We can also use this number to prevent gathering of certain sized to reduce the epidemic by preventing potential superspreading events. Interventions can target to reduce the reproduction number in order to reduce the probability of having clusters of secondary cases.
+We can also use this number to prevent gatherings of a certain size to reduce the epidemic by preventing potential superspreading events. Interventions can target to reduce the reproduction number in order to reduce the probability of having clusters of secondary cases.
 
 
 ::::::::::::::::::::::::::::::::: challenge

From 9bd59d0788884c6bb64c366ed2c99cdc99083eb7 Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 14:40:49 +0000
Subject: [PATCH 13/18] update The double-colon callout in
 superspreading-simulate.Rmd to match superspreading-estimate.Rmd

---
 episodes/superspreading-simulate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-simulate.Rmd b/episodes/superspreading-simulate.Rmd
index 48b2693d..12521b7e 100644
--- a/episodes/superspreading-simulate.Rmd
+++ b/episodes/superspreading-simulate.Rmd
@@ -162,7 +162,7 @@ library(tidyverse)
 
 ### The double-colon
 
-The double-colon `::` in R let you call a specific function from a package without loading the entire package into the current environment. 
+The double-colon `::` in R lets you call a specific function from a package without loading the entire package into the current environment. The package must be installed.
 
 For example, `dplyr::filter(data, condition)` uses `filter()` from the `{dplyr}` package.
 

From 45986edd28dd2871e68d09048755de652af63a8c Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 15:03:02 +0000
Subject: [PATCH 14/18] consistent formatting of bullet points in
 supersperading-simulate.Rmd

---
 episodes/superspreading-simulate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-simulate.Rmd b/episodes/superspreading-simulate.Rmd
index 12521b7e..18100e2b 100644
--- a/episodes/superspreading-simulate.Rmd
+++ b/episodes/superspreading-simulate.Rmd
@@ -337,7 +337,7 @@ epichains::simulate_chains(
 
 - **simulation controls** (`n_chains` and `statistic`),
 - **offspring distribution** (`offspring_dist` and required distribution parameters), and
-- generation time (`generation_time`).
+- **generation time** (`generation_time`).
 
 In the lines above, we described how to specify the offspring distribution and generation time. The **simulation controls** include at least two arguments:
 

From 70b8024d4a77bfeec14c30a5ca4bfb8bf18b9faa Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 15:07:57 +0000
Subject: [PATCH 15/18] fix definition of set.seed() in
 superspreading-simulate.Rmd

---
 episodes/superspreading-simulate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-simulate.Rmd b/episodes/superspreading-simulate.Rmd
index 18100e2b..bdac2189 100644
--- a/episodes/superspreading-simulate.Rmd
+++ b/episodes/superspreading-simulate.Rmd
@@ -362,7 +362,7 @@ We can use `simulate_chains()` to create multiple chains and increase the probab
 
 We need to one additional element:
 
-- `set.seed(<integer>)`, which is a random number generator function with a specified seed value, the `<integer>` number, to ensure consistent results across different runs of the code.
+- `set.seed(<integer>)` is a function used to initialise a pseudo-random number generator. By specifying a seed value (the `<integer>`), you ensure that the sequence of numbers produced by subsequent random functions - like `rnorm()` or `simulate_chains()` — is identical every time the code is executed.
 
 With this configuration, each **chain** will represent **one initial case**. These cases per chain are independent, isolated, and without interactions. This means that each chain will have their own pool of susceptibles, which you can configure by using the `pop` or `percent_immune` arguments.
 

From b3aeadec9a5e016642dde80b35add165937258c8 Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 15:10:36 +0000
Subject: [PATCH 16/18] change wording in challenge box in
 superspreading-simulate.Rmd

---
 episodes/superspreading-simulate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-simulate.Rmd b/episodes/superspreading-simulate.Rmd
index bdac2189..dbacbb96 100644
--- a/episodes/superspreading-simulate.Rmd
+++ b/episodes/superspreading-simulate.Rmd
@@ -394,7 +394,7 @@ We can visually count how many chains reach to more than 100 infected cases, wit
 
 Use the last run of `epichains::simulate_chains()` for simulating multiple chains. Change the `statistic` from `"size"` to `"length"`. Run the `summary()` function.
 
-- What chain feature this output count for?
+- What chain feature does this output show?
 
 ::::::::: hint
 

From 5dcff3eaca0d4bb15f87fc3b66cc5c21424f51a9 Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 15:25:08 +0000
Subject: [PATCH 17/18] update wording in superspreading-simulate.Rmd

---
 episodes/superspreading-simulate.Rmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/episodes/superspreading-simulate.Rmd b/episodes/superspreading-simulate.Rmd
index dbacbb96..4a9bbb47 100644
--- a/episodes/superspreading-simulate.Rmd
+++ b/episodes/superspreading-simulate.Rmd
@@ -657,7 +657,7 @@ simulated_chains_map %>%
 
 To increase the probability of simulating uncontrolled outbreak projections given an overdispersed offspring distribution, let's simulate **1000 transmission chains** with 1 initial case each starting at day 0.
 
-We will create a multiple simulation **without** iteration for this section:
+We will run a simulation with multiple replicates, **without** iteration for this section:
 
 ```{r}
 set.seed(33)

From dfc48ef97783cb8ad36c06f9026bf86820cbcad0 Mon Sep 17 00:00:00 2001
From: Joshua Lambert <joshua.lambert@lshtm.ac.uk>
Date: Mon, 23 Mar 2026 16:45:31 +0000
Subject: [PATCH 18/18] Minor edits to superspreading text from code review

Co-authored-by: Andree Valle Campos <avallecam@gmail.com>
---
 episodes/superspreading-estimate.Rmd | 2 +-
 episodes/superspreading-simulate.Rmd | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/episodes/superspreading-estimate.Rmd b/episodes/superspreading-estimate.Rmd
index c164edc8..c1e50d51 100644
--- a/episodes/superspreading-estimate.Rmd
+++ b/episodes/superspreading-estimate.Rmd
@@ -488,7 +488,7 @@ ggplot() +
 
 ### Individual-level variation in transmission
 
-The individual-level variation in transmission is defined by the relationship between the mean ($R_{0}$), dispersion ($k$), which defines the variance of a negative binomial distribution.
+The individual-level variation in transmission is defined by the relationship between the mean ($R_{0}$) and dispersion ($k$), which together define the variance of a negative binomial distribution.
 
 The negative binomial model has $variance = R_{0}(1+\frac{R_{0}}{k})$, so smaller values of $k$ indicate greater variance and, consequently, greater **individual-level variation** in transmission.
 
diff --git a/episodes/superspreading-simulate.Rmd b/episodes/superspreading-simulate.Rmd
index 4a9bbb47..f74aab3b 100644
--- a/episodes/superspreading-simulate.Rmd
+++ b/episodes/superspreading-simulate.Rmd
@@ -362,7 +362,7 @@ We can use `simulate_chains()` to create multiple chains and increase the probab
 
 We need to one additional element:
 
-- `set.seed(<integer>)` is a function used to initialise a pseudo-random number generator. By specifying a seed value (the `<integer>`), you ensure that the sequence of numbers produced by subsequent random functions - like `rnorm()` or `simulate_chains()` — is identical every time the code is executed.
+- `set.seed(<integer>)` is a function used to initialise a pseudo-random number generator. By specifying a seed value (the `<integer>`), you ensure that the sequence of numbers produced by subsequent random functions, like `rnorm()` or `simulate_chains()`, is identical every time the code is executed.
 
 With this configuration, each **chain** will represent **one initial case**. These cases per chain are independent, isolated, and without interactions. This means that each chain will have their own pool of susceptibles, which you can configure by using the `pop` or `percent_immune` arguments.