Conditional Uncertainty Reduction: Case-Guided Sequential Experimental Design Tutorial#17
Conversation
…l Design Tutorial
|
@thevolatilebit would you mind checking if our action for JuliaFormatter is appropriately updated (https://github.com/Merck/CEEDesigns.jl/blob/main/.github/workflows/Formatter.yml)? I think it is giving a misleading failure. When running manually below, it gives a success message but the action is still failing. Do we need to update the .yml file (e.g. https://github.com/julia-actions/julia-format/blob/master/action.yml)? |
|
@gwenjones can you rebase from main and try to see if everything builds properly? It looks like on main @thevolatilebit switched to using Runic rather than JuliaFormatter, and it's working as expected. Make sure you have the environment in |
thevolatilebit
left a comment
There was a problem hiding this comment.
Thank you, @gwenjones, for your contribution. Overall this is a strong addition that reproduces the Chen et al. (2026) CNS case study clearly.
Most comments below are about tightening the narrative (context, math typesetting, consistent nomenclature) and a few structural points around the ensemble plotting module.
Narrative / prose
1. Add context on conditional likelihood in the opening
Currently (L. 7) the term conditional likelihood is used without definition. Suggest inserting a paragraph after L. 7:
In our setting, the conditional likelihood is the posterior probability mass the generative model assigns to the target variable
$y$ falling in a "desirable" range, given the evidence accumulated so far, i.e.$L(s) = P(y \in Y^\star \mid \text{evidence}(s))$ . Concretely in this tutorial,$L(s) = P(0.5 \leq k_\text{puu} \leq 1 \mid \text{QSAR and assay readouts})$ . Because the planner is free to stop only once$L(s) \geq \tau$ , the policy is steered toward evidence states that are not only low-uncertainty but also confident the compound lies in the promising region. See the [Generative Experimental Designs tutorial](@ref simple_generative) for the underlying similarity-weighted belief.
2. Explicit paper reference in the opening
The paper is only referenced in passing under "Dataset" (L. 22). Add a sentence after L. 4:
This tutorial implements the case-guided sequential experimental design methodology of Chen et al., 2026, reproducing the CNS/brain-penetration case study from that paper using CEEDesigns.jl.
3. Preamble block at L. 38–43
The using Plots / using CEEDesigns, CEEDesigns.GenerativeDesigns / seed!(1) block is pure setup. Consider labelling it as a "preamble" and hide it in rendered output via Literate/Documenter while still executing it. Documenter has @setup blocks for exactly this purpose: https://documenter.juliadocs.org/stable/man/syntax/#reference-at-setup If kept in the tutorial, introduce it with one sentence: "The following preamble loads the necessary packages and fixes the RNG seed for reproducibility."
4. Rewrite L. 47
Current: "We distinguish between cheap, in-silico predictions and expensive, physical assays by assigning different distance scales (λ values)."
Suggested:
We discount the informativeness gap (or 'fidelity gap') between cheap, in-silico predictions and expensive, physical assays by assigning different distance scales (per-feature
$\lambda_k$ ) — a smaller$\lambda$ makes a feature less decisive for similarity, so two compounds that differ on that feature are still considered close neighbors in the weighted kernel.
5. Acknowledge the heuristic choice of λ values (paragraph at L. 61–72)
Add a closing sentence:
These specific values (
$\lambda = 50$ for in-silico,$\lambda = 200$ for physical measurements) are set heuristically from empirical tuning on this dataset — they are not derived from any theoretical optimality result, and practitioners should re-tune them (e.g. via cross-validated predictive log-likelihood of$k_\text{puu}$ ) when applying the workflow to other data.
6. Define the ensemble approach before using it (≈ L. 206 / L. 234)
Before L. 211 ("Run a single ensemble of N = 5 independent MCTS planners…"), add:
By ensemble we mean running
$N$ independent MCTS-DPW planners on the same initial evidence. Each planner yields its own Pareto front of candidate designs; aggregating by majority vote over the selected action sets at each uncertainty level yields a more robust recommendation (the MLASP) than any single run, and the spread across runs gives an empirical measure of policy variance.
7. "Initial posterior probability" reads awkwardly (L. 293)
The awkwardness is in "initial," not "posterior" — "initial" vaguely signals "before assays" but invites misreadings. Suggest dropping it and naming the conditioning set explicitly, e.g.:
P_kpuu_in_range— posterior probability$P(k_\text{puu} \in [0.5, 1.0] \mid \text{QSAR})$ that the compound lies in the desirable range given QSAR features alone (before any physical assays).
Consistency / typography
8. Math typesetting — use $…$ for math, backticks only for code identifiers
Several places mix prose/code/math styles:
-
L. 35:
`H(s) ≤ ε`,`L(s) ≥ τ`→$H(s) \leq \varepsilon$,$L(s) \geq \tau$. -
L. 47, 61, 63: "λ values", "λ_k", "λ = 50", "λ = 200" →
$\lambda$,$\lambda_k$,$\lambda = 50$,$\lambda = 200$. -
L. 80–83 (bullet list):
H(s),w_i(s),L(s)are plain text — should be$H(s)$,$w_i(s)$,$L(s)$. -
L. 111, 114: "τ" in prose →
$\tau$. - L. 133: "0.5 ≤ kpuu ≤ 1", "QSAR_PgP < 2", "QSAR_BCRP < 2" → math mode.
-
L. 212: "τ = 0.9" in prose →
$\tau = 0.9$.
Code identifiers (`kpuu`, `sampler`, `weights`) correctly stay in backticks.
9. kpuu vs k_puu inconsistency
- Narrative uses
k_puuat L. 13, 18. - Everything else (code + prose) uses
kpuu: L. 55, 80, 87, 97, 106, 111, 114, 133, 142, 143, 149–155, 162, 173, 293, 323, 366.
The dataset column is kpuu. Prefer $k_\text{puu}$ in narrative math and `kpuu` only when naming the data column — change L. 13 and 18 accordingly.
10. Typo
- L. 380:
examiune→examine.
Additional issues
11. Missing trailing semicolon at L. 49
in_silico = [...] lacks a trailing ; (compare L. 59 physical = [...];). In Literate/Documenter output this will print the vector into the rendered doc, which is likely unintended.
12. Minor prose fixes
- L. 212: "5 thresholds spaces evenly" → "spaced evenly".
- L. 298–299: "Provided is at each uncertainty threshold, the assay/s that won the majority vote…" reads rough. Suggest: "For each uncertainty threshold we report the assay set that won the majority vote across the ensemble."
13. Clarify solver depth (L. 119)
"depth = 11 would allow the planner to look ahead through all possible assay orderings" — worth a one-line parenthetical tying "11" to the 4 experiments defined at L. 102–107 so a reader can reconstruct the count.
14. Silent drop of time cost (L. 234–235)
costs_tradeoff = (1.0, 0.0) ignores the time axis even though both monetary and time costs are introduced at L. 95. Add a sentence: "we trade off cost vs uncertainty only; extending to time is straightforward by setting the second weight > 0."
15. Committed generated artifacts
docs/make.jl copies tutorials/*.jl into docs/src/tutorials/ on build (L. 19–22) and Literate regenerates the .md (L. 28–35). Both docs/src/tutorials/ConditionalUncertaintyReduction.jl and .md appear to have been committed in this PR. I understand this follows the way other tutorials are introduced. That said, shall we consider they should be .gitignored — committing generated artifacts will cause drift in future PRs?
16. src/ensemble_fronts.jl — scope and hygiene
~630 lines of plotting heuristics. Worth considering whether this belongs in CEEDesigns proper or in a tutorial-support module. Regardless, a few small things:
- L. 67–68: stray inline-comment fragments (
# :global, :per_threshold, or :noneappears twice, and# Calculate normalized frequencies…) are leaked into the docstring body. - L. 249–252: both branches of
if "Action_Set" in names(df_copy)produce identicaltitle_text; the branch can be collapsed. - L. 443, 450: the fallback for the unmatched point uses
best_row.Average_Utility / scale_factorbut the main loop relies on exactFloat64equality of three fields — this is brittle. Either match by row index or document why the fallback is unreachable. - Neither
ensemble_to_dataframenorplot_ensemble_paretoisexported fromCEEDesigns.jl(L. 4 only exportsfront, plot_front, make_labels, plot_evals). The tutorial usesusing CEEDesigns: ensemble_to_dataframe, plot_ensemble_pareto— fine, but inconsistent with how the other helpers are surfaced. Either export both or mark them explicitly as internal helpers.
|
Thanks for the comments! All are implemented. A couple notes:
Please let me know if these changes are acceptable, thanks! |
|
@gwenjones Looks great, merged, thank you! |
ConditionalUncertaintyReduction.jloutlining conditional (constraint-aware) uncertainty-reduction MDP and applying to CNS examplesrc/ensemble_fronts.jlfor plotting the ensemble pareto frontscns_assays.csv