Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
e1b19a4
Add SVGs for doc
Hrovatin Jan 16, 2026
3ee3a2a
Add optimization landscape SVG
AdrianSosic Feb 24, 2026
7c65827
Update images
Hrovatin Feb 27, 2026
9a5231f
Designing experiments checklists FAQ - unfinished
Hrovatin Aug 3, 2025
3e484c2
Expand checklist
Hrovatin Aug 5, 2025
7036e7e
Add design checklist to FAQ
Hrovatin Oct 23, 2025
e0ba087
Make introduction clearer
Hrovatin Oct 23, 2025
a6ef866
Format readme introduction
Hrovatin Oct 23, 2025
5d97211
Simplify readme introduction language
Hrovatin Oct 24, 2025
bc86405
Move API overview diagram into this repository
Hrovatin Oct 24, 2025
cc9078d
Rename API overview drawio file
Hrovatin Oct 24, 2025
5431b08
Make readme more accessible for new users
Hrovatin Oct 24, 2025
0a3ab21
Format FAQ
Hrovatin Oct 24, 2025
920b616
Update changelog
Hrovatin Oct 24, 2025
f8a482d
Add a few more examples
Hrovatin Oct 27, 2025
86bc7fa
Fix typo
Hrovatin Oct 27, 2025
de32044
Add emoji to example use cases
Hrovatin Oct 28, 2025
bf2164c
reword
Hrovatin Oct 28, 2025
e480028
Intro of Checklist for designing BayBE optimization campaigns
Hrovatin Oct 28, 2025
f037f63
reword
Hrovatin Oct 28, 2025
3e42e5c
typo
Hrovatin Oct 28, 2025
c1e8c4c
Force documentation build - TEMPORARY
Hrovatin Oct 29, 2025
2d03292
Remove offending link and thus reset workflow to original state
Hrovatin Oct 29, 2025
b0f0e0a
Fix the use of the word experiment
Hrovatin Oct 29, 2025
5560a52
Use telegraph style for all headings
Hrovatin Oct 29, 2025
d17acfd
reword
Hrovatin Oct 29, 2025
419f716
rweord
Hrovatin Oct 30, 2025
567d81e
Add example use case
Hrovatin Oct 30, 2025
1dedba4
Remove oxford comma
Hrovatin Oct 30, 2025
d90ce06
typo
Hrovatin Oct 30, 2025
e4c48f3
reword
Hrovatin Oct 30, 2025
b6e839e
Fix file name typo
Hrovatin Oct 30, 2025
f9c5aab
Try not using svg with automatic light/dark mode
Hrovatin Oct 30, 2025
f4aecd9
fix image links in readme
Hrovatin Oct 30, 2025
6750d43
Remove changelog entry
Hrovatin Nov 4, 2025
c795114
Fix typo
Hrovatin Nov 4, 2025
990e44d
Remove unnecessary wording
Hrovatin Nov 4, 2025
a2a0a89
Reword to match API terms properly
Hrovatin Nov 4, 2025
b8ed750
Reword to properly match the API wording
Hrovatin Nov 4, 2025
db9fd81
Fix readme image links
Hrovatin Nov 4, 2025
c184149
reword
Hrovatin Nov 5, 2025
330846a
reword
Hrovatin Nov 5, 2025
409b52a
reword
Hrovatin Nov 5, 2025
32fc26b
reweord
Hrovatin Nov 5, 2025
98b0c1b
Reword features
Hrovatin Nov 5, 2025
cb9906e
Reword user to you
Hrovatin Nov 5, 2025
f4290ed
Unify wording of setting (BO method) and configuration (chosen parame…
Hrovatin Nov 6, 2025
7072d80
Fix rendering style
Hrovatin Nov 6, 2025
72aaa75
reword
Hrovatin Nov 6, 2025
23eb55d
Remove setup checklist
Hrovatin Nov 6, 2025
cc8a04d
Change "design" step to "setup" step
Hrovatin Nov 17, 2025
70b479d
Add source code generating optimization landscape plot
AdrianSosic Dec 17, 2025
6570dc7
Add automatically switching lookup plot
Hrovatin Dec 19, 2025
4205e33
Group features list
Hrovatin Dec 19, 2025
2b30360
Add box formats examples for the feature list
Hrovatin Dec 19, 2025
f178e90
Add target transformations
Hrovatin Dec 19, 2025
57f74e9
Update diagram files
Hrovatin Dec 19, 2025
bc2627a
Replace with automatic diagram
Hrovatin Dec 19, 2025
d77786b
Typo
Hrovatin Jan 15, 2026
c885b4d
Clean the landscape plotting utilities
Hrovatin Jan 15, 2026
a1c09da
Update plots in drawio
Hrovatin Jan 15, 2026
ec5b2ec
Update diagram svg files in readme
Hrovatin Jan 15, 2026
d2b3a84
Update the diagram to reduce size
Hrovatin Jan 16, 2026
3321c57
typo
Hrovatin Jan 16, 2026
072a667
Convert feature list to dropdown box list and add links
Hrovatin Jan 16, 2026
fa49745
Remove links accidentally added to images
Hrovatin Jan 16, 2026
3eaff57
reword
Hrovatin Jan 29, 2026
a2c8d3d
Use svg landscape plot in diagram
Hrovatin Feb 9, 2026
6ea0655
Disable link check
Hrovatin Dec 19, 2025
0742729
Force build
Hrovatin Feb 9, 2026
49d4f37
Update API diagram image
Hrovatin Feb 9, 2026
f4fc0b1
Update links to user guide
Hrovatin Feb 9, 2026
55b6953
Remove bold text in feature bulet points
Hrovatin Feb 9, 2026
b008258
Reword feature bullet points
Hrovatin Feb 9, 2026
1d3bad7
Update colors
Hrovatin Feb 26, 2026
4e36de8
Bugfix - accidentally introduced code
Hrovatin Mar 29, 2026
2a93ea0
Bugfix - accidentally introduced code
Hrovatin Mar 29, 2026
1e2a059
Fix empty lines
Hrovatin Mar 29, 2026
5b891a3
Formating
Hrovatin Mar 31, 2026
4940806
Reword
Hrovatin Apr 1, 2026
2342c87
Small updates
Hrovatin Apr 1, 2026
ed58da6
reword
Hrovatin Apr 1, 2026
83bb051
reword
Hrovatin Apr 11, 2026
1f1895b
Remove automatic lookup file and just use the light one
Hrovatin Apr 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,10 @@ htmlcov
# Pictures created by backtesting
*.png

# Picture components for documentation
docs/scripts/graphics/*.svg
docs/scripts/graphics/*.png

# Folders that are temporarily created when building the documentation
docs/_autosummary
docs/_build
Expand Down
231 changes: 166 additions & 65 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,35 +27,112 @@

# BayBE — A Bayesian Back End for Design of Experiments

The **Bay**esian **B**ack **E**nd (**BayBE**) is a general-purpose toolbox for Bayesian Design
of Experiments, focusing on additions that enable real-world experimental campaigns.
The **Bay**esian **B**ack **E**nd (**BayBE**) helps to find **good parameter configurations**
within complex parameter search spaces.

## 🔋 Batteries Included
Besides its core functionality to perform a typical recommend-measure loop, BayBE
offers a range of ✨**built‑in features**✨ crucial for real-world use cases.
The following provides a non-comprehensive overview:

- 🛠️ Custom parameter encodings: Improve your campaign with domain knowledge
- 🧪 Built-in chemical encodings: Improve your campaign with chemical knowledge
- 🎯 Numerical and binary targets with min, max and match objectives
- ⚖️ Multi-target support via Pareto optimization and desirability scalarization
- 🔍 Insights: Easily analyze feature importance and model behavior
- 🎭 Hybrid (mixed continuous and discrete) spaces
- 🚀 Transfer learning: Mix data from multiple campaigns and accelerate optimization
- 🎰 Bandit models: Efficiently find the best among many options in noisy environments (e.g. A/B Testing)
- 🔢 Cardinality constraints: Control the number of active factors in your design
- 🌎 Distributed workflows: Run campaigns asynchronously with pending experiments and partial measurements
- 🎓 Active learning: Perform smart data acquisition campaigns
- ⚙️ Custom surrogate models: Enhance your predictions through mechanistic understanding
- 📈 Comprehensive backtest, simulation and imputation utilities: Benchmark and find your best settings
- 📝 Fully typed and hypothesis-tested: Robust code base
- 🔄 All objects are fully (de-)serializable: Useful for storing results in databases or use in wrappers like APIs
<div align="center">
Comment thread
Hrovatin marked this conversation as resolved.

![Complex Search Space](https://raw.githubusercontent.com/Hrovatin/baybe/docs/easy_access/docs/_static/complex_search_space.svg)

</div>

BayBE can help to solve many real-world optimization problems, such as:

- 🧪 Find chemical reaction conditions or process parameters
- 🥣 Create materials, chemical mixtures or formulations with desired properties
- ✈️ Optimize the 3D shape of a physical object
- 🖥️ Optimize a virtual simulation
- ⚙️ Select model hyperparameters
- 🫖 Find tasty espresso machine settings
Comment thread
Hrovatin marked this conversation as resolved.

This is achieved via **Bayesian Design of Experiments**,
which helps to efficiently navigate parameter search spaces.
It balances
exploitation of parameter space regions known to lead to good outcomes
and exploration of unknown regions.
Comment thread
AVHopp marked this conversation as resolved.

BayBE provides a **general-purpose toolbox** for Bayesian Design of Experiments,
Comment thread
Hrovatin marked this conversation as resolved.
focusing on making this procedure easily accessible for real-world experiments.
Comment thread
Hrovatin marked this conversation as resolved.
Its utility was already shown in a variety of [real-world experimental campaigns](#citation) in both industry and academia.

## 🔋 Batteries Included
Comment thread
Hrovatin marked this conversation as resolved.
Comment thread
Hrovatin marked this conversation as resolved.
Comment thread
Hrovatin marked this conversation as resolved.
BayBE offers a range of ✨**built&#8209;in&nbsp;features**✨, including:

<details style="border:2px solid #535353; border-radius: 7px; margin: 5px;">
<summary style="background-color: #535353; color: white; padding: 10px; border-radius: 5px; cursor: pointer;">
🛠️ Flexible modeling options
</summary>
<div style="padding: 10px;">
<ul>
<li>Use both continuous and discrete parameters within a single <a href="https://emdgroup.github.io/baybe/stable/examples/Searchspaces/hybrid_space.html">hybrid search space</a>.</li>
<li>Exclude undesired or impossible parameter configurations (e.g., to define a maximal number of mixture components) using <a href="https://emdgroup.github.io/baybe/stable/userguide/constraints.html">constraints</a>.</li>
<li>Choose between different optimization strategies to balance exploration and exploitation of the search space:
<ul>
<li>Smartly acquire training data for model building via <a href="https://emdgroup.github.io/baybe/stable/userguide/active_learning.html">active learning</a>.</li>
<li>Conduct AB testing via <a href="https://emdgroup.github.io/baybe/stable/examples/Multi_Armed_Bandit/Multi_Armed_Bandit.html">bandit models</a>.</li>
</ul>
</li>
<li>Specify the desired target value via <a href="https://emdgroup.github.io/baybe/stable/userguide/transformations.html">target transformations</a>.</li>
<li>Optimize multiple targets at the same time via <a href="https://emdgroup.github.io/baybe/stable/userguide/objectives.html#paretoobjective">Pareto optimization</a> or <a href="https://emdgroup.github.io/baybe/stable/userguide/objectives.html#desirabilityobjective">desirability scalarization</a>.</li>
</ul>
</div>
</details>
<details style="border:2px solid #535353; border-radius: 7px; margin: 5px;">
<summary style="background-color: #535353; color: white; padding: 10px; border-radius: 5px; cursor: pointer;">
📚 Mechanisms for leveraging additional information
</summary>
<div style="padding: 10px;">
<ul>
<li>Capture relationships between categories via <a href="https://emdgroup.github.io/baybe/stable/userguide/parameters.html#customdiscreteparameter">custom encodings for categorical</a> data.</li>
<li>Use built-in <a href="https://emdgroup.github.io/baybe/stable/userguide/parameters.html#substanceparameter">chemical encodings</a> for chemistry-related parameters.</li>
<li>Add mechanistic process understanding via <a href="https://emdgroup.github.io/baybe/stable/userguide/surrogates.html#using-custom-models">custom surrogate</a> models.</li>
<li>Leverage additional data from similar campaigns to accelerate optimization via <a href="https://emdgroup.github.io/baybe/stable/userguide/transfer_learning.html">transfer learning</a>.</li>
</ul>
</div>
</details>
<details style="border:2px solid #535353; border-radius: 7px; margin: 5px;">
<summary style="background-color: #535353; color: white; padding: 10px; border-radius: 5px; cursor: pointer;">
🔗 Advanced optimization workflows
</summary>
<div style="padding: 10px;">
<ul>
<li>Run campaigns <a href="https://emdgroup.github.io/baybe/stable/userguide/async.html">asynchronously</a> with partial measurements and pending experiments.</li>
<li>Store BayBE objects and use API wrappers with the <a href="https://emdgroup.github.io/baybe/stable/userguide/serialization.html">serialization</a> functionality.</li>
</ul>
</div>
</details>
<details style="border:2px solid #535353; border-radius: 7px; margin: 5px;">
<summary style="background-color: #535353; color: white; padding: 10px; border-radius: 5px; cursor: pointer;">
🔍 Performance evaluation tools
</summary>
<div style="padding: 10px;">
<ul>
<li>Gain <a href="https://emdgroup.github.io/baybe/stable/userguide/insights.html">insights</a> about the optimization campaigns by analyzing model behavior and feature importance.</li>
<li>Conduct benchmarks to select between different Bayesian optimization settings via <a href="https://emdgroup.github.io/baybe/stable/userguide/simulation.html">backtesting</a>.</li>
</ul>
</div>
</details>

## ⚡ Quick Start

Let us consider a simple experiment where we control three parameters and want to
maximize a single target called `Yield`.
To perform Bayesian Design of Experiments with BayBE,
you should first specify the **parameter search space** and **objective** to be optimized.
Based on this information and any **available data** about outcomes of specific parameter configurations,
BayBE will **recommend the next set of parameter configurations** to be **measured**.
To inform the next recommendation cycle, the newly generated measurements can be added to BayBE.

<div align="center">

![Quick Start](https://raw.githubusercontent.com/Hrovatin/baybe/docs/easy_access/docs/_static/quick_start.svg)
Comment thread
Scienfitz marked this conversation as resolved.

</div>

From the user perspective, the most important part is the "setup" step (top of the figure).

Below we show a simple optimization procedure, starting with the setup step and subsequently
performing the recommendation loop.
The provided example aims to maximize the yield of a chemical reaction by adjusting its parameter configurations
(also known as reaction conditions).

First, install BayBE into your Python environment:
```bash
Expand All @@ -66,7 +143,7 @@ For more information on this step, see our

### Defining the Optimization Objective

In BayBE's language, the `Yield` can be represented as a `NumericalTarget`,
In BayBE's language, the reaction yield can be represented as a `NumericalTarget`,
Comment thread
AVHopp marked this conversation as resolved.
which we wrap into a `SingleTargetObjective`:

```python
Expand All @@ -76,21 +153,19 @@ from baybe.objectives import SingleTargetObjective
target = NumericalTarget(name="Yield")
objective = SingleTargetObjective(target=target)
```
In cases where we are confronted with multiple (potentially conflicting) targets,
the `ParetoObjective` or `DesirabilityObjective` can be used instead.
These allow to define additional settings, such as how the targets should be balanced.
In cases where we are confronted with multiple (potentially conflicting) targets
(e.g., yield vs selectivity),
the `ParetoObjective` or `DesirabilityObjective` can be used to define how the targets should be balanced.
For more details, see the
[objectives section](https://emdgroup.github.io/baybe/stable/userguide/objectives.html)
of the user guide.

### Defining the Search Space

Next, we inform BayBE about the available "control knobs", that is, the underlying
system parameters we can tune to optimize our targets. This also involves specifying
their values/ranges and other parameter-specific details.

For our example, we assume that we can control three parameters – `Granularity`,
`Pressure[bar]`, and `Solvent` – as follows:
reaction parameters we can tune to optimize the yield.
In this case we tune granularity, pressure and solvent, each being encoded as a `Parameter`.
We also need to specify which values individual parameters can take.

```python
from baybe.parameters import (
Expand Down Expand Up @@ -147,20 +222,15 @@ and alternative ways of construction.

### Optional: Defining the Optimization Strategy

As an optional step, we can specify details on how the optimization should be
conducted. If omitted, BayBE will choose a default setting.
As an optional step, we can specify details on how the optimization of the experimental configurations should be
performed. If omitted, BayBE will choose a default Bayesian optimization setting.

For our example, we combine two recommenders via a so-called meta recommender named
`TwoPhaseMetaRecommender`:

1. In cases where no measurements have been made prior to the interaction with BayBE,
a selection via `initial_recommender` is used.
2. As soon as the first measurements are available, we switch to `recommender`.

For more details on the different recommenders, their underlying algorithmic
details, and their configuration settings, see the
[recommenders section](https://emdgroup.github.io/baybe/stable/userguide/recommenders.html)
of the user guide.
the parameters will be recommended with the `initial_recommender`.
2. As soon as the first measurements are available, we switch to the `recommender`.

```python
from baybe.recommenders import (
Expand All @@ -175,65 +245,94 @@ recommender = TwoPhaseMetaRecommender(
)
```

For more details on the different recommenders, their underlying algorithmic
details and how their settings can be adjusted, see the
[recommenders section](https://emdgroup.github.io/baybe/stable/userguide/recommenders.html)
of the user guide.

### The Optimization Loop

We can now construct a campaign object that brings all pieces of the puzzle together:
We can now construct a `Campaign` that performs the Bayesian optimization of the experimental configurations:

```python
from baybe import Campaign

campaign = Campaign(searchspace, objective, recommender)
```

With this object at hand, we can start our experimentation cycle.
With this object at hand, we can start our optimization cycle.
In particular:

* We can ask BayBE to `recommend` new experiments.
* We can `add_measurements` for certain experimental settings to the campaign's
database.
* The campaign can `recommend` new experiments.
* We can `add_measurements` of target values for the measured parameter configurations
to the campaign's database.

Note that these two steps can be performed in any order.
In particular, available measurements can be submitted at any time and also several
times before querying the next recommendations.

```python
df = campaign.recommend(batch_size=3)
df = campaign.recommend(
batch_size=3
) # Recommend three experimental configurations to test
print(df)
```

The below table shows the three parameter configurations for which BayBE recommended to
measure the reaction yield.

```none
Granularity Pressure[bar] Solvent
15 medium 1.0 Solvent D
10 coarse 10.0 Solvent C
29 fine 5.0 Solvent B
```

Note that the specific recommendations will depend on both the data
already fed to the campaign and the random number generator seed that is used.
Next, we need to conduct the recommended experiments and record the corresponding `Target` values.

```python
df["Yield"] = [
79.8,
54.1,
59.4,
] # Measured yields for the three recommended parameter configurations
Comment thread
Hrovatin marked this conversation as resolved.
print(df)
```
```none
Granularity Pressure[bar] Solvent Yield
15 medium 1.0 Solvent D 79.8
10 coarse 10.0 Solvent C 54.1
29 fine 5.0 Solvent B 59.4
```

After having conducted the corresponding experiments, we can add our measured
targets to the table and feed it back to the campaign:
Now, we can add the newly measured `Target` values to the `Campaign`:

```python
df["Yield"] = [79.8, 54.1, 59.4]
campaign.add_measurements(df)
```

With the newly arrived data, BayBE can produce a refined design for the next iteration.
This loop would typically continue until a desired target value has been achieved in
the experiment.
With the newly provided data, BayBE can produce a refined recommendation for the next iteration.
This loop typically continues until a desired `Target` value is achieved in the experiment.

### Advanced Example: Chemical Substances
BayBE has several modules to go beyond traditional approaches. One such example is the
use of custom encodings for categorical parameters. Chemical encodings for substances
are a special built-in case of this that comes with BayBE.
### Inspect the Progress of the Experimental Configuration Optimization

The below plot shows progression of a campaign that optimized direct arylation reaction
by tuning the solvent, base and ligand
(from [Shields, B.J. et al.](https://doi.org/10.1038/s41586-021-03213-y)).
Each line shows the best target value that was cumulatively achieved after a given number of experimental iterations.


Different lines show outcomes of `Campaigns` with different settings.

In the following picture you can see
the outcome for treating the solvent, base and ligand in a direct arylation reaction
optimization (from [Shields, B.J. et al.](https://doi.org/10.1038/s41586-021-03213-y)) with
chemical encodings compared to one-hot and a random baseline:
![Substance Encoding Example](./examples/Backtesting/full_lookup_light.svg)

In particular, the five `Campaigns` differ in how molecules are encoded within
each chemical `Parameter`. Instead of simply one-hot encoding each SMILES string,
`SubstanceParameter` can be used to directly compute chemical fingerprints from
the input SMILES.
We can see that optimization is more efficient when
using chemical encodings (e.g., *MORDRED*) rather than encoding categories with *one-hot* encoding. The latter is, in fact, no better than *randomly* suggesting parameter configurations at each experimental iteration.

Comment thread
Hrovatin marked this conversation as resolved.
<a id="installation"></a>
(installation)=
## 💻 Installation
Expand Down Expand Up @@ -264,7 +363,7 @@ pip install git+https://github.com/emdgroup/baybe.git@main

Alternatively, you can install the package from your own local copy.
First, clone the repository, navigate to the repository root folder, check out the
desired commit, and run:
desired commit and run:

```bash
pip install .
Expand Down Expand Up @@ -312,6 +411,8 @@ The available groups are:
## 📡 Telemetry
Telemetry was fully and permanently removed in version 0.14.0.

<a id="citation"></a>
(citation)=
## 📖 Citation
If you find BayBE useful, please consider citing [our paper](https://doi.org/10.1039/D5DD00050E):

Expand Down
Loading
Loading