Prodrugs are easily deployable chemical entities with beneficial pharmacokinetic properties; however, their rational design requires careful crafting of release mechanisms and holistic optimization of pharmacokinetic properties. Machine learning is poised to support rational design of prodrugs by efficiently filtering millions of generated designs down to the most promising candidates. Here, we designed and validated a novel machine learning pipeline for rapid and systematic design of prodrugs with desired properties. We also developed a subsampling approach for efficient application of our pair-wise DeepDelta approach to larger datasets (>1500 datapoints).
The associated publication is currently under review.
We would like to thank the Chemprop, Llama, Unsloth, Molecule-RNN, SmilesGPT, MolGan, Scikit-learn, and Chemical VAE developers for making their code publicly available.
Datasets, saved models, and code for applying DeepDelta for the two example prodrug case studies. Due to the large file size of the files for generated prodrugs and their predicted values, these results are stored on Zenodo: 10.5281/zenodo.18079221.
Datasets and code for subsampling strategies for efficient application of the pair-wise DeepDelta approach to larger datasets (>1500 datapoints). Due to the large file size of results, these are stored on Zenodo: 10.5281/zenodo.14894034.
Datasets, code, and results for the analysis of currently approved and investigational prodrugs (Figure S1).
Datasets and code for generative models to directly build promoieties onto drug structures (based on Molecule-RNN).
Datasets, code, and results for the analysis of novel prodrugs designed using generative models (Figure 1).
The copyrights of the software are owned by Duke University. As such, two licenses for this software are offered:
- An open-source license under the GPLv2 license for non-commercial academic use.
- A custom license with Duke University, for commercial use or uses without the GPLv2 license restrictions.