caffeine-multi-state-design

caffeine-multi-state-design (varCOMETS) is a Python-based program developed by the Laboratory of Protein Design and Immunoengineering (LPDI) at EPFL for multi-state design. It focuses on multi-state protein design, aiming to optimize protein sequences that can adopt multiple conformations or functional states. This implementation was used to design caffeine-inducible nanobody heterodimers with minimized off-target homodimerization, enabling precise control of cellular signaling in synthetic biology applications.

If you use this code please cite our paper [1], and the original COMETS multi-state design paper [2].

Features

Implements multi-state design algorithms to identify sequences compatible with multiple protein states.
Utilizes Python/pymol for scripting.
Incorporates Rosetta tools for energy calculations, rotamer sampling, and structural analysis.

System Requirements

This code requires python 2.7 and python 2.7-based pymol to run. You must run it in your local machine.

Energy matrices for the problems in the paper are provided in this repository. However, if you wish to apply this code to a different problem, Rosetta is required to compute intrabody and pairwise energy matrices for residues. Thus, you must use Rosetta to compute these matrices as in the format of the files in the Rosetta_energy_matrices/ directory.

Installation

To use this repository, clone it to your local machine:

git clone https://github.com/LPDI-EPFL/caffeine-multi-state-design.git
cd caffeine-multi-state-design

Runtime

The running time of the scripts, as currently set up, is within a second, and can be run in a laptop. However, varCOMETS is a provable algorithm, meaning that it guarantees to identify the global minimum within the search space. Since the complexity of the problem is not polynomial (see [2]), as the number of designed residue positions grows, the space grows and the run time can become unfeasible. Therefore, care should be taken when selecting residues to model. Probably keeping it at 5-7 residues max is necessary and these must be carefully selected. Although an HPC server can likely process larger inputs, the problem still has exponential complexity which can catch up to available resources with a slighly larger input.

Usage

The main script for running the multi-state design is msd_caf.py. To execute the design process as shown in the paper, run it with the provided residue groups, and the precomputed energy matrices:

python msd_caf.py caffeine A # for group A
python msd_caf.py caffeine B # for group B
python msd_caf.py caffeine C # for group C
python msd_caf.py caffeine D # for group D

To visualize results:

pymol

Within pymol run:

run show_in_pymol output/caffeine/caffeine_A.json

Repository Structure

msd_caf.py: Entry point for the caffeine design

show_in_pymol.py: Utility script to visualize structures and designs in PyMOL.

varbnb/VarbnbMSD.py: Algorithm for multi state design

bp/: Belief propagation algorithm used to compute lower bounds

mplp: MPLP algorithm used to compute upper bounds.

dynamicAS: Dynamic A* algorithm used to explore the multi-state design space.

ematrix: Directory to parse Rosetta energy matrices.

test-mplp-orig.py, test_pyrosetta.py: Test scripts for validating design methods.

input/: Directory containing the definition of the states and the regions being modeled, as well as allowed amino acids.

output/: Directory where output files, including designed sequences and structures, are saved.

Rosetta_energy_matrices/: contains the energy matrices for the design as computed by Rosetta.

License

This project is licensed under the MIT License. See the LICENSE file for details.

References

If you use this code, please cite:

[1] Scheller L. et al. "Humanized Caffeine-Inducible Systems for Controlling Cellular Functions", 2025

[2] Hallen M. & Donald B.R., "COMETS (Constrained Optimization of Multistate Energies by Tree Search): A provable and efficient protein design algorithm to optimize binding affinity and specificity with respect to sequence." Journal of Computational Biology 23.5 (2016): 311-321.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

caffeine-multi-state-design

Features

System Requirements

Installation

Runtime

Usage

Repository Structure

License

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
BP		BP
MPLP		MPLP
Rosetta_energy_matrices		Rosetta_energy_matrices
Util		Util
config		config
dynamicAS		dynamicAS
ematrix		ematrix
images		images
input		input
output		output
rotamer		rotamer
varbnb		varbnb
.gitignore		.gitignore
4qnq.pdb		4qnq.pdb
4qvf.pdb		4qvf.pdb
LICENSE		LICENSE
README.md		README.md
design.py		design.py
io.mc		io.mc
msd.py		msd.py
msd_caf.py		msd_caf.py
output_no_pruning.txt		output_no_pruning.txt
show_in_pymol.py		show_in_pymol.py
test-mplp-orig.py		test-mplp-orig.py
test_pyrosetta.py		test_pyrosetta.py

Folders and files

Latest commit

History

Repository files navigation

caffeine-multi-state-design

Features

System Requirements

Installation

Runtime

Usage

Repository Structure

License

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages