RosettaDDGPrediction

Overview

RosettaDDGPrediction is a Python package to run Rosetta-based protocols for the prediction of the ΔΔG of stability upon mutation of a monomeric protein or the ΔΔG of binding upon mutation of a protein complex and analyze the results.

Forked from https://github.com/ELELAB/RosettaDDGPrediction. The major difference between the original repo and this fork is that Rosetta is installed into a conda virtual environment to avoid dependency issues.

Installation

# clone repo
git clone https://github.com/jlingford/ddg_rosetta.git
cd ddg_rosetta

# install all dependencies with conda or mamba (mamba is faster)
# NOTE: rosetta dependency takes a long time to install
mamba env create -f requirements.yaml
mamba activate ddg_rosetta

# make python modules executable
python3 setup.py install

# make one of the bash scripts executable
chmod +x ./scripts/summarise_agg_ddg_data.sh

Upon successful installation, you should have three executable (rosetta_ddg_run, rosetta_ddg_aggregate and rosetta_ddg_plot) available to perform the various steps of data collection and analysis.

Usage

Please refer to the user guide .pdf for how to use RosettaDDGPrediction. Modules can also be called with a --help flag.

Example code and scripts need to be changed per use case and won't work as is.

Examples of how to run each step:

1. Run Rosetta

Tested on a HPC with SLURM. Run sbatch ./scripts/long_run_slurm.sh

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=48
#SBATCH <slurm stuff goes here>

## SETUP
# ensure that pdb file is in pdb_input dir
# ensure mutations are specified in muts.txt
# the rosetta is installed under the conda or mamba dir for ddg_rosetta
# check which of the .yaml config files you want to specify for running rosetta

# set env
module purge
module load miniforge3 #assuming conda is a module on your HPC
conda activate ddg_rosetta

# set variables
ROSETTA_DIR=/path/to/your/miniconda/conda/envs/ddg_rosetta #change for your system
CONFIG_RUN=RosettaDDGPrediction/config_run
CONFIG_SET=RosettaDDGPrediction/config_settings
CONFIG_AGG=RosettaDDGPrediction/config_aggregate
CONFIG_PLT=RosettaDDGPrediction/config_plot
MUT_DIR=muts

# run rosetta
rosetta_ddg_run \
    --pdbfile pdb_input/wt_monomer.pdb \
    --listfile $MUT_DIR/key_muts.txt \
    --configfile-run $CONFIG_RUN/cartesian2020_ref2015.yaml \
    --configfile-settings $CONFIG_SET/rosettampi.yaml \
    --rosettapath $ROSETTA_DIR \
    -n 48 #starts 48 processes in parallel

echo "done rosetta ddg run!"

rosetta_ddg_check_run \
    --configfile-run $CONFIG_RUN/cartesian2020_ref2015.yaml

echo "done rosetta check"

This outputs two new dirs, relax and cartesian, which is populated with info required by rosetta_ddg_aggregate.

Check the ROSETTA_CRASH.log to see if there were any issues. If the run fails, the next step won't output a full .csv of all mutations.

2. Aggregate the rosetta data

Easiest to include this in the same SLURM script from above.

rosetta_ddg_aggregate \
    -ca $CONFIG_AGG/aggregate.yaml \
    -cr $CONFIG_RUN/cartddg_ref2015.yaml \
    -cs $CONFIG_SET/rosettampi.yaml \
    -mf cartesian/mutinfo.txt \
    -od agg_data \
    -n 48

./scripts/summarise_agg_ddg_data.sh

The rosetta_ddg_aggregate module creates two .csv files required for rosetta_ddg_plot:

ddg_mutations_aggregate.csv
ddg_mutations_structures.csv

"Structures" contains all the Rosetta runs, while "aggregate" is the average of the data in "structures". Each is specific to what sort of plot you want to generate with rosetta_ddg_plot.

There is also the script summarise_agg_ddg_data.sh, which creates one file for my custom plotting script:

rosetta_ddg_scores.csv

3. Plot the data

Easiest to run locally after copying all the outputs to your local machine. The authors provide a module to generate plots:

rosetta_ddg_plot \
    -i ddg_mutations_structures.csv \
    -o figure \
    -cp ./RosettaDDGPrediction/config_plot/total_heatmap.yaml

Alternatively, run custom python scripts on the rosetta_ddg_scores.csv file to generate different plots.

python3 ./scripts/plots/barplot.py

Citation

RosettaDDGPrediction for high-throughput mutational scans: from stability to binding

Valentina Sora, Adrian Otamendi Laspiur, Kristine Degn, Matteo Arnaudi, Mattia Utichi, Ludovica Beltrame, Dayana De Menezes, Matteo Orlandi, Olga Rigina, Peter Wad Sackett, Karin Wadt, Kjeld Schmiegelow, Matteo Tiberti, Elena Papaleo* under revision for Protein Science and on biorxiv: https://doi.org/10.1101/2022.09.02.506350

Name		Name	Last commit message	Last commit date
Latest commit History 182 Commits
RosettaDDGPrediction.egg-info		RosettaDDGPrediction.egg-info
RosettaDDGPrediction		RosettaDDGPrediction
build/lib/RosettaDDGPrediction		build/lib/RosettaDDGPrediction
dist		dist
muts		muts
pdb_input		pdb_input
relax		relax
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ROSETTA_CRASH.log		ROSETTA_CRASH.log
USER_GUIDE.pdf		USER_GUIDE.pdf
ddg_barplot_example.png		ddg_barplot_example.png
environment.yaml		environment.yaml
get_poslist.py		get_poslist.py
setup.py		setup.py
spec-file.txt		spec-file.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RosettaDDGPrediction

Overview

Installation

Usage

1. Run Rosetta

2. Aggregate the rosetta data

3. Plot the data

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RosettaDDGPrediction

Overview

Installation

Usage

1. Run Rosetta

2. Aggregate the rosetta data

3. Plot the data

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages