This repository provides a python framework for performing binding free energy calculations for a user-defined target molecule and a protein-catalyzed capture (PCC) agent with a user-defined epitope sequence. The core of this package is based on "High-throughput virtual screening of protein-catalyzed capture agents for novel hydrogel-nanoparticle fentanyl sensors" and the corresponding repository. Please refer to the original paper for a more technical explanation of the theory and the setup behind the free energy calculations done with this framework.
For more detailed instructions and API descriptions, see the full documentation or consult the source in the docs/ directory.
The framework is organized into four main submodules that work in serial to perform a complete binding free energy calculation using GROMACS:
PCCBuilder: This object creates a PCC with a given sequence, calculates GAFF2 parameters for it, and minimizes it.TargetMOL: Similar toPCCBuilder, creates input structures and force field parameters for a provided target molecule.FECalc: Brings together the PCC and the target molecule, solvates them in a water box, and performs minimization, NVT, and NPT equilibration, runs a parallel-bias metadynamics simulation, and reweights the resulting trajectory for free energy calculations.postprocess: Uses the reweighted statistics to calculate the 3D binding free energy volume of the PCC-molecule complex and calculates the binding free energy and dissociation constant.
git clone https://github.com/arminshzd/PCC-FECalc.git
cd PCC-FECalcpip install -e .After the package is installed, add AmberTools and libgfortran5 so the sqm program works:
conda install -c conda-forge ambertools libgfortran5Note: PyMOL is required but not installed automatically because no pip wheel is available. Install PyMOL separately from pymol.org or build from the open-source GitHub repository. There's also a conda package package available:
conda install -c conda-forge pymol-open-sourceEnsure the
pymolexecutable is on yourPATH.
See example/pcc_submit_test.py for a sample calculation setup.
This framework additionally relies on PyMOL and acpype for PCC mutations and GAFF2 parameter generation. acpype is installed automatically as a Python package. PyMOL must be installed separately as described above.
The calculations happen through four steps and each step requires a JSON files with the necessary user parameters:
The settings file for this step is pre-made in FECalc/PCCBuilder_settings.JSON and generally requires no modification.
PCCBuilder.create stops after building the structure and again after
parameter generation so that you can inspect the intermediate files. Invoke
PCCBuilder.create(check=False) to perform all steps without interruption.
The settings file for this step should be created by the user. Two example are provided in example/ACT_settings.JSON for acetaminophen and example/FEN_settings.JSON for fentanyl. The mandatory entries are:
name: Name of the target. Used for creating subdirectories and making reports.charge: Total charge of the target.anchor1: Anchor point defined using the atoms on the target molecule. This is used together withanchor2to define a vector that is used in determining the relative position and orientation of the target molecule with respect to the PCC during the PBMetaD calculations. See the original publication for a detailed explanation of how this vector is used in the collective variables.anchor2: See above.output_dir: Path to the folder to store the parameter calculations and minimization.input_pdb_dir: Inputpdbfile of the target molecule structure.
The settings file for this step should be created by the user. An example is provided in example/system_settings.JSON. The mandatory entries are:
PCC_output_dir: Path to the output folder that holds the PCC calculations.PCC_settings_json: Path to theJSONfile for the PCC.MOL_settings_json: Path to theJSONfile for the target.temperature: Temperature of the simulationsbox_size: Size of the simulation box. Cubic periodic.complex_output_dir: Path to the out directory for the free energy calculations. The contents of this directory will be as follows:
{PCC sequence}_{target name}/
│-- em/ # Minimization
│-- nvt/ # NVT equilibration
│-- npt/ # NPT equilibration
│-- md/ # PBMetaD simulation
│-- reweight/ # ReweightingThe optional entries are:
scheduler: Scheduler used to allocate hardware resources. Supported options are"local"(default),"slurm","pbs", and"lsf".nodes,cores,threads: Hardware layout overrides for the number of nodes, cores per node, and threads per core. Defaults to1for each value; scheduler environment variables take precedence. These counts are used to set the correct-ntompfor GROMACS.metad_settings: Parameters of the metadynamics simulationn_steps: Number of steps for the metadynamics run. 2 fs step size. defaults to 800 ns.metad_height: Height of the deposited Guassians. Defaults to 3.0 kJ/mol.metad_pace: Pace of deposition. Defaults to 500 steps.metad_bias_factor: Biasing factor for the PBMetaD bias. Defaults to 20.
postprocess_settings: Parameters for the post-processing and free energy calculations.discard_initial: Initial duration of the PBMetaD simulation to discard for free energy calculations in ns. Defaults to 100 ns.n_folds: Number of folds for block-analysis and uncertainty quantification. Defaults to 5.
This step does not require a separate settings file as the parameters are set by the settings file in step 3. The post-processing step calculates the 3D integral of the bound basin and the unbound region and reports complex_output_dir/{PCC sequence}_{target name}/metadata.JSON.
This project is licensed under the MIT License.
This software was created by Armin Shayesteh Zadeh and Andrew L. Ferguson under Army Research Office (ARO) Cooperative Agreement Number W911NF-23-2-0135. ARO, as the Federal awarding agency, reserves a royalty-free, nonexclusive and irrevocable right to reproduce, publish, or otherwise use this software for Federal purposes, and to authorize others to do so in accordance with 2 CFR 200.315(b).