VF-Unity-Parallelized is a streamlined version of VirtualFlow, integrating the features of both VFVS (VirtualFlow for Virtual Screening) and VFLP (VirtualFlow for Ligand Preparation) into a cohesive workflow. Designed to operate seamlessly on SLURM systems, this workflow allows users to easily incorporate any docking software of their choice, ensuring maximum flexibility.
The core workflow involves users supplying a SMILES text file, the receptor of interest, and the docking parameters to facilitate large-scale docking simulations. Out-of-the-box, VF-Unity-Parallelized is configured to support docking with QuickVina 2.0 and Smina.
Execution within SLURM environments is highly optimized, with computations distributed in parallel across multiple CPUs and nodes. This design ensures efficient linear scaling relative to the number of molecules provided.
Please clone the repository using:
git clone https://github.com/VirtualFlow/VFUparr.git
Please ensure that the following packages are installed:
DATA: This directory is where users can place the receptor file and the corresponding executables for running the docking process.OUTPUTS: This directory is designated for storing the results of the docking simulations.all.ctrl: Contains all user-specifiable parameters required for the screening process, including the docking parameterization.dataset_calc.py: A Python script for running the docking on specified ligands.submit.sh: A Slurm submission script for submitting an array of jobs for processing.
To get started with the docking simulations, follow the steps outlined below. These steps ensure that your configuration is correctly set up for your specific docking scenario:
-
Configure Receptor Location:
- Open
all.ctrland specify the exact location of your receptor in the designated section.
- Open
-
Set Docking Parameters:
- Within
all.ctrl, enter the appropriateCENTER-X/Y/ZandSIZE-X/Y/Zcoordinates to define your docking area.
- Within
-
Specify SMILES List Path:
- In
all.ctrl, input the path to your SMILES list file. This file is crucial for defining the molecular inputs for the simulation. - Ensure the file adheres to the format: each line contains a SMILES representation followed by a comma and the molecule ID (e.g.,
C[C@@H](N)C(=O)O, Molecule1). Lines must be separated by newline characters to distinguish between different molecular entries.
- In
-
Slurm Cluster Account:
- In
submit.sh, replaceTODOin#SBATCH --account=TODOwith your actual Slurm cluster account name to ensure proper job submission.
- In
-
Job Submission Configuration:
- Adjust the number of jobs to submit for your docking calculation in
submit.shby modifying#SBATCH --array=1-999accordingly. Ensure this number matches theMAX_NUM_JOBSparameter set inall.ctrl.
- Adjust the number of jobs to submit for your docking calculation in
-
Executable Permissions:
- Make sure the docking executables have the correct executable permissions by running
chmod 777 ./DATA/qvina.
- Make sure the docking executables have the correct executable permissions by running
-
Submit Your Job:
- Finally, submit your job to the Slurm cluster with the command:
sbatch submit.sh.
- Finally, submit your job to the Slurm cluster with the command:
By following these steps, you'll be properly set up to conduct your docking simulations. Ensure all paths and parameters are double-checked for accuracy before submitting your job.
Upon completion of docking calculations, the results will be systematically saved in the working directory of your repository. Look for files named following the pattern OUTPUT_*_*.txt, where each represents a different output from your simulations. These text files are comprehensive, containing vital information for each molecule processed:
- SMILES String: The unique identifier for the chemical structure of the molecule.
- Docking Score: A numerical value indicating the predicted affinity between the receptor and the ligand. A score of 10,000 indicates a failed docking calculation.
- Molecule ID: A specific identifier assigned to the molecule for easy reference.
- Input Ligand Location: The path to the file containing the input ligand used in the docking simulation.
- Docking Pose File Location: The path to the file showing the preferred orientation (pose) of the ligand when bound to the protein receptor.
This organized output allows for efficient analysis and interpretation of your docking simulations, enabling a deeper understanding of the interaction between molecules and their potential efficacy.
If you are interested in contributing to VirtualFlow, whether it is to report a bug or to extend VirtualFlow with your own code, please see the file CONTRIBUTING.md and the file CODE_OF_CONDUCT.md.
The project ist distributed under the GNU GPL v2.0. Please see the file LICENSE for more details.
Gorgulla, Christoph, et al. "VirtualFlow 2.0-The Next Generation Drug Discovery Platform Enabling Adaptive Screens of 69 Billion Molecules." bioRxiv (2023): 2023-04.