An ellipsoid is a solid three dimensional version of an ellipse. The points inside an ellipse can be described in quadratic form by a 3x3 square matrix, or in parametric form using the directions and magnitudes of the principal axes.
MVEE is the minimum volume enclosing ellipsoid. Given a set of points we can calculate the smallest ellipsoid that contains those points.
This is interesting for drug discovery as:
- Drug molecules have three dimensional shapes that fit inside the proteins they bind to. E.g. HIV protease
- If we have one good drug (an active compound) another compound may also be active if it has the same shape
- Comparing the shapes of molecules can be difficult but comparing ellipsoids instead could be easier
Methods to calculate the MVEE from a set of points can be found online
- Theory
- Another link to the above
- Python code to calculate MVEE and ellipse example
- Matlab MVEE
- Stackoverflow port Matlab MVEE to Python
- Stackoverflow calculate MVEE in Java
The MVEE code is downloaded to get_ellipse.py and you can run it to calculate a MVEE from some random 2D points.
The MVEE code returns the quadratic square matrix that defines the MVEE. Decomposing that matrix using Singular Value Decomposition (SVD) yields the eigen values and eigen vectors of the quadratic matrix and from those the axes vectors of the ellipsoid may be determined.
When storing molecules in computer files we can use
SmilesstringsMolblock records, summary and full documentation
Smiles strings do not include molecular coordinates but mol block records can.
We use RDKit to program molecules
Programming in Python is much easier with an IDE. Visual studio code is a great IDE for developing in Python
There are a bunch of programs for viewing and editing molecules. PYMol is a nice program for viewing molecules with a Python API.
The free version has a manual install.
The PyMOL GUI can run Python scripts. See the manual
This PyMOL script can display an ellipsoid in Python. The file is downloaded to ELM_ellipsoid.py
I have written some example code in Molecule.py. This:
- Starts with an example molecule defined by a smiles string
- Calculates 3D coordinates using RDKit (a molecular conformation)
- Finds the MVEE that contains the 3D coordinates
- Uses SVD to determine the ellipse axes
- Writes out the 3D molecules as a Mol block file
- Writes out a PyMOL script that will load the molecule and display the MVEE
Your computer already has
- Visual studio code
- Git bash
- Python
- Pymol
You will need to install
- Windows terminal
- Visual studio code extensions
- Github account
We will need to spend some time making sure that everything works with your login.
- Use Toggl to track time
- Document using Markdown (as in this document)
- Run the Python code from the command line and load the result into PyMol
- Get familiar with the Visual studio code IDE
- Explore using the debugger
- Change the code so that you can pass any smile as a command line argument. Create ellipsoids for different structures
- Use the PyMol API to add the ellipsoid axes to visualization (see
Compiled Graphics Objectsin the PyMol manual) - Use unit tests so you can define and document your program's behavior.
- Use
gitto share and back up your project - What happens when you try to create an ellipsoid for a "flat" structure like benzene? I guess it will crash. What can be done to fix that.
- More advanced- use RDKit to find all the rings in a molecule. Chop the molecule up into rings
and the bits between them (cyclic and acyclic structures). Instead of one ellipse for a molecule create multiple ellipses. Issues:
- Fused rings
- Atoms adjacent to rings
- Do some atoms need to be in more than one ellipse?