One API. 60+ molecular formats and libraries. From PDB to simulation in a single workflow.
Why MolSysMT? | Installation | Quickstart | Supported forms | Documentation | Citation
MolSysMT is a Python library for working with molecular systems across formats and simulation engines through a uniform API. Load a PDB file, query its topology, repair missing atoms, solvate it, and hand it off to OpenMM — all without leaving Python and without writing format-specific boilerplate.
Most molecular simulation libraries are excellent at one thing:
- MDTraj is fast for trajectory analysis, but its topology is read-only.
- MDAnalysis has a rich selection language, but interoperating with OpenMM requires glue code.
- OpenMM is the gold standard for GPU-accelerated simulation, but it does not help you prepare or repair a structure.
MolSysMT sits between these tools. Its role is to make the handoffs transparent:
import molsysmt as msm
# PDB file → OpenMM Simulation in ~5 lines, with structure preparation
mol = msm.convert('1l2y.pdb', to='molsysmt.MolSys')
mol = msm.build.add_missing_hydrogens(mol, pH=7.4, engine='MolSysMT')
mol = msm.build.solvate(mol, box_shape='cubic', clearance='12 angstroms',
water_model='TIP3P', ionic_strength='0.15 molar')
sim = msm.convert(mol, to='openmm.Simulation', forcefield='amber14-all.xml')Key design decisions:
- Single selection language across all forms — the same
selection='molecule_type=="protein"'string works on a PDB file, an MDTraj Trajectory, an MDAnalysis Universe, or a native MolSys. get/set/convertas the three fundamental operations — no form-specific accessors to memorize.- Numba JIT kernels for structure analysis (RMSD, distances, dihedral angles, radius of gyration, PCA, …) with optional CUDA GPU dispatch.
- Native structure preparation (missing heavy atoms, terminal cappings, H placement, solvation with ions) that does not require an OpenMM or PDBFixer installation.
- No heavy mandatory dependencies — MDTraj, MDAnalysis, OpenMM, and RDKit are optional; MolSysMT loads only what your workflow actually needs.
conda install -c uibcdf -c conda-forge molsysmtRequires Python 3.11–3.13. Optional dependencies (MDTraj, MDAnalysis, OpenMM, ParmEd, nglview, RDKit, …) are loaded on demand when available.
git clone https://github.com/uibcdf/molsysmt.git
cd molsysmt
pip install -e ".[dev]"import molsysmt as msm
mol = msm.convert('1l2y.pdb', to='molsysmt.MolSys')
n_atoms, n_residues, n_chains = msm.get(mol, n_atoms=True, n_groups=True, n_chains=True)
# (304, 20, 1)
seq = msm.get(mol, element='molecule', selection='molecule_type=="protein"',
attribute='sequence')
# ['NLYIQWLKDGGPSSGRPPPS']# MolSys → MDTraj → MDAnalysis → back to MolSys, topology preserved throughout
traj = msm.convert(mol, to='mdtraj.Trajectory')
uni = msm.convert(traj, to='mdanalysis.Universe')
mol2 = msm.convert(uni, to='molsysmt.MolSys')
msm.compare(mol, mol2, attributes=['n_atoms', 'sequence', 'bonds'])
# {'n_atoms': True, 'sequence': True, 'bonds': True}mol = msm.convert('raw_structure.pdb', to='molsysmt.MolSys')
# Diagnose
missing_heavy = msm.build.get_missing_heavy_atoms(mol)
missing_caps = msm.build.get_missing_terminal_cappings(mol)
# Repair (no external dependencies required)
mol = msm.build.add_missing_heavy_atoms(mol, engine='MolSysMT')
mol = msm.build.add_missing_terminal_cappings(mol, engine='MolSysMT')
mol = msm.build.add_missing_hydrogens(mol, pH=7.4, engine='MolSysMT')
# Solvate
mol = msm.build.solvate(mol, box_shape='truncated_octahedral',
clearance='12 angstroms', water_model='TIP3P',
ionic_strength='0.15 molar', engine='MolSysMT')# RMSD, radius of gyration, dihedral angles — Numba JIT, optionally GPU
rmsd = msm.structure.get_rmsd(mol, selection='backbone')
rg = msm.structure.get_radius_of_gyration(mol)
phi, psi = msm.structure.get_dihedral_angles(mol, dihedral='phi'), \
msm.structure.get_dihedral_angles(mol, dihedral='psi')
# Secondary structure via DSSP
ss = msm.structure.get_secondary_structure(mol)view = msm.view(mol, standard=True)
view # inline in JupyterMolSysMT can read, write, and interconvert 60+ molecular forms organized in three tiers:
| Tier | Forms |
|---|---|
| Tier 1 (stable, full test coverage) | molsysmt.MolSys, file:pdb, file:h5msm, file:xtc, file:bcif/bcif.gz, file:molsys_yaml, openmm.Topology, mdtraj.Trajectory, mdtraj.Topology, string:pdb_id, string:alphafold_id |
| Tier 2 (best-effort) | MDAnalysis Universe, OpenMM Modeller/Simulation, RDKit Mol, Biopython PDBStructure, ParmEd Structure, NGLView, MolSysViewer |
| Tier 3 (experimental) | NetworkX, pytraj, XYZ, and others |
Any Tier 1 form can be converted to any other Tier 1 form with a single msm.convert() call.
Full documentation, tutorials, and API reference: https://www.uibcdf.org/MolSysMT/
The devguide/ directory in this repository contains the developer guide,
architecture documentation, and contribution guidelines.
Contributions are welcome. Please open an issue before submitting a pull request for non-trivial changes.
To run the test suite locally:
# Fast smoke tier (seconds)
make -C devtools/tests smoke
# Full suite, distributed across cores
make -C devtools/tests testSee devguide/testing_strategy.md for the full testing policy.
MolSysMT is distributed under the MIT license. See LICENSE for details.
- Liliana M. Moreno Vargas
- Diego Prada Gracia
See CONTRIBUTORS.md for the full list.
If you use MolSysMT in your research, please cite the software release:
A methods paper describing MolSysMT is in preparation. Please check the documentation for the most up-to-date citation instructions.
Thanks to the developers and maintainers of the libraries MolSysMT builds on: MDTraj, MDAnalysis, OpenMM, AmberTools, ParmEd, nglview, RDKit, Biopython, and others.
- Daniel Ibarrola Sánchez for his contributions to the early development of MolSysMT.