msBayesImpute is a versatile framework for handling missing values in mass spectrometry (MS) proteomics data.
It integrates probabilistic dropout models with Bayesian matrix factorization in a fully data-driven manner,
allowing it to account for both missing at random (MAR) and missing not at random (MNAR) patterns.
This repository contains the Python implementation of msBayesImpute, built on Pyro, a probabilistic programming language.
The R version is available here: msBayesImpute (R package).
msbayesimputepy/
├── data/ # Example dataset (HeLa cell line proteomics data)
├── msbayesimputepy/ # Python implementation of msBayesImpute
├── msbayesimputepy.egg-info/ # Metadata for the Python package
├── dist/ # Pre-built Python wheel package
├── vignettes/ # Example usage (see quick_guide_python.ipynb)
├── requirements.txt # Package dependencies
└── README.mdInstall the Python package from the pre-built wheel in the dist/ folder:
pip install dist/msbayesimputepy-0.2.0-py3-none-any.whl- See the Jupyter notebook in
vignettes/quick_guide_python.ipynbfor a quick start. - Example dataset: provided in the
data/folder (HeLa cell line proteomics).
If you use msBayesImpute in your research, please cite:
He J, et al. bioRxiv (2025). msBayesImpute: A Versatile Framework for Addressing Missing Values in Biomedical Mass Spectrometry Proteomics Data