mrpeg is a Python software to perform association test integrating perturbational screens, eQTL, and GWAS summary data to identify mediating genes of complex traits.
- We detest usage of our software or scientific outcome to promote racial discrimination.The Mr. PEG manuscript is described in
Zeyun Lu, Yi Ding, Nathan LaPierre, Lili Wang, Douglas Yao, Nicholas Mancuso, Alexander Gusev
Check here for full documentation.
Installation | Example | Version History | Support
-
Before installation, we highly recommend to create a new environment using conda so that it will not affect the software versions of the other projects. For example, use following codes:
conda create -n env-mrpeg python=3.10 conda activate env-mrpeg
-
If you are using a Mac with an Apple M1 or newer chip, you should initiate your conda using
miniforgeto ensure compatibility (see this link for previous issue). On most HPC systems, this is usually not necessary. -
Last, users can download the latest repository and then use
pip:git clone https://github.com/gusevlab/mrpeg.git cd mrpeg pip install .
mrpeg software is very easy to use:
For performing inference:
cd ./data/
mrpeg peg --gwas example_gwas.tsv.gz \
--eqtl example_eqtl.tsv.gz \
--perturb example_perturbation.tsv.gz \
--gwas-cols chrom snp a1 a0 beta se \
--eqtl-cols chrom snp a1 a0 z gene \
--ref-geno plink/geno_chr\* \
--trait "mediating_gene" \
--top-signal 1 \
-o tmp_results_mediating_geneWe also implement a function to compute the GWAS signals given gene annotations.
cd ./data/
mrpeg signal --gwas example_gwas.tsv.gz \
--gwas_cols chrom snp pos beta se \
--ref ref_gene_info.tsv.gz \
--ref_cols CHR P_MID_FLANK0 P_MID_FLANK1 ID2 \
--window 0 \
--trait example \
--chr 1 \
-o tmp_results_signalSee here for more details on how to use mrpeg.
If you want to use in-software mrpeg inference function, you can use following Python code as an example:
from mrpeg.peg import infer_peg
# betas is a numpy array of GWAS effects (k by 1)
# ses is a numpy array of GWAS standard errors (k by 1)
# eqtl is a numpy array of eQTL effects (k by 1)
# perturbs is a numpy array of perturbation effects (k by t) of k perturbed genes on t downstream genes
# ld is a numpy array (k by k) of LD across SNPs
infer_peg(beta=betas, se=ses, eqtl=eqtls, perturb=perturbs, ld=lds)You can customize this function with your own ideas!
| Version | Description |
|---|---|
| 0.2 | Added input format specifications, troubleshooting guide, and performance guide. Fixed documentation errors carried over from a previous project. Added missing intervaltree dependency. Added type hints to closest and signal modules. Added test suite with 50 unit tests. This update was completely done using Claude Code with human verification. |
| 0.1 | Initial Release |
For any questions, comments, bug reporting, and feature requests, please contact Zeyun Lu (zeyun_lu@dfci.harvard.edu) and Sasha Gusev (alexander_gusev@dfci.harvard.edu), and open a new thread in the Issue Tracker.
This project has been set up using PyScaffold 4.1.1. For details and usage information on PyScaffold see https://pyscaffold.org/.