This repository provides a Snakemake-driven pipeline that runs ArchR-based analyses of single-nucleus ATAC-seq (snATAC-seq) data from human kidney biopsies. The pipeline produces ArchR projects, peak matrices, motif enrichment results, and derived files used downstream by the companion snRNA-seq analysis in github.com/saezlab/kidney_biopsies_MCFA.
The following outputs that are referenced by the companion RNA repository in the data/ATAC folder are generated by this pipeline:
out/archr/Atlas/ATAC/coordinates.csvout/archr/Atlas/ATAC/enrich_motifs.rds
- The analysis is implemented with Snakemake. Primary rules live in
workflow/rules/(seeworkflow/rules/archr.smk) and are organized by ArchR analysis stages: input checks & preprocessing, creation of ArchR projects, dimensionality reduction & clustering, peak calling, motif enrichment & footprinting, and exporting of peak and gene score matrices. - Analysis scripts used in the above rules are in
workflow/scripts/. - Default ArchR outputs are written under
out/archr/(ArchR projects, peak files, motif enrichment RDSs). Additional summary tables and figures are written toresults/andplots/(these output folders are git-ignored). - Other
- Parameters for the initial dimensionality reduction and manual annotation of the snATAC-seq data are held in
config/config.yamlandconfig/manual_annotations_atac.yaml. - Cluster/job submission helpers are in
config/slurm/.
- Parameters for the initial dimensionality reduction and manual annotation of the snATAC-seq data are held in
- Snakemake v7.30.2 (this workflow was implemented and tested with this version).
- Cluster access (Slurm). A
config/slurm/folder contains example jobscript and utilities for submitting to Slurm, which may need adapting to fulfill job submission requirements of your Slurm system. - Data (see below).
- Singularity images in SIF format (see below).
Two inputs are required, and will be made available on Zenodo:
out/archr/Atlas/ATAC/proj_2data/annotations/label_transfer_RNA_ATAC_V3_clean.csv
Additionally, the repository contains code to generate the ArchR project proj_2 from the raw fragment files.
The respository references several Singularity images, and will be updated with access links once the .sif files are readily available on Zenodo.
The initial ArchR project proj_2 was generated using an ArchR conda environment, built using the yaml file and post-deployment script found in workflow/envs. The build time for the conda environment was multiple hours and was vulnerable to connection timeouts, and so later parts of the project use Singularity: archr_1.0.3-base-r4.1_0.0.1.sif. This Singularity image is based on the ArchR Docker Image from the Greenleaf Lab, with some additional packages.
- Dry run (see what would be executed):
snakemake -n- Run on a Slurm cluster using the provided Slurm scripts (example):
# Adapt config/slurm/config.yaml for your cluster, then:
snakemake --profile config/slurmThe authors acknowledge support by the state of Baden-Württemberg through bwHPC and the German Research Foundation (DFG) through grant INST 35/1597-1 FUGG, as well as the data storage service SDS@hd supported by the Ministry of Science, Research and the Arts Baden-Württemberg (MWK) and the German Research Foundation (DFG) through grant INST 35/1503-1 FUGG. Charlotte Boys gratefully acknowledges DFG funding through the Clinical Research Unit 5011 InteraKD (Project ID: 445703531).
This project is distributed under the GNU General Public License v3.0. See LICENSE for full details.
Maintainer: Charlotte Boys