Process to generate Chromatin State Expression (CSE) matrix
This pipeline takes in bam files from bulk RNA-seq data generated by the nf-core and generates a Chromatin State Expression matrix. The annotations are based on the full stack ChromHMM annotations from here.
- Copy or link bam files to a folder called
data. - List the sample names (no .bam suffix) as individual lines in the text file
sampleNames.txt. - Run snakemake command
DNX_data_count_anno.tsvcontains the counts for each chromatin state feature (rows) for each sample (columns).peakCountBySample_broadPeaksfolder contains the MACS3 peak calling outputs for each samples for both + and - strands.peakCountBySample_broadPeaks_gtffolder contains peak annotations in gtf format.peakCountBySample_broadPeaks_featureCountscontains counts for each peak feature, annotated with chromatin states, for each sample.