Skip to content

Lattesnow/ChIP_SP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChIP_SP: Spatial ChIP (ChIP-SP) as a New Bioinformatics Tool to Characterize Spatial Gene Regulation

Overview

ChIP_SP is an R package implementing the core ChIP-SP algorithm for integrating ChIP-seq transcription factor binding with Hi-C chromatin interaction data to identify spatially linked regulatory regions.

The package is designed to provide the core computational engine for ChIP-SP analysis: Step 0, optional removal of sex chromosomes (chrX, chrY); Step 1, merging Hi-C loop outputs across replicates and resolutions; and Step 2, spatial linking of ChIP-seq peaks to distal Hi-C loop anchors and ranking interactions.

Downstream analyses, including gene annotation, pathway enrichment, and visualization, are intentionally excluded from the package API and are instead provided as reference R scripts in inst/scripts/. This design allows users full flexibility in annotation strategy and downstream interpretation.

Conceptual Framework

Hi-C identifies chromatin loops, but it does not resolve the exact nucleotide-level contact point within each interacting bin. ChIP-SP addresses this limitation by identifying ChIP-seq peaks that overlap one anchor of a Hi-C loop, assigning the partner anchor as a spatially linked regulatory region, and treating the full interacting Hi-C anchor region as a potential transcription factor regulatory site.

Spatial interactions are ranked using both ChIP-seq peak strength (pileup) and Hi-C loop confidence (FDR). The resulting ranked regions can then be used for downstream analyses such as gene annotation with ChIPpeakAnno, UCSC Genome Browser visualization, KEGG or pathway enrichment analysis, and transcription factor enrichment analysis.

Installation

Install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("Lattesnow/ChIP_SP")

Then load the package with:

library(ChIP_SP)

Current Package Functions

The current exported package API includes removeXYChromosomes(), mergeHiCLoops(), and chipSPLink().

removeXYChromosomes() is an optional preprocessing function to remove rows on chromosome X and chromosome Y from a genomic interval table. mergeHiCLoops() merges Hi-C loop output files across replicates and/or resolutions into a unified loop table. chipSPLink() performs the core ChIP-SP spatial integration by linking ChIP-seq peaks to distal Hi-C loop anchors and generating a ranked output table.

Input Requirements

All input files should be placed in the same working directory unless full paths are provided.

Hi-C loop files are expected to follow the naming pattern *HiC.xls. These files may include multiple biological replicates and multiple loop-calling resolutions.

ChIP-seq peak files are expected to follow the naming pattern *ChIP.xls. These files are typically generated by MACS2 or equivalent peak-calling software.

Core Workflow

Step 0 is an optional preprocessing step for removing chromosome X and chromosome Y from genomic interval tables before downstream analysis. This can be performed using removeXYChromosomes().

Example:

library(ChIP_SP)

hic_df_filtered <- removeXYChromosomes(hic_df, chr_col = "chr")

Step 1 merges Hi-C loop outputs across replicates and resolutions. The mergeHiCLoops() function reads Hi-C loop files and combines them into a single merged data frame.

Example:

library(ChIP_SP)

hic_files <- list.files(
  pattern = "HiC\\.xls$",
  full.names = TRUE
)

hic_df <- mergeHiCLoops(hic_files)

The output is a merged data.frame containing Hi-C loops across all replicates and resolutions.

Step 2 performs ChIP–Hi-C spatial integration and ranking. The chipSPLink() function takes a ChIP-seq peak file together with the merged Hi-C loop table and identifies spatially linked regulatory regions.

Example:

chip_file <- list.files(
  pattern = "ChIP\\.xls$",
  full.names = TRUE
)

chipsp_results <- chipSPLink(
  chip_file = chip_file,
  hic_df    = hic_df
)

The output is a ranked data.frame containing ChIP-SP integrated spatial regulatory regions, including genomic coordinates such as chr, start, and end, together with ChIP-seq signal (pileup), Hi-C loop confidence (FDR), normalized scores, and the final ChIP-SP ranking score.

A typical minimal workflow is:

library(ChIP_SP)

hic_files <- list.files(pattern = "HiC\\.xls$", full.names = TRUE)
hic_df <- mergeHiCLoops(hic_files)

hic_df <- removeXYChromosomes(hic_df, chr_col = "chr")

chip_file <- list.files(pattern = "ChIP\\.xls$", full.names = TRUE)

chipsp_results <- chipSPLink(
  chip_file = chip_file,
  hic_df    = hic_df
)

Downstream Analyses

Downstream annotation and enrichment analyses are not part of the package API. Reference scripts are provided in inst/scripts/.

These include scripts such as Step0_ChIP_SP_Remove_XY_Chromosomes.R, Step3_ChIP_SP_Gene_annotation_ChIPpeakAnno_UCSC.R, Step4_ChIP_SP_Gene_annotation_Pathwayanalysis_UCSC.R, Step5_ChIP_SP_Gene_annotation_KEGG_ChEA_UCSC.R, and Step6_ClusterGVis_ChIP_SP_Loop.R.

These scripts can be used as templates for ChIPpeakAnno-based gene annotation, pathway enrichment analysis, KEGG and ChEA enrichment, UCSC Genome Browser preparation, and ClusterGVis-based visualization.

Package Scope and Design Philosophy

The package includes the core ChIP-SP spatial integration algorithm, Hi-C loop merging, optional sex chromosome removal, and reproducible, modular workflows.

The package excludes genome-specific annotation choices, pathway and transcription factor enrichment APIs, visualization-specific dependencies, and web-based downstream tools.

This modular design supports reproducibility, reviewer transparency, and flexibility across species, annotations, and downstream analysis pipelines.

Notes

The package currently focuses on the core spatial integration workflow. Annotation and interpretation are intentionally left outside the package API. Users can customize downstream analyses according to species, genome build, and study design.

Citation

If you use ChIP_SP in your research, please cite the associated manuscript describing the ChIP-SP method. Citation details will be added once available.

Contact

For questions, issues, or feature requests, please open a GitHub issue at the repository or contact tianyi.zhou@childrens.harvard.edu.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages