Skip to content

Asian-Pan-Genome/Centromere

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

139 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Centromere and Satellite Annotation

This repository hosts a comprehensive data and workflow resource for Centromere and Satellite Annotation in the APGp1 genome assembly.


1. Overview

This project provides precise characterization of centromeric regions, including:

  • Complete Centromere Coordinates
  • Detailed Satellite and Higher-Order Repeat (HOR) Annotation Tracks
  • Robust Pipelines for HOR identification and Satellite Annotation.

2. Data Access

The Annotation directory contains the final, high-quality annotation data.

3. Recommended Workflows

We provide and highly recommend the following two pipelines for future T2T human assemblies and related satellite studies.

SatelliteAnnotationWorkflow

Directory: SatelliteAnnotationWorkflow

This workflow was developed for the comprehensive annotation of centromeric satellites (including $\alpha$, $\beta$, $\gamma$, HSat1, HSat2, and HSat3) in human assemblies. The pipeline outputs precise centromeric coordinates for each genome and provides scripts for result visualization.

HORmining Pipeline

Directory: HORmining

HORmining is a bioinformatics pipeline designed for robust identification of Higher-Order Repeat (HOR) structures within alpha satellite DNA. It integrates two complementary computational approaches: the graph-based HORmon algorithm and the hierarchical tandem repeat mining (HTRM)-based HiCAT algorithm.


4. Integrated Analysis (Code Archive)

The Analysis directory contains the source code used for the final, integrated analysis of global centromere diversity, structural characterization, and evolutionary comparisons.

About

Centromere diversity

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors