RNA-seq Analysis of Breast Cancer Subtypes
Project Overview
This repository contains the analysis code and results for a bulk RNA-seq study on breast cancer subtypes (HER2, TNBC, Non-TNBC) and healthy tissue samples.
The project aims to identify differentially expressed genes (DEGs) and explore enriched biological processes using functional annotation.
Pipeline Summary
The analysis pipeline includes the following steps:
-
Quality Control: Performed with FastQC and aggregated using MultiQC.
-
Read Alignment: Reads were aligned to the human reference genome (R64-1-1.112) using HISAT2.
-
Gene Counting: Gene-level read counts were obtained using featureCounts.
-
Differential Expression Analysis: Conducted with DESeq2.
-
Functional Enrichment: GO enrichment analysis was performed on DEGs.
-
Visualization: Results were visualized using various plots, including dot plots for enriched GO terms.
Repository Structure
├── RNASeq_V2/ # Analysis scripts for different pipeline steps
├── report.pdf # Full project report
└── README.md # Project documentation
Usage Instructions
Prerequisites
To run the analysis, you need the following software and R packages:
-
R (DESeq2, ggplot2, clusterProfiler, EnhancedVolcano)
-
HISAT2
-
featureCounts
-
FastQC and MultiQC
Running the Analysis
Clone the Repository
git clone git@github.com:azertyang/RNAcours.git cd RNAcours
Execute Analysis
Run each script sequentially for data processing and analysis.