This repository contains comprehensive multi-omic time series analyses of three coral species examining molecular responses and regulatory networks across developmental/stress timepoints.
The repository is organized by species with shared multi-species analyses:
timeseries_molecular/
├── D-Apul/ # Acropora pulchra analyses
│ ├── code/ # Analysis scripts (R Markdown)
│ ├── data/ # Raw and processed data
│ ├── output/ # Analysis outputs
│ └── README.md # Species-specific documentation
├── E-Peve/ # Porites evermanni analyses
│ ├── code/ # Analysis scripts
│ ├── data/ # Raw and processed data
│ ├── output/ # Analysis outputs
│ └── README.md # Species-specific documentation
├── F-Ptua/ # Pocillopora tuahiniensis analyses
│ ├── code/ # Analysis scripts
│ ├── data/ # Raw and processed data
│ ├── output/ # Analysis outputs
│ └── README.md # Species-specific documentation
├── M-multi-species/ # Cross-species comparative analyses
│ ├── data/ # Shared metadata and multi-species datasets
│ ├── scripts/ # Multi-species analysis scripts
│ └── output/ # Comparative analysis outputs
└── README.md # This file
All analysis scripts follow this standardized format:
XX.YY-<species_prefix>-<analysis_description>.Rmd
Where:
XX.YY: Hierarchical numbering (XX = major analysis type, YY = sub-analysis)<species_prefix>: Species designation (D-Apul, E-Peve, F-Ptua)<analysis_description>: Brief description of analysis functionality.Rmd: R Markdown format (other formats like .qmd accepted)
Examples:
00.00-D-Apul-RNAseq-reads-FastQC-MultiQC.Rmd- Initial RNA-seq quality control01.00-D-Apul-RNAseq-trimming-fastp-FastQC-MultiQC.Rmd- Read trimming and QC22.6-D-Apul-multiomic-machine-learning-updatedWGBS.Rmd- Advanced multi-omic ML
- Each code file generates outputs in corresponding
../output/directories - Output directories match script names (minus file extension)
- Intermediate files, plots, and processed data stored in output directories
- Start with species-specific READMEs in each species directory for detailed workflows
- Follow numerical order of scripts within each species for logical analysis progression
- Check dependencies - later numbered scripts often depend on outputs from earlier ones
- Use metadata files in
M-multi-species/data/for sample information and experimental design - Contribute new analyses by following naming conventions and documenting in appropriate READMEs
This research addresses several key biological hypotheses about coral molecular responses:
- Temporal Molecular Dynamics: Coral gene expression, miRNA regulation, and DNA methylation patterns change systematically across developmental/stress timepoints
- Multi-omic Integration: Transcriptomic, epigenomic, and metabolomic data are interconnected and can predict physiological responses
- Regulatory Networks: miRNA-mRNA interactions and DNA methylation create complex regulatory networks that control coral responses
- Species-Specific Responses: Different coral species exhibit distinct molecular strategies for responding to environmental conditions
- Predictive Modeling: Machine learning approaches can identify molecular signatures predictive of coral physiological state
- Species: Three coral species with different life strategies
- Acropora pulchra (fast-growing, branching)
- Porites evermanni (slow-growing, massive)
- Pocillopora tuahiniensis (intermediate growth, branching)
- Timepoints: Four sampling timepoints (TP1-TP4) across experimental period
- Colonies: Multiple colonies per species for biological replication
- Data Types: RNA-seq, sRNA-seq/miRNA, WGBS (methylation), metabolomics, lipidomics, physiological measurements
- Quality Control & Processing: Read QC, trimming, alignment, quantification
- Differential Expression: Time-series gene and miRNA expression analysis
- Target Prediction: miRNA-mRNA interaction prediction and validation
- Co-expression Networks: WGCNA and correlation network analysis
- Epigenetic Analysis: DNA methylation patterns and CpG island annotation
- Machine Learning: Predictive modeling of phenotype from molecular data
- Multi-omic Integration: Cross-platform correlation and network analysis
- Functional Annotation: GO enrichment and pathway analysis
┌─────────────────────────────────────────────────────────────────┐
│ TIMESERIES MOLECULAR PIPELINE │
└─────────────────────────────────────────────────────────────────┘
┌─────────────┐
│ SAMPLES │
│ TP1-TP4 │
│ 3 Species │
│Multi-Colony │
└─────┬───────┘
│
┌─────────────┼─────────────┐
│ │ │
┌────▼───┐ ┌───▼───┐ ┌───▼────┐
│D-Apul │ │E-Peve │ │F-Ptua │
│A.pulchra│ │P.everm│ │P.tuahin│
└────┬───┘ └───┬───┘ └───┬────┘
│ │ │
┌───────┼─────────────┼────────────┼───────┐
│ │ │ │ │
┌────▼──┐ ┌──▼──┐ ┌────▼──┐ ┌───▼───┐ ┌──▼──┐
│RNA-seq│ │sRNA │ │ WGBS │ │Metabol│ │Lipid│
│ │ │miRNA│ │Methyl │ │ omics │ │omics│
└───┬───┘ └──┬──┘ └───┬───┘ └───┬───┘ └──┬──┘
│ │ │ │ │
└────────┼───────────┼────────────┼────────┘
│ │ │
┌───▼───────────▼────────────▼───┐
│ INTEGRATED ANALYSES │
│ │
│ ○ Differential Expression │
│ ○ Co-expression Networks │
│ ○ miRNA Target Prediction │
│ ○ Machine Learning Models │
│ ○ Multi-omic Correlation │
│ ○ Functional Annotation │
└───────────────┬───────────────┘
│
┌──────────▼──────────┐
│ BIOLOGICAL │
│ INSIGHTS │
│ │
│ • Regulatory Networks│
│ • Temporal Dynamics │
│ • Species Differences│
│ • Predictive Models │
└─────────────────────┘
Sam White - Lead Data Engineer
- RNA-seq data processing and quality control
- Genome indexing and read alignment (HISAT2)
- WGBS data processing and bisulfite genome preparation
- Established computational pipelines and file organization standards
Kathleen Durkin - miRNA and Regulatory Networks Specialist
- sRNA-seq/miRNA discovery and expression analysis
- miRNA target prediction (miRanda, RNAhybrid)
- miRNA-mRNA co-expression and correlation networks
- Machine learning approaches for multi-omic integration
- Gene set enrichment and functional annotation
Steven Roberts - Epigenomics Lead
- WGBS methylation analysis and Bismark processing
- CpG island annotation and methylation pattern analysis
- SNP calling from bisulfite sequencing data
- Multi-omic machine learning model development
- Protein annotation and comparative analyses
Ariana S. Huffmyer - Metabolomics and Physiological Integration
- Metabolomics data processing and analysis
- Lipidomics data analysis and integration
- Physiological data integration and PC analysis
- Cross-platform correlation analysis
- Multi-species comparative approaches
Jill Ashey - Network Analysis Specialist
- WGCNA analysis for metabolomics data
- Co-expression network construction and analysis
R. Cunning - Multi-omic Integration
- Cross-platform data integration approaches
- Metabolomics-lipidomics-genomics correlation analysis
| Analysis Type | Primary Contributors | Key Scripts |
|---|---|---|
| Data QC & Processing | Sam White | 00.XX, 01.XX, 02.XX series |
| Gene Expression | Sam White, Kathleen Durkin | 03.XX series |
| miRNA Analysis | Kathleen Durkin | 04.XX, 06.XX, 07.XX series |
| Co-expression Networks | Kathleen Durkin, Jill Ashey | 12.XX, 16.XX, 18.XX series |
| Methylation Analysis | Steven Roberts | 14.XX, 15.XX, 19.XX, 26.XX, 28.XX series |
| Machine Learning | Kathleen Durkin, Steven Roberts | 20.XX, 22.XX, 29.XX series |
| Metabolomics | Ariana Huffmyer, R. Cunning | M-multi-species scripts |
| Functional Annotation | Kathleen Durkin, Steven Roberts | 21.XX, 23.XX, 25.XX, 27.XX series |
- RNA Metadata:
M-multi-species/data/rna_metadata.csv - Physiological Data: E5 Timeseries Repository
- Sample Information: Detailed colony IDs, timepoints, and experimental design in metadata files
- Clone the repository
- Review species-specific READMEs for detailed analysis workflows
- Check metadata files to understand experimental design
- Follow numerical script order within each species directory
- Ensure computational dependencies are installed (R, Bioconductor packages, alignment tools)
For questions or contributions, please refer to individual script authors or open an issue in the repository.