Timeseries Molecular Analysis Repository

This repository contains comprehensive multi-omic time series analyses of three coral species examining molecular responses and regulatory networks across developmental/stress timepoints.

Repository Organization

The repository is organized by species with shared multi-species analyses:

timeseries_molecular/
├── D-Apul/           # Acropora pulchra analyses
│   ├── code/         # Analysis scripts (R Markdown)
│   ├── data/         # Raw and processed data
│   ├── output/       # Analysis outputs
│   └── README.md     # Species-specific documentation
├── E-Peve/           # Porites evermanni analyses  
│   ├── code/         # Analysis scripts
│   ├── data/         # Raw and processed data
│   ├── output/       # Analysis outputs
│   └── README.md     # Species-specific documentation
├── F-Ptua/           # Pocillopora tuahiniensis analyses
│   ├── code/         # Analysis scripts
│   ├── data/         # Raw and processed data  
│   ├── output/       # Analysis outputs
│   └── README.md     # Species-specific documentation
├── M-multi-species/  # Cross-species comparative analyses
│   ├── data/         # Shared metadata and multi-species datasets
│   ├── scripts/      # Multi-species analysis scripts
│   └── output/       # Comparative analysis outputs
└── README.md         # This file

File Naming Convention and Interaction Guidelines

Code File Naming

All analysis scripts follow this standardized format:

XX.YY-<species_prefix>-<analysis_description>.Rmd

Where:

XX.YY: Hierarchical numbering (XX = major analysis type, YY = sub-analysis)
<species_prefix>: Species designation (D-Apul, E-Peve, F-Ptua)
<analysis_description>: Brief description of analysis functionality
.Rmd: R Markdown format (other formats like .qmd accepted)

Examples:

00.00-D-Apul-RNAseq-reads-FastQC-MultiQC.Rmd - Initial RNA-seq quality control
01.00-D-Apul-RNAseq-trimming-fastp-FastQC-MultiQC.Rmd - Read trimming and QC
22.6-D-Apul-multiomic-machine-learning-updatedWGBS.Rmd - Advanced multi-omic ML

Output Organization

Each code file generates outputs in corresponding ../output/ directories
Output directories match script names (minus file extension)
Intermediate files, plots, and processed data stored in output directories

How Users Should Interact With Files

Start with species-specific READMEs in each species directory for detailed workflows
Follow numerical order of scripts within each species for logical analysis progression
Check dependencies - later numbered scripts often depend on outputs from earlier ones
Use metadata files in M-multi-species/data/ for sample information and experimental design
Contribute new analyses by following naming conventions and documenting in appropriate READMEs

Biological Hypotheses and Research Goals

This research addresses several key biological hypotheses about coral molecular responses:

Primary Hypotheses

Temporal Molecular Dynamics: Coral gene expression, miRNA regulation, and DNA methylation patterns change systematically across developmental/stress timepoints
Multi-omic Integration: Transcriptomic, epigenomic, and metabolomic data are interconnected and can predict physiological responses
Regulatory Networks: miRNA-mRNA interactions and DNA methylation create complex regulatory networks that control coral responses
Species-Specific Responses: Different coral species exhibit distinct molecular strategies for responding to environmental conditions
Predictive Modeling: Machine learning approaches can identify molecular signatures predictive of coral physiological state

Research Components

Experimental Design

Species: Three coral species with different life strategies
- Acropora pulchra (fast-growing, branching)
- Porites evermanni (slow-growing, massive)
- Pocillopora tuahiniensis (intermediate growth, branching)
Timepoints: Four sampling timepoints (TP1-TP4) across experimental period
Colonies: Multiple colonies per species for biological replication
Data Types: RNA-seq, sRNA-seq/miRNA, WGBS (methylation), metabolomics, lipidomics, physiological measurements

Analysis Approaches

Quality Control & Processing: Read QC, trimming, alignment, quantification
Differential Expression: Time-series gene and miRNA expression analysis
Target Prediction: miRNA-mRNA interaction prediction and validation
Co-expression Networks: WGCNA and correlation network analysis
Epigenetic Analysis: DNA methylation patterns and CpG island annotation
Machine Learning: Predictive modeling of phenotype from molecular data
Multi-omic Integration: Cross-platform correlation and network analysis
Functional Annotation: GO enrichment and pathway analysis

Schematic Overview

┌─────────────────────────────────────────────────────────────────┐
│                    TIMESERIES MOLECULAR PIPELINE                │
└─────────────────────────────────────────────────────────────────┘

                           ┌─────────────┐
                           │   SAMPLES   │
                           │   TP1-TP4   │
                           │ 3 Species   │
                           │Multi-Colony │
                           └─────┬───────┘
                                 │
                   ┌─────────────┼─────────────┐
                   │             │             │
              ┌────▼───┐     ┌───▼───┐    ┌───▼────┐
              │D-Apul  │     │E-Peve │    │F-Ptua  │
              │A.pulchra│     │P.everm│    │P.tuahin│
              └────┬───┘     └───┬───┘    └───┬────┘
                   │             │            │
           ┌───────┼─────────────┼────────────┼───────┐
           │       │             │            │       │
      ┌────▼──┐ ┌──▼──┐    ┌────▼──┐    ┌───▼───┐ ┌──▼──┐
      │RNA-seq│ │sRNA │    │ WGBS  │    │Metabol│ │Lipid│
      │       │ │miRNA│    │Methyl │    │ omics │ │omics│
      └───┬───┘ └──┬──┘    └───┬───┘    └───┬───┘ └──┬──┘
          │        │           │            │        │
          └────────┼───────────┼────────────┼────────┘
                   │           │            │
               ┌───▼───────────▼────────────▼───┐
               │     INTEGRATED ANALYSES       │
               │                               │
               │ ○ Differential Expression     │
               │ ○ Co-expression Networks      │
               │ ○ miRNA Target Prediction     │
               │ ○ Machine Learning Models     │
               │ ○ Multi-omic Correlation      │
               │ ○ Functional Annotation       │
               └───────────────┬───────────────┘
                               │
                    ┌──────────▼──────────┐
                    │   BIOLOGICAL        │
                    │   INSIGHTS          │
                    │                     │
                    │ • Regulatory Networks│
                    │ • Temporal Dynamics │
                    │ • Species Differences│
                    │ • Predictive Models │
                    └─────────────────────┘

Contributors and Their Contributions

Core Contributors

Sam White - Lead Data Engineer

RNA-seq data processing and quality control
Genome indexing and read alignment (HISAT2)
WGBS data processing and bisulfite genome preparation
Established computational pipelines and file organization standards

Kathleen Durkin - miRNA and Regulatory Networks Specialist

sRNA-seq/miRNA discovery and expression analysis
miRNA target prediction (miRanda, RNAhybrid)
miRNA-mRNA co-expression and correlation networks
Machine learning approaches for multi-omic integration
Gene set enrichment and functional annotation

Steven Roberts - Epigenomics Lead

WGBS methylation analysis and Bismark processing
CpG island annotation and methylation pattern analysis
SNP calling from bisulfite sequencing data
Multi-omic machine learning model development
Protein annotation and comparative analyses

Ariana S. Huffmyer - Metabolomics and Physiological Integration

Metabolomics data processing and analysis
Lipidomics data analysis and integration
Physiological data integration and PC analysis
Cross-platform correlation analysis
Multi-species comparative approaches

Jill Ashey - Network Analysis Specialist

WGCNA analysis for metabolomics data
Co-expression network construction and analysis

R. Cunning - Multi-omic Integration

Cross-platform data integration approaches
Metabolomics-lipidomics-genomics correlation analysis

Analysis Categories by Contributor

Analysis Type	Primary Contributors	Key Scripts
Data QC & Processing	Sam White	00.XX, 01.XX, 02.XX series
Gene Expression	Sam White, Kathleen Durkin	03.XX series
miRNA Analysis	Kathleen Durkin	04.XX, 06.XX, 07.XX series
Co-expression Networks	Kathleen Durkin, Jill Ashey	12.XX, 16.XX, 18.XX series
Methylation Analysis	Steven Roberts	14.XX, 15.XX, 19.XX, 26.XX, 28.XX series
Machine Learning	Kathleen Durkin, Steven Roberts	20.XX, 22.XX, 29.XX series
Metabolomics	Ariana Huffmyer, R. Cunning	M-multi-species scripts
Functional Annotation	Kathleen Durkin, Steven Roberts	21.XX, 23.XX, 25.XX, 27.XX series

Key Datasets and Metadata

RNA Metadata: M-multi-species/data/rna_metadata.csv
Physiological Data: E5 Timeseries Repository
Sample Information: Detailed colony IDs, timepoints, and experimental design in metadata files

Getting Started

Clone the repository
Review species-specific READMEs for detailed analysis workflows
Check metadata files to understand experimental design
Follow numerical script order within each species directory
Ensure computational dependencies are installed (R, Bioconductor packages, alignment tools)

For questions or contributions, please refer to individual script authors or open an issue in the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 1,588 Commits
D-Apul		D-Apul
E-Peve		E-Peve
F-Ptua		F-Ptua
M-multi-species		M-multi-species
assets		assets
.gitignore		.gitignore
README.md		README.md
bu.sh		bu.sh
master-molecular-metadata.csv		master-molecular-metadata.csv
philharmonic.log		philharmonic.log
pyproject.toml		pyproject.toml
timeseries_molecular.Rproj		timeseries_molecular.Rproj
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Timeseries Molecular Analysis Repository

Repository Organization

File Naming Convention and Interaction Guidelines

Code File Naming

Output Organization

How Users Should Interact With Files

Biological Hypotheses and Research Goals

Primary Hypotheses

Research Components

Experimental Design

Analysis Approaches

Schematic Overview

Contributors and Their Contributions

Core Contributors

Analysis Categories by Contributor

Key Datasets and Metadata

Getting Started

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Timeseries Molecular Analysis Repository

Repository Organization

File Naming Convention and Interaction Guidelines

Code File Naming

Output Organization

How Users Should Interact With Files

Biological Hypotheses and Research Goals

Primary Hypotheses

Research Components

Experimental Design

Analysis Approaches

Schematic Overview

Contributors and Their Contributions

Core Contributors

Analysis Categories by Contributor

Key Datasets and Metadata

Getting Started

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages