Deep ECG ASD Project Overview This project investigates whether self-supervised ECG embeddings (JEPA) provide better signal representations than traditional physiological features for predicting: Autism-related severity (regression) Diagnostic / behavioral classification (classification) Autonomic dysregulation markers Project Structure Deep-Ecg-Asd/ ├── data/ │ ├── raw/ # (ignored) original ECG + clinical files │ ├── metadata/ # participant_ids + master_labels.csv │ └── embeddings/ # JEPA outputs ├── scripts/ │ ├── build_master_labels.py │ ├── test_jepa_load.py │ ├── run_extract_embeddings.py ├── configs/ │ └── paths.yaml ├── src/ │ └── deep_ecg_asd/ ├── ECG_JEPA/ # pretrained model repo Pipeline
- Build labels python scripts/build_master_labels.py Creates: data/metadata/master_labels.csv Includes: participant_id severity_score (ADOS CSS) class_label (thresholded)
- Load pretrained JEPA encoder python scripts/test_jepa_load.py Verifies: checkpoint loads encoder initializes
- Extract embeddings (next step) python scripts/run_extract_embeddings.py Output: data/embeddings/ Modeling Plan We compare three approaches: Model 1 — Embeddings Input: JEPA embeddings Goal: learn latent representations of ECG Model 2 — Physiological Features HRV Skewness Entropy Model 3 — Hybrid Concatenate embeddings + features Targets Regression severity_score (ADOS CSS) Classification class_label (currently threshold-based, may switch to group labels)
Notes Raw ECG data is not tracked in Git External / vendor code is excluded Paths are configured in configs/paths.yaml