Skip to content

taowis/ImmunoBridgeAI

Repository files navigation

ImmunoBridgeAI

ImmunoBridgeAI is a compact, reproducible platform showing how to bridge immune repertoires (TCR/BCR) and single-cell profiles to translational, explainable insights.

Ships with tiny demo datasets and fully working code—no external downloads needed.

Features

  • Load AIRR-style TSV repertoires and compute clonal expansion & diversity.
  • Lightweight TCR clustering via edit distance and simple graph communities.
  • Specificity annotation using a bundled mini VDJdb-style file (demo only).
  • Ingest single-cell counts + metadata, run PCA→UMAP with scikit-learn + umap-learn.
  • Streamlit dashboard with Overview, Repertoire, Report tabs.
  • Tests and GitHub Actions CI for credibility.

🔍 AI & Machine Learning Components

ImmunoBridgeAI is designed to demonstrate real-world AI applications in computational immunology and translational research.
Below are the key points where AI/ML technologies are used or easily pluggable.

1. Deep Generative Models for Single-Cell Data (scVI / scANVI)

  • Files: src/immunobridge/scanvi_hooks.py, dashboard/app.py
  • Technology: scVI (variational autoencoder) and scANVI (semi-supervised annotation model).
  • Purpose: Learn latent representations of single-cell RNA + VDJ data and annotate cell types.
  • Impact: Enables AI-powered immune cell classification in autoimmune diseases such as T1D and IBD.
  • Demo Mode: Falls back to PCA/UMAP if scVI-tools is not installed.

2. Explainable AI for Clinical Interpretation (SHAP)

  • Files: (planned) dashboard/app.py
  • Technology: SHAP (Shapley additive explanations).
  • Purpose: Explain model predictions (e.g., case vs control) by ranking feature contributions.
  • Impact: Adds transparency for clinicians when interpreting immune repertoire-based AI models.

3. Translational AI Data Preparation

  • Files: pipelines/airr2anndata.nf, bin/airr2anndata.py
  • Technology: Nextflow-based preprocessing for AI pipelines.
  • Purpose: Standardize multi-modal immune data (AIRR repertoires + single-cell) for ML/DL workflows.
  • Impact: Ensures hospital and research datasets are AI-ready and reproducible across sites.

4. Graph-Based ML Features

  • Files: src/immunobridge/repertoire.py
  • Technology: NetworkX graph construction from clonotype edit-distance.
  • Purpose: Enable graph neural network (GNN) or clustering models on immune repertoires.
  • Impact: Facilitates discovery of clonal expansion patterns linked to disease.

5. Manifold Learning for Visualization

  • Files: src/immunobridge/io_10x.py
  • Technology: UMAP (nonlinear manifold learning).
  • Purpose: Reduce high-dimensional single-cell data to 2D/3D for exploration.
  • Impact: Helps researchers visually explore patient immune profiles.

Note: All AI components are designed to run on small demo datasets out of the box.
For real-world research, replace with your own immune repertoire and single-cell datasets, enable scvi-tools, and integrate hospital-specific preprocessing pipelines.


📁 Repository Structure

ImmunoBridgeAI/
├── LICENSE                       # MIT license file
├── CITATION.cff                  # Citation metadata for academic reference
├── README.md                     # Project overview, usage, and documentation
├── requirements.txt              # Minimal Python dependencies for quick install
├── pyproject.toml                # ✅ Modern Python packaging (setuptools build)
├── nextflow.config               # Global config for running Nextflow pipeline
│
├── env/                          # Conda environment definitions
│   └── environment.yml           # Full Conda environment with all dependencies
│
├── data/                         # Tiny bundled demo datasets
│   ├── airr_demo.tsv              # Example AIRR-formatted immune repertoire
│   ├── vdjdb_demo.tsv             # Mini VDJdb-like specificity lookup table
│   ├── sc_demo_counts.csv         # Toy single-cell gene expression matrix
│   └── sc_demo_meta.csv           # Metadata (subject, batch) for demo single-cell data
│
├── src/                          # Core Python package source code
│   └── immunobridge/              # Package: immunobridge
│       ├── __init__.py            # Package initializer, defines exports
│       ├── io_airr.py             # Load and process AIRR-formatted repertoires
│       ├── io_10x.py              # Load single-cell 10x-style count & metadata, run PCA/UMAP
│       ├── repertoire.py          # Diversity metrics, clonotype graph building, community detection
│       ├── specificity.py         # Match clonotypes to VDJdb-style specificity tables
│       ├── viz.py                 # Plotly-based visualizations (UMAP scatter, bar charts)
│       └── scanvi_hooks.py        # Optional scVI/scANVI embedding/annotation integration
│
├── dashboard/                    # Streamlit web app
│   └── app.py                     # Interactive dashboard for single-cell & repertoire exploration
│
├── tests/                        # Automated test suite
│   ├── test_basic.py              # Basic package-level tests (imports, quick checks)
│   └── test_repertoire.py         # Unit tests for repertoire analysis & data loading
│
├── .github/                      # GitHub-specific settings
│   └── workflows/                 # GitHub Actions CI configuration
│       └── ci.yml                  # CI workflow for testing on push/PR
│
├── pipelines/                    # Nextflow pipelines
│   └── airr2anndata.nf            # Mini pipeline: AIRR TSV → AnnData-like CSV for analysis
│
└── bin/                          # Executable helper scripts
    └── airr2anndata.py             # Python script used in the Nextflow pipeline

Quickstart

conda env create -f env/environment.yml
conda activate immunobridgeai
pytest -q
# Install the package in editable mode from Python project file pyproject.toml
pip install -e .
# 2) then launch Streamlit
streamlit run dashboard/app.py

Data

  • data/airr_demo.tsv: tiny AIRR-like repertoire
  • data/vdjdb_demo.tsv: tiny specificity table
  • data/sc_demo_counts.csv, data/sc_demo_meta.csv: tiny single-cell example

Demo data is synthetic / downsampled and for illustration only.

Why this fits Systems Immunology & Translational AI

  • Bridges immune repertoire signals and single‑cell states to translational outputs (dashboard + per‑subject report).
  • Autoimmunity focus (T1D/GI demo labels) matches the lab’s emphasis on precision medicine.
  • Industry & hospital‑ready touches: tiny runnable demo, pinned deps, tests, Nextflow skeleton for reproducible ingestion.
  • AI hooks: includes scVI/scANVI latent integration (if installed) with a robust PCA fallback, demonstrating awareness of practical deployment constraints.

scVI/scANVI hooks (optional)

To enable true scVI/scANVI latents:

# Optional, heavy deps — install only if you need them
pip install anndata scvi-tools

The dashboard toggle “Use scVI/scANVI latent if available” will:

  • try scVI/scANVI with labels as semi‑supervised targets,
  • otherwise gracefully fallback to PCA so the demo always runs.

Nextflow mini‑pipeline

A tiny, reproducible pipeline converts AIRR‑style TSV into a toy “AnnData‑like” CSV matrix:

nextflow run pipelines/airr2anndata.nf --airr data/airr_demo.tsv --outdir results
# Output: results/anndata_like.csv

This is a placeholder you can expand (QC, clone‑calling, VDJ pairing, etc.) when moving to hospital data.

📖 Citation

If you use ImmunoBridgeAI in your research or presentations, please cite it as:

Lai, K. (2025). ImmunoBridgeAI: AI-powered immunogenomics toolkit for single-cell and immune repertoire analysis in translational autoimmunity research. (Version 0.1.0) [Computer software]. GitHub: https://github.com/taowis/ImmunoBridgeAI. Prepared for submission to the Journal of Open Source Software (JOSS).

About

ImmunoBridgeAI is a reproducible platform showcasing AI applications in immunogenomics. It integrates single-cell multi-omics (scRNA + VDJ) and immune repertoire (TCR/BCR) data, enabling deep generative modeling, clonotype network analysis, and translational reporting for autoimmune disease research.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors