ImmunoBridgeAI is a compact, reproducible platform showing how to bridge immune repertoires (TCR/BCR) and single-cell profiles to translational, explainable insights.
Ships with tiny demo datasets and fully working code—no external downloads needed.
- Load AIRR-style TSV repertoires and compute clonal expansion & diversity.
- Lightweight TCR clustering via edit distance and simple graph communities.
- Specificity annotation using a bundled mini VDJdb-style file (demo only).
- Ingest single-cell counts + metadata, run PCA→UMAP with scikit-learn + umap-learn.
- Streamlit dashboard with Overview, Repertoire, Report tabs.
- Tests and GitHub Actions CI for credibility.
ImmunoBridgeAI is designed to demonstrate real-world AI applications in computational immunology and translational research.
Below are the key points where AI/ML technologies are used or easily pluggable.
- Files:
src/immunobridge/scanvi_hooks.py,dashboard/app.py - Technology: scVI (variational autoencoder) and scANVI (semi-supervised annotation model).
- Purpose: Learn latent representations of single-cell RNA + VDJ data and annotate cell types.
- Impact: Enables AI-powered immune cell classification in autoimmune diseases such as T1D and IBD.
- Demo Mode: Falls back to PCA/UMAP if scVI-tools is not installed.
- Files: (planned)
dashboard/app.py - Technology: SHAP (Shapley additive explanations).
- Purpose: Explain model predictions (e.g., case vs control) by ranking feature contributions.
- Impact: Adds transparency for clinicians when interpreting immune repertoire-based AI models.
- Files:
pipelines/airr2anndata.nf,bin/airr2anndata.py - Technology: Nextflow-based preprocessing for AI pipelines.
- Purpose: Standardize multi-modal immune data (AIRR repertoires + single-cell) for ML/DL workflows.
- Impact: Ensures hospital and research datasets are AI-ready and reproducible across sites.
- Files:
src/immunobridge/repertoire.py - Technology: NetworkX graph construction from clonotype edit-distance.
- Purpose: Enable graph neural network (GNN) or clustering models on immune repertoires.
- Impact: Facilitates discovery of clonal expansion patterns linked to disease.
- Files:
src/immunobridge/io_10x.py - Technology: UMAP (nonlinear manifold learning).
- Purpose: Reduce high-dimensional single-cell data to 2D/3D for exploration.
- Impact: Helps researchers visually explore patient immune profiles.
Note: All AI components are designed to run on small demo datasets out of the box.
For real-world research, replace with your own immune repertoire and single-cell datasets, enablescvi-tools, and integrate hospital-specific preprocessing pipelines.
ImmunoBridgeAI/
├── LICENSE # MIT license file
├── CITATION.cff # Citation metadata for academic reference
├── README.md # Project overview, usage, and documentation
├── requirements.txt # Minimal Python dependencies for quick install
├── pyproject.toml # ✅ Modern Python packaging (setuptools build)
├── nextflow.config # Global config for running Nextflow pipeline
│
├── env/ # Conda environment definitions
│ └── environment.yml # Full Conda environment with all dependencies
│
├── data/ # Tiny bundled demo datasets
│ ├── airr_demo.tsv # Example AIRR-formatted immune repertoire
│ ├── vdjdb_demo.tsv # Mini VDJdb-like specificity lookup table
│ ├── sc_demo_counts.csv # Toy single-cell gene expression matrix
│ └── sc_demo_meta.csv # Metadata (subject, batch) for demo single-cell data
│
├── src/ # Core Python package source code
│ └── immunobridge/ # Package: immunobridge
│ ├── __init__.py # Package initializer, defines exports
│ ├── io_airr.py # Load and process AIRR-formatted repertoires
│ ├── io_10x.py # Load single-cell 10x-style count & metadata, run PCA/UMAP
│ ├── repertoire.py # Diversity metrics, clonotype graph building, community detection
│ ├── specificity.py # Match clonotypes to VDJdb-style specificity tables
│ ├── viz.py # Plotly-based visualizations (UMAP scatter, bar charts)
│ └── scanvi_hooks.py # Optional scVI/scANVI embedding/annotation integration
│
├── dashboard/ # Streamlit web app
│ └── app.py # Interactive dashboard for single-cell & repertoire exploration
│
├── tests/ # Automated test suite
│ ├── test_basic.py # Basic package-level tests (imports, quick checks)
│ └── test_repertoire.py # Unit tests for repertoire analysis & data loading
│
├── .github/ # GitHub-specific settings
│ └── workflows/ # GitHub Actions CI configuration
│ └── ci.yml # CI workflow for testing on push/PR
│
├── pipelines/ # Nextflow pipelines
│ └── airr2anndata.nf # Mini pipeline: AIRR TSV → AnnData-like CSV for analysis
│
└── bin/ # Executable helper scripts
└── airr2anndata.py # Python script used in the Nextflow pipeline
conda env create -f env/environment.yml
conda activate immunobridgeai
pytest -q
# Install the package in editable mode from Python project file pyproject.toml
pip install -e .
# 2) then launch Streamlit
streamlit run dashboard/app.pydata/airr_demo.tsv: tiny AIRR-like repertoiredata/vdjdb_demo.tsv: tiny specificity tabledata/sc_demo_counts.csv,data/sc_demo_meta.csv: tiny single-cell example
Demo data is synthetic / downsampled and for illustration only.
- Bridges immune repertoire signals and single‑cell states to translational outputs (dashboard + per‑subject report).
- Autoimmunity focus (T1D/GI demo labels) matches the lab’s emphasis on precision medicine.
- Industry & hospital‑ready touches: tiny runnable demo, pinned deps, tests, Nextflow skeleton for reproducible ingestion.
- AI hooks: includes scVI/scANVI latent integration (if installed) with a robust PCA fallback, demonstrating awareness of practical deployment constraints.
To enable true scVI/scANVI latents:
# Optional, heavy deps — install only if you need them
pip install anndata scvi-toolsThe dashboard toggle “Use scVI/scANVI latent if available” will:
- try scVI/scANVI with labels as semi‑supervised targets,
- otherwise gracefully fallback to PCA so the demo always runs.
A tiny, reproducible pipeline converts AIRR‑style TSV into a toy “AnnData‑like” CSV matrix:
nextflow run pipelines/airr2anndata.nf --airr data/airr_demo.tsv --outdir results
# Output: results/anndata_like.csvThis is a placeholder you can expand (QC, clone‑calling, VDJ pairing, etc.) when moving to hospital data.
If you use ImmunoBridgeAI in your research or presentations, please cite it as:
Lai, K. (2025). ImmunoBridgeAI: AI-powered immunogenomics toolkit for single-cell and immune repertoire analysis in translational autoimmunity research. (Version 0.1.0) [Computer software]. GitHub: https://github.com/taowis/ImmunoBridgeAI. Prepared for submission to the Journal of Open Source Software (JOSS).