| title | layout |
|---|---|
Machine Learning Scientist — Drug Discovery (Foundation Models) |
default |
London, UK • linkedin.com/in/ctr26 • github.com/ctr26 • Google Scholar
Focus: Virtual cells, multi‑modal foundation models, omics + imaging, precision medicine
Machine Learning Scientist specialising in virtual cell development for drug discovery. Built and deployed multi‑modal foundation models that integrate knowledge graphs, text, transcriptomic and phenotypic imaging data. Experience spans molecular interactions to whole‑organism imaging. Comfortable leading across science and engineering—MLOps at TB‑scale, reproducible pipelines, and cross‑functional collaboration with biology, chemistry and platform teams.
Foundation models • Representation/self‑supervised learning • Multi‑modal fusion • Knowledge graphs • Biological sequence & transcriptomics • High‑content imaging • GNNs • OOD/robustness • Scaling & performance • Reproducible ML (MLOps) • Scientific communication
Senior Machine Learning Scientist — Valence Labs @ Recursion Pharmaceuticals
London, UK • Oct 2024 – Present
- Virtual Cell initiative: Fine‑tune multi‑modal LLMs over knowledge graphs + text + RNA‑seq + phenotypic imaging to model cell state and predict gene/drug responses; partnered closely with biology to design benchmark tasks and success metrics.
- TxPert: Co‑developed a state‑of‑the‑art transcriptomic perturbation predictor using systems‑biology KGs; contributed training code, data curation, and ablations.
- Boltz2 project: Contributed to proteome‑scale virtual drug screening components; supported evaluation strategy and error analysis across targets.
- Community: Organiser, Virtual Cell Journal Club; fostered reading group bridging ML and wet‑lab teams.
Senior Research Associate & AI Engineering Lead — EMBL‑EBI (Uhlmann Group & Bio‑Image Archive)
Cambridge, UK • Dec 2022 – Oct 2024
- Team leadership: Supervised 6 PhD students; established coding standards, CI, and peer‑review practices used across the lab.
- Spatial biology: Built deep‑learning pipelines for high‑content cell morphology and single‑cell feature learning; integrated with public bioimage resources.
- Open‑source: Created bioimage_embed (self‑supervised biological images) and shape_embed (cell‑shape DL toolkit); productionised training/inference.
- MLOps: Designed scalable pipelines processing TB‑scale microscopy datasets across HPC and cloud; containerised workflows, automated experiment tracking.
- Academic service: Reviewer — ISBI 2022/2023, ICASSP 2024.
AI/ML Founding Engineer — Amun AI AB
Stockholm, Sweden • 2022 – 2024
- Built GKE/Kubernetes platform for model serving with NVIDIA Triton/KServe; supported 100+ models for 30+ daily users with auth, monitoring and autoscaling.
AI/ML Engineering Consultant — DeepMirror
Cambridge & London, UK • 2022 – 2024
- MouseMindMapper: Automated brain‑histology segmentation product generating £50k annual revenue; delivered end‑to‑end data, training, packaging and docs.
- Wrote a high‑performance C++ cheminformatics fingerprinting library for production use.
Data Scientist — Brazma Group, EMBL‑EBI
Cambridge, UK • Dec 2019 – Dec 2023
- Co‑authored the successful AI4LIFE €5M grant (federated bioimage AI infrastructure); contributed to platform architecture and model‑sharing strategy.
- Drove large‑scale AI microscopy analyses in the Image Data Resource; collaborated with Google Cloud on representation learning.
- Taught annual deep‑learning courses to 40+ researchers (PhD to PI).
Software Engineer (COVID‑19 Response) — European Nucleotide Archive, EMBL‑EBI
Cambridge, UK • Mar 2020 – Sept 2020
- Built CI/CD for the COVID‑19 Data Portal to enable daily global data updates.
- Scaled NGS alignment and Nextflow/Kubernetes ETL pipelines for surging data volumes.
Computational Microscopist — National Physical Laboratory
London, UK • 2018 – Dec 2019
- Developed novel 3D organoid segmentation methods for cancer research; delivered consultancy to MSquared on advanced imaging.
PhD, Engineering — University of Cambridge • 2014 – 2018 (EPSRC PES‑CDT)
Thesis: “Light‑sheet microscopy for tracking particles in large specimens”
- Designed & built a novel light‑sheet microscope with automated acquisition.
- Algorithms for particle tracking, signal optimisation, and micrometre‑scale tomography.
- Supervision: 2× MRes, 1× BSc.
MRes, Photonics — University of Cambridge & UCL • 2013 – 2014 (EPSRC Photonics CDT)
- Structured‑illumination microscopy reconstruction; modules in Computer Vision, Quantum Mechanics, Photonics.
MSci, Physics (First‑Class Honours) — Nottingham Trent University • 2009 – 2013
- Top physics graduate; President, Mountaineering Club (2011–2012).
-
Wenkel F, Tu W, Masschelein C, Shirzad H, Eastwood C, Whitfield ST, Bendidi I, Russell CT, et al. TxPert: Leveraging Biochemical Relationships for Out‑of‑Distribution Transcriptomic Perturbation Prediction. arXiv:2505.14919 (2025)
-
Harrison PW, Lopez R, Rahman N, Allen SG, Aslam R, Buso N, Russell CT, et al. The COVID‑19 Data Portal: accelerating SARS‑CoV‑2 and COVID‑19 research through rapid open access data sharing. Nucleic Acids Research 49(W1):W619–W623 (2021)
-
Ouyang W, Beuttenmueller F, Gómez‑de‑Mariscal E, Pape C, Burke T, Garcia‑López‑de Haro C, Russell C, et al. Bioimage model zoo: a community‑driven resource for accessible deep learning in bioimage analysis. BioRxiv 2022.06.07.495102 (2022)
-
Ahlers J, Moré DA, Amsalem O, Anderson A, Bokota G, Boone P, Russell C, et al. napari: a multi‑dimensional image viewer for Python. Zenodo 1–2 (2023)
-
Hidalgo‑Cenalmor I, Pylvänäinen JW, Ferreira MG, Russell CT, et al. DL4MicEverywhere: deep learning for microscopy made flexible, shareable and reproducible. Nature Methods 21(6):925–927 (2024)
See Google Scholar for complete publication list: scholar.google.com/citations?user=XVt7BYQAAAAJ
- Virtual Cell Foundation Model • Patent pending • 2024 • Multi‑modal integration of knowledge graphs, transcriptomics and imaging for cellular state prediction (Hook1/Recursion)
- TxPert: Transcriptomic Perturbation Prediction • Patent pending • 2024 • Systems‑biology knowledge graph integration for gene expression response forecasting (Recursion)
- bioimage_embed — Self‑supervised learning for biological images.
- shape_embed — Deep‑learning toolkit for cell‑shape analysis.
- Contributions to Hypha Platform, BioImage Model Zoo, BIA Binder, Hypha Helm Charts, COVID Workflow Manager.
ML & AI: Foundation‑model fine‑tuning, contrastive/self‑supervised learning, OOD & uncertainty, evaluation/ablation design
Frameworks: PyTorch, TensorFlow, Lightning, Pyro, Hugging Face, scikit‑learn
Vision & Bio: Bioimage analysis, 3D reconstruction, segmentation/super‑resolution, snRNA‑seq/bulk RNA‑seq, histopathology, fluorescence imaging, GNNs, knowledge graphs
Languages: Python (primary), R, MATLAB, C++, Java
Compute: Multi‑GPU training (A100/V100), CUDA, distributed training, SLURM, HPC, GCP/AWS GPU instances
MLOps/Infra: Kubernetes, Docker, NVIDIA Triton, KServe, MLflow, CI/CD, Terraform
Workflows: Nextflow, Snakemake, Apache Airflow
- AI4LIFE (2022) — Co‑investigator on €5M EU grant (federated bioimage AI)
- EPSRC CDT Studentship (2013–2018) — Photonic & Electronic Systems CDT (£120k)
- Nuffield Research Bursary (2012) — Computer vision for liquid‑crystal flows
- Institute of Physics grant support (2009–2012)
- Course lead: Deep Learning for Bioimage Analysis (2019–2023), 40+ participants/year
- Supervision: 6 PhD students (AI & spatial biology) + 3 project students (PhD years)
- Peer review: Nature Methods, Scientific Reports, Journal of Microscopy, ISBI, ICASSP
- Talks & conferences: FOM (2018, 2022, 2023), MMC (2018, 2022), CBIAS (2023)
- Community leadership: Rowing captain/coach (Magdalene College), Mountaineering Club President (NTU)
References available upon request.