Skip to content

Comments

feat: add GP-age epigenetic clock#188

Merged
marcbal77 merged 1 commit intobio-learn:masterfrom
marcbal77:feature/add-gpage-clock
Feb 11, 2026
Merged

feat: add GP-age epigenetic clock#188
marcbal77 merged 1 commit intobio-learn:masterfrom
marcbal77:feature/add-gpage-clock

Conversation

@marcbal77
Copy link
Member

@marcbal77 marcbal77 commented Dec 7, 2025

Summary

  • Add GP-age clock (Gaussian Process regression-based age prediction)
  • 6 model variants: GPAge10, GPAge30, GPAge71, GPAgeA, GPAgeB, GPAgeC
  • GPy added as optional dependency (pip install biolearn[gpage])

Reference

@marcbal77
Copy link
Member Author

Note: The .json.zip extension is misleading - they're actually gzip-compressed JSON files. GPy handles this natively. I noticed this is different than the linear clocks managed as csv's.

@marcbal77 marcbal77 marked this pull request as draft December 7, 2025 01:39
@marcbal77 marcbal77 self-assigned this Dec 7, 2025
@marcbal77 marcbal77 added the enhancement New feature or request label Dec 7, 2025
@sarudak
Copy link
Member

sarudak commented Jan 26, 2026

I'm not clear what this needs to be ready to merge

@marcbal77
Copy link
Member Author

I ran into some error locally and I could not repeat it and I looked further into GPy, I just recall it had to do with the .json.zip extension and could not replicate. Let me rebase and make test/format and then take out of draft state.

@marcbal77 marcbal77 force-pushed the feature/add-gpage-clock branch from 8beda2c to 10ec7a9 Compare January 27, 2026 22:49
@marcbal77 marcbal77 marked this pull request as ready for review January 27, 2026 22:50
@marcbal77
Copy link
Member Author

marcbal77 commented Feb 9, 2026

  1. Initially observed an error when trying to open the model file directly - something related to gzip decompression (couldn't replicate the exact error)
  2. Checked file types - confirmed .json.zip files are actually gzip-compressed (not zip):
    $ file biolearn/data/GP-age_model_10_cpgs.json.zip
    gzip compressed data, max compression
  3. Tested model loading:
    from biolearn.model_gallery import ModelGallery
    gallery = ModelGallery()
    model = gallery.get('GPAge10', imputation_method='none')

Loads successfully, 10 CpG sites

  1. Ran full test suite: 161 passed, 5 skipped

I could not reproduce the original gzip error. GPy's load_model() handles gzip natively despite the misleading .json.zip extension. Tests pass, models load correctly. Rebased on master and marked ready for review.

Copy link
Member

@sarudak sarudak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready to merge. Would like to discuss the separation of gpy as an optional dependency.

@marcbal77 marcbal77 merged commit 7b74cfa into bio-learn:master Feb 11, 2026
1 check passed
@marcbal77
Copy link
Member Author

We concluded to leave GPy as a dependency due to compatibility issues and dependencies on other large libraries.

@marcbal77 marcbal77 deleted the feature/add-gpage-clock branch February 11, 2026 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants