Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
8f0c11a
Fix transfomer-engine version in evo2 docker install (#224)
giogix2 Apr 17, 2025
30bd593
copy files from https://github.com/czi-ai/transcriptformer/commit/f37…
bputzeys May 1, 2025
10fe8bb
Include example and configs for transcriptformer
bputzeys May 1, 2025
fb96220
Include bugfix where file_path is only defined if
bputzeys May 1, 2025
61eba10
TODO: Remove dependency on pytorch-lightning
bputzeys May 1, 2025
759c1bf
Include option to load & use the model without
bputzeys May 1, 2025
8e0a11e
Let TranscriptFormer inherit from HelicalRNAModel
bputzeys May 1, 2025
18a20e6
TODO: Only return embeddings for now
bputzeys May 1, 2025
2bd481b
WIP: Add transcriptformer configurer
bputzeys May 1, 2025
34d33a7
The logging.disable() function is being used
bputzeys May 1, 2025
4383a45
TODO: Move configs to constructor. This should
bputzeys May 1, 2025
0c815bc
TODO: use ensembl_ids for all datasets even if gene_names is not spec…
bputzeys May 1, 2025
0c9f6c0
Use the same dataset as the other examples
bputzeys May 2, 2025
28bb04d
Simplify config setup
bputzeys May 2, 2025
91e6f52
Raise error if mapping to ensembl ids fails
bputzeys May 2, 2025
41042b6
If provided dataset doesn't have an 'assay' column
bputzeys May 2, 2025
e2a784a
Add transcriptformer to CI
bputzeys May 2, 2025
08b8ec9
Move hash values to constants folder
bputzeys May 2, 2025
6eba301
Add support for tf_exemplar and tf_metazoa
bputzeys May 2, 2025
8c31d5f
Include option to return entire output AnnData
bputzeys May 2, 2025
6e638ed
Use index as default gene_col_name
bputzeys May 2, 2025
d0fa909
Set index as default gene_col_name
bputzeys May 2, 2025
cd18bba
Include model card and documentation for
bputzeys May 2, 2025
ebaa297
Remove unused requirements.txt file
bputzeys May 2, 2025
b619847
Bump version number
bputzeys May 2, 2025
26da9dd
Run black on transcriptformer folder
bputzeys May 2, 2025
d35c5ff
remove comments
bputzeys May 2, 2025
f5dbb63
Improce docstrings
bputzeys May 2, 2025
4c24dd4
Set metazoa as default model for transcriptformer
bputzeys May 2, 2025
6885d6f
Replace UCE with TranscriptFormer in example
bputzeys May 2, 2025
1845fe0
Set model to tf_sapiens
bputzeys May 2, 2025
fdd40be
Speed up test
bputzeys May 2, 2025
3ab7a63
Fix notebook execution
bputzeys May 2, 2025
0a4fcd5
Merge pull request #226 from helicalAI/transcriptformer
bputzeys May 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,10 @@ jobs:
run: |
python examples/fine_tune_models/fine_tune_UCE.py ++device="cuda"

- name: Execute Transcriptformer
run: |
python examples/run_models/run_transcriptformer.py

- name: Execute Hyena
run: |
python examples/run_models/run_hyena_dna.py ++device="cuda"
Expand Down
14 changes: 6 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,8 @@ Let’s build the most exciting AI-for-Bio community together!

## What's new?

### Saving fine-tuned models

We give users the option now to save fine-tuned models. Please have a look at the examples folder [here](examples/fine_tune_models).

### Evo2
We have integrated [Evo2](https://github.com/ArcInstitute/evo2) into our helical package and have made a model card for it in our [Evo2 model folder](helical/models/evo_2/README.md). If you would like to test the model, take a look at our [example notebook](examples/notebooks/Evo-2.ipynb)!
Let us know what you think and we are happy to help you with the larger model (40B parameters!) if needed!
### TranscriptFormer
We have integrated [TranscriptFormer](https://github.com/czi-ai/transcriptformer) into our helical package and have made a model card for it in our [Transcriptformer model folder](helical/models/transcriptformer/README.md). If you would like to test the model, take a look at our [example notebook](examples/notebooks/Geneformer-vs-TranscriptFormer.ipynb)!

### 🧬 Introducing Helix-mRNA-v0: Unlocking new frontiers & use cases in mRNA therapy 🧬
We’re thrilled to announce the release of our first-ever mRNA Bio Foundation Model, designed to:
Expand Down Expand Up @@ -102,6 +97,7 @@ apptainer shell --nv --fakeroot singularity/helical/
- [Geneformer](https://helical.readthedocs.io/en/latest/model_cards/geneformer/)
- [scGPT](https://helical.readthedocs.io/en/latest/model_cards/scgpt/)
- [Universal Cell Embedding (UCE)](https://helical.readthedocs.io/en/latest/model_cards/uce/)
- [TranscriptFormer](https://helical.readthedocs.io/en/latest/model_cards/transcriptformer/)

### DNA models:
- [HyenaDNA](https://helical.readthedocs.io/en/latest/model_cards/hyena_dna/)
Expand All @@ -125,7 +121,7 @@ Within the `examples/notebooks` folder, open the notebook of your choice. We rec
| ----------- | ----------- |----------- |
|[Quick-Start-Tutorial.ipynb](./examples/notebooks/Quick-Start-Tutorial.ipynb)| A tutorial to quickly get used to the helical package and environment. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Quick-Start-Tutorial.ipynb)|
|[Helix-mRNA.ipynb](./examples/notebooks/Helix-mRNA.ipynb)|An example of how to use the Helix-mRNA model.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Helix-mRNA.ipynb) |
|[Geneformer-vs-UCE.ipynb](./examples/notebooks/Geneformer-vs-UCE.ipynb) | Zero-Shot Reference Mapping with Geneformer & UCE and compare the outcomes. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Geneformer-vs-UCE.ipynb) |
|[Geneformer-vs-TranscriptFormer.ipynb](./examples/notebooks/Geneformer-vs-TranscriptFormer.ipynb) | Zero-Shot Reference Mapping with Geneformer & UCE and compare the outcomes. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Geneformer-vs-TranscriptFormer.ipynb) |
|[Hyena-DNA-Inference.ipynb](./examples/notebooks/Hyena-DNA-Inference.ipynb)|An example how to do probing with HyenaDNA by training a neural network on 18 downstream classification tasks.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Hyena-Dna-Inference.ipynb) |
|[Cell-Type-Annotation.ipynb](./examples/notebooks/Cell-Type-Annotation.ipynb)|An example how to do probing with scGPT by training a neural network to predict cell type annotations.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Cell-Type-Annotation.ipynb) |
|[Cell-Type-Classification-Fine-Tuning.ipynb](./examples/notebooks/Cell-Type-Classification-Fine-Tuning.ipynb)|An example how to fine-tune different models on classification tasks.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Cell-Type-Classification-Fine-Tuning.ipynb) |
Expand Down Expand Up @@ -154,6 +150,7 @@ A lot of our models have been published by talend authors developing these excit
- [scGPT](https://github.com/bowang-lab/scGPT/)
- [Geneformer](https://huggingface.co/ctheodoris/Geneformer)
- [UCE](https://github.com/snap-stanford/UCE)
- [TranscriptFormer](https://github.com/czi-ai/transcriptformer)
- [HyenaDNA](https://github.com/HazyResearch/hyena-dna)
- [anndata](https://github.com/scverse/anndata)
- [scanpy](https://github.com/scverse/scanpy)
Expand All @@ -172,6 +169,7 @@ You can find the Licenses for each model implementation in the model repositorie
- [scGPT](https://github.com/helicalAI/helical/blob/release/helical/models/scgpt/LICENSE)
- [Geneformer](https://github.com/helicalAI/helical/blob/release/helical/models/geneformer/LICENSE)
- [UCE](https://github.com/helicalAI/helical/blob/release/helical/models/uce/LICENSE)
- [TranscriptFormer](https://github.com/helicalAI/helical/blob/release/helical/models/transcriptformer/LICENSE)
- [HyenaDNA](https://github.com/helicalAI/helical/blob/release/helical/models/hyena_dna/LICENSE)
- [Evo2](https://github.com/helicalAI/helical/blob/release/helical/models/evo_2/LICENSE)

Expand Down
5 changes: 5 additions & 0 deletions docs/configs/transcriptformer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
::: helical.models.transcriptformer.TranscriptFormerConfig
handler: python
options:
show_root_heading: True
show_source: True
19 changes: 10 additions & 9 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,8 @@ We will update this repo on a regular basis with new models, benchmarks, modalit
Let’s build the most exciting AI-for-Bio community together!

## What's new?
### Saving fine-tuned models

We give users the option now to save fine-tuned models. Please have a look at the examples folder [here](examples/fine_tune_models).

### Evo 2
We have integrated [Evo 2](https://github.com/ArcInstitute/evo2) into our helical package and have made a model card for it in our [Evo 2 model card](./model_cards/evo_2.md). If you would like to test the model, take a look at our [example notebook](./notebooks/Evo-2.ipynb)!
Let us know what you think and we are happy to help you with the larger model (40B parameters!) if needed!
### TranscriptFormer
We have integrated [TranscriptFormer](https://github.com/czi-ai/transcriptformer) into our helical package and have made a model card for it in our [Transcriptformer model folder](helical/models/transcriptformer/README.md). If you would like to test the model, take a look at our [example notebook](examples/notebooks/Geneformer-vs-TranscriptFormer.ipynb)!

### 🧬 Introducing Helix-mRNA-v0: Unlocking new frontiers & use cases in mRNA therapy 🧬
We’re thrilled to announce the release of our first-ever mRNA Bio Foundation Model, designed to:
Expand Down Expand Up @@ -82,6 +77,7 @@ apptainer shell --nv --fakeroot singularity/helical/
- [Geneformer](./model_cards/geneformer.md)
- [scGPT](./model_cards/scgpt.md)
- [Universal Cell Embedding (UCE)](./model_cards/uce.md)
- [TranscriptFormer](./model_cards/transcriptformer.md)

### DNA models:
- [HyenaDNA](./model_cards/hyenadna.md)
Expand All @@ -105,7 +101,7 @@ Within the `example/notebooks` folder, open the notebook of your choice. We reco
| ----------- | ----------- |----------- |
|[Quick-Start-Tutorial.ipynb](./notebooks/Quick-Start-Tutorial.ipynb)| A tutorial to quickly get used to the helical package and environment. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Quick-Start-Tutorial.ipynb)|
|[Helix-mRNA.ipynb](./notebooks/Helix-mRNA.ipynb)|An example of how to use the Helix-mRNA model.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Helix-mRNA.ipynb) |
|[Geneformer-vs-UCE.ipynb](./notebooks/Geneformer-vs-UCE.ipynb) | Zero-Shot Reference Mapping with Geneformer & UCE and compare the outcomes. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Geneformer-vs-UCE.ipynb) |
|[Geneformer-vs-TranscriptFormer.ipynb](./notebooks/Geneformer-vs-TranscriptFormer.ipynb) | Zero-Shot Reference Mapping with Geneformer & UCE and compare the outcomes. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Geneformer-vs-TranscriptFormer.ipynb) |
|[Hyena-DNA-Inference.ipynb](./notebooks/Hyena-DNA-Inference.ipynb)|An example how to do probing with HyenaDNA by training a neural network on 18 downstream classification tasks.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Hyena-DNA-Inference.ipynb)|
|[Cell-Type-Annotation.ipynb](./notebooks/Cell-Type-Annotation.ipynb)|An example how to do probing with scGPT by training a neural network to predict cell type annotations.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Cell-Type-Annotation.ipynb) |
|[Cell-Type-Classification-Fine-Tuning.ipynb](./notebooks/Cell-Type-Classification-Fine-Tuning.ipynb)|An example how to fine-tune different models on classification tasks.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Cell-Type-Classification-Fine-Tuning.ipynb) |
Expand Down Expand Up @@ -134,10 +130,15 @@ A lot of our models have been published by talented authors developing these exc
- [scGPT](https://github.com/bowang-lab/scGPT/)
- [Geneformer](https://huggingface.co/ctheodoris/Geneformer)
- [UCE](https://github.com/snap-stanford/UCE)
- [TranscriptFormer](https://github.com/czi-ai/transcriptformer)
- [HyenaDNA](https://github.com/HazyResearch/hyena-dna)
- [anndata](https://github.com/scverse/anndata)
- [scanpy](https://github.com/scverse/scanpy)
- [transformers](https://github.com/huggingface/transformers)
- [scikit-learn](https://github.com/scikit-learn/scikit-learn)
- [GenePT](https://github.com/yiqunchen/GenePT)
- [Caduceus](https://github.com/kuleshov-group/caduceus)
- [Evo2](https://github.com/ArcInstitute/evo2)

### Licenses

Expand All @@ -148,10 +149,10 @@ You can find the Licenses for each model implementation in the model repositorie
- [scGPT](https://github.com/helicalAI/helical/blob/release/helical/models/scgpt/LICENSE)
- [Geneformer](https://github.com/helicalAI/helical/blob/release/helical/models/geneformer/LICENSE)
- [UCE](https://github.com/helicalAI/helical/blob/release/helical/models/uce/LICENSE)
- [TranscriptFormer](https://github.com/helicalAI/helical/blob/release/helical/models/transcriptformer/LICENSE)
- [HyenaDNA](https://github.com/helicalAI/helical/blob/release/helical/models/hyena_dna/LICENSE)
- [Evo2](https://github.com/helicalAI/helical/blob/release/helical/models/evo_2/LICENSE)


## Citation

Please use this BibTeX to cite this repository in your publications:
Expand Down
1 change: 1 addition & 0 deletions docs/model_cards/transcriptformer.md
9 changes: 9 additions & 0 deletions docs/models/transcriptformer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
::: helical.models.transcriptformer.TranscriptFormer
handler: python
options:
members:
- process_data
- get_embeddings
- get_output_adata
show_root_heading: True
show_source: True
1 change: 1 addition & 0 deletions docs/notebooks/Geneformer-vs-TranscriptFormer.ipynb
1 change: 0 additions & 1 deletion docs/notebooks/Geneformer-vs-UCE.ipynb

This file was deleted.

Loading
Loading