Pretrain a domain-adapted modernbert for docket entry description text.
Run initial experiments on a small set of ~6M entries to test different strategies including:
- Vanilla masked language modeling with modernbert models
- Finetuning small models from scratch
- Finetuning models with a distillation loss + MLM
- Finetuning "sliced" variants of modernbert models
Then run a larger training run with ~50M entries using the best strategies.
Pretrain a domain-adapted modernbert for docket entry description text.
Run initial experiments on a small set of ~6M entries to test different strategies including:
Then run a larger training run with ~50M entries using the best strategies.