These are the evaluation benchmarks for Mamba2 3B model for the 640k checkpoint.
For Majorities of the base scores:
| Metric |
Value |
| winogrande |
0.700868 |
| truthfulqa_mc2 |
0.362402 |
| social_iqa |
0.326510 |
| sciq |
0.926 |
| piqa |
0.800326 |
| openbookqa |
0.436 |
| lambada |
4.382999 |
| lambada_openai |
4.046894 |
| lambada_standard |
4.719104 |
| hellaswag |
0.758415 |
| copa |
0.85 |
| boolq |
0.717125 |
| arc_easy |
0.709175 |
| arc_challenge |
0.420648 |
For MMLU:
| Model |
MMLU |
| mamba2_3b_640k |
0.411409 |
Benchmarks for previous checkpoints can be found here.
These are the evaluation benchmarks for Mamba2 3B model for the
640kcheckpoint.For Majorities of the base scores:
For MMLU:
Benchmarks for previous checkpoints can be found here.