Add AIME2025, GPQA, HealthBench evaluation_test suites; unify row-limiting via pytest flag; clean up examples #199
ci.yml
on: pull_request
Annotations
1 warning
|
MCP End-to-End Tests
No files were found with the provided path: coverage.xml. No artifacts will be uploaded.
|
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
coverage-batch-eval
Expired
|
31.1 KB |
sha256:181615560b48f4a3d86f7e10e58312efb89b6a38fc5f431248f725d2e572ebdf
|
|
|
coverage-core-3.10
Expired
|
37 KB |
sha256:98d26fff306fbe4582e84dba54d32596c2f40b76a60a26d3c8a2fe1711bbd5ba
|
|
|
coverage-core-3.11
Expired
|
37 KB |
sha256:9792257363e948352faa9723ab0e5ed2299227e0b2f6fdc3912e0de148796df3
|
|
|
coverage-core-3.12
Expired
|
37 KB |
sha256:7011f8b79fe3d4201fd26f1e175bea26caf1dd8aa673e8e484de78fe7a469872
|
|