Skip to content

Add AIME2025, GPQA, HealthBench evaluation_test suites; unify row-lim… #202

Add AIME2025, GPQA, HealthBench evaluation_test suites; unify row-lim…

Add AIME2025, GPQA, HealthBench evaluation_test suites; unify row-lim… #202

Triggered via push August 10, 2025 17:45
Status Success
Total duration 9m 26s
Artifacts 4

ci.yml

on: push
Lint & Type Check
1m 25s
Lint & Type Check
Matrix: test-core
Batch Evaluation Tests
1m 35s
Batch Evaluation Tests
MCP End-to-End Tests
48s
MCP End-to-End Tests
Upload Coverage
5s
Upload Coverage
Fit to window
Zoom out
Zoom in

Annotations

1 warning
MCP End-to-End Tests
No files were found with the provided path: coverage.xml. No artifacts will be uploaded.

Artifacts

Produced during runtime
Name Size Digest
coverage-batch-eval Expired
31.1 KB
sha256:e89500b14da594d085184bb7b4843d9e3416675b275a1641fb5c689278145fc4
coverage-core-3.10 Expired
37 KB
sha256:955adbf8c8d3138a5230dda27dde539f1324e8e1e3dd2bef1dd0b7286f26a945
coverage-core-3.11 Expired
37 KB
sha256:ad2403784d4d59198d3376c522ea242df1fe54c4a1bcfd7df014e87fbb52bc1e
coverage-core-3.12 Expired
37 KB
sha256:883a22cf86db6991f0ab8420fb7bba88b874bfbd711692c5f2dfbc5cb009d65f