chore: generalize eval-results LFS rule to catch versioned dumps by antoinezambelli · Pull Request #96 · antoinezambelli/forge

antoinezambelli · 2026-05-31T22:49:57Z

What

Generalize the eval-results Git LFS rule so every result-dump naming variant is captured.

- eval_results.jsonl       filter=lfs ...
- eval_results_rig*.jsonl   filter=lfs ...
+ eval_results*.jsonl        filter=lfs ...

Why

The two old patterns matched only the bare eval_results.jsonl and the now-retired rig-tagged naming (eval_results_rig*.jsonl). The current scheme is version-tagged — eval_results_v0.6.0.jsonl, eval_results_v0.7.0.jsonl — which neither pattern matched:

$ git check-attr filter eval_results_v0.7.0.jsonl
eval_results_v0.7.0.jsonl: filter: unspecified

The two existing v-files are already in LFS (added when a matching git lfs track pattern still existed), but the rule protecting them is gone. So the next release dump (eval_results_v0.7.3.jsonl) would have committed as a plain ~67MB git blob baked permanently into history — the "accidental regular-git churn" this cleans up.

The fix

One glob — eval_results*.jsonl — subsumes bare, rig-tagged, and version-tagged forms, so no future variant slips through.

Verification

git check-attr filter now returns lfs for eval_results_v0.7.0.jsonl and the future eval_results_v0.7.3.jsonl.
Existing v-file pointers unchanged (LF, same OIDs/sizes: 67MB + 45MB); git lfs fsck --pointers = OK.
Diff is .gitattributes only — no code, no proxy overlap.

The two LFS patterns matched only 'eval_results.jsonl' and the now-dead rig-tagged 'eval_results_rig*.jsonl' naming. The current scheme is version-tagged ('eval_results_v0.7.0.jsonl'), which neither pattern matched -- git check-attr reported 'unspecified' for those files, so the next release dump (eval_results_v0.7.3.jsonl) would have committed as a plain ~67MB git blob instead of an LFS pointer. Collapse both into one glob 'eval_results*.jsonl' covering bare, rig-tagged, and version-tagged variants. Existing v-file pointers are unchanged; this only closes the gap for future dumps.

#98) The wheel target is scoped to src/forge, but the sdist target had no configuration, so hatchling swept the entire working tree -- pulling the LFS eval datasets (~112MB) and the eval dashboard's node_modules (~97MB, including Windows .exe/.node binaries) into the source distribution. Add a scoped [tool.hatch.build.targets.sdist] excluding both. The sdist drops from ~26MB to ~690KB; src/forge, tests, docs, the dashboard source, and the prebuilt results HTML are all retained. report.py rebuilds the dashboard via npm on demand, so the committed node_modules was never load-bearing; the eval datasets remain in the repo via LFS. Also refresh a stale .gitignore comment to the versioned eval-results naming (post #96).

antoinezambelli merged commit f7fb366 into main May 31, 2026
2 checks passed

antoinezambelli deleted the az/eval-lfs-hygiene branch May 31, 2026 22:50

antoinezambelli mentioned this pull request Jun 1, 2026

chore: scope sdist to exclude eval datasets and dashboard node_modules #98

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: generalize eval-results LFS rule to catch versioned dumps#96

chore: generalize eval-results LFS rule to catch versioned dumps#96
antoinezambelli merged 1 commit into
mainfrom
az/eval-lfs-hygiene

antoinezambelli commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

antoinezambelli commented May 31, 2026

What

Why

The fix

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant