Skip to content

refactor(training): T-CONTEXTS partial — _dump_frozen_dataset takes ctx#83

Open
frapercan wants to merge 2 commits intodevelopfrom
feat/t-contexts-dump-frozen-mine
Open

refactor(training): T-CONTEXTS partial — _dump_frozen_dataset takes ctx#83
frapercan wants to merge 2 commits intodevelopfrom
feat/t-contexts-dump-frozen-mine

Conversation

@frapercan
Copy link
Copy Markdown
Owner

Acceptance criteria (master plan §24 Fase 1 — T-CONTEXTS partial)

Sixth incremental Parameter Object slice. _dump_frozen_dataset was already a thin wrapper over export_reranker_parquets (which now takes ParquetExportContext); this slice removes the redundant repackaging and drops 11 keyword args to 1.

Changes

  • protea/core/training_dump_helpers.py:
    • _dump_frozen_dataset signature collapses 11 args → 1 (ctx: ParquetExportContext).
    • Body uses ctx.stage_dir to preserve the legacy dump_dir return-key contract.
    • Single call-site in TrainRerankerAutoOperation.execute (dump_helper branch) builds the context inline (store=None + producer fields filled where the wrapper used to fill them internally).
    • Forward-reference "ParquetExportContext" in the signature (TYPE_CHECKING import) avoids a circular dep with parquet_export.

Smell budget

75 → 74 offenders. params>6: 20 → 19 (_dump_frozen_dataset retired).

Test plan

  • poetry run ruff check protea scripts
  • poetry run flake8 protea/
  • poetry run pytest tests/ --ignore=tests/test_jobs_pg.py (1163 passed, 11 skipped)
  • poetry run python scripts/check_smells.py (74 known, none new)

Branch naming

Pushed as feat/t-contexts-dump-frozen-mine because another agent had already taken the more natural feat/t-contexts-dump-frozen for an unrelated PR (#81, web benchmark heatmap).

frapercan added 2 commits May 8, 2026 18:05
…t × aspect) cell

Replaces the long flat matrix as the default view of /benchmark with a
3-aspects × N-categories grid of compact heatmap cards. Each cell ranks
embeddings by Fmax with horizontal bars colored on a perceptual scale,
the leader marked with a medal, and a slot reserved for bootstrap CI
whiskers (rendered when persisted).

The original full matrix table stays one click away behind a Heatmap |
Table toggle so the export-friendly raw-numbers view isn't lost.

New components/BenchmarkHeatmap.tsx
- bestRowsByEmbedding: collapses the matrix endpoint's per-K rows to
  one bar per embedding (the cell's best across stages/Ks already in
  the active selection).
- HSL gradient blue→violet by Fmax, bar width proportional. Color is
  supportive; the bar length is the primary signal for accessibility.
- Aspect-tinted card header (MFO blue / BPO violet / CCO emerald) so
  the per-aspect column reads at a glance.
- Hover tooltip exposes stage / K / Fmax. Future CI whiskers will
  render in the same row without changing the cell layout.

apps/web/app/[locale]/benchmark/page.tsx
- New viewMode state (default "heatmap").
- Toggle bar (role=tablist, aria-selected) rendered when there's data.
- Existing leaderboards (global + in-selection) stay above the toggle
  unchanged — they're already the per-cell story.

Behavior unchanged:
- Filters (stage, K, evaluation_set), CSV export, leaderboards, full
  matrix table — all preserved. Toggle to "Table" for the prior view.

CI: next build green; backend untouched.
Collapses ``_dump_frozen_dataset`` from 11 keyword-only args to 1
``ParquetExportContext`` argument. The helper was already a thin
wrapper around ``export_reranker_parquets`` (which now takes the
same context); this slice removes the redundant repackaging.

The single call-site in ``TrainRerankerAutoOperation.execute``
(``dump_helper`` branch) is updated to build the context inline,
filling ``store=None`` + ``producer_version`` + ``producer_git_sha``
the wrapper used to fill internally.

Body uses ``ctx.stage_dir`` for the legacy ``dump_dir`` return-key
preservation. Forward-reference ``"ParquetExportContext"`` in the
signature avoids a circular import (parquet_export pulls heavy deps).

Sizes:
- training_dump_helpers.py: -25 LOC (wrapper body shrinks; caller
  picks up 7 LOC for the inline context construction)
- Smell baseline: 75 -> 74 (params>6 20 → 19;
  ``_dump_frozen_dataset`` retired)

Local-first 5 verde (ruff + flake8 + pytest 1163 + check_smells).
@frapercan frapercan enabled auto-merge (squash) May 8, 2026 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant