Skip to content

feat: auto-label detected composite datasets#41

Merged
TrevorBasinger merged 1 commit intotb/test-suite-optimizationsfrom
tb/dataset-composite-auto-labels
Mar 17, 2026
Merged

feat: auto-label detected composite datasets#41
TrevorBasinger merged 1 commit intotb/test-suite-optimizationsfrom
tb/dataset-composite-auto-labels

Conversation

@TrevorBasinger
Copy link
Member

@TrevorBasinger TrevorBasinger commented Mar 17, 2026

What this changes

When roar detects that a run output is a dataset-like composite artifact, it now creates local artifact labels for that dataset automatically.

Those labels are attached to the composite artifact itself and include the stable dataset identity fields we want to keep long term, such as dataset type, dataset ID, fingerprint information, split, version hint, and derived modality.

Why

We want dataset information to move out of artifact metadata over time and into the label system. This change starts that transition without removing the existing metadata path yet.

It also means users do not need to manually label detected datasets just to make them discoverable and queryable locally.

Notes

This preserves any existing non-dataset labels on the artifact. The system-managed dataset label subtree is updated only when the detected dataset values change.

This PR is stacked on top of #40.

Testing

  • focused unit and happy-path coverage for dataset label generation and visibility
  • broader dataset / label pytest slice
  • mypy roar

@TrevorBasinger TrevorBasinger merged commit e639003 into tb/test-suite-optimizations Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant