Skip to content

CRUX M7: Data depth — 11 missing stories #924

@noahgift

Description

@noahgift

Parent: #917

Scope

HuggingFace datasets parity — streaming, sharding, parquet, arrow interop, WebDataset.

Contracts in scope

  • contracts/crux-H-{03,06,12,13,14,16,17,18,19,20,21}-v1.yaml

Exit criteria

  • 11 contracts promoted missingsupported
  • New apr dataset surface OR integration into existing apr pretrain --dataset
  • ShardBatchIter handles all 11 format/streaming patterns

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    cruxCRUX competitive-research-UX specepicEpic — multi-story umbrellaphase-3CRUX phase_3_missing — implement new stories

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions