Parent: #917
Scope
HuggingFace datasets parity — streaming, sharding, parquet, arrow interop, WebDataset.
Contracts in scope
contracts/crux-H-{03,06,12,13,14,16,17,18,19,20,21}-v1.yaml
Exit criteria
- 11 contracts promoted
missing → supported
- New
apr dataset surface OR integration into existing apr pretrain --dataset
ShardBatchIter handles all 11 format/streaming patterns
Dependencies
Parent: #917
Scope
HuggingFace datasets parity — streaming, sharding, parquet, arrow interop, WebDataset.
Contracts in scope
contracts/crux-H-{03,06,12,13,14,16,17,18,19,20,21}-v1.yamlExit criteria
missing→supportedapr datasetsurface OR integration into existingapr pretrain --datasetShardBatchIterhandles all 11 format/streaming patternsDependencies
ShardBatchIter(reads.binu32 LE)