Hi,
I have a Mosaic ML streaming dataset. Each sample in my dataset has shape (Time, N, 3).
What the model needs for training are pairs ((N, 3) , (N, 3)). To do this I extract the pairs using the time dimension. I have a function extract() which returns an iterator of pairs from a single (Time, N, 3) input.
How would I combine this with the StreamingDataset / StreamingDataloader? I want to be able to do the extraction() as part of the data processing pipeline, so it can happen in different worker threads, and do not block the main thread which is busy training.
Is this supported / how should one go about implementing this?
Hi,
I have a Mosaic ML streaming dataset. Each sample in my dataset has shape (Time, N, 3).
What the model needs for training are pairs ((N, 3) , (N, 3)). To do this I extract the pairs using the time dimension. I have a function extract() which returns an iterator of pairs from a single (Time, N, 3) input.
How would I combine this with the StreamingDataset / StreamingDataloader? I want to be able to do the extraction() as part of the data processing pipeline, so it can happen in different worker threads, and do not block the main thread which is busy training.
Is this supported / how should one go about implementing this?