Issue on page /grain.experimental.html

I have been part of the Google OSS ML stack for almost a decade, so I do understand the meaning of `experimental` very clearly. I do not understand why the documentation for grain is so poor that too in the LLM era where assisted documentation writing is a thing. 

There are functionalities that I am interested in using and showcasing to other people, but the lack of documentation means I spend more time looking at the source code to figure out the right way to use an API. For example, look at the documentation of [ParquetIterDataset](https://google-grain.readthedocs.io/en/stable/grain.experimental.html#grain.experimental.ParquetIterDataset). Does it answer any of the following questions:

1. What are the read_kwargs?
2. What if I want to lazy load the shards?
3. What if I want to lazy load, but also want to ensure that next shard is already read so that I do not create data pipeline bubble during training?
4. How to maximize the performance when you have small/large number of shards?

I have been waiting for things to improve on the documentation for almost 2 years now. There are a few of us who can also help, but even for that we need to know the API well (chicken-egg problem here!)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue on page /grain.experimental.html #1238

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue on page /grain.experimental.html #1238

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions