Thank you for sharing the STAR work and code. I am trying to reproduce the training and evaluation settings described in the paper, and I have two questions about the OpenVid splits.
- OpenVid-1M 200K training subset
In the paper, STAR is trained on “a subset of OpenVid-1M with 200K text-video pairs.”
Could you please clarify how this 200K subset is defined?
Is there a released ID list / file list (e.g., txt/csv/json) for the 200K samples?
If it is sampled dynamically, could you share the exact sampling procedure, including any filtering rules and the random seed used?
- OpenVid30 test set
The paper mentions that OpenVid30 is separated from OpenVid-1M with no overlap with training data, and consists of the first ~100 frames of each video.
Could you share the exact video IDs included in OpenVid30 (or a list file)?
How was “no overlap” ensured in practice (e.g., by video ID, URL, hash, etc.)?
If these split files already exist in the repository, please point me to the paths. Any guidance would be greatly appreciated.
Thank you for sharing the STAR work and code. I am trying to reproduce the training and evaluation settings described in the paper, and I have two questions about the OpenVid splits.
In the paper, STAR is trained on “a subset of OpenVid-1M with 200K text-video pairs.”
Could you please clarify how this 200K subset is defined?
Is there a released ID list / file list (e.g., txt/csv/json) for the 200K samples?
If it is sampled dynamically, could you share the exact sampling procedure, including any filtering rules and the random seed used?
The paper mentions that OpenVid30 is separated from OpenVid-1M with no overlap with training data, and consists of the first ~100 frames of each video.
Could you share the exact video IDs included in OpenVid30 (or a list file)?
How was “no overlap” ensured in practice (e.g., by video ID, URL, hash, etc.)?
If these split files already exist in the repository, please point me to the paths. Any guidance would be greatly appreciated.