The table below shows the training annotations and their corresponding image and video sources download links:
| Dataset | Link |
|---|---|
| LVIS | https://cocodataset.org/#download (train2017) |
| obj365 | https://www.objects365.org/overview.html |
| openimages | https://storage.googleapis.com/openimages/web/index.html |
| PACO | https://cocodataset.org/#download (train2017) |
| V3Det | https://v3det.openxlab.org.cn/ |
| Dataset | Link |
|---|---|
| RefCOCO | https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcoco.zip |
| RefCOCO+ | https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcoco+.zip |
| RefCOCOg | https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcocog.zip |
| RefText | https://github.com/Buki2/STAN |
| Visual Genome | https://homes.cs.washington.edu/~ranjay/visualgenome/index.html |
| GRES | https://cocodataset.org/#download (train2014) |
| Google_Refexp | https://cocodataset.org/#download (train2014) |
| Rexverse-2M | https://huggingface.co/datasets/IDEA-Research/Rexverse-2M |
The table below shows the training annotations and their corresponding video sources download links. Note, for each video source (.mp4), please first refer to extract_mp4_frames.py to extract frames.