The following table lists the supported datasets and provides links to the corresponding data preparation instructions.
| Dataset | Description |
|---|---|
| ActivityNet | A Large-Scale Video Benchmark for Human Activity Understanding with 19,994 videos. |
| THUMOS14 | Consists of 413 videos with temporal annotations. |
| EPIC-KITCHENS | Large-scale dataset in first-person (egocentric) vision. Latest version is EPIC-KITCHENS-100. |
| EPIC-Sounds | A large scale dataset of audio annotations capturing temporal extents and class labels. |
| Ego4D-MQ | Ego4D is the world's largest egocentric video dataset. MQ refers to its moment query task. |
| HACS | The same action taxonomy with ActivityNet, but consists of around 50K videos. |
| FineAction | Contains 103K temporal instances of 106 action categories, annotated in 17K untrimmed videos. |
| Multi-THUMOS | Dense, multilabel action annotations of THUMOS14. |
| Charades | Contains dense-labeled 9,848 annotated videos of daily activities. |
- If you meet
FileNotFoundError: [Errno 2] No such file or directory: 'xxx/missing_files.txt'
- It means you may need to generate a
missing_files.txt, which should record the missing features compared to all the videos in the annotation files. You can usepython tools/prepare_data/generate_missing_list.py annotation.json feature_folderto generate the txt file. - eg.
python tools/prepare_data/generate_missing_list.py data/fineaction/annotations/annotations_gt.json data/fineaction/features/fineaction_mae_g - In the provided feature from this codebase, we have already included this txt in the zip file.