Data Preparation

We have successfully pre-trained and fine-tuned our SIGMA on Kinetics400, Something-Something-V2, UCF101 and HMDB51 with this codebase.

The pre-processing of Something-Something-V2 can be summarized into 3 steps:
1. Download the dataset from official website.
2. Preprocess the dataset by changing the video extension from webm to .mp4 with the original height of 240px.
3. Generate annotations needed for dataloader ("<path_to_video> <video_class>" in annotations). The annotation usually includes train.csv, val.csv and test.csv ( here test.csv is the same as val.csv). We share our annotation files (train.csv, val.csv, test.csv) via Google Drive. The format of *.csv file is like:
```
dataset_root/video_1.mp4  label_1
dataset_root/video_2.mp4  label_2
dataset_root/video_3.mp4  label_3
...
dataset_root/video_N.mp4  label_N
```
The pre-processing of Kinetics400 can be summarized into 3 steps:
1. Download the dataset from official website.
2. Preprocess the dataset by resizing the short edge of video to 320px. You can refer to MMAction2 Data Benchmark for TSN and SlowOnly.
3. Generate annotations needed for dataloader ("<path_to_video> <video_class>" in annotations). The annotation usually includes train.csv, val.csv and test.csv ( here test.csv is the same as val.csv). The format of *.csv file is like:
```
dataset_root/video_1.mp4  label_1
dataset_root/video_2.mp4  label_2
dataset_root/video_3.mp4  label_3
...
dataset_root/video_N.mp4  label_N
```

Note:

We use decord to decode the videos on the fly during both pre-training and fine-tuning phases.