Dataset Preparation

Kinetics

The Kinetics Dataset could be downloaded via the code released by ActivityNet:

Download the videos via the official scripts.
After all the videos were downloaded, resize the video to the short edge size of 256, then prepare the csv files for training, validation, and testing set as train.csv, val.csv, test.csv. The format of the csv file is:

path_to_video_1 label_1
path_to_video_2 label_2
path_to_video_3 label_3
...
path_to_video_N label_N

All the Kinetics models in the Model Zoo are trained and tested with the same data as Non-local Network. For dataset specific issues, please reach out to the dataset provider.

AVA

The AVA Dataset could be downloaded from the official site

We followed the same downloading and preprocessing procedure as the Long-Term Feature Banks for Detailed Video Understanding do.

You could follow these steps to download and preprocess the data:

Download videos

DATA_DIR="../../data/ava/videos"

if [[ ! -d "${DATA_DIR}" ]]; then
  echo "${DATA_DIR} doesn't exist. Creating it.";
  mkdir -p ${DATA_DIR}
fi

wget https://s3.amazonaws.com/ava-dataset/annotations/ava_file_names_trainval_v2.1.txt

for line in $(cat ava_file_names_trainval_v2.1.txt)
do
  wget https://s3.amazonaws.com/ava-dataset/trainval/$line -P ${DATA_DIR}
done

Cut each video from its 15th to 30th minute

IN_DATA_DIR="../../data/ava/videos"
OUT_DATA_DIR="../../data/ava/videos_15min"

if [[ ! -d "${OUT_DATA_DIR}" ]]; then
  echo "${OUT_DATA_DIR} doesn't exist. Creating it.";
  mkdir -p ${OUT_DATA_DIR}
fi

for video in $(ls -A1 -U ${IN_DATA_DIR}/*)
do
  out_name="${OUT_DATA_DIR}/${video##*/}"
  if [ ! -f "${out_name}" ]; then
    ffmpeg -ss 900 -t 901 -i "${video}" "${out_name}"
  fi
done

Extract frames

IN_DATA_DIR="../../data/ava/videos_15min"
OUT_DATA_DIR="../../data/ava/frames"

if [[ ! -d "${OUT_DATA_DIR}" ]]; then
  echo "${OUT_DATA_DIR} doesn't exist. Creating it.";
  mkdir -p ${OUT_DATA_DIR}
fi

for video in $(ls -A1 -U ${IN_DATA_DIR}/*)
do
  video_name=${video##*/}

  if [[ $video_name = *".webm" ]]; then
    video_name=${video_name::-5}
  else
    video_name=${video_name::-4}
  fi

  out_video_dir=${OUT_DATA_DIR}/${video_name}/
  mkdir -p "${out_video_dir}"

  out_name="${out_video_dir}/${video_name}_%06d.jpg"

  ffmpeg -i "${video}" -r 30 -q:v 1 "${out_name}"
done

Download annotations

DATA_DIR="../../data/ava/annotations"

if [[ ! -d "${DATA_DIR}" ]]; then
  echo "${DATA_DIR} doesn't exist. Creating it.";
  mkdir -p ${DATA_DIR}
fi

wget https://research.google.com/ava/download/ava_train_v2.1.csv -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_val_v2.1.csv -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_action_list_v2.1_for_activitynet_2018.pbtxt -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_train_excluded_timestamps_v2.1.csv -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_val_excluded_timestamps_v2.1.csv -P ${DATA_DIR}

Download "frame lists" (train, val) and put them in the frame_lists folder (see structure above).
Download person boxes (train, val, test) and put them in the annotations folder (see structure above). If you prefer to use your own person detector, please see details in here.

Download the ava dataset with the following structure:

ava
|_ frames
|  |_ [video name 0]
|  |  |_ [video name 0]_000001.jpg
|  |  |_ [video name 0]_000002.jpg
|  |  |_ ...
|  |_ [video name 1]
|     |_ [video name 1]_000001.jpg
|     |_ [video name 1]_000002.jpg
|     |_ ...
|_ frame_lists
|  |_ train.csv
|  |_ val.csv
|_ annotations
   |_ [official AVA annotation files]
   |_ ava_train_predicted_boxes.csv
   |_ ava_val_predicted_boxes.csv

You could also replace the v2.1 by v2.2 if you need the AVA v2.2 annotation. You can also download some pre-prepared annotations from here.

Charades

Please download the Charades RGB frames from dataset provider.
Download the frame list from the following links: (train, val).

Please set DATA.PATH_TO_DATA_DIR to point to the folder containing the frame lists, and DATA.PATH_PREFIX to the folder containing RGB frames.

Something-Something V2

Please download the dataset and annotations from dataset provider.
Download the frame list from the following links: (train, val).
Extract the frames at 30 FPS. (We used ffmpeg-4.1.3 with command ffmpeg -i "${video}" -r 30 -q:v 1 "${out_name}" in experiments.) Please put the frames in a structure consistent with the frame lists.

Please put all annotation json files and the frame lists in the same folder, and set DATA.PATH_TO_DATA_DIR to the path. Set DATA.PATH_PREFIX to be the path to the folder containing extracted frames.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DATASET.md

DATASET.md

Dataset Preparation

Kinetics

AVA

Charades

Something-Something V2

Files

DATASET.md

Latest commit

History

DATASET.md

File metadata and controls

Dataset Preparation

Kinetics

AVA

Charades

Something-Something V2