The Kinetics Dataset could be downloaded via the code released by ActivityNet:
-
Download the videos via the official scripts.
-
After all the videos were downloaded, resize the video to the short edge size of 256, then prepare the csv files for training, validation, and testing set as
train.csv
,val.csv
,test.csv
. The format of the csv file is:
path_to_video_1 label_1
path_to_video_2 label_2
path_to_video_3 label_3
...
path_to_video_N label_N
All the Kinetics models in the Model Zoo are trained and tested with the same data as Non-local Network. For dataset specific issues, please reach out to the dataset provider.
The AVA Dataset could be downloaded from the official site
We followed the same downloading and preprocessing procedure as the Long-Term Feature Banks for Detailed Video Understanding do.
You could follow these steps to download and preprocess the data:
- Download videos
DATA_DIR="../../data/ava/videos"
if [[ ! -d "${DATA_DIR}" ]]; then
echo "${DATA_DIR} doesn't exist. Creating it.";
mkdir -p ${DATA_DIR}
fi
wget https://s3.amazonaws.com/ava-dataset/annotations/ava_file_names_trainval_v2.1.txt
for line in $(cat ava_file_names_trainval_v2.1.txt)
do
wget https://s3.amazonaws.com/ava-dataset/trainval/$line -P ${DATA_DIR}
done
- Cut each video from its 15th to 30th minute
IN_DATA_DIR="../../data/ava/videos"
OUT_DATA_DIR="../../data/ava/videos_15min"
if [[ ! -d "${OUT_DATA_DIR}" ]]; then
echo "${OUT_DATA_DIR} doesn't exist. Creating it.";
mkdir -p ${OUT_DATA_DIR}
fi
for video in $(ls -A1 -U ${IN_DATA_DIR}/*)
do
out_name="${OUT_DATA_DIR}/${video##*/}"
if [ ! -f "${out_name}" ]; then
ffmpeg -ss 900 -t 901 -i "${video}" "${out_name}"
fi
done
- Extract frames
IN_DATA_DIR="../../data/ava/videos_15min"
OUT_DATA_DIR="../../data/ava/frames"
if [[ ! -d "${OUT_DATA_DIR}" ]]; then
echo "${OUT_DATA_DIR} doesn't exist. Creating it.";
mkdir -p ${OUT_DATA_DIR}
fi
for video in $(ls -A1 -U ${IN_DATA_DIR}/*)
do
video_name=${video##*/}
if [[ $video_name = *".webm" ]]; then
video_name=${video_name::-5}
else
video_name=${video_name::-4}
fi
out_video_dir=${OUT_DATA_DIR}/${video_name}/
mkdir -p "${out_video_dir}"
out_name="${out_video_dir}/${video_name}_%06d.jpg"
ffmpeg -i "${video}" -r 30 -q:v 1 "${out_name}"
done
- Download annotations
DATA_DIR="../../data/ava/annotations"
if [[ ! -d "${DATA_DIR}" ]]; then
echo "${DATA_DIR} doesn't exist. Creating it.";
mkdir -p ${DATA_DIR}
fi
wget https://research.google.com/ava/download/ava_train_v2.1.csv -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_val_v2.1.csv -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_action_list_v2.1_for_activitynet_2018.pbtxt -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_train_excluded_timestamps_v2.1.csv -P ${DATA_DIR}
wget https://research.google.com/ava/download/ava_val_excluded_timestamps_v2.1.csv -P ${DATA_DIR}
-
Download "frame lists" (train, val) and put them in the
frame_lists
folder (see structure above). -
Download person boxes (train, val, test) and put them in the
annotations
folder (see structure above). If you prefer to use your own person detector, please see details in here.
Download the ava dataset with the following structure:
ava
|_ frames
| |_ [video name 0]
| | |_ [video name 0]_000001.jpg
| | |_ [video name 0]_000002.jpg
| | |_ ...
| |_ [video name 1]
| |_ [video name 1]_000001.jpg
| |_ [video name 1]_000002.jpg
| |_ ...
|_ frame_lists
| |_ train.csv
| |_ val.csv
|_ annotations
|_ [official AVA annotation files]
|_ ava_train_predicted_boxes.csv
|_ ava_val_predicted_boxes.csv
You could also replace the v2.1
by v2.2
if you need the AVA v2.2 annotation. You can also download some pre-prepared annotations from here.
-
Please download the Charades RGB frames from dataset provider.
-
Download the frame list from the following links: (train, val).
Please set DATA.PATH_TO_DATA_DIR
to point to the folder containing the frame lists, and DATA.PATH_PREFIX
to the folder containing RGB frames.
-
Please download the dataset and annotations from dataset provider.
-
Download the frame list from the following links: (train, val).
-
Extract the frames at 30 FPS. (We used ffmpeg-4.1.3 with command
ffmpeg -i "${video}" -r 30 -q:v 1 "${out_name}"
in experiments.) Please put the frames in a structure consistent with the frame lists.
Please put all annotation json files and the frame lists in the same folder, and set DATA.PATH_TO_DATA_DIR
to the path. Set DATA.PATH_PREFIX
to be the path to the folder containing extracted frames.