-
Download the annotations files from [here].
-
Extract the file and put the annotations in
HIT/data/jhmdb/annotations
-
Download the jhmdb videos from their official website, format the
HIT/data/jhmdb/videos
directory as follows:{video_name} |_ 00001.png |_ 00002.png ...
-
Download Annotations. Donwload AVA Actions annotations from the official dataset website. Organize those annotations file as following structure:
AVA/ |_ annotations/ | |_ ava_action_list_v2.2.pbtxt | |_ ava_action_list_v2.2_for_activitynet_2019.pbtxt | |_ ava_include_timestamps_v2.2.txt | |_ ava_train_excluded_timestamps_v2.2.csv | |_ ava_val_excluded_timestamps_v2.2.csv | |_ ava_train_v2.2.csv | |_ ava_val_v2.2.csv
-
Download Videos. Download the list of training/validation file names from CVDF repository and download all videos following those links provided there. Place the list file and video files as follows:
AVA/ |_ annotations/ | |_ ava_file_names_trainval_v2.1.txt |_ movies/ | |_ trainval/ | | |_ <MOVIE-ID-1>.mp4 | | |_ ... | | |_ <MOVIE-ID-N>.mp4
-
Create Symbolic Link. Create a symbolic link that references the AVA dataset directory by running following commands.
cd /path/to/HIT mkdir data ln -s /path/to/AVA data/AVA
-
Preprocess Videos. Running following commands to process raw movies.
python tools/process_ava_videos.py \ --movie_root data/AVA/movies/trainval \ --clip_root data/AVA/clips/trainval \ --kframe_root data/AVA/keyframes/trainval \ --process_num $[`nproc`/2]
This script extracts video clips and key frames from those raw movies. Each video clip lasts exactly one second and ranges from second 895 to second 1805. All video clips are scaled such that the shortest side becomes no larger than 360 and transcoded to have fps 25. The first frame of each video clip is extracted as key frame, which follows the definition in AVA dataset. (Key frames are only used to detect persons and objects.) The output video clips and key frames will be saved as follows:
AVA/ |_ clips/ | |_ trainval/ | | |_ <MOVIE-ID-1> | | | |_ [895~1805].mp4 | | |_ ... | | |_ <MOVIE-ID-N> | | | |_ [895~1805].mp4 |_ keyframes/ | |_ trainval/ | | |_ <MOVIE-ID-1> | | | |_ [895~1805].jpg | | |_ ... | | |_ <MOVIE-ID-N> | | | |_ [895~1805].jpg
This processing could take a long time, so we just provide the processed key frames and clips for downloading(keyframes, clips).
-
Convert Annotations. Our codes use COCO-style anntations, so we have to convert official csv annotations into COCO json format by running following commands.
python preprocess_data/ava/csv2COCO.py \ --csv_path data/AVA/annotations/ava_train_v2.2.csv \ --movie_list data/AVA/annotations/ava_file_names_trainval_v2.1.txt \ --img_root data/AVA/keyframes/trainval python preprocess_data/ava/csv2COCO.py \ --csv_path data/AVA/annotations/ava_val_v2.2.csv \ --movie_list data/AVA/annotations/ava_file_names_trainval_v2.1.txt \ --img_root data/AVA/keyframes/trainval
The converted json files will be stored in
AVA/annotations
directory as follows,*_min.json
means that the json file has no space indent.Alternatively, you could just download our json files here(train, val).
AVA/ |_ annotations/ | |_ ava_train_v2.2.json | |_ ava_train_v2.2_min.json | |_ ava_val_v2.2.json | |_ ava_val_v2.2_min.json
-
Detect Persons and Objects. The predicted person boxes for AVA validation set can be donwloaded [here]. Note that we only use ground truth person boxes for training. The object boxes files are also available for download(train, val). These files should be placed at following locations.
AVA/ |_ boxes/ | |_ ava_val_det_person_bbox.json | |_ ava_train_det_object_bbox.json | |_ ava_val_det_object_bbox.json
For person detector, we first trained it on MSCOCO keypoint dataset and then fine-tuned it on AVA dataset. The final model weight is available [here].
For object detector, we use the model provided in maskrcnn-benchmark repository, which is trained on MSCOCO dataset. Person boxes are removed from the predicted results.
-
Detect Keypoints. We use detectron2 as pose detector. Please install the project as indicated in their github repo.
You can use python preprocess_data/ava/keypoints_detection.py
as a reference to write a script to perform inference on AVA or you can directly download our files here and put them in data/AVA/annotations/
.
- PS
Some modifications to the project are needed to train on AVA (especially the files
hit/dataset/datasets/jhmdb.py
andconfig_files/hitnet.yaml
). Please refer to Alphaction for inspiration on how to change them or open an issue.