This is an updated version of 3D-ResNets-PyTorch.
This is the PyTorch code for the following papers:
This code includes training, fine-tuning and testing on Kinetics, Moments in Time, ActivityNet, UCF-101, and HMDB-51.
Note that this code may not exactly reproduce the results of above papers because some updates are included.
Pre-trained models are available here.
All models are trained on Kinetics.
ResNeXt-101 achieved the best performance in our experiments. (See paper in details.)
resnet-18-kinetics.pth: --model resnet --model_depth 18 --resnet_shortcut A
resnet-34-kinetics.pth: --model resnet --model_depth 34 --resnet_shortcut A
resnet-34-kinetics-cpu.pth: CPU ver. of resnet-34-kinetics.pth
resnet-50-kinetics.pth: --model resnet --model_depth 50 --resnet_shortcut B
resnet-101-kinetics.pth: --model resnet --model_depth 101 --resnet_shortcut B
resnet-152-kinetics.pth: --model resnet --model_depth 152 --resnet_shortcut B
resnet-200-kinetics.pth: --model resnet --model_depth 200 --resnet_shortcut B
preresnet-200-kinetics.pth: --model preresnet --model_depth 200 --resnet_shortcut B
wideresnet-50-kinetics.pth: --model wideresnet --model_depth 50 --resnet_shortcut B --wide_resnet_k 2
resnext-101-kinetics.pth: --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32
densenet-121-kinetics.pth: --model densenet --model_depth 121
densenet-201-kinetics.pth: --model densenet --model_depth 201
Some of fine-tuned models on UCF-101 and HMDB-51 (split 1) are also available.
resnext-101-kinetics-ucf101_split1.pth: --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32
resnext-101-64f-kinetics-ucf101_split1.pth: --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32 --sample_duration 64
resnext-101-kinetics-hmdb51_split1.pth: --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32
resnext-101-64f-kinetics-hmdb51_split1.pth: --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32 --sample_duration 64
This table shows the averaged accuracies over top-1 and top-5 on Kinetics.
Method | Accuracies |
---|---|
ResNet-18 | 66.1 |
ResNet-34 | 71.0 |
ResNet-50 | 72.2 |
ResNet-101 | 73.3 |
ResNet-152 | 73.7 |
ResNet-200 | 73.7 |
ResNet-200 (pre-act) | 73.4 |
Wide ResNet-50 | 74.7 |
ResNeXt-101 | 75.4 |
DenseNet-121 | 70.8 |
DenseNet-201 | 72.3 |
- PyTorch
- v1.0+
conda install pytorch torchvision cudatoolkit -c pytorch
- FFmpeg, FFprobe
wget http://johnvansickle.com/ffmpeg/releases/ffmpeg-release-64bit-static.tar.xz
tar xvf ffmpeg-release-64bit-static.tar.xz
cd ./ffmpeg-3.3.3-64bit-static/; sudo cp ffmpeg ffprobe /usr/local/bin;
- Python 3
- Download videos using the official crawler.
- Convert from avi to jpg files using
utils/video_jpg.py
python utils/video_jpg.py avi_video_directory jpg_video_directory
- Generate fps files using
utils/fps.py
python utils/fps.py avi_video_directory jpg_video_directory
- Download videos using the official crawler.
- Locate test set in
video_directory/test
.
- Locate test set in
- Convert from avi to jpg files using
utils/video_jpg_kinetics.py
python utils/video_jpg_kinetics.py avi_video_directory jpg_video_directory
- Generate n_frames files using
utils/n_frames_kinetics.py
python utils/n_frames_kinetics.py jpg_video_directory
- Generate annotation file in json format similar to ActivityNet using
utils/kinetics_json.py
- The CSV files (kinetics_{train, val, test}.csv) are included in the crawler.
python utils/kinetics_json.py train_csv_path val_csv_path test_csv_path dst_json_path
- Download videos and train/test splits here.
- Convert from avi to jpg files using
utils/video_jpg_ucf101_hmdb51.py
python utils/video_jpg_ucf101_hmdb51.py avi_video_directory jpg_video_directory
- Generate n_frames files using
utils/n_frames_ucf101_hmdb51.py
python utils/n_frames_ucf101_hmdb51.py jpg_video_directory
- Generate annotation file in json format similar to ActivityNet using
utils/ucf101_json.py
annotation_dir_path
includes classInd.txt, trainlist0{1, 2, 3}.txt, testlist0{1, 2, 3}.txt
python utils/ucf101_json.py annotation_dir_path
- Download videos and train/test splits here.
- Convert from avi to jpg files using
utils/video_jpg_ucf101_hmdb51.py
python utils/video_jpg_ucf101_hmdb51.py avi_video_directory jpg_video_directory
- Generate n_frames files using
utils/n_frames_ucf101_hmdb51.py
python utils/n_frames_ucf101_hmdb51.py jpg_video_directory
- Generate annotation file in json format similar to ActivityNet using
utils/hmdb51_json.py
annotation_dir_path
includes brush_hair_test_split1.txt, ...
python utils/hmdb51_json.py annotation_dir_path
Assume the structure of data directories is the following:
~/
data/
kinetics_videos/
jpg/
.../ (directories of class names)
.../ (directories of video names)
... (jpg files)
results/
save_100.pth
kinetics.json
Confirm all options.
python main.lua -h
Train ResNets-34 on the Kinetics dataset (400 classes) with 4 CPU threads (for data loading).
Batch size is 128.
Save models at every 5 epochs.
All GPUs is used for the training.
If you want a part of GPUs, use CUDA_VISIBLE_DEVICES=...
.
python main.py --root_path ~/data --video_path kinetics_videos/jpg --annotation_path kinetics.json \
--result_path results --dataset kinetics --model resnet \
--model_depth 34 --n_classes 400 --batch_size 128 --n_threads 4 --checkpoint 5
Continue Training from epoch 101. (~/data/results/save_100.pth is loaded.)
python main.py --root_path ~/data --video_path kinetics_videos/jpg --annotation_path kinetics.json \
--result_path results --dataset kinetics --resume_path results/save_100.pth \
--model_depth 34 --n_classes 400 --batch_size 128 --n_threads 4 --checkpoint 5
Fine-tuning conv5_x and fc layers of a pretrained model (~/data/models/resnet-34-kinetics.pth) on UCF-101.
python main.py --root_path ~/data --video_path ucf101_videos/jpg --annotation_path ucf101_01.json \
--result_path results --dataset ucf101 --n_classes 400 --n_finetune_classes 101 \
--pretrain_path models/resnet-34-kinetics.pth --ft_begin_index 4 \
--model resnet --model_depth 34 --resnet_shortcut A --batch_size 128 --n_threads 4 --checkpoint 5