University of Bath Research Project

Methods of Deep Model Compression for Video Classification

PyTorch Implementation of Elliott Loveridge's master's thesis, including code for video classification and model compression.

Requirements

PyTorch 1.0.1.post2
OpenCV
FFmpeg, FFprobe
Python 3
Distiller
Docker

Pre-trained models

Pre-trained models can be downloaded from a Google Drive here.

Tested models:

3D MobileNetv2
3D ResNet
3D CSN

MobileNetV2's complexity may be adjusted via a 'width_multiplier' arg, with 'model_depth' choices possible for both ResNet and CSN.

Dataset Preparation

UCF101

If using Bath University's OGG, data has been saved in the following dir:

/mnt/slow0/ucf101/data/

Download videos and train/test splits here.
Convert from avi to jpg files using utils/video_jpg_ucf101_hmdb51.py

python utils/video_jpg_ucf101_hmdb51.py avi_video_directory jpg_video_directory

Generate n_frames files using utils/n_frames_ucf101_hmdb51.py

python utils/n_frames_ucf101_hmdb51.py jpg_video_directory

Generate annotation file in json format similar to ActivityNet using utils/ucf101_json.py
- annotation_dir_path includes classInd.txt, trainlist0{1, 2, 3}.txt, testlist0{1, 2, 3}.txt

python utils/ucf101_json.py annotation_dir_path

All experiments use annotation file 1, saved within this repo.

Dataset Dir

Data is assumed to be stored under the following schema:

/data/
    ucf101-videos/
        jpg/
            'ucf101-class'/
    results/
        benchmark/
            DDMM/

Docker

This project was maintained within Docker to ensure the correct installation of Distiller and other relevant packages. If running this code on Bath University's OGG Service, reference the relevant Docker image via the below examples of running a test. The Dockerfile has been provided should you wish to create a similar image yourself.

Running the code

Model configurations are given as follows:


ResNet-18	 : --model resnet  --model_depth 18  --resnet_shortcut A
ResNet-50	 : --model resnet  --model_depth 50  --resnet_shortcut B
ResNet-101	 : --model resnet  --model_depth 101 --resnet_shortcut B
MobileNetV2-1.0x  : --model mobilenetv2  --width_mult 1.0
CSN-50  : --model csn --model_depth 50

Example code runs are saved in ogg-run.py, and it is assumed this is used to run tests. Make sure to specify all parameters required for a given run. Also, make sure the bash script is runnable - use 'chmod +x ogg-run.sh' for a script.

An example run is given as follows:

Docker Code:

docker run --rm --ipc=host --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 -v "$(pwd)":/app -v "/mnt/slow0/ucf101/data":/data elliottloveridge/distiller /app/ogg-run.sh

NVIDIA_VISIBLE_DEVICES defines which available GPU to use

Then, ogg-run.py contains the following examples;

Training from scratch:

python /app/compressed-3d-cnn/main.py --root_path /data \
    --video_path ucf101_videos/jpg/ \
    --annotation_path /app/compressed-3d-cnn/annotation_UCF101/ucf101_01.json \
    --result_path results \
    --dataset ucf101 \
    --n_classes 101 \
    --batch_size 32  \
    --model mobilenetv2 \
    --width_mult 1.0 \
    --learning_rate 0.1 \
    --n_val_samples 1 \
    --n_epochs 20 \
    --test

Evaluation:

python /app/compressed-3d-cnn/utils/video_accuracy.py --root_path /data \
    --annotation_path /app/compressed-3d-cnn/annotation_UCF101/ucf101_01.json \
    --dataset ucf101 \
    --result_path results

Evaluation will create a folder for MMYY (see Dataset Dir above) and store it within the relevant sub-folder (this would be 'benchmark' for the above example)
Example runs for model-compression methods are saved within ogg-run.py, see opts.py for all required compression arguments
The distiller folder within compressed-3d-cnn contains YAML files required by the compression scheduler

Augmentations

There are several augmentation techniques available. Please check spatial_transforms.py and temporal_transforms.py for the details of the augmentation methods.

Note: "RandomHorizontalFlip" and "RandomCrop" were used for training of UCF101

Calculating Video Accuracy

In order to calculate video accuracy, you should first run the models with '--test' mode in order to create 'val.json'. Then, you need to run the evaluation script, given as an example above

Acknowledgement

I'd like to thank both Kensho Hara for releasing his codebase, the people who extended this work, and the team working on Distiller who allowed for model compression to be implemented.

Name		Name	Last commit message	Last commit date
Latest commit History 822 Commits
compressed-3d-cnn		compressed-3d-cnn
distiller		distiller
docker		docker
.gitignore		.gitignore
README.md		README.md
ogg-run.sh		ogg-run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

University of Bath Research Project

Methods of Deep Model Compression for Video Classification

Requirements

Pre-trained models

Dataset Preparation

UCF101

Dataset Dir

Docker

Running the code

Augmentations

Calculating Video Accuracy

Acknowledgement

About

Releases

Packages

Languages

elliottloveridge/compressed-video-classification

Folders and files

Latest commit

History

Repository files navigation

University of Bath Research Project

Methods of Deep Model Compression for Video Classification

Requirements

Pre-trained models

Dataset Preparation

UCF101

Dataset Dir

Docker

Running the code

Augmentations

Calculating Video Accuracy

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages