Name		Name	Last commit message	Last commit date
parent directory ..
s3d-rgb-mobilenet-v3-stream-jester		s3d-rgb-mobilenet-v3-stream-jester
s3d-rgb-mobilenet-v3-stream-msasl		s3d-rgb-mobilenet-v3-stream-msasl
tools/data		tools/data
DATA_JESTER.md		DATA_JESTER.md
DATA_MSASL.md		DATA_MSASL.md
README.md		README.md

README.md

Gesture Recognition

Models that are able to recognize gestures from live video stream on CPU.

MS-ASL-100 gesture set (continuous scenario)

Model Name Complexity (GFLOPs) Size (Mp) Top-1 accuracy Links GPU_NUM

s3d-rgb-mobilenet-v3-stream-msasl 6.66 4.133 84.7% model template, snapshot 2
Jester-27 gesture set (continuous scenario)

Model Name Complexity (GFLOPs) Size (Mp) Top-1 accuracy Links GPU_NUM

s3d-rgb-mobilenet-v3-stream-jester 4.23 4.133 93.58% model template, snapshot 4

Model Name	Complexity (GFLOPs)	Size (Mp)	Top-1 accuracy	Links	GPU_NUM
s3d-rgb-mobilenet-v3-stream-msasl	6.66	4.133	84.7%	model template, snapshot	2

Model Name	Complexity (GFLOPs)	Size (Mp)	Top-1 accuracy	Links	GPU_NUM
s3d-rgb-mobilenet-v3-stream-jester	4.23	4.133	93.58%	model template, snapshot	4

Datasets

Target datasets:

MS-ASL - for MS-ASL-100 gesture models
Jester - for Jester-27 gesture models

Training pipeline

0. Change a directory in your terminal to action_recognition_2.

cd <training_extensions>/pytorch_toolkit/action_recognition_2

If You have not created virtual environment yet:

./init_venv.sh

Else:

. venv/bin/activate

or if You use conda:

conda activate <environment_name>

1. Select a model template file and instantiate it in some directory.

export MODEL_TEMPLATE=`realpath ./model_templates/gesture_recognition/s3d-rgb-mobilenet-v3-stream-msasl/template.yaml`
export WORK_DIR=/tmp/my_model
python ../tools/instantiate_template.py ${MODEL_TEMPLATE} ${WORK_DIR}

2. Prepare data

Target datasets:

To prepare MS-ASL data follow instructions: DATA_MSASL.md.
To prepare JESTER data follow instructions: DATA_JESTER.md.

3. Change current directory to directory where the model template has been instantiated.

cd ${WORK_DIR}

4. Training and Fine-tuning

Try both following variants and select the best one:

Training from scratch or pre-trained weights. Only if you have a lot of data, let's say tens of thousands or even more images. This variant assumes long training process starting from big values of learning rate and eventually decreasing it according to a training schedule.
Fine-tuning from pre-trained weights. If the dataset is not big enough, then the model tends to overfit quickly, forgetting about the data that was used for pre-training and reducing the generalization ability of the final model. Hence, small starting learning rate and short training schedule are recommended.

If you would like to start training from pre-trained weights use --load-weights parameter with imagenet1000-kinetics700-snapshot.pth (you can download it here for any s3d-rgb-mobilenet-v3-stream-XXX model).

If you would like to start fine-tuning from pre-trained weights use --load-weights parameter with snapshot.pth.

python train.py \
   --load-weights ${WORK_DIR}/snapshot.pth \
   --train-ann-files ${TRAIN_ANN_FILE} \
   --train-data-roots ${TRAIN_DATA_ROOT} \
   --val-ann-files ${VAL_ANN_FILE} \
   --val-data-roots ${VAL_DATA_ROOT} \
   --save-checkpoints-to ${WORK_DIR}/outputs

NOTE: It's recommended during fine-tuning to decrease the --base-learning-rate parameter compared with default value (see ${MODEL_TEMPLATE}) to prevent from forgetting during the first iterations.

Also you can use parameters such as --epochs, --batch-size, --gpu-num, --base-learning-rate, otherwise default values will be loaded from ${MODEL_TEMPLATE}.

5. Evaluation

Evaluation procedure allows us to get quality metrics values and complexity numbers such as number of parameters and FLOPs.

To compute mean accuracy metric run:

python eval.py \
   --load-weights ${WORK_DIR}/outputs/latest.pth \
   --test-ann-files ${TEST_ANN_FILE} \
   --test-data-roots ${TEST_DATA_ROOT} \
   --save-metrics-to ${WORK_DIR}/metrics.yaml

6. Export PyTorch* model to the OpenVINO™ format

To convert PyTorch* model to the OpenVINO™ IR format run the export.py script:

python export.py \
   --load-weights ${WORK_DIR}/outputs/latest.pth \
   --save-model-to ${WORK_DIR}/export

This produces model model.xml and weights model.bin in single-precision floating-point format (FP32). The obtained model expects normalized image in planar RGB format.

7. Demo

OpenVINO™ provides the Gesture Recognition demo, which is able to use the converted model. See details in the demo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gesture-recognition

gesture-recognition

README.md

Gesture Recognition

Datasets

Training pipeline

0. Change a directory in your terminal to action_recognition_2.

1. Select a model template file and instantiate it in some directory.

2. Prepare data

3. Change current directory to directory where the model template has been instantiated.

4. Training and Fine-tuning

5. Evaluation

6. Export PyTorch* model to the OpenVINO™ format

7. Demo

Files

gesture-recognition

Directory actions

More options

Directory actions

More options

Latest commit

History

gesture-recognition

Folders and files

parent directory

README.md

Gesture Recognition

Datasets

Training pipeline

0. Change a directory in your terminal to action_recognition_2.

1. Select a model template file and instantiate it in some directory.

2. Prepare data

3. Change current directory to directory where the model template has been instantiated.

4. Training and Fine-tuning

5. Evaluation

6. Export PyTorch* model to the OpenVINO™ format

7. Demo