This repository contains the PyTorch implementation of our Paper "SUPER-REALTIME FACIAL LANDMARK DETECTION AND SHAPE FITTING BY DEEP REGRESSION OF SHAPE MODEL PARAMETERS".
pip install shapenet
pip install git+https://github.com/justusschock/shapenet
Demonstration Videos comparing our method to dlib
can be found here as overlay and here as side-by-side view
For simplicity we provide several scripts to preprocess the data, train networks, predict from networks and export the network via torch.jit
.
To get a list of the necessary and accepted arguments, run the script with the -h
flag.
prepare_all_data
: prepares multiple datasets (you can select the datasets to preprocess via arguments passed to this script)prepare_cat_dset
: Download and preprocesses the Cat-Datasetprepare_helen_dset
: Preprocesses an already downloaded ZIP file of the HELEN Dataset (Download is recommended from here since this already contains the landmarks)prepare_lfpw_dset
: Preprocesses an already downloaded ZIP file of the LFPW Dataset (Download is recommended from here since this already contains the landmarks)
train_shapenet
: Trains the shapenet with the configuration specified in an extra configuration file (exemplaric configuration for all available datasets are provided in the example_configs folder)
predict_from_net
: Predicts all images in a given directory (assumes existing groundtruths for cropping, otherwise the cropping to groundtruth could be replaced by a detector)
export_to_jit
: Traces the given model and saves it as jit-ScriptModule, which can be accessed via Python and C++
This implementation uses the delira
-Framework for training and validation handling. It supports mixed precision training and inference via NVIDIA/APEX (must be installed separately). The data-handling is outsourced to shapedata.
The following gives a short overview about the packages and classes.
The networks
subpackage contains the actual implementation of the shapenet with bindings to integrate the ShapeLayer
and other feature extractors (currently the ones registered in torchvision.models
).
The layer
subpackage contains the Python and C++ Implementations of the ShapeLayer and the Affine Transformations. It is supposed to use these Layers as layers in shapenet.networks
The jit
subpackage is a less flexible reimplementation of the subpackages shapenet.networks
and shapenet.layer
to export trained weights as jit-ScriptModule
The utils
subpackage contains everything that did not suit into the scope of any other package. Currently it is mainly responsible for parsing of configuration files.
The scripts
subpackage contains all scipts described in Scripts and their helper functions.
Currently Pretrained Weights are available for grayscale faces and cats.
For these Networks the image size is fixed to 224 and the pretrained weights can be loaded via torch.jit.load("PATH/TO/NETWORK/FILE.ptj")
. The inputs have to be of type torch.Tensor
with dtype torch.float
in shape (BATCH_SIZE, 1, 224, 224)
and normalized in a range between (0, 1).
If you use our Code for your own research, please cite our paper:
@article{Kopaczka2019,
title = {Super-Realtime Facial Landmark Detection and Shape Fitting by Deep Regression of Shape Model Parameters},
author = {Marcin Kopaczka and Justus Schock and Dorit Merhof},
year = {2019},
journal = {arXiV preprint}
}
The Paper is available as PDF on arXiv.