This is the PyTorch implementation of paper Real-time Facial Surface Geometry from Monocular Video on Mobile GPUs (https://arxiv.org/pdf/1907.06724.pdf)
This version doesn't have BatchNorm layers for fine-tuning. If you want to use such model for training, you should add these layers manually.
The procedure for conversion was pretty interesting:
- I unpacked ARCore iOS framework and took tflite model of facemesh. You can download it here
- Paper doesn't state any architecture details, so I looked at Netron graph visualization to reverse-engineer number of input-output channels and operations.
- Made them in pytorch and transfer raw weights from tflite file semi-manually into pytorch model definition. (see Convert-FaceMesh.ipynb for details)
Input for the model is expected to be cropped face with 25% margin at every side, resized to 192x192 and normalized from -1 to 1
However, predict_on_image
function normalizes your image itself, so you can even treat resized image as np.array as input
See Inference-FaceMesh.ipynb notebook for usage example