This is a CI/CD example of an evaluation report produced by giskard-vision
which compares a set of facial landmark detection models based on the following criteria:
- Performance on partial and entire facial parts
- Performance on images containing faces with different head poses (estimated with
6DRepNet
: https://github.com/thohemp/6DRepNet) - Performance on images containing people from different ethnicities (estimated with
DeepFace
: https://github.com/serengil/deepface) - Robustness against image perturbations like blurring, resizing, recoloring (performed by
opencv
: https://github.com/opencv/opencv)
Best(prediction_time) | Best(prediction_fail_rate) | Best(metric_value) | |||||||
---|---|---|---|---|---|---|---|---|---|
model | FaceAlignment | Mediapipe | OpenCV | FaceAlignment | Mediapipe | OpenCV | FaceAlignment | Mediapipe | OpenCV |
criteria | |||||||||
300W | ✓ | ✓ | ✓ | ||||||
altered color | ✓ | ✓ | ✓ | ||||||
blurred | ✓ | ✓ | ✓ | ||||||
cropped on left half | ✓ | ✓ | ✓ | ||||||
cropped on upper half | ✓ | ✓ | ✓ | ||||||
latino_ethnicity | ✓ | ✓ | ✓ | ||||||
negative_roll | ✓ | ✓ | ✓ | ||||||
positive_roll | ✓ | ✓ | ✓ | ||||||
resized with ratios: 0.5 | ✓ | ✓ | ✓ | ||||||
white_ethnicity | ✓ | ✓ | ✓ |
criteria | model | test | metric | metric_value | Best(metric_value) | prediction_time | Best(prediction_time) | prediction_fail_rate | Best(prediction_fail_rate) | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 300W | FaceAlignment | Test | NME_mean | 0.214453 | 64.1483 | 0.05 | ✓ | ||
1 | 300W | Mediapipe | Test | NME_mean | 3.08786 | 4.93624 | ✓ | 0.19 | ||
2 | 300W | OpenCV | Test | NME_mean | 0.195668 | ✓ | 29.9249 | 0.16 | ||
3 | altered color | FaceAlignment | Test | NME_mean | 0.174517 | ✓ | 58.1809 | 0.02 | ✓ | |
4 | altered color | Mediapipe | Test | NME_mean | 2.5315 | 3.96894 | ✓ | 0.79 | ||
5 | altered color | OpenCV | Test | NME_mean | 0.243567 | 28.037 | 0.14 | |||
6 | blurred | FaceAlignment | Test | NME_mean | 0.204779 | ✓ | 63.3339 | 0.04 | ✓ | |
7 | blurred | Mediapipe | Test | NME_mean | 3.26273 | 5.59357 | ✓ | 0.09 | ||
8 | blurred | OpenCV | Test | NME_mean | 0.331916 | 24.6266 | 0.12 | |||
9 | cropped on left half | FaceAlignment | Test | NME_mean | 0.0950334 | ✓ | 22.3192 | 0.820441 | ✓ | |
10 | cropped on left half | Mediapipe | Test | NME_mean | 2.39931 | 3.95156 | ✓ | 0.951029 | ||
11 | cropped on left half | OpenCV | Test | NME_mean | 0.17521 | 11.1745 | 0.825882 | |||
12 | cropped on upper half | FaceAlignment | Test | NME_mean | 0.0948426 | 26.6648 | 0.782941 | ✓ | ||
13 | cropped on upper half | Mediapipe | Test | NME_mean | 2.19568 | 4.02357 | ✓ | 0.941765 | ||
14 | cropped on upper half | OpenCV | Test | NME_mean | 0.0519043 | ✓ | 10.7939 | 0.978824 | ||
15 | latino_ethnicity | FaceAlignment | Test | NME_mean | 0.294063 | 4.06596 | 0.142857 | |||
16 | latino_ethnicity | Mediapipe | Test | NME_mean | 3.20718 | 0.469689 | ✓ | 0.285714 | ||
17 | latino_ethnicity | OpenCV | Test | NME_mean | 0.0664585 | ✓ | 3.8904 | 0 | ✓ | |
18 | negative_roll | FaceAlignment | Test | NME_mean | 0.0909015 | ✓ | 25.7093 | 0.0416667 | ✓ | |
19 | negative_roll | Mediapipe | Test | NME_mean | 3.04958 | 2.03313 | ✓ | 0.0833333 | ||
20 | negative_roll | OpenCV | Test | NME_mean | 0.0968505 | 10.9992 | 0.125 | |||
21 | positive_roll | FaceAlignment | Test | NME_mean | 0.330439 | ✓ | 30.441 | 0.0576923 | ✓ | |
22 | positive_roll | Mediapipe | Test | NME_mean | 3.13338 | 2.88912 | ✓ | 0.288462 | ||
23 | positive_roll | OpenCV | Test | NME_mean | 0.411305 | 17.3002 | 0.192308 | |||
24 | resized with ratios: 0.5 | FaceAlignment | Test | NME_mean | 0.218424 | ✓ | 59.4045 | 0.04 | ✓ | |
25 | resized with ratios: 0.5 | Mediapipe | Test | NME_mean | 3.19861 | 5.10067 | ✓ | 0.12 | ||
26 | resized with ratios: 0.5 | OpenCV | Test | NME_mean | 0.252987 | 10.9042 | 0.18 | |||
27 | white_ethnicity | FaceAlignment | Test | NME_mean | 0.084737 | ✓ | 29.2241 | 0.0384615 | ✓ | |
28 | white_ethnicity | Mediapipe | Test | NME_mean | 3.16841 | 2.83826 | ✓ | 0.173077 | ||
29 | white_ethnicity | OpenCV | Test | NME_mean | 0.0986856 | 15.3149 | 0.0769231 |