-
-
Notifications
You must be signed in to change notification settings - Fork 331
Models
Default models in Human library are:
- Face Detection: MediaPipe BlazeFace-Back
- Face Mesh: MediaPipe FaceMesh
- Face Iris Analysis: MediaPipe Iris
- Face Description: HSE FaceRes
- Emotion Detection: Oarriaga Emotion
- Body Analysis: PoseNet (AtomicBits version MNv2-075-16)
- Hand Analysis: MediaPipe Hands
- Object Detection: MobileNet-v3 with CenterNet
All models are modified from original implementation in following manner:
- Input pre-processing: image enhancements, normalization, etc.
- Caching: custom caching operations to bypass specific model runs when no changes are detected
- Output parsing: custom analysis of HeatMaps to regions, output values normalization, etc.
- Output interpolation: custom smoothing operations
- Model modifications:
- Model definition: reformatted for readability, added conversion notes and correct signatures
- Model weights: quantized to 16-bit float for size reduction
Models are not re-trained so any bias included in the original models is present in Human
For any possible bias notes, see specific model cards
Human
includes implementations for several alternative models which are normally not 1:1 replacement,
but can be switched on-the-fly due to standardized output implementation
Body detection can be switched from PoseNet
to BlazePose
, EfficientPose
or MoveNet
depending on the use case:
-
PoseNet
: Works with multiple people in frame, works with only partial people
Best described as works-anywhere, but not with great precision -
MoveNet-Lightning
: Works with single person in frame, works with only partial people
Modernized and optimized version of PoseNet with different model architecture -
MoveNet-Thunder
: Variation ofMoveNet
with higher precision but slower processing -
EfficientPose
: Works with single person in frame, works with only partial people
Experimental model that shows future promise but is not ready for wide spread usage due to performance -
BlazePose
: Works with single person in frame and that person should be fully visibile
But if conditions are met, it returns far more details (39 vs 17 keypoints) and is far more accurate
Furthermore, it returns 3D approximation of each point instead of 2D
Face description can be switched from default combined model FaceRes
to individual models
-
Gender Detection
: Oarriaga Gender -
Age Detection
: SSR-Net Age IMDB -
Face Embedding
: BecauseofAI MobileFace Embedding
Object detection can be switched from mb3-centernet
to nanodet
Model Name | Model Definition Size | Model Definition | Weights Size | Weights Name | Num Tensors |
---|---|---|---|---|---|
MediaPipe MediaPipe BlazeFace (Front) | 51K | blazeface-front.json | 393K | blazeface-front.bin | 73 |
MediaPipe BlazeFace (Back) | 78K | blazeface-back.json | 527K | blazeface-back.bin | 112 |
MediaPipe FaceMesh | 88K | facemesh.json | 2.9M | facemesh.bin | 118 |
MediaPipe Iris | 120K | iris.json | 2.5M | iris.bin | 191 |
MediaPipe Meet | 94K | meet.json | 364K | meet.bin | 163 |
MediaPipe Selfie | 82K | selfie.json | 208M | selfie.bin | 136 |
Oarriaga Emotion | 18K | emotion.json | 802K | emotion.bin | 23 |
SSR-Net Age (IMDB) | 93K | age.json | 158K | age.bin | 158 |
SSR-Net Gender (IMDB) | 92K | gender-ssrnet-imdb.json | 158K | gender-ssrnet-imdb.bin | 157 |
Oarriaga Gender | 30K | gender.json | 198K | gender.bin | 39 |
PoseNet | 47K | posenet.json | 4.8M | posenet.bin | 62 |
MediaPipe BlazePose | 158K | blazepose.json | 6.6M | blazepose.bin | 225 |
MediaPipe HandPose (HandDetect) | 126K | handdetect.json | 6.8M | handdetect.bin | 152 |
MediaPipe HandPose (HandSkeleton) | 127K | handskeleton.json | 5.3M | handskeleton.bin | 145 |
Sirius-AI MobileFaceNet | 125K | mobilefacenet.json | 5.0M | mobilefacenet.bin | 139 |
BecauseofAI MobileFace | 33K | mobileface.json | 2.1M | mobileface.bin | 75 |
FaceBoxes | 212K | faceboxes.json | 2.0M | faceboxes.bin | N/A |
NanoDet | 255K | nanodet.json | 7.3M | nanodet.bin | 229 |
MB3-CenterNet | 197K | nanodet.json | 1.9M | nanodet.bin | 267 |
FaceRes | 70K | faceres.json | 6.7M | faceres.bin | 524 |
MoveNet-Lightning | 158K | movenet-lightning.json | 4.5M | movenet-lightning.bin | 178 |
MoveNet-Thunder | 158K | movenet-thunder.json | 12M | movenet-thunder.bin | 178 |
Note: All model definitions JSON files are parsed for human readability
- Face Detection: MediaPipe BlazeFace
- Facial Spacial Geometry: MediaPipe FaceMesh
- Eye Iris Details: MediaPipe Iris
- Face Description: HSE-FaceRes
- Hand Detection & Skeleton: MediaPipe HandPose
- Body Pose Detection: BlazePose
- Body Pose Detection: PoseNet
- Body Pose Detection: MoveNet
- Body Pose Detection: EfficientPose
- Age & Gender Prediction: SSR-Net
- Emotion Prediction: Oarriaga
- Face Embedding: BecauseofAI MobileFace
- ObjectDetection: NanoDet
- ObjectDetection: MB3-CenterNet
- Image Filters: WebGLImageFilter
- Pinto Model Zoo: Pinto
Included models are included under license inherited from the original model source
Model code has substantially changed from source that it is considered a derivative work and not simple re-publishing
Human Library Wiki Pages
3D Face Detection, Body Pose, Hand & Finger Tracking, Iris Tracking, Age & Gender Prediction, Emotion Prediction & Gesture Recognition