MediaPipe examples which stream their detections over OSC to be used in other applications.
Currently this is only tested on Windows and MacOS. It's recommended to use Python3 (>3.7
) and a virtual environment.
python install -r requirements.txt
To run an example use the basic python command to start up the script.
# start pose detection with webcam 0
python pose.py --input 0
# start pose detection with video
python pose.py --input yoga.mp4
Other parameters are documented in the following list or algorithm specific.
- input - The video input path or video camera id (default
0
) - min-detection-confidence - Minimum confidence value ([0.0, 1.0]) for the detection to be considered successful. (default
0.5
) - min-tracking-confidence - Minimum confidence value ([0.0, 1.0]) to be considered tracked successfully. (default
0.5
) - ip - OSC ip address to send to (default
127.0.0.1
) - port - OSC port to send to (default
7500
)
The landmark model currently included in MediaPipe Pose predicts the location of 33 full-body landmarks (see figure below), each with (x, y, z, visibility
). Note that the z value should be discarded as the model is currently not fully trained to predict depth, but this is something we have on the roadmap.
Reference: mediapipe/solutions/pose
Additional Parameters
--model-complexity MODEL_COMPLEXITY
Set model complexity (0=Light, 1=Full, 2=Heavy).
--no-smooth-landmarks
Disable landmark smoothing.
--static-image-mode Enables static image mode.
count
- Indicates how many poses are detected (currently only0
or1
)- list of landmarks (
33
per pose) (if pose has been detected)x
- X-Position of the landmarky
- Y-Position of the landmarkz
- Z-Position of the landmarkvisibility
- Visibility of the landmark
/mediapipe/pose [count, x, y, z, visibility, x, y, z, visibility ...]
The hand detection model is able to detect and track 21 3D landmarks.
count
- Indicates how many hands are detected- list of landmarks (
21
per hand) (if hands has been detected)x
- X-Position of the landmarky
- Y-Position of the landmarkz
- Z-Position of the landmarkvisibility
- Visibility of the landmark
/mediapipe/hands [count, x, y, z, visibility, x, y, z, visibility ...]
The face detection model is able to detect multiple faces and 5 keypoints. At the moment only the bounding box is sent over OSC.
All values are normalized to the image width and height.
count
- Indicates how many faces are detected- list of one bounding box per face (if faces has been detected)
xmin
- X-Position of the top-left bounding box anchorymin
- Y-Position of the top-left bounding box anchorwidth
- Width of the bounding boxheight
- Height of the bounding boxscore
- Confidence score of the bounding box
/mediapipe/faces [count, xmin, ymin, width, height, score, xmin, ymin, width, height, score ...]
tbd
Currently, there are very basic receiver examples for processing. Check out the examples folder.
- Example code and documentation adapted from google/mediapipe
- OSC sending and examples implemented by cansik