diff --git a/CHANGELOG.md b/CHANGELOG.md index b6aa01e20b..a0a7b628e8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,7 +8,8 @@ Released on December, XX, 2022. - Added Continual Transformer Encoders ([#317](https://github.com/opendr-eu/opendr/pull/317)). - Added Continual Spatio-Temporal Graph Convolutional Networks tool ([#370](https://github.com/opendr-eu/opendr/pull/370)). - Added AmbiguityMeasure utility tool ([#361](https://github.com/opendr-eu/opendr/pull/361)). - - Added SiamRPN 2D tracking tool ([#367](https://github.com/opendr-eu/opendr/pull/367)) + - Added SiamRPN 2D tracking tool ([#367](https://github.com/opendr-eu/opendr/pull/367)). + - Added High resolution pose estimation tool ([#356](https://github.com/opendr-eu/opendr/pull/356)). - Bug Fixes: - Fixed `BoundingBoxList`, `TrackingAnnotationList`, `BoundingBoxList3D` and `TrackingAnnotationList3D` confidence warnings ([#365](https://github.com/opendr-eu/opendr/pull/365)). diff --git a/docs/reference/high-resolution-pose-estimation.md b/docs/reference/high-resolution-pose-estimation.md new file mode 100644 index 0000000000..b0128397ee --- /dev/null +++ b/docs/reference/high-resolution-pose-estimation.md @@ -0,0 +1,358 @@ +## high_resolution_pose_estimation module + +The *high_resolution_pose_estimation* module contains the *HighResolutionPoseEstimationLearner* class, which inherits from the abstract class *Learner*. + +### Class HighResolutionPoseEstimationLearner +Bases: `engine.learners.Learner` + +The *HighResolutionLightweightOpenPose* class is an implementation for pose estimation in high resolution images. +This method creates a heatmap of a resized version of the input image. +Using this heatmap, the input image is cropped keeping the area of interest and then it is used for pose estimation. +Since the high resolution pose estimation method is based on the Lightweight OpenPose algorithm, the models that can be used have to be trained with the Lightweight OpenPose tool. + +In this method there are two important variables which are responsible for the increase in speed and accuracy in high resolution images. +These variables are *first_pass_height* and *second_pass_height* which define how the image is resized in this procedure. + +The [HighResolutionPoseEstimationLearner](/src/opendr/perception/pose_estimation/hr_pose_estimation/high_resolution_learner.py) class has the following public methods: + +#### `HighResolutionPoseEstimationLearner` constructor +```python +HighResolutionPoseEstimationLearner(self, device, backbone, temp_path, mobilenet_use_stride, mobilenetv2_width, shufflenet_groups, num_refinement_stages, batches_per_iter, base_height, first_pass_height, second_pass_height, percentage_arround_crop, heatmap_threshold, experiment_name, num_workers, weights_only, output_name, multiscale, scales, visualize, img_mean, img_scale, pad_value, half_precision) +``` + +Constructor parameters: + +- **device**: *{'cpu', 'cuda'}, default='cuda'*\ + Specifies the device to be used. +- **backbone**: *{'mobilenet, 'mobilenetv2', 'shufflenet'}, default='mobilenet'*\ + Specifies the backbone architecture. +- **temp_path**: *str, default='temp'*\ + Specifies a path where the algorithm looks for pretrained backbone weights, the checkpoints are saved along with the logging files. + Moreover the JSON file that contains the evaluation detections is saved here. +- **mobilenet_use_stride**: *bool, default=True*\ + Whether to add a stride value in the mobilenet model, which reduces accuracy but increases inference speed. +- **mobilenetv2_width**: *[0.0 - 1.0], default=1.0*\ + If the mobilenetv2 backbone is used, this parameter specifies its size. +- **shufflenet_groups**: *int, default=3*\ + If the shufflenet backbone is used, it specifies the number of groups to be used in grouped 1x1 convolutions in each ShuffleUnit. +- **num_refinement_stages**: *int, default=2*\ + Specifies the number of pose estimation refinement stages are added on the model's head, including the initial stage. +- **batches_per_iter**: *int, default=1*\ + Specifies per how many batches a backward optimizer step is performed. +- **base_height**: *int, default=256*\ + Specifies the height, based on which the images will be resized before performing the forward pass when using the Lightweight OpenPose. +- **first_pass_height**: *int, default=360*\ + Specifies the height that the input image will be resized during the heatmap generation procedure. +- **second_pass_height**: *int, default=540*\ + Specifies the height of the image on the second inference for pose estimation procedure. +- **percentage_arround_crop**: *float, default=0.3*\ + Specifies the percentage of an extra pad arround the cropped image +- **heatmap_threshold**: *float, default=0.1*\ + Specifies the threshold value that the heatmap elements should have during the first pass in order to trigger the second pass +- **experiment_name**: *str, default='default'*\ + String name to attach to checkpoints. +- **num_workers**: *int, default=8*\ + Specifies the number of workers to be used by the data loader. +- **weights_only**: *bool, default=True*\ + If True, only the model weights will be loaded; it won't load optimizer, scheduler, num_iter, current_epoch information. +- **output_name**: *str, default='detections.json'*\ + The name of the json file where the evaluation detections are stored, inside the temp_path. +- **multiscale**: *bool, default=False*\ + Specifies whether evaluation will run in the predefined multiple scales setup or not. + It overwrites self.scales to [0.5, 1.0, 1.5, 2.0]. +- **scales**: *list, default=None*\ + A list of integer scales that define the multiscale evaluation setup. + Used to manually set the scales instead of going for the predefined multiscale setup. +- **visualize**: *bool, default=False*\ + Specifies whether the images along with the poses will be shown, one by one, during evaluation. +- **img_mean**: *list, default=(128, 128, 128)]*\ + Specifies the mean based on which the images are normalized. +- **img_scale**: *float, default=1/256*\ + Specifies the scale based on which the images are normalized. +- **pad_value**: *list, default=(0, 0, 0)*\ + Specifies the pad value based on which the images' width is padded. +- **half_precision**: *bool, default=False*\ + Enables inference using half (fp16) precision instead of single (fp32) precision. Valid only for GPU-based inference. + + +#### `HighResolutionPoseEstimationLearner.eval` +```python +HighResolutionPoseEstimationLearner.eval(self, dataset, silent, verbose, use_subset, subset_size, images_folder_name, annotations_filename) +``` + +This method is used to evaluate a trained model on an evaluation dataset. +Returns a dictionary containing statistics regarding evaluation. + +Parameters: + +- **dataset**: *object*\ + Object that holds the evaluation dataset. + Can be of type `ExternalDataset` or a custom dataset inheriting from `DatasetIterator`. +- **silent**: *bool, default=False*\ + If set to True, disables all printing of evaluation progress reports and other information to STDOUT. +- **verbose**: *bool, default=True*\ + If set to True, enables the maximum verbosity. +- **use_subset**: *bool, default=True*\ + If set to True, a subset of the validation dataset is created and used in evaluation. +- **subset_size**: *int, default=250*\ + Controls the size of the validation subset. +- **images_folder_name**: *str, default='val2017'*\ + Folder name that contains the dataset images. + This folder should be contained in the dataset path provided. + Note that this is a folder name, not a path. +- **annotations_filename**: *str, default='person_keypoints_val2017.json'*\ + Filename of the annotations JSON file. + This file should be contained in the dataset path provided. + +#### `HighResolutionPoseEstimation.infer` +```python +HighResolutionPoseEstimation.infer(self, img, upsample_ratio, stride, track, smooth, multiscale, visualize) +``` + +This method is used to perform pose estimation on an image. +Returns a list of `engine.target.Pose` objects, where each holds a pose, or returns an empty list if no detection were made. + +Parameters: + +- **img**: *object***\ + Object of type engine.data.Image. +- **upsample_ratio**: *int, default=4*\ + Defines the amount of upsampling to be performed on the heatmaps and PAFs when resizing. +- **stride**: *int, default=8*\ + Defines the stride value for creating a padded image. +- **track**: *bool, default=True*\ + If True, infer propagates poses ids from previous frame results to track poses. +- **smooth**: *bool, default=True*\ + If True, smoothing is performed on pose keypoints between frames. +- **multiscale**: *bool, default=False*\ + Specifies whether evaluation will run in the predefined multiple scales setup or not. + + + +#### `HighResolutionPoseEstimationLearner.__first_pass` +```python +HighResolutionPoseEstimationLearner.__first_pass(self, img) +``` + +This method is used for extracting a heatmap from the input image about human locations in the picture. + +Parameters: + +- **img**: *object***\ + Object of type engine.data.Image. + + +#### `HighResolutionPoseEstimationLearner.__second_pass` +```python +HighResolutionPoseEstimationLearner.__second_pass(self, img, net_input_height_size, max_width, stride, upsample_ratio, pad_value, img_mean, img_scale) +``` + +On this method the second inference step is carried out, which estimates the human poses on the image that is provided. +Following the steps of the proposed method this image should be the cropped part of the initial high resolution image that came out from taking into account the area of interest of the heatmap generated. + +Parameters: + +- **img**: *object***\ + Object of type engine.data.Image. +- **net_input_height_size**: *int*\ + It is the height that is used for resizing the image on the pose estimation procedure. +- **max_width**: *int*\ + It is the max width that the cropped image should have in order to keep the height-width ratio below a certain value. +- **stride**: *int*\ + Is the stride value of mobilenet which reduces accuracy but increases inference speed. +- **upsample_ratio**: *int, default=4*\ + Defines the amount of upsampling to be performed on the heatmaps and PAFs when resizing. +- **pad_value**: *list, default=(0, 0, 0)*\ + Specifies the pad value based on which the images' width is padded. +- **img_mean**: *list, default=(128, 128, 128)]*\ + Specifies the mean based on which the images are normalized. +- **img_scale**: *float, default=1/256*\ + Specifies the scale based on which the images are normalized. + + +#### `HighResolutionPoseEstimation.download` +```python +HighResolutionPoseEstimation.download(self, path, mode, verbose, url) +``` + +Download utility for various Lightweight Open Pose components. +Downloads files depending on mode and saves them in the path provided. +It supports downloading: +1. the default mobilenet pretrained model +2. mobilenet, mobilenetv2 and shufflenet weights needed for training +3. a test dataset with a single COCO image and its annotation + +Parameters: + +- **path**: *str, default=None*\ + Local path to save the files, defaults to self.temp_path if None. +- **mode**: *str, default="pretrained"*\ + What file to download, can be one of "pretrained", "weights", "test_data" +- **verbose**: *bool, default=False*\ + Whether to print messages in the console. +- **url**: *str, default=OpenDR FTP URL*\ + URL of the FTP server. + +#### `HighResolutionPoseEstimation.load` +```python +HighResolutionPoseEstimation.load(self, path, verbose) +``` +This method is used to load a pretrained model that has trained with Lightweight OpenPose. The model is loaded from inside the directory of the path provided, using the metadata .json file included. + +Parameters: +- **path**: *str*\ + Path of the model to be loaded. +- **verbose**: *bool, default=False*\ + If set to True, prints a message on success. + + +#### Examples + +* **Inference and result drawing example on a test .jpg image using OpenCV.** + ```python + import cv2 + from opendr.perception.pose_estimation import HighResolutionPoseEstimationLearner + from opendr.perception.pose_estimation import draw + from opendr.engine.data import Image + + pose_estimator = HighResolutionPoseEstimationLearner(device='cuda', num_refinement_stages=2, + mobilenet_use_stride=False, half_precision=False, + first_pass_height=360, + second_pass_height=540) + pose_estimator.download() # Download the default pretrained mobilenet model in the temp_path + + pose_estimator.load("./parent_dir/openpose_default") + pose_estimator.download(mode="test_data") # Download a test data taken from COCO2017 + + img = Image.open('./parent_dir/dataset/image/000000000785_1080.jpg') + orig_img = img.opencv() # Keep original image + current_poses = pose_estimator.infer(img) + img_opencv = img.opencv() + for pose in current_poses: + draw(img_opencv, pose) + img_opencv = cv2.addWeighted(orig_img, 0.6, img_opencv, 0.4, 0) + cv2.imshow('Result', img_opencv) + cv2.waitKey(0) + ``` + + +#### Performance Evaluation + + +In order to check the performance of the *HighResolutionPoseEstimationLearner* it has been tested on various platforms and with different optimizations that Lightweight OpenPose offers. +The experiments are conducted on a 1080p image. + + +#### Lightweight OpenPose With resizing on 256 pixels +| **Method** | **CPU i7-9700K (FPS)** | **RTX 2070 (FPS)** | **Jetson TX2 (FPS)** | **Xavier NX (FPS)** | +|:------------------------------------------------:|-----------------------|-------------------|----------------------|---------------------| +| OpenDR - Baseline | 0.9 | 46.3 | 4.6 | 6.4 | +| OpenDR - Full | 2.9 | 83.1 | 11.2 | 13.5 | + + +#### Lightweight OpenPoseWithout resizing +| Method | CPU i7-9700K (FPS) | RTX 2070 (FPS) | Jetson TX2 (FPS) | Xavier NX (FPS) | +|-------------------|--------------------|-----------------|------------------|-----------------| +| OpenDR - Baseline | 0.05 | 2.6 | 0.3 | 0.5 | +| OpenDR - Full | 0.2 | 10.8 | 1.4 | 3.1 | + + +#### High-Resolution Pose Estimation +| Method | CPU i7-9700K (FPS) | RTX 2070 (FPS) | Jetson TX2 (FPS) | Xavier NX (FPS) | +|------------------------|--------------------|----------------|------------------|-----------------| +| HRPoseEstim - Baseline | 2.3 | 13.6 | 1.4 | 1.8 | +| HRPoseEstim - Half | 2.7 | 16.1 | 1.3 | 3.0 | +| HRPoseEstim - Stride | 8.1 | 27.0 | 4 | 4.9 | +| HRPoseEstim - Stages | 3.7 | 16.5 | 1.9 | 2.7 | +| HRPoseEstim - H+S | 8.2 | 25.9 | 3.6 | 5.5 | +| HRPoseEstim - Full | 10.9 | 31.7 | 4.8 | 6.9 | + +As it is shown in the previous tables, OpenDR Lightweight OpenPose achieves higher FPS when it is resizing the input image into 256 pixels. +It is easier to process that image, but as it is shown in the next tables the method falls apart when it comes to accuracy and there are no detections. + +We have evaluated the effect of using different inference settings, namely: +- *HRPoseEstim - Baseline*, which refers to directly using the High Resolution Pose Estimation method,which is based in Lightweight OpenPose, +- *HRPoseEstim - Half*, which refers to enabling inference in half (FP) precision, +- *HRPoseEstim - Stride*, which refers to increasing stride by two in the input layer of the model, +- *HRPoseEstim - Stages*, which refers to removing the refinement stages, +- *HRPoseEstim - H+S*, which uses both half precision and increased stride, and +- *HRPoseEstim - Full*, which refers to combining all three available optimization. +was used as input to the models. + +The average precision and average recall on the COCO evaluation split is also reported in the tables below: + + +#### Lightweight OpenPose with resizing +| Method | Average Precision (IoU=0.50) | Average Recall (IoU=0.50) | +|-------------------|------------------------------|---------------------------| +| OpenDR - Baseline | 0.101 | 0.267 | + | OpenDR - Full | 0.031 | 0.044 | + + + + +#### Lightweight OpenPose without resizing +| Method | Average Precision (IoU=0.50) | Average Recall (IoU=0.50) | +|-------------------|------------------------------|---------------------------| +| OpenDR - Baseline | 0.695 | 0.749 | +| OpenDR - Full | 0.389 | 0.441 | + + + +#### High Resolution Pose Estimation +| Method | Average Precision (IoU=0.50) | Average Recall (IoU=0.50) | +|------------------------|------------------------------|---------------------------| +| HRPoseEstim - Baseline | 0.615 | 0.637 | +| HRPoseEstim - Half | 0.604 | 0.621 | +| HRPoseEstim - Stride | 0.262 | 0.274 | +| HRPoseEstim - Stages | 0.539 | 0.562 | +| HRPoseEstim - H+S | 0.254 | 0.267 | +| HRPoseEstim - Full | 0.259 | 0.272 | + +The average precision and the average recall have been calculated on a 1080p version of COCO2017 validation dataset and the results are reported in the table below: + +| Method | Average Precision (IoU=0.50) | Average Recall (IoU=0.50) | +|-------------------|------------------------------|---------------------------| +| HRPoseEstim - Baseline | 0.518 | 0.536 | +| HRPoseEstim - Half | 0.509 | 0.520 | +| HRPoseEstim - Stride | 0.143 | 0.149 | +| HRPoseEstim - Stages | 0.474 | 0.496 | +| HRPoseEstim - H+S | 0.134 | 0.139 | +| HRPoseEstim - Full | 0.141 | 0.150 | + +For measuring the precision and recall we used the standard approach proposed for COCO, using an Intersection of Union (IoU) metric at 0.5. + + +#### Notes + +For the metrics of the algorithm the COCO dataset evaluation scores are used as explained [here](https://cocodataset.org/#keypoints-eval). + +Keypoints and how poses are constructed is according to the original method described [here](https://github.com/Daniil-Osokin/lightweight-human-pose-estimation.pytorch/blob/master/TRAIN-ON-CUSTOM-DATASET.md). + +Pose keypoints ids are matched as: + +| Keypoint ID | Keypoint name | Keypoint abbrev. | +|------------- |---------------- |------------------ | +| 0 | nose | nose | +| 1 | neck | neck | +| 2 | right shoulder | r_sho | +| 3 | right elbow | r_elb | +| 4 | right wrist | r_wri | +| 5 | left shoulder | l_sho | +| 6 | left elbow | l_elb | +| 7 | left wrist | l_wri | +| 8 | right hip | r_hip | +| 9 | right knee | r_knee | +| 10 | right ankle | r_ank | +| 11 | left hip | l_hip | +| 12 | left knee | l_knee | +| 13 | left ankle | l_ank | +| 14 | right eye | r_eye | +| 15 | left eye | l_eye | +| 16 | right ear | r_ear | +| 17 | left ear | l_ear | + + +#### References +[1] OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields, [arXiv](https://arxiv.org/abs/1812.08008). +[2] Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose, [arXiv](https://arxiv.org/abs/1811.12004). diff --git a/docs/reference/index.md b/docs/reference/index.md index 2b953690bc..8a5c6c1adb 100644 --- a/docs/reference/index.md +++ b/docs/reference/index.md @@ -30,6 +30,7 @@ Neither the copyright holder nor any applicable licensor will be liable for any - [landmark_based_facial_expression_recognition](landmark-based-facial-expression-recognition.md) - pose estimation: - [lightweight_open_pose Module](lightweight-open-pose.md) + - [high_resolution_pose_estimation Module](high-resolution-pose-estimation.md) - activity recognition: - [skeleton-based action recognition](skeleton-based-action-recognition.md) - [continual skeleton-based action recognition Module](skeleton-based-action-recognition.md#class-costgcnlearner) @@ -111,7 +112,8 @@ Neither the copyright holder nor any applicable licensor will be liable for any - heart anomaly detection: - [heart anomaly detection Demo](/projects/python/perception/heart_anomaly_detection) - pose estimation: - - [lightweight_open_pose Demo](/projects/python/perception/lightweight_open_pose) + - [lightweight_open_pose Demo](/projects/python/perception/pose_estimation/lightweight_open_pose) + - [high_resolution_pose_estimation Demo](/projects/python/perception/pose_estimation/high_resolution_pose_estimation) - multimodal human centric: - [rgbd_hand_gesture_learner Demo](/projects/python/perception/multimodal_human_centric/rgbd_hand_gesture_recognition) - [audiovisual_emotion_recognition Demo](/projects/python/perception/multimodal_human_centric/audiovisual_emotion_recognition) diff --git a/projects/opendr_ws/src/opendr_perception/CMakeLists.txt b/projects/opendr_ws/src/opendr_perception/CMakeLists.txt index cf7923501e..640ab5a4d3 100644 --- a/projects/opendr_ws/src/opendr_perception/CMakeLists.txt +++ b/projects/opendr_ws/src/opendr_perception/CMakeLists.txt @@ -29,6 +29,7 @@ include_directories( catkin_install_python(PROGRAMS scripts/pose_estimation_node.py + scripts/hr_pose_estimation_node.py scripts/fall_detection_node.py scripts/object_detection_2d_nanodet_node.py scripts/object_detection_2d_yolov5_node.py diff --git a/projects/opendr_ws/src/opendr_perception/scripts/hr_pose_estimation_node.py b/projects/opendr_ws/src/opendr_perception/scripts/hr_pose_estimation_node.py new file mode 100755 index 0000000000..0a471b224e --- /dev/null +++ b/projects/opendr_ws/src/opendr_perception/scripts/hr_pose_estimation_node.py @@ -0,0 +1,164 @@ +#!/usr/bin/env python +# Copyright 2020-2022 OpenDR European Project +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import argparse +import torch + +import rospy +from sensor_msgs.msg import Image as ROS_Image +from opendr_bridge.msg import OpenDRPose2D +from opendr_bridge import ROSBridge + +from opendr.engine.data import Image +from opendr.perception.pose_estimation import draw +from opendr.perception.pose_estimation import HighResolutionPoseEstimationLearner + + +class PoseEstimationNode: + + def __init__(self, input_rgb_image_topic="/usb_cam/image_raw", + output_rgb_image_topic="/opendr/image_pose_annotated", detections_topic="/opendr/poses", device="cuda", + num_refinement_stages=2, use_stride=False, half_precision=False): + """ + Creates a ROS Node for high resolution pose estimation with HR Pose Estimation. + :param input_rgb_image_topic: Topic from which we are reading the input image + :type input_rgb_image_topic: str + :param output_rgb_image_topic: Topic to which we are publishing the annotated image (if None, no annotated + image is published) + :type output_rgb_image_topic: str + :param detections_topic: Topic to which we are publishing the annotations (if None, no pose detection message + is published) + :type detections_topic: str + :param device: device on which we are running inference ('cpu' or 'cuda') + :type device: str + :param num_refinement_stages: Specifies the number of pose estimation refinement stages are added on the + model's head, including the initial stage. Can be 0, 1 or 2, with more stages meaning slower and more accurate + inference + :type num_refinement_stages: int + :param use_stride: Whether to add a stride value in the model, which reduces accuracy but increases + inference speed + :type use_stride: bool + :param half_precision: Enables inference using half (fp16) precision instead of single (fp32) precision. + Valid only for GPU-based inference + :type half_precision: bool + """ + self.input_rgb_image_topic = input_rgb_image_topic + + if output_rgb_image_topic is not None: + self.image_publisher = rospy.Publisher(output_rgb_image_topic, ROS_Image, queue_size=1) + else: + self.image_publisher = None + + if detections_topic is not None: + self.pose_publisher = rospy.Publisher(detections_topic, OpenDRPose2D, queue_size=1) + else: + self.pose_publisher = None + + self.bridge = ROSBridge() + + # Initialize the high resolution pose estimation learner + self.pose_estimator = HighResolutionPoseEstimationLearner(device=device, num_refinement_stages=num_refinement_stages, + mobilenet_use_stride=use_stride, + half_precision=half_precision) + self.pose_estimator.download(path=".", verbose=True) + self.pose_estimator.load("openpose_default") + + def listen(self): + """ + Start the node and begin processing input data. + """ + rospy.init_node('opendr_hr_pose_estimation_node', anonymous=True) + rospy.Subscriber(self.input_rgb_image_topic, ROS_Image, self.callback, queue_size=1, buff_size=10000000) + rospy.loginfo("Pose estimation node started.") + rospy.spin() + + def callback(self, data): + """ + Callback that processes the input data and publishes to the corresponding topics. + :param data: Input image message + :type data: sensor_msgs.msg.Image + """ + # Convert sensor_msgs.msg.Image into OpenDR Image + image = self.bridge.from_ros_image(data, encoding='bgr8') + + # Run pose estimation + poses = self.pose_estimator.infer(image) + + # Publish detections in ROS message + if self.pose_publisher is not None: + for pose in poses: + if pose.id is None: # Temporary fix for pose not having id + pose.id = -1 + # Convert OpenDR pose to ROS pose message using bridge and publish it + self.pose_publisher.publish(self.bridge.to_ros_pose(pose)) + + if self.image_publisher is not None: + # Get an OpenCV image back + image = image.opencv() + # Annotate image with poses + for pose in poses: + draw(image, pose) + # Convert the annotated OpenDR image to ROS image message using bridge and publish it + self.image_publisher.publish(self.bridge.to_ros_image(Image(image), encoding='bgr8')) + + +def main(): + parser = argparse.ArgumentParser() + parser.add_argument("-i", "--input_rgb_image_topic", help="Topic name for input rgb image", + type=str, default="/usb_cam/image_raw") + parser.add_argument("-o", "--output_rgb_image_topic", help="Topic name for output annotated rgb image", + type=lambda value: value if value.lower() != "none" else None, + default="/opendr/image_pose_annotated") + parser.add_argument("-d", "--detections_topic", help="Topic name for detection messages", + type=lambda value: value if value.lower() != "none" else None, + default="/opendr/poses") + parser.add_argument("--device", help="Device to use, either \"cpu\" or \"cuda\", defaults to \"cuda\"", + type=str, default="cuda", choices=["cuda", "cpu"]) + parser.add_argument("--accelerate", help="Enables acceleration flags (e.g., stride)", default=False, + action="store_true") + args = parser.parse_args() + + try: + if args.device == "cuda" and torch.cuda.is_available(): + device = "cuda" + elif args.device == "cuda": + print("GPU not found. Using CPU instead.") + device = "cpu" + else: + print("Using CPU.") + device = "cpu" + except: + print("Using CPU.") + device = "cpu" + + if args.accelerate: + stride = True + stages = 0 + half_prec = True + else: + stride = False + stages = 2 + half_prec = False + + pose_estimator_node = PoseEstimationNode(device=device, + input_rgb_image_topic=args.input_rgb_image_topic, + output_rgb_image_topic=args.output_rgb_image_topic, + detections_topic=args.detections_topic, + num_refinement_stages=stages, use_stride=stride, half_precision=half_prec) + pose_estimator_node.listen() + + +if __name__ == '__main__': + main() diff --git a/projects/opendr_ws/src/opendr_perception/scripts/pose_estimation_node.py b/projects/opendr_ws/src/opendr_perception/scripts/pose_estimation_node.py index 7aa661f14c..13aecca307 100755 --- a/projects/opendr_ws/src/opendr_perception/scripts/pose_estimation_node.py +++ b/projects/opendr_ws/src/opendr_perception/scripts/pose_estimation_node.py @@ -99,7 +99,7 @@ def callback(self, data): # Publish detections in ROS message if self.pose_publisher is not None: for pose in poses: - # Convert OpenDR pose to ROS2 pose message using bridge and publish it + # Convert OpenDR pose to ROS pose message using bridge and publish it self.pose_publisher.publish(self.bridge.to_ros_pose(pose)) if self.image_publisher is not None: @@ -108,7 +108,7 @@ def callback(self, data): # Annotate image with poses for pose in poses: draw(image, pose) - # Convert the annotated OpenDR image to ROS2 image message using bridge and publish it + # Convert the annotated OpenDR image to ROS image message using bridge and publish it self.image_publisher.publish(self.bridge.to_ros_image(Image(image), encoding='bgr8')) diff --git a/projects/python/perception/pose_estimation/high_resolution_pose_estimation/README.md b/projects/python/perception/pose_estimation/high_resolution_pose_estimation/README.md new file mode 100644 index 0000000000..da48ec4088 --- /dev/null +++ b/projects/python/perception/pose_estimation/high_resolution_pose_estimation/README.md @@ -0,0 +1,11 @@ +# High Resolution Pose Estimation + +This folder contains sample applications that demonstrate various parts of the functionality provided by the High Resolution Pose Estimation algorithm provided by OpenDR. + +More specifically, the applications provided are: + +1. demos/inference_demo.py: A tool that demonstrates how to perform inference on a single high resolution image and then draw the detected poses. +2. demos/eval_demo.py: A tool that demonstrates how to perform evaluation using the High Resolution Pose Estimation algorithm on 720p, 1080p and 1440p datasets. +3. demos/benchmarking_demo.py: A simple benchmarking tool for measuring the performance of High Resolution Pose Estimation in various platforms. + + diff --git a/projects/python/perception/pose_estimation/high_resolution_pose_estimation/demos/benchmarking_demo.py b/projects/python/perception/pose_estimation/high_resolution_pose_estimation/demos/benchmarking_demo.py new file mode 100644 index 0000000000..33142e7c99 --- /dev/null +++ b/projects/python/perception/pose_estimation/high_resolution_pose_estimation/demos/benchmarking_demo.py @@ -0,0 +1,83 @@ +# Copyright 2020-2022 OpenDR European Project +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import cv2 +import time +from opendr.perception.pose_estimation import HighResolutionPoseEstimationLearner +import argparse +from os.path import join +from tqdm import tqdm +import numpy as np + + +if __name__ == '__main__': + parser = argparse.ArgumentParser() + parser.add_argument("--device", help="Device to use (cpu, cuda)", type=str, default="cuda") + parser.add_argument("--accelerate", help="Enables acceleration flags (e.g., stride)", default=False, + action="store_true") + parser.add_argument("--height1", help="Base height of resizing in heatmap generation", default=360) + parser.add_argument("--height2", help="Base height of resizing in second inference", default=540) + + args = parser.parse_args() + + device, accelerate, base_height1, base_height2 = args.device, args.accelerate,\ + args.height1, args.height2 + + if device == 'cpu': + import torch + torch.set_flush_denormal(True) + torch.set_num_threads(8) + + if accelerate: + stride = True + stages = 0 + half_precision = True + else: + stride = False + stages = 2 + half_precision = False + + pose_estimator = HighResolutionPoseEstimationLearner(device=device, num_refinement_stages=stages, + mobilenet_use_stride=stride, half_precision=half_precision, + first_pass_height=int(base_height1), + second_pass_height=int(base_height2)) + pose_estimator.download(path=".", verbose=True) + pose_estimator.load("openpose_default") + + # Download one sample image + pose_estimator.download(path=".", mode="test_data") + image_path = join("temp", "dataset", "image", "000000000785_1080.jpg") + img = cv2.imread(image_path) + + fps_list = [] + print("Benchmarking...") + for i in tqdm(range(50)): + start_time = time.perf_counter() + # Perform inference + poses = pose_estimator.infer(img) + + end_time = time.perf_counter() + fps_list.append(1.0 / (end_time - start_time)) + print("Average FPS: %.2f" % (np.mean(fps_list))) + + # If pynvml is available, try to get memory stats for cuda + try: + if 'cuda' in device: + from pynvml import nvmlInit, nvmlDeviceGetMemoryInfo, nvmlDeviceGetHandleByIndex + + nvmlInit() + info = nvmlDeviceGetMemoryInfo(nvmlDeviceGetHandleByIndex(0)) + print("Memory allocated: %.2f MB " % (info.used / 1024 ** 2)) + except ImportError: + pass diff --git a/projects/python/perception/pose_estimation/high_resolution_pose_estimation/demos/eval_demo.py b/projects/python/perception/pose_estimation/high_resolution_pose_estimation/demos/eval_demo.py new file mode 100644 index 0000000000..9f71e0bd5c --- /dev/null +++ b/projects/python/perception/pose_estimation/high_resolution_pose_estimation/demos/eval_demo.py @@ -0,0 +1,62 @@ +# Copyright 2020-2022 OpenDR European Project +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from opendr.perception.pose_estimation import HighResolutionPoseEstimationLearner +import argparse +from os.path import join +from opendr.engine.datasets import ExternalDataset +import time + + +if __name__ == '__main__': + parser = argparse.ArgumentParser() + parser.add_argument("--device", help="Device to use (cpu, cuda)", type=str, default="cuda") + parser.add_argument("--accelerate", help="Enables acceleration flags (e.g., stride)", default=False, + action="store_true") + parser.add_argument("--height1", help="Base height of resizing in first inference", default=360) + parser.add_argument("--height2", help="Base height of resizing in second inference", default=540) + + args = parser.parse_args() + + device, accelerate, base_height1, base_height2 = args.device, args.accelerate,\ + args.height1, args.height2 + + if accelerate: + stride = True + stages = 0 + half_precision = True + else: + stride = True + stages = 2 + half_precision = True + + pose_estimator = HighResolutionPoseEstimationLearner(device=device, num_refinement_stages=stages, + mobilenet_use_stride=stride, + half_precision=half_precision, + first_pass_height=int(base_height1), + second_pass_height=int(base_height2)) + pose_estimator.download(path=".", verbose=True) + pose_estimator.load("openpose_default") + + # Download a sample dataset + pose_estimator.download(path=".", mode="test_data") + + eval_dataset = ExternalDataset(path=join("temp", "dataset"), dataset_type="COCO") + + t0 = time.time() + results_dict = pose_estimator.eval(eval_dataset, use_subset=False, verbose=True, silent=True, + images_folder_name="image", annotations_filename="annotation.json") + t1 = time.time() + print("\n Evaluation time: ", t1 - t0, "seconds") + print("Evaluation results = ", results_dict) diff --git a/projects/python/perception/pose_estimation/high_resolution_pose_estimation/demos/inference_demo.py b/projects/python/perception/pose_estimation/high_resolution_pose_estimation/demos/inference_demo.py new file mode 100644 index 0000000000..f1b588d943 --- /dev/null +++ b/projects/python/perception/pose_estimation/high_resolution_pose_estimation/demos/inference_demo.py @@ -0,0 +1,65 @@ +# Copyright 2020-2022 OpenDR European Project +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import cv2 +from opendr.perception.pose_estimation import HighResolutionPoseEstimationLearner +from opendr.perception.pose_estimation import draw +from opendr.engine.data import Image +import argparse +from os.path import join + + +if __name__ == '__main__': + parser = argparse.ArgumentParser() + parser.add_argument("--device", help="Device to use (cpu, cuda)", type=str, default="cuda") + parser.add_argument("--accelerate", help="Enables acceleration flags (e.g., stride)", default=False, + action="store_true") + parser.add_argument("--height1", help="Base height of resizing in first inference", default=360) + parser.add_argument("--height2", help="Base height of resizing in second inference", default=540) + + args = parser.parse_args() + + device, accelerate, base_height1, base_height2 = args.device, args.accelerate,\ + args.height1, args.height2 + + if accelerate: + stride = True + stages = 0 + half_precision = True + else: + stride = False + stages = 2 + half_precision = False + + pose_estimator = HighResolutionPoseEstimationLearner(device=device, num_refinement_stages=stages, + mobilenet_use_stride=stride, half_precision=half_precision, + first_pass_height=int(base_height1), + second_pass_height=int(base_height2)) + pose_estimator.download(path=".", verbose=True) + pose_estimator.load("openpose_default") + + # Download one sample image + pose_estimator.download(path=".", mode="test_data") + + image_path = join("temp", "dataset", "image", "000000000785_1080.jpg") + + img = Image.open(image_path) + + poses = pose_estimator.infer(img) + + img_cv = img.opencv() + for pose in poses: + draw(img_cv, pose) + cv2.imshow('Results', img_cv) + cv2.waitKey(0) diff --git a/projects/python/perception/lightweight_open_pose/README.md b/projects/python/perception/pose_estimation/lightweight_open_pose/README.md similarity index 100% rename from projects/python/perception/lightweight_open_pose/README.md rename to projects/python/perception/pose_estimation/lightweight_open_pose/README.md diff --git a/projects/python/perception/lightweight_open_pose/demos/benchmarking_demo.py b/projects/python/perception/pose_estimation/lightweight_open_pose/demos/benchmarking_demo.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/demos/benchmarking_demo.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/demos/benchmarking_demo.py diff --git a/projects/python/perception/lightweight_open_pose/demos/eval_demo.py b/projects/python/perception/pose_estimation/lightweight_open_pose/demos/eval_demo.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/demos/eval_demo.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/demos/eval_demo.py diff --git a/projects/python/perception/lightweight_open_pose/demos/inference_demo.py b/projects/python/perception/pose_estimation/lightweight_open_pose/demos/inference_demo.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/demos/inference_demo.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/demos/inference_demo.py diff --git a/projects/python/perception/lightweight_open_pose/demos/inference_tutorial.ipynb b/projects/python/perception/pose_estimation/lightweight_open_pose/demos/inference_tutorial.ipynb similarity index 100% rename from projects/python/perception/lightweight_open_pose/demos/inference_tutorial.ipynb rename to projects/python/perception/pose_estimation/lightweight_open_pose/demos/inference_tutorial.ipynb diff --git a/projects/python/perception/lightweight_open_pose/demos/webcam_demo.py b/projects/python/perception/pose_estimation/lightweight_open_pose/demos/webcam_demo.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/demos/webcam_demo.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/demos/webcam_demo.py diff --git a/projects/python/perception/lightweight_open_pose/jetbot/README.md b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/README.md similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/README.md rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/README.md diff --git a/projects/python/perception/lightweight_open_pose/jetbot/evaluate.sh b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/evaluate.sh similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/evaluate.sh rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/evaluate.sh diff --git a/projects/python/perception/lightweight_open_pose/jetbot/fall_controller.py b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/fall_controller.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/fall_controller.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/fall_controller.py diff --git a/projects/python/perception/lightweight_open_pose/jetbot/flask.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/flask.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/flask.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/flask.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/jetbot.sh b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/jetbot.sh similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/jetbot.sh rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/jetbot.sh diff --git a/projects/python/perception/lightweight_open_pose/jetbot/jetbot_kill.sh b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/jetbot_kill.sh similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/jetbot_kill.sh rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/jetbot_kill.sh diff --git a/projects/python/perception/lightweight_open_pose/jetbot/requirements.txt b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/requirements.txt similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/requirements.txt rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/requirements.txt diff --git a/projects/python/perception/lightweight_open_pose/jetbot/results/.keep b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/results/.keep similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/results/.keep rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/results/.keep diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/protos/human_010_sit.wbo b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/protos/human_010_sit.wbo similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/protos/human_010_sit.wbo rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/protos/human_010_sit.wbo diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/protos/human_010_standing.wbo b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/protos/human_010_standing.wbo similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/protos/human_010_standing.wbo rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/protos/human_010_standing.wbo diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/pose_demo.wbt b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/pose_demo.wbt similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/pose_demo.wbt rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/pose_demo.wbt diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/brown_eye.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/brown_eye.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/brown_eye.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/brown_eye.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyebrow005.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyebrow005.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyebrow005.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyebrow005.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyebrow009.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyebrow009.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyebrow009.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyebrow009.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyelashes01.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyelashes01.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyelashes01.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyelashes01.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyelashes04.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyelashes04.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyelashes04.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/eyelashes04.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/female_elegantsuit01_diffuse.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/female_elegantsuit01_diffuse.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/female_elegantsuit01_diffuse.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/female_elegantsuit01_diffuse.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/female_elegantsuit01_normal.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/female_elegantsuit01_normal.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/female_elegantsuit01_normal.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/female_elegantsuit01_normal.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/keylthnormals.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/keylthnormals.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/keylthnormals.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/keylthnormals.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/keylthtex1.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/keylthtex1.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/keylthtex1.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/keylthtex1.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/male_casualsuit02_diffuse.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/male_casualsuit02_diffuse.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/male_casualsuit02_diffuse.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/male_casualsuit02_diffuse.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/male_casualsuit02_normal.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/male_casualsuit02_normal.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/male_casualsuit02_normal.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/male_casualsuit02_normal.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/middleage_lightskinned_male_diffuse2.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/middleage_lightskinned_male_diffuse2.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/middleage_lightskinned_male_diffuse2.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/middleage_lightskinned_male_diffuse2.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/short01_diffuse.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/short01_diffuse.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/short01_diffuse.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/simulation_pose/worlds/textures/short01_diffuse.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/static/eu.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/static/eu.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/static/eu.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/static/eu.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/static/opendr.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/static/opendr.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/static/opendr.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/static/opendr.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/static/opendr_logo.png b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/static/opendr_logo.png similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/static/opendr_logo.png rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/static/opendr_logo.png diff --git a/projects/python/perception/lightweight_open_pose/jetbot/templates/index.html b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/templates/index.html similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/templates/index.html rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/templates/index.html diff --git a/projects/python/perception/lightweight_open_pose/jetbot/utils/__init__.py b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/__init__.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/utils/__init__.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/__init__.py diff --git a/projects/python/perception/lightweight_open_pose/jetbot/utils/active.py b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/active.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/utils/active.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/active.py diff --git a/projects/python/perception/lightweight_open_pose/jetbot/utils/pid.py b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/pid.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/utils/pid.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/pid.py diff --git a/projects/python/perception/lightweight_open_pose/jetbot/utils/pose_controller.py b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/pose_controller.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/utils/pose_controller.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/pose_controller.py diff --git a/projects/python/perception/lightweight_open_pose/jetbot/utils/pose_utils.py b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/pose_utils.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/utils/pose_utils.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/pose_utils.py diff --git a/projects/python/perception/lightweight_open_pose/jetbot/utils/robot_interface.py b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/robot_interface.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/utils/robot_interface.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/robot_interface.py diff --git a/projects/python/perception/lightweight_open_pose/jetbot/utils/visualization.py b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/visualization.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/utils/visualization.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/visualization.py diff --git a/projects/python/perception/lightweight_open_pose/jetbot/utils/webots.py b/projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/webots.py similarity index 100% rename from projects/python/perception/lightweight_open_pose/jetbot/utils/webots.py rename to projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/webots.py diff --git a/src/opendr/perception/pose_estimation/__init__.py b/src/opendr/perception/pose_estimation/__init__.py index d0d7c3cc14..66e9725a6a 100644 --- a/src/opendr/perception/pose_estimation/__init__.py +++ b/src/opendr/perception/pose_estimation/__init__.py @@ -1,6 +1,8 @@ from opendr.perception.pose_estimation.lightweight_open_pose.lightweight_open_pose_learner import \ LightweightOpenPoseLearner +from opendr.perception.pose_estimation.hr_pose_estimation.high_resolution_learner import \ + HighResolutionPoseEstimationLearner from opendr.perception.pose_estimation.lightweight_open_pose.utilities import draw, get_bbox -__all__ = ['LightweightOpenPoseLearner', 'draw', 'get_bbox'] +__all__ = ['LightweightOpenPoseLearner', 'draw', 'get_bbox', 'HighResolutionPoseEstimationLearner'] diff --git a/src/opendr/perception/pose_estimation/hr_pose_estimation/README.md b/src/opendr/perception/pose_estimation/hr_pose_estimation/README.md new file mode 100644 index 0000000000..e9a89a99ca --- /dev/null +++ b/src/opendr/perception/pose_estimation/hr_pose_estimation/README.md @@ -0,0 +1,11 @@ +# OpenDR Pose Estimation - High Resolution Open Pose + +This folder contains the OpenDR Learner and Target classes implemented for the high resolution pose estimation task. + + +# Sources + +Large parts of the Learner and utilities.py code are taken from [Daniil-Osokin/lightweight-human-pose-estimation.pytorch]( +https://github.com/Daniil-Osokin/lightweight-human-pose-estimation.pytorch) with modifications to make them compatible +with OpenDR specifications. + diff --git a/src/opendr/perception/pose_estimation/hr_pose_estimation/__init__.py b/src/opendr/perception/pose_estimation/hr_pose_estimation/__init__.py new file mode 100644 index 0000000000..e69de29bb2 diff --git a/src/opendr/perception/pose_estimation/hr_pose_estimation/high_resolution_learner.py b/src/opendr/perception/pose_estimation/hr_pose_estimation/high_resolution_learner.py new file mode 100644 index 0000000000..083af21255 --- /dev/null +++ b/src/opendr/perception/pose_estimation/hr_pose_estimation/high_resolution_learner.py @@ -0,0 +1,603 @@ +# Copyright 2020-2022 OpenDR European Project +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License.""" + +# General imports +import torchvision.transforms +import os +import cv2 +import torch +import json +import numpy as np +from tqdm import tqdm + +from urllib.request import urlretrieve + +# OpenDR engine imports +from opendr.engine.data import Image +from opendr.engine.target import Pose +from opendr.engine.constants import OPENDR_SERVER_URL + +# OpenDR lightweight_open_pose imports +from opendr.perception.pose_estimation.lightweight_open_pose.lightweight_open_pose_learner import \ + LightweightOpenPoseLearner +from opendr.perception.pose_estimation.lightweight_open_pose.algorithm.modules.load_state import \ + load_state +from opendr.perception.pose_estimation.lightweight_open_pose.algorithm.modules.keypoints import \ + extract_keypoints, group_keypoints +from opendr.perception.pose_estimation.lightweight_open_pose.algorithm.val import \ + convert_to_coco_format, run_coco_eval, normalize, pad_width + + +class HighResolutionPoseEstimationLearner(LightweightOpenPoseLearner): + + def __init__(self, device='cuda', backbone='mobilenet', + temp_path='temp', mobilenet_use_stride=True, mobilenetv2_width=1.0, shufflenet_groups=3, + num_refinement_stages=2, batches_per_iter=1, base_height=256, + first_pass_height=360, second_pass_height=540, percentage_arround_crop=0.3, heatmap_threshold=0.1, + experiment_name='default', num_workers=8, weights_only=True, output_name='detections.json', + multiscale=False, scales=None, visualize=False, + img_mean=np.array([128, 128, 128], np.float32), img_scale=np.float32(1 / 256), pad_value=(0, 0, 0), + half_precision=False): + + super(HighResolutionPoseEstimationLearner, self).__init__(device=device, backbone=backbone, temp_path=temp_path, + mobilenet_use_stride=mobilenet_use_stride, + mobilenetv2_width=mobilenetv2_width, + shufflenet_groups=shufflenet_groups, + num_refinement_stages=num_refinement_stages, + batches_per_iter=batches_per_iter, + base_height=base_height, + experiment_name=experiment_name, + num_workers=num_workers, weights_only=weights_only, + output_name=output_name, multiscale=multiscale, + scales=scales, visualize=visualize, img_mean=img_mean, + img_scale=img_scale, pad_value=pad_value, + half_precision=half_precision) + + self.first_pass_height = first_pass_height + self.second_pass_height = second_pass_height + self.perc = percentage_arround_crop + self.threshold = heatmap_threshold + + def __first_pass(self, img): + """ + This method is generating a rough heatmap of the input image in order to specify the approximate location + of humans in the picture. + + :param img: input image for heatmap generation + :type img: numpy.ndarray + + :return: returns the Part Affinity Fields (PAFs) of the humans inside the image + :rtype: numpy.ndarray + """ + + if 'cuda' in self.device: + tensor_img = torch.from_numpy(img).permute(2, 0, 1).unsqueeze(0).float().cuda() + tensor_img = tensor_img.cuda() + if self.half: + tensor_img = tensor_img.half() + else: + tensor_img = torch.from_numpy(img).permute(2, 0, 1).unsqueeze(0).float().cpu() + + stages_output = self.model(tensor_img) + + stage2_pafs = stages_output[-1] + pafs = np.transpose(stage2_pafs.squeeze().cpu().data.numpy(), (1, 2, 0)) + return pafs + + def __second_pass(self, img, net_input_height_size, max_width, stride, upsample_ratio, + pad_value=(0, 0, 0), + img_mean=np.array([128, 128, 128], np.float32), img_scale=np.float32(1 / 256)): + """ + This method detects the keypoints and estimates the pose of humans using the cropped image from the + previous step (__first_pass_). + + :param img: input image for heatmap generation + :type img: numpy.ndarray + :param net_input_height_size: the height that the input image will be resized for inference + :type net_input_height_size: int + :param max_width: this parameter is the maximum width that the resized image should have. It is introduced to + avoid cropping images with abnormal ratios e.g (30, 800) + :type max_width: int + :param upsample_ratio: Defines the amount of upsampling to be performed on the heatmaps and PAFs when resizing, + defaults to 4 + :type upsample_ratio: int, optional + + :returns: the heatmap of human figures, the part affinity filed (pafs), the scale of the resized image compred + to the initial and the pad arround the image + :rtype: heatmap, pafs -> numpy.ndarray + scale -> float + pad = -> list + """ + + height, width, _ = img.shape + scale = net_input_height_size / height + img_ratio = width / height + if img_ratio > 6: + scale = max_width / width + + scaled_img = cv2.resize(img, (0, 0), fx=scale, fy=scale, interpolation=cv2.INTER_LINEAR) + scaled_img = normalize(scaled_img, img_mean, img_scale) + min_dims = [net_input_height_size, max(scaled_img.shape[1], net_input_height_size)] + padded_img, pad = pad_width(scaled_img, stride, pad_value, min_dims) + + if 'cuda' in self.device: + tensor_img = torch.from_numpy(padded_img).permute(2, 0, 1).unsqueeze(0).float().cuda() + tensor_img = tensor_img.cuda() + if self.half: + tensor_img = tensor_img.half() + else: + tensor_img = torch.from_numpy(padded_img).permute(2, 0, 1).unsqueeze(0).float().cpu() + + stages_output = self.model(tensor_img) + + stage2_heatmaps = stages_output[-2] + heatmaps = np.transpose(stage2_heatmaps.squeeze().cpu().data.numpy(), (1, 2, 0)) + heatmaps = heatmaps.astype(np.float32) + heatmaps = cv2.resize(heatmaps, (0, 0), fx=upsample_ratio, fy=upsample_ratio, interpolation=cv2.INTER_CUBIC) + + stage2_pafs = stages_output[-1] + pafs = np.transpose(stage2_pafs.squeeze().cpu().data.numpy(), (1, 2, 0)) + pafs = pafs.astype(np.float32) + pafs = cv2.resize(pafs, (0, 0), fx=upsample_ratio, fy=upsample_ratio, interpolation=cv2.INTER_CUBIC) + + return heatmaps, pafs, scale, pad + + @staticmethod + def __pooling(img, kernel): # Pooling on input image for dimension reduction + """This method applies a pooling filter on an input image in order to resize it in a fixed shape + + :param img: input image for resizing + :rtype img: engine.data.Image class object + :param kernel: the kernel size of the pooling filter + :type kernel: int + """ + pool_img = torchvision.transforms.ToTensor()(img) + pool_img = pool_img.unsqueeze(0) + pool_img = torch.nn.functional.avg_pool2d(pool_img, kernel) + pool_img = pool_img.squeeze(0).permute(1, 2, 0).cpu().float().numpy() + return pool_img + + def fit(self, dataset, val_dataset=None, logging_path='', logging_flush_secs=30, + silent=False, verbose=True, epochs=None, use_val_subset=True, val_subset_size=250, + images_folder_name="train2017", annotations_filename="person_keypoints_train2017.json", + val_images_folder_name="val2017", val_annotations_filename="person_keypoints_val2017.json"): + """This method is not used in this implementation.""" + + raise NotImplementedError + + def optimize(self, do_constant_folding=False): + """This method is not used in this implementation.""" + + raise NotImplementedError + + def reset(self): + """This method is not used in this implementation.""" + return NotImplementedError + + def save(self, path, verbose=False): + """This method is not used in this implementation.""" + return NotImplementedError + + def eval(self, dataset, silent=False, verbose=True, use_subset=True, subset_size=250, upsample_ratio=4, + images_folder_name="val2017", annotations_filename="person_keypoints_val2017.json"): + """ + This method is used to evaluate a trained model on an evaluation dataset. + + :param dataset: object that holds the evaluation dataset. + :type dataset: ExternalDataset class object or DatasetIterator class object + :param silent: if set to True, disables all printing of evaluation progress reports and other + information to STDOUT, defaults to 'False' + :type silent: bool, optional + :param verbose: if set to True, enables the maximum verbosity, defaults to 'True' + :type verbose: bool, optional + :param use_subset: If set to True, a subset of the validation dataset is created and used in + evaluation, defaults to 'True' + :type use_subset: bool, optional + :param subset_size: Controls the size of the validation subset, defaults to '250' + :type subset_size: int, optional + param upsample_ratio: Defines the amount of upsampling to be performed on the heatmaps and PAFs + when resizing,defaults to 4 + :type upsample_ratio: int, optional + :param images_folder_name: Folder name that contains the dataset images. This folder should be contained + in the dataset path provided. Note that this is a folder name, not a path, defaults to 'val2017' + :type images_folder_name: str, optional + :param annotations_filename: Filename of the annotations json file. This file should be contained in the + dataset path provided, defaults to 'person_keypoints_val2017.json' + :type annotations_filename: str, optional + + :returns: returns stats regarding evaluation + :rtype: dict + """ + + data = super(HighResolutionPoseEstimationLearner, # NOQA + self)._LightweightOpenPoseLearner__prepare_val_dataset(dataset, use_subset=use_subset, + subset_name="val_subset.json", + subset_size=subset_size, + images_folder_default_name=images_folder_name, + annotations_filename=annotations_filename, + verbose=verbose and not silent) + # Model initialization if needed + if self.model is None and self.checkpoint_load_iter != 0: + # No model loaded, initializing new + self.init_model() + # User set checkpoint_load_iter, so they want to load a checkpoint + # Try to find the checkpoint_load_iter checkpoint + checkpoint_name = "checkpoint_iter_" + str(self.checkpoint_load_iter) + ".pth" + checkpoints_folder = os.path.join(self.parent_dir, '{}_checkpoints'.format(self.experiment_name)) + full_path = os.path.join(checkpoints_folder, checkpoint_name) + try: + checkpoint = torch.load(full_path, map_location=torch.device(self.device)) + except FileNotFoundError as e: + e.strerror = "File " + checkpoint_name + " not found inside checkpoints_folder, " \ + "provided checkpoint_load_iter (" + \ + str(self.checkpoint_load_iter) + \ + ") doesn't correspond to a saved checkpoint.\nNo such file or directory." + raise e + if not silent and verbose: + print("Loading checkpoint:", full_path) + + load_state(self.model, checkpoint) + elif self.model is None: + raise AttributeError("self.model is None. Please load a model or set checkpoint_load_iter.") + + self.model = self.model.eval() # Change model state to evaluation + self.model.to(self.device) + if "cuda" in self.device: + self.model = self.model.to(self.device) + if self.half: + self.model.half() + + if self.multiscale: + self.scales = [0.5, 1.0, 1.5, 2.0] + + coco_result = [] + num_keypoints = Pose.num_kpts + + pbar_eval = None + if not silent: + pbar_desc = "Evaluation progress" + pbar_eval = tqdm(desc=pbar_desc, total=len(data), bar_format="{l_bar}%s{bar}{r_bar}" % '\x1b[38;5;231m') + + img_height = data[0]['img'].shape[0] + + if img_height in (1080, 1440): + offset = 200 + elif img_height == 720: + offset = 50 + else: + offset = 0 + + for sample in data: + file_name = sample['file_name'] + img = sample['img'] + h, w, _ = img.shape + max_width = w + kernel = int(h / self.first_pass_height) + if kernel > 0: + pool_img = self.__pooling(img, kernel) + + else: + pool_img = img + + # ------- Heatmap Generation ------- + avg_pafs = self.__first_pass(pool_img) + avg_pafs = avg_pafs.astype(np.float32) + + pafs_map = cv2.blur(avg_pafs, (5, 5)) + pafs_map[pafs_map < self.threshold] = 0 + + heatmap = pafs_map.sum(axis=2) + heatmap = heatmap * 100 + heatmap = heatmap.astype(np.uint8) + heatmap = cv2.blur(heatmap, (5, 5)) + + contours, hierarchy = cv2.findContours(heatmap, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) + count = [] + coco_keypoints = [] + + if len(contours) > 0: + for x in contours: + count.append(x) + xdim = [] + ydim = [] + + for j in range(len(count)): # Loop for every human (every contour) + for i in range(len(count[j])): + xdim.append(count[j][i][0][0]) + ydim.append(count[j][i][0][1]) + + h, w, _ = pool_img.shape + xmin = int(np.floor(min(xdim))) * int((w / heatmap.shape[1])) * kernel + xmax = int(np.floor(max(xdim))) * int((w / heatmap.shape[1])) * kernel + ymin = int(np.floor(min(ydim))) * int((h / heatmap.shape[0])) * kernel + ymax = int(np.floor(max(ydim))) * int((h / heatmap.shape[0])) * kernel + + extra_pad_x = int(self.perc * (xmax - xmin)) # Adding an extra pad around cropped image + extra_pad_y = int(self.perc * (ymax - ymin)) + + if xmin - extra_pad_x > 0: + xmin = xmin - extra_pad_x + if xmax + extra_pad_x < img.shape[1]: + xmax = xmax + extra_pad_x + if ymin - extra_pad_y > 0: + ymin = ymin - extra_pad_y + if ymax + extra_pad_y < img.shape[0]: + ymax = ymax + extra_pad_y + + if (xmax - xmin) > 40 and (ymax - ymin) > 40: + crop_img = img[ymin:ymax, xmin:xmax] + else: + crop_img = img[offset:img.shape[0], offset:img.shape[1]] + + h, w, _ = crop_img.shape + + # ------- Second pass of the image, inference for pose estimation ------- + avg_heatmaps, avg_pafs, scale, pad = self.__second_pass(crop_img, self.second_pass_height, max_width, + self.stride, upsample_ratio) + total_keypoints_num = 0 + all_keypoints_by_type = [] + for kpt_idx in range(18): + total_keypoints_num += extract_keypoints(avg_heatmaps[:, :, kpt_idx], all_keypoints_by_type, + total_keypoints_num) + + pose_entries, all_keypoints = group_keypoints(all_keypoints_by_type, avg_pafs) + + for kpt_id in range(all_keypoints.shape[0]): + all_keypoints[kpt_id, 0] = (all_keypoints[kpt_id, 0] * self.stride / upsample_ratio - pad[1]) / scale + all_keypoints[kpt_id, 1] = (all_keypoints[kpt_id, 1] * self.stride / upsample_ratio - pad[0]) / scale + + for i in range(all_keypoints.shape[0]): + for j in range(all_keypoints.shape[1]): + if j == 0: # Adjust offset if needed for evaluation on our HR datasets + all_keypoints[i][j] = round((all_keypoints[i][j] + xmin) - offset) + if j == 1: # Adjust offset if needed for evaluation on our HR datasets + all_keypoints[i][j] = round((all_keypoints[i][j] + ymin) - offset) + + current_poses = [] + for n in range(len(pose_entries)): + if len(pose_entries[n]) == 0: + continue + pose_keypoints = np.ones((num_keypoints, 2), dtype=np.int32) * -1 + for kpt_id in range(num_keypoints): + if pose_entries[n][kpt_id] != -1.0: # keypoint was found + pose_keypoints[kpt_id, 0] = int(all_keypoints[int(pose_entries[n][kpt_id]), 0]) + pose_keypoints[kpt_id, 1] = int(all_keypoints[int(pose_entries[n][kpt_id]), 1]) + pose = Pose(pose_keypoints, pose_entries[n][18]) + current_poses.append(pose) + + coco_keypoints, scores = convert_to_coco_format(pose_entries, all_keypoints) + + image_id = int(file_name[0:file_name.rfind('.')]) + + for idx in range(len(coco_keypoints)): + coco_result.append({ + 'image_id': image_id, + 'category_id': 1, # person + 'keypoints': coco_keypoints[idx], + 'score': scores[idx] + }) + + if self.visualize: + for keypoints in coco_keypoints: + for idx in range(len(keypoints) // 3): + cv2.circle(img, (int(keypoints[idx * 3] + offset), int(keypoints[idx * 3 + 1]) + offset), + 3, (255, 0, 255), -1) + cv2.imshow('keypoints', img) + key = cv2.waitKey() + if key == 27: # esc + return + if not silent: + pbar_eval.update(1) + + with open(self.output_name, 'w') as f: + json.dump(coco_result, f, indent=4) + if len(coco_result) != 0: + if use_subset: + result = run_coco_eval(os.path.join(dataset.path, "val_subset.json"), + self.output_name, verbose=not silent) + else: + result = run_coco_eval(os.path.join(dataset.path, annotations_filename), + self.output_name, verbose=not silent) + return {"average_precision": result.stats[0:5], "average_recall": result.stats[5:]} + else: + if not silent and verbose: + print("Evaluation ended with no detections.") + return {"average_precision": [0.0 for _ in range(5)], "average_recall": [0.0 for _ in range(5)]} + + def infer(self, img, upsample_ratio=4, stride=8, track=True, smooth=True, + multiscale=False): + """ + This method is used to perform pose estimation on an image. + + :param img: image to run inference on + :rtype img: engine.data.Image class object + :param upsample_ratio: Defines the amount of upsampling to be performed on the heatmaps and PAFs + when resizing,defaults to 4 + :type upsample_ratio: int, optional + :param stride: Defines the stride value for creating a padded image + :type stride: int,optional + :param track: If True, infer propagates poses ids from previous frame results to track poses, + defaults to 'True' + :type track: bool, optional + :param smooth: If True, smoothing is performed on pose keypoints between frames, defaults to 'True' + :type smooth: bool, optional + :param multiscale: Specifies whether evaluation will run in the predefined multiple scales setup or not. + :type multiscale: bool,optional + + :return: Returns a list of engine.target.Pose objects, where each holds a pose, or returns an empty list + if no detections were made. + :rtype: list of engine.target.Pose objects + """ + current_poses = [] + + offset = 0 + + num_keypoints = Pose.num_kpts + + if not isinstance(img, Image): + img = Image(img) + + # Bring image into the appropriate format for the implementation + img = img.convert(format='channels_last', channel_order='bgr') + + h, w, _ = img.shape + max_width = w + + kernel = int(h / self.first_pass_height) + if kernel > 0: + pool_img = self.__pooling(img, kernel) + else: + pool_img = img + + # ------- Heatmap Generation ------- + avg_pafs = self.__first_pass(pool_img) + avg_pafs = avg_pafs.astype(np.float32) + pafs_map = cv2.blur(avg_pafs, (5, 5)) + + pafs_map[pafs_map < self.threshold] = 0 + + heatmap = pafs_map.sum(axis=2) + heatmap = heatmap * 100 + heatmap = heatmap.astype(np.uint8) + heatmap = cv2.blur(heatmap, (5, 5)) + + contours, hierarchy = cv2.findContours(heatmap, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) + count = [] + + if len(contours) > 0: + for x in contours: + count.append(x) + xdim = [] + ydim = [] + + for j in range(len(count)): # Loop for every human (every contour) + for i in range(len(count[j])): + xdim.append(count[j][i][0][0]) + ydim.append(count[j][i][0][1]) + + h, w, _ = pool_img.shape + xmin = int(np.floor(min(xdim))) * int((w / heatmap.shape[1])) * kernel + xmax = int(np.floor(max(xdim))) * int((w / heatmap.shape[1])) * kernel + ymin = int(np.floor(min(ydim))) * int((h / heatmap.shape[0])) * kernel + ymax = int(np.floor(max(ydim))) * int((h / heatmap.shape[0])) * kernel + + extra_pad_x = int(self.perc * (xmax - xmin)) # Adding an extra pad around cropped image + extra_pad_y = int(self.perc * (ymax - ymin)) + + if xmin - extra_pad_x > 0: + xmin = xmin - extra_pad_x + if xmax + extra_pad_x < img.shape[1]: + xmax = xmax + extra_pad_x + if ymin - extra_pad_y > 0: + ymin = ymin - extra_pad_y + if ymax + extra_pad_y < img.shape[0]: + ymax = ymax + extra_pad_y + + if (xmax - xmin) > 40 and (ymax - ymin) > 40: + crop_img = img[ymin:ymax, xmin:xmax] + else: + crop_img = img[offset:img.shape[0], offset:img.shape[1]] + + h, w, _ = crop_img.shape + + # ------- Second pass of the image, inference for pose estimation ------- + avg_heatmaps, avg_pafs, scale, pad = self.__second_pass(crop_img, self.second_pass_height, + max_width, stride, upsample_ratio) + + total_keypoints_num = 0 + all_keypoints_by_type = [] + for kpt_idx in range(18): + total_keypoints_num += extract_keypoints(avg_heatmaps[:, :, kpt_idx], all_keypoints_by_type, + total_keypoints_num) + + pose_entries, all_keypoints = group_keypoints(all_keypoints_by_type, avg_pafs) + + for kpt_id in range(all_keypoints.shape[0]): + all_keypoints[kpt_id, 0] = (all_keypoints[kpt_id, 0] * stride / upsample_ratio - pad[1]) / scale + all_keypoints[kpt_id, 1] = (all_keypoints[kpt_id, 1] * stride / upsample_ratio - pad[0]) / scale + + for i in range(all_keypoints.shape[0]): + for j in range(all_keypoints.shape[1]): + if j == 0: # Adjust offset if needed for evaluation on our HR datasets + all_keypoints[i][j] = round((all_keypoints[i][j] + xmin) - offset) + if j == 1: # Adjust offset if needed for evaluation on our HR datasets + all_keypoints[i][j] = round((all_keypoints[i][j] + ymin) - offset) + + current_poses = [] + for n in range(len(pose_entries)): + if len(pose_entries[n]) == 0: + continue + pose_keypoints = np.ones((num_keypoints, 2), dtype=np.int32) * -1 + for kpt_id in range(num_keypoints): + if pose_entries[n][kpt_id] != -1.0: # keypoint was found + pose_keypoints[kpt_id, 0] = int(all_keypoints[int(pose_entries[n][kpt_id]), 0]) + pose_keypoints[kpt_id, 1] = int(all_keypoints[int(pose_entries[n][kpt_id]), 1]) + pose = Pose(pose_keypoints, pose_entries[n][18]) + current_poses.append(pose) + + return current_poses + + def download(self, path=None, mode="pretrained", verbose=False, + url=OPENDR_SERVER_URL + "perception/pose_estimation/lightweight_open_pose/", + image_resolution=1080): + """ + Download utility for various Lightweight Open Pose components. Downloads files depending on mode and + saves them in the path provided. It supports downloading: + 1) the default mobilenet pretrained model + 2) mobilenet, mobilenetv2 and shufflenet weights needed for training + 3) a test dataset with a single COCO image and its annotation + :param path: Local path to save the files, defaults to self.temp_path if None + :type path: str, path, optional + :param mode: What file to download, can be one of "pretrained", "weights", "test_data", defaults to "pretrained" + :type mode: str, optional + :param verbose: Whether to print messages in the console, defaults to False + :type verbose: bool, optional + :param url: URL of the FTP server, defaults to OpenDR FTP URL + :type url: str, optional + :param image_resolution: Resolution of the test images to download + :type image_resolution: int, optional + """ + valid_modes = ["weights", "pretrained", "test_data"] + if mode not in valid_modes: + raise UserWarning("mode parameter not valid:", mode, ", file should be one of:", valid_modes) + + if path is None: + path = self.temp_path + + if not os.path.exists(path): + os.makedirs(path) + + if mode in ("pretrained", "weights"): + super(HighResolutionPoseEstimationLearner, self).download(path=path, mode=mode, verbose=verbose, url=url) + elif mode == "test_data": + if verbose: + print("Downloading test data...") + if not os.path.exists(os.path.join(self.temp_path, "dataset")): + os.makedirs(os.path.join(self.temp_path, "dataset")) + if not os.path.exists(os.path.join(self.temp_path, "dataset", "image")): + os.makedirs(os.path.join(self.temp_path, "dataset", "image")) + # Path for high resolution data + url = OPENDR_SERVER_URL + "perception/pose_estimation/high_resolution_pose_estimation/" + # Download annotation file + file_url = os.path.join(url, "dataset", "annotation.json") + urlretrieve(file_url, os.path.join(self.temp_path, "dataset", "annotation.json")) + # Download test image + if image_resolution in (1080, 1440): + file_url = os.path.join(url, "dataset", "image", "000000000785_" + str(image_resolution) + ".jpg") + urlretrieve(file_url, os.path.join(self.temp_path, "dataset", "image", "000000000785_1080.jpg")) + else: + raise UserWarning("There are no data for this image resolution (only 1080 and 1440 are supported).") + + if verbose: + print("Test data download complete.") diff --git a/tests/sources/tools/perception/pose_estimation/high_resolution_pose_estimation/__init__.py b/tests/sources/tools/perception/pose_estimation/high_resolution_pose_estimation/__init__.py new file mode 100644 index 0000000000..e69de29bb2 diff --git a/tests/sources/tools/perception/pose_estimation/high_resolution_pose_estimation/test_high_resolution_pose_estimation.py b/tests/sources/tools/perception/pose_estimation/high_resolution_pose_estimation/test_high_resolution_pose_estimation.py new file mode 100644 index 0000000000..aa0a27005d --- /dev/null +++ b/tests/sources/tools/perception/pose_estimation/high_resolution_pose_estimation/test_high_resolution_pose_estimation.py @@ -0,0 +1,95 @@ +# Copyright 2020-2022 OpenDR European Project +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import unittest +import shutil +from opendr.perception.pose_estimation import HighResolutionPoseEstimationLearner + +from opendr.engine.datasets import ExternalDataset +from opendr.engine.data import Image +import warnings +import os + +device = os.getenv('TEST_DEVICE') if os.getenv('TEST_DEVICE') else 'cpu' + + +def rmfile(path): + try: + os.remove(path) + except OSError as e: + print("Error: %s - %s." % (e.filename, e.strerror)) + + +def rmdir(_dir): + try: + shutil.rmtree(_dir) + except OSError as e: + print("Error: %s - %s." % (e.filename, e.strerror)) + + +class TestLightweightOpenPoseLearner(unittest.TestCase): + + @classmethod + def setUpClass(cls): + print("\n\n**********************************\nTEST High Resolution Pose Estimation Learner\n" + "**********************************") + + cls.temp_dir = os.path.join(".", "tests", "sources", "tools", "perception", + "pose_estimation", "high_resolution_pose_estimation", "hr_pose_estim_temp") + cls.pose_estimator = HighResolutionPoseEstimationLearner(device=device, temp_path=cls.temp_dir, num_workers=1) + + # Download all required files for testing + cls.pose_estimator.download(mode="pretrained") + cls.pose_estimator.download(mode="test_data") + + @classmethod + def tearDownClass(cls): + # Clean up downloaded files + rmdir(os.path.join(cls.temp_dir, "openpose_default")) + rmdir(os.path.join(cls.temp_dir, "dataset")) + + rmdir(os.path.join(cls.temp_dir)) + + def test_eval(self): + # Test eval will issue resource warnings due to some files left open in pycoco tools, + # as well as a deprecation warning due to a cast of a float to integer (hopefully they will be fixed in a future + # version) + warnings.simplefilter("ignore", ResourceWarning) + warnings.simplefilter("ignore", DeprecationWarning) + + eval_dataset = ExternalDataset(path=os.path.join(self.temp_dir, "dataset"), dataset_type="COCO") + self.pose_estimator.load(os.path.join(self.temp_dir, "openpose_default")) + results_dict = self.pose_estimator.eval(eval_dataset, use_subset=False, verbose=True, silent=True, + images_folder_name="image", annotations_filename="annotation.json") + self.assertNotEqual(len(results_dict['average_precision']), 0, + msg="Eval results dictionary contains empty list.") + self.assertNotEqual(len(results_dict['average_recall']), 0, + msg="Eval results dictionary contains empty list.") + # Cleanup + rmfile(os.path.join(self.temp_dir, "detections.json")) + warnings.simplefilter("default", ResourceWarning) + warnings.simplefilter("default", DeprecationWarning) + + def test_infer(self): + self.pose_estimator.model = None + self.pose_estimator.load(os.path.join(self.temp_dir, "openpose_default")) + + img = Image.open(os.path.join(self.temp_dir, "dataset", "image", "000000000785_1080.jpg")) + # Default pretrained mobilenet model detects 18 keypoints on img with id 785 + self.assertGreater(len(self.pose_estimator.infer(img)[0].data), 0, + msg="Returned pose must have non-zero number of keypoints.") + + +if __name__ == "__main__": + unittest.main() diff --git a/tests/test_license.py b/tests/test_license.py index 4de6bf2e66..641e8faadc 100755 --- a/tests/test_license.py +++ b/tests/test_license.py @@ -112,7 +112,7 @@ def setUp(self): skippedFilePaths = [ 'src/opendr/perception/activity_recognition/datasets/utils/decoder.py', - 'projects/python/perception/lightweight_open_pose/jetbot/utils/pid.py', + 'projects/python/perception/pose_estimation/lightweight_open_pose/jetbot/utils/pid.py', 'src/opendr/perception/compressive_learning/multilinear_compressive_learning/algorithm/trainers.py', 'src/opendr/perception/object_detection_2d/retinaface/Makefile', 'src/opendr/perception/multimodal_human_centric/audiovisual_emotion_learner/algorithm/efficientface_modulator.py',