-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Add utility to draw keypoints #4216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Here is the first dry run! This is super dirty and possibly needs few more iterations 😄 I'm trying to replicate something similar to this colab notebook. Keypoint outputs are slightly tricky! Note that the outputs of keypoint detection models are
Just like utiltities. This utlity too draws all the possible keypoints on the image. Here are some of the current outputs. (Indian road ❤️ ) (Some Image taken from Pascal VOC examples) I still need to implement plotting labels and some functionality to join the points (plot lines with connections) I have kept the label option, since some users might be interested in labeling the keypoints, (E.g. by the keypoint numbers or by names). Small code to reproduce this output and illustrate the API
A review at this point would be great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for missing this PR @oke-aditya !
I've made a few comments, let me know what you think.
torchvision/utils.py
Outdated
def draw_keypoints( | ||
image: torch.Tensor, | ||
keypoints: torch.Tensor, | ||
labels: Optional[List[str]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of providing labels
(which, for the COCO dataset that we consider we only have one category), I would instead provide a connectivity
argument, which dictates which points should be linked against which other point.
This can be an optional argument, and if provided will connect the different keypoints together (and could thus replace the connect
boolean argument as well.
Here is an implementation that I had for it for the "person" class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @fmassa . Thanks a lot for review 😃
The idea behind labels is if person wants to explicitly mark the keypoint number.
Particularly useful in counting the keypoints, or validating performance of all the keypoints. (Maybe a keypoint number X is performing very poorly? User would need to label these to come to know.
Also it might be particularly useful in face detection etc models.
Maybe users want to do something similar to above.
About the connectivity point, I think you are right. connectivity can be Optional[Tuple[Tuple[int, int]]]
denoting which two points should be connected. So that we can link them up if provided.
I think the utility should be slightly generic and not limited to COCO. Let me know what you think
The person implementation is very nice. (maskrcnn-benchmark is probably one of the best libraries)
I think I can post images side by side how maskrcnn implementation and the current one (when finalized a bit more) looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for keypoints (given that they are fairly small) labelling them might not be very readable, while using a color-code might be easier to spot. This begs the question if the color should represent the instance or the keypoint-id in the instance.
My 2 cents is that we can pass a color tensor that is of size (num_instances, num_keypoints, 3)
, and that internally we can broadcast to the full shape if the user only passed (num_keypoints, 3)
or (num_instances, 1, 3)
or (3)
.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we do not use labels too for drawing segmentation masks. And a good color combination can be sufficient for visualizing keypoints.
We don't use tensors for colors, in other utlitites so maybe we shouldn't use here.
I suggest colors to be either "str" (One color for all keypoints) List[str] specifying color for each the keypoints ids.
None, so that we generate random colors from palette.
Specifying color per instance might be hard and users would need to dynamically adjust this utlity, as number of detected instances would differ. We don't use num_instances in any other utility and would prefer to stick with keypoint ids.
torchvision/utils.py
Outdated
|
||
Args: | ||
image (Tensor): Tensor of shape (3, H, W) and dtype uint8. | ||
keypoints (Tensor): Tensor of shape (num_keypoints, K, 3) the K keypoints location for each of the N instances, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of num_keypoints
, I would maybe call it num_instances
, as it can be confusing compared to K
(which is the number of keypoints per instance)
Also, the 3
here is a bit weird, as we don't use it for now. Maybe I would just check that the tensor has at least 2 dimensions, and the code ignores everything after the 2nd dimension. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it is num_instances
. I will change that.
3 is bit weird, but that makes the API simple, users can directly pass (X, Y, visibility) the format of torchvision keypoints.
We can do visibility based thresholding, to use the third dimension. (this might restrict the API)
Or simply discard as you suggested and leave the above part to end-user.
Let me know your thoughts. I will refactor accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the 3, do you know how other datasets are represented (like for face keypoints)?
My understanding is that the 3 is specific to COCO, and that other datasets only represent the keypoints as a set of (x, y)
coordinates.
Given that we are not handling the visibility in the code (yet), and that the 3rd dimension might be understood as if we handled 3d keypoints (which is not the case), I would say to just assume we have at least 2 coordinates, and skip the rest.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree about visibility. Plus the logic of thresholding / choosing to plot if visible should be left to user.
So we can skip the third dimension.
Bonjour @fmassa ! I hope I have implemented the basic functionality as per requirements. There are two small issues
Well color palette is always the issue with visualization utils and it's back here too! For keypoints colors. Should we For line colors. Should we
As I mentioned earlier, it is useful to keep labels, but again plotting lot of labels can lead to messy image.
We can leave it to user, or have an argument to threshold it. Here are few outputs! The colors are awful Would love to hear thoughts from @datumbox and @NicolasHug too 😃 |
Code to reproduce.
|
@fmassa can you please have a look. I will have more bandwidth over next few weeks. (I can contribute bit more 😅) |
Hey @NicolasHug @datumbox, @fmassa maybe this is missed. 😃 We can either go ahead or continue on this after 0.11 release? |
@oke-aditya I've asked Francisco to pick this up as he is more familiar with the keypoints model. He is currently OOO but he will get back to you. |
No problem 👍 |
Apologies for excessive pings, but I think @fmassa is back 😃 |
Don't know why GitHub has messed up reviewing this PR :( Messages going haywire. Here is the current API on small dummy black image and keypoints.
I think that the code is fine and does handle the cases. (I didn't use numpy and broadcast, it was easier to accommodate this) I will post one run on COCO sample images, too tomorrow. Let me know your thoughts @fmassa P.S. Sadly I'm out of sync :( Usually I get free time during weekend and ping all the maintainers on your off days. 😞 |
Hi! I gave another run and got the outputs for COCO models. Here are the outputs. Connecting the skeleton with connectivity. I joined them randomly nothing special Picture Credits to MS COCO images. The output looks bit bad without line_color, I think we shouldn't keep it white. cc @fmassa |
@fmassa can you please have a review. :) |
So the above commit this is a working code with @fmassa's version (including tests, CI was green). But I still have one small problem. The problem is handling excess colors ! Both of our utils currently E.g. We have 2 boxes but create 3 colors. This works fine.
Why excess colors is Nice !!This is a very useful feature as users can define a big color palette globally and this would automatically accommodate the colors. Back to keypointsThe previous non-numpy version handled this, and there are no troubles. With the current numpy proposed solution.
E.g. if we supply colors for 2 instances and there are just 1.
I believe if someone wants to use this model, he would just set a threshold and not know number of instances detected. So specifying exact number of colors wouldn't work. I believe that we should land this PR soon as @datumbox suggested. So I will go forward with colors being just "str" (which is handled by both logics and works anyways) Also there are a few other points. I will open a ticket including this :) . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oke-aditya LGTM, thanks!
Concerning the extra colours, I think there are a few options to detect and handle it so that the broadcasting can still work. At any case, I think we can discuss these changes on an issue dedicated to extending the functionality of the util.
We should add a good example with a proper picture on the gallery to showcase how to call the keypoint model and visualize the predictions. Since the utility requires preprocessing the model output (cleaning up, removing extra columns etc), we should provide a working example that users can follow.
I'll let @fmassa have the final word on this as he has been involved on the review since the beginning.
Sure Vasilis ! I do have the code for gallery example ready which works with the keypoint_rcnn model. I will send a follow-up PRa and open up a ticket for additional functionality once this is merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let's get this merged as is (i.e., without the handling of multiple colors).
We can discuss about multiple colors in an issue.
Thanks for all your work @oke-aditya ! |
Hey @fmassa! You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py |
Summary: * fix * Outline Keypoints API * Add utility * make it work :) * Fix optional type * Add connectivity, fmassa's advice 😃 * Minor code improvement * small fix * fix implementation * Add tests * Fix tests * Update colors * Fix bug and test more robustly * Add a comment, merge stuff * Fix fmt * Support single str for merging * Remove unnecessary vars. Reviewed By: datumbox Differential Revision: D32298967 fbshipit-source-id: 596a04baa1f04f14246381e4815aeb9dbcccb1b6 Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
* fix * Outline Keypoints API * Add utility * make it work :) * Fix optional type * Add connectivity, fmassa's advice 😃 * Minor code improvement * small fix * fix implementation * Add tests * Fix tests * Update colors * Fix bug and test more robustly * Add a comment, merge stuff * Fix fmt * Support single str for merging * Remove unnecessary vars. Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
Closes #3365
Finally! (everything went well in my life!) I hope I can complete the implementation soon!
I will post the outputs of completed prototype here!
- [ ] Adds example to gallery.Next PR