Skip to content

Add utility to draw keypoints #4216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Nov 9, 2021
Merged

Add utility to draw keypoints #4216

merged 35 commits into from
Nov 9, 2021

Conversation

oke-aditya
Copy link
Contributor

@oke-aditya oke-aditya commented Jul 28, 2021

Closes #3365

Finally! (everything went well in my life!) I hope I can complete the implementation soon!
I will post the outputs of completed prototype here!

  • Adds code
  • Adds docs
    - [ ] Adds example to gallery. Next PR
  • Adds tests

@oke-aditya
Copy link
Contributor Author

oke-aditya commented Jul 28, 2021

Here is the first dry run! This is super dirty and possibly needs few more iterations 😄

I'm trying to replicate something similar to this colab notebook.

Keypoint outputs are slightly tricky! Note that the outputs of keypoint detection models are

keypoints (FloatTensor[N, K, 3]): the locations of the predicted keypoints, in [x, y, v] format.
N -> Number of Keypoints
K -> Each keypoint output
3 -> X, Y, Visibility.

Just like utiltities. This utlity too draws all the possible keypoints on the image.
Threshold needs to be processed by the user.

Here are some of the current outputs.

(Image 1)

(Indian road ❤️ )

example3

(Some Image taken from Pascal VOC examples)

I still need to implement plotting labels and some functionality to join the points (plot lines with connections)
I'm not sure if color can be provided by the user or we should generate from a color palette.

I have kept the label option, since some users might be interested in labeling the keypoints, (E.g. by the keypoint numbers or by names).

Small code to reproduce this output and illustrate the API

model = torchvision.models.keypointrcnn_resnet50_fpn(pretrained=True)
model = model.eval()
IMAGE_PATH = 'demo_im3.jpg'

image = PIL.Image.open(IMAGE_PATH)
image_tensor = F.to_tensor(image)
output = model([image_tensor])[0]
kpts = output['keypoints']

image_tensor2 = torchvision.io.read_image(IMAGE_PATH)
res = draw_keypoints(image_tensor2, kpts)
show(res)

A review at this point would be great!

cc @datumbox @NicolasHug

@oke-aditya oke-aditya marked this pull request as ready for review July 29, 2021 08:40
@datumbox datumbox requested a review from fmassa September 6, 2021 10:15
Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for missing this PR @oke-aditya !

I've made a few comments, let me know what you think.

def draw_keypoints(
image: torch.Tensor,
keypoints: torch.Tensor,
labels: Optional[List[str]] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of providing labels (which, for the COCO dataset that we consider we only have one category), I would instead provide a connectivity argument, which dictates which points should be linked against which other point.

This can be an optional argument, and if provided will connect the different keypoints together (and could thus replace the connect boolean argument as well.

Here is an implementation that I had for it for the "person" class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fmassa . Thanks a lot for review 😃

The idea behind labels is if person wants to explicitly mark the keypoint number.
Particularly useful in counting the keypoints, or validating performance of all the keypoints. (Maybe a keypoint number X is performing very poorly? User would need to label these to come to know.

Also it might be particularly useful in face detection etc models.

image

Maybe users want to do something similar to above.

About the connectivity point, I think you are right. connectivity can be Optional[Tuple[Tuple[int, int]]] denoting which two points should be connected. So that we can link them up if provided.

I think the utility should be slightly generic and not limited to COCO. Let me know what you think

The person implementation is very nice. (maskrcnn-benchmark is probably one of the best libraries)
I think I can post images side by side how maskrcnn implementation and the current one (when finalized a bit more) looks like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for keypoints (given that they are fairly small) labelling them might not be very readable, while using a color-code might be easier to spot. This begs the question if the color should represent the instance or the keypoint-id in the instance.

My 2 cents is that we can pass a color tensor that is of size (num_instances, num_keypoints, 3), and that internally we can broadcast to the full shape if the user only passed (num_keypoints, 3) or (num_instances, 1, 3) or (3).

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do not use labels too for drawing segmentation masks. And a good color combination can be sufficient for visualizing keypoints.

We don't use tensors for colors, in other utlitites so maybe we shouldn't use here.
I suggest colors to be either "str" (One color for all keypoints) List[str] specifying color for each the keypoints ids.
None, so that we generate random colors from palette.

Specifying color per instance might be hard and users would need to dynamically adjust this utlity, as number of detected instances would differ. We don't use num_instances in any other utility and would prefer to stick with keypoint ids.


Args:
image (Tensor): Tensor of shape (3, H, W) and dtype uint8.
keypoints (Tensor): Tensor of shape (num_keypoints, K, 3) the K keypoints location for each of the N instances,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of num_keypoints, I would maybe call it num_instances, as it can be confusing compared to K (which is the number of keypoints per instance)

Also, the 3 here is a bit weird, as we don't use it for now. Maybe I would just check that the tensor has at least 2 dimensions, and the code ignores everything after the 2nd dimension. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it is num_instances. I will change that.

3 is bit weird, but that makes the API simple, users can directly pass (X, Y, visibility) the format of torchvision keypoints.

We can do visibility based thresholding, to use the third dimension. (this might restrict the API)
Or simply discard as you suggested and leave the above part to end-user.

Let me know your thoughts. I will refactor accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the 3, do you know how other datasets are represented (like for face keypoints)?

My understanding is that the 3 is specific to COCO, and that other datasets only represent the keypoints as a set of (x, y) coordinates.

Given that we are not handling the visibility in the code (yet), and that the 3rd dimension might be understood as if we handled 3d keypoints (which is not the case), I would say to just assume we have at least 2 coordinates, and skip the rest.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about visibility. Plus the logic of thresholding / choosing to plot if visible should be left to user.
So we can skip the third dimension.

@oke-aditya oke-aditya requested a review from fmassa September 8, 2021 18:20
@oke-aditya oke-aditya changed the title [WIP] Add utility to draw keypoints Add utility to draw keypoints Sep 12, 2021
@oke-aditya
Copy link
Contributor Author

Bonjour @fmassa !

I hope I have implemented the basic functionality as per requirements.
The keypoints can be connected by a Optional connectivity arg which is Tuple[Tuple[int, int]]. Each tuple denoting the
keypoint ids to be connected.

There are two small issues

  1. Colors: -

Well color palette is always the issue with visualization utils and it's back here too!
We need colors for keypoints as well as lines to join them (if connectivity is True)

For keypoints colors. Should we
By default generate unique colors for every keypoint.
or
Allow users to pass into colors as one string for all keypoints as well as colors for each keypoints.

For line colors. Should we
Again randomly generate unique colors for every line
or
Allow users to pass into colors one string for all lines as well as colors for each line.
or
Use the same colors that were passed to keypoints

  1. Labels: -

As I mentioned earlier, it is useful to keep labels, but again plotting lot of labels can lead to messy image.

  1. Visibility: -

We can leave it to user, or have an argument to threshold it.

Here are few outputs! The colors are awful

Figure_1

Figure_2

Figure_3

Figure_4

Figure_5

Figure_6

Would love to hear thoughts from @datumbox and @NicolasHug too 😃

@oke-aditya
Copy link
Contributor Author

Code to reproduce.

def show(imgs):
    if not isinstance(imgs, list):
        imgs = [imgs]
    fix, axs = plt.subplots(ncols=len(imgs), squeeze=False)
    for i, img in enumerate(imgs):
        img = img.detach()
        img = F.to_pil_image(img)
        axs[0, i].imshow(np.asarray(img))
        axs[0, i].set(xticklabels=[], yticklabels=[], xticks=[], yticks=[])
    plt.show()


model = keypointrcnn_resnet50_fpn(pretrained=True)
model = model.eval()

IMAGE_PATH = 'demo_im6.jpg'
image_tensor2 = read_image(IMAGE_PATH)

image = Image.open(IMAGE_PATH)
image_tensor = to_tensor(image)
output = model([image_tensor])[0]

kpts = output['keypoints']
res = draw_keypoints(image_tensor2, kpts, colors="red", connectivity=((0, 1), (1, 2), (2, 3), (3, 4)))
show(res)

res = draw_keypoints(image_tensor2, kpts, colors="red")
show(res)

@oke-aditya
Copy link
Contributor Author

@fmassa can you please have a look. I will have more bandwidth over next few weeks. (I can contribute bit more 😅)

@oke-aditya
Copy link
Contributor Author

Hey @NicolasHug @datumbox, @fmassa maybe this is missed. 😃 We can either go ahead or continue on this after 0.11 release?

@datumbox
Copy link
Contributor

@oke-aditya I've asked Francisco to pick this up as he is more familiar with the keypoints model. He is currently OOO but he will get back to you.

@oke-aditya
Copy link
Contributor Author

No problem 👍

@oke-aditya
Copy link
Contributor Author

Apologies for excessive pings, but I think @fmassa is back 😃

@oke-aditya
Copy link
Contributor Author

Don't know why GitHub has messed up reviewing this PR :( Messages going haywire.

Here is the current API on small dummy black image and keypoints.


keypoints = torch.tensor(
        [
            [
                [5, 5, 1.0], [10, 10, 1.0], [50, 50, 1.0]
            ],
            [
                [20, 20, 1.0], [30, 30, 1.0], [40, 40, 1.0]
            ]
        ], dtype=torch.float)
    img = torch.full((3, 100, 100), 0, dtype=torch.uint8)
# Case 1 single str for all colors
res = draw_keypoints(img, keypoints, colors="red")
show(res)

kp1

# Case 2 single str and connected keypoints
res = draw_keypoints(img, keypoints, colors="red", connectivity=((0, 1), ))
show(res)

kp2

# Case 3 List of string denoting colors for all keypoints of each instance.
res = draw_keypoints(img, keypoints, colors=["red", "violet"])
show(res)

kp3

# Case 4 Connecting the above
res = draw_keypoints(img, keypoints, colors=["red", "violet"], connectivity=((0, 1), ))
show(res)

kp4

# Case 5 Nested List containing colors for every keypoint id of each instance.
res = draw_keypoints(img, keypoints, colors=[["pink", "yellow", "brown"], ["orange", "white", "blue"]], connectivity=((0, 1), ))
show(res)

kp5

I think that the code is fine and does handle the cases. (I didn't use numpy and broadcast, it was easier to accommodate this)

I will post one run on COCO sample images, too tomorrow.

Let me know your thoughts @fmassa

P.S. Sadly I'm out of sync :( Usually I get free time during weekend and ping all the maintainers on your off days. 😞

@oke-aditya
Copy link
Contributor Author

oke-aditya commented Nov 2, 2021

Hi!

I gave another run and got the outputs for COCO models.
I have a script ready which can become the gallery example! I think it's good to leave it as a new PR as it will need a more thorough reviewing.

Here are the outputs.

boy_kpt1

Connecting the skeleton with connectivity.

boy_kpt2

dog_kpt

I joined them randomly nothing special

dog_kpt2

Picture Credits to MS COCO images.

The output looks bit bad without line_color, I think we shouldn't keep it white.

cc @fmassa

@oke-aditya
Copy link
Contributor Author

@fmassa can you please have a review. :)

@oke-aditya
Copy link
Contributor Author

oke-aditya commented Nov 8, 2021

Hi @datumbox @fmassa

So the above commit this is a working code with @fmassa's version (including tests, CI was green).

But I still have one small problem.
(Sorry if it is bad to keep finding problems in other's code, but really I don't want the thing to go unnoticed)

The problem is handling excess colors !

Both of our utils currently draw_bounding_boxes and draw_segmentation_masks handle excess colors!
This is especially good if we don't know number of instances of boxes / masks. One can define excess colors, and the utliity would ignore the extra ones.

E.g. We have 2 boxes but create 3 colors. This works fine.

boxes = torch.tensor([[50, 50, 100, 200], [210, 150, 350, 430]], dtype=torch.float)
colors = ["blue", "yellow", "green"]
result = draw_bounding_boxes(dog1_int, boxes, colors=colors, width=5)

Why excess colors is Nice !!

This is a very useful feature as users can define a big color palette globally and this would automatically accommodate the colors.
The only constraint is num_colors shouldn't be less than num_boxes / num_masks. (i.e. len(colors) <= len(boxes)
One might not know how many boxes / masks are detected by the trained models, and hence the colors wouldn't cause trouble. We would just ignore extra ones.

Back to keypoints

The previous non-numpy version handled this, and there are no troubles.

With the current numpy proposed solution.
We cannot achieve this! This is due to the broadcasting! If we supply excess colors, it would result in error.
E.g. if we supply 18 keypoints instead of 17

ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (1,18)  and requested shape (1,17)

E.g. if we supply colors for 2 instances and there are just 1.

ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (2,1)  and requested shape (1,17)

I believe if someone wants to use this model, he would just set a threshold and not know number of instances detected. So specifying exact number of colors wouldn't work.

I believe that we should land this PR soon as @datumbox suggested. So I will go forward with colors being just "str" (which is handled by both logics and works anyways) Also there are a few other points. I will open a ticket including this :) .

@oke-aditya oke-aditya requested a review from datumbox November 8, 2021 13:40
@datumbox datumbox removed their request for review November 9, 2021 11:29
Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oke-aditya LGTM, thanks!

Concerning the extra colours, I think there are a few options to detect and handle it so that the broadcasting can still work. At any case, I think we can discuss these changes on an issue dedicated to extending the functionality of the util.

We should add a good example with a proper picture on the gallery to showcase how to call the keypoint model and visualize the predictions. Since the utility requires preprocessing the model output (cleaning up, removing extra columns etc), we should provide a working example that users can follow.

I'll let @fmassa have the final word on this as he has been involved on the review since the beginning.

@oke-aditya
Copy link
Contributor Author

Sure Vasilis ! I do have the code for gallery example ready which works with the keypoint_rcnn model. I will send a follow-up PRa and open up a ticket for additional functionality once this is merged.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's get this merged as is (i.e., without the handling of multiple colors).

We can discuss about multiple colors in an issue.

@fmassa fmassa merged commit a4ca717 into pytorch:main Nov 9, 2021
@fmassa
Copy link
Member

fmassa commented Nov 9, 2021

Thanks for all your work @oke-aditya !

@github-actions
Copy link

github-actions bot commented Nov 9, 2021

Hey @fmassa!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

@oke-aditya oke-aditya deleted the add_kypt branch November 9, 2021 13:53
facebook-github-bot pushed a commit that referenced this pull request Nov 15, 2021
Summary:
* fix

* Outline Keypoints API

* Add utility

* make it work :)

* Fix optional type

* Add connectivity, fmassa's advice 😃

* Minor code improvement

* small fix

* fix implementation

* Add tests

* Fix tests

* Update colors

* Fix bug and test more robustly

* Add a comment, merge stuff

* Fix fmt

* Support single str for merging

* Remove unnecessary vars.

Reviewed By: datumbox

Differential Revision: D32298967

fbshipit-source-id: 596a04baa1f04f14246381e4815aeb9dbcccb1b6

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
cyyever pushed a commit to cyyever/vision that referenced this pull request Nov 16, 2021
* fix

* Outline Keypoints API

* Add utility

* make it work :)

* Fix optional type

* Add connectivity, fmassa's advice 😃

* Minor code improvement

* small fix

* fix implementation

* Add tests

* Fix tests

* Update colors

* Fix bug and test more robustly

* Add a comment, merge stuff

* Fix fmt

* Support single str for merging

* Remove unnecessary vars.

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add utilites to plot Keypoints
4 participants