-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add KeyPoint feature for prototype datasets #5326
base: main
Are you sure you want to change the base?
Conversation
💊 CI failures summary and remediationsAs of commit 4bab7c3 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I've made some comments, let me know what you think
@@ -129,6 +159,7 @@ def _decode_captions_ann(self, anns: List[Dict[str, Any]], image_meta: Dict[str, | |||
_ANN_DECODERS = OrderedDict( | |||
[ | |||
("instances", _decode_instances_anns), | |||
("person_keypoints", _decode_person_keypoints_anns), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unrelated to this PR, but I would maybe revisit if we want to have dataset to have a varying return type depending on arguments passed to it.
It might be better to have different flavors of the dataset to be different dataset classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is a design choice that we should address. cc @NicolasHug
In response to #5326 (comment) I've added Here is an example with from torchvision.prototype import datasets, transforms
from torchvision.utils import draw_keypoints
from torchvision.transforms.functional import to_pil_image
def draw_non_right_keypoints(sample, file):
image = sample["image"]
keypoints = sample["keypoints"]
non_right_keypoints = keypoints[[not descr.startswith("right") for descr in keypoints.descriptions], :]
annotated_image = draw_keypoints(
image,
non_right_keypoints.unsqueeze(0),
radius=int(2e-2 * min(sample["image"].shape[-2:])),
colors="red",
)
to_pil_image(annotated_image).save(file)
dataset = datasets.load("celeba")
sample = next(iter(dataset))
draw_non_right_keypoints(sample, "before.jpg")
transform = transforms.HorizontalFlip()
transformed_sample = transform(sample)
draw_non_right_keypoints(transformed_sample, "after.jpg")
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposal for the symmetry LGTM, thanks!
I only have one more comment regarding the resize transform, otherwise good to merge
|
||
class Resize(Transform): | ||
NO_OP_FEATURE_TYPES = {Label} | ||
NO_OP_FEATURE_TYPES = {Label, Keypoint} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok if we leave this as a TODO, but we should change the coordinates of the keypoints following the rescaling factor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in general all geometric transforms need to support Keypoint
's. We can probably share a lot of functionality with the bounding box kernels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I got some questions below
("vertical", description, description.replace("left", "right")) | ||
for description in descriptions | ||
if description.startswith("left") | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify the type of the resulting symmetries
variable?
Below they Keypoint
expects Sequence[Tuple[KeypointSymmetry, Tuple[int, int]]]
. It's non obvious to me how the current code has the specific type.
As discussed offline with @datumbox, we'll put this PR on hold until we have full support for bounding boxes and segmentation masks. |
We currently have two different APIs for keypoints:
draw_keypoints
requires the keypoints to be in shape(num_instances, num_keypoints_per_instance, 3)
where the last channel contains the x and y coordinate.KeyKeypointRCNN
's such askeypointrcnn_resnet50_fpn
require the keypoints to be in shape(num_instances, num_keypoints_per_instance, 3)
where the last channel contains the x and y coordinate and the visibility.We currently also have two datasets that provide keypoints: COCO and CelebA. Of those COCO provides a visibility flag while CelebA doesn't. Skimming through other datasets, it seems the visibility is not a regular part of the annotations. Thus, for the
KeyPoint
feature that this PR adds, I went for only the x and y coordinates.Below you can find example implementations that call both APIs for COCO and CelebA:
COCO
Example for CelebA