-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added the KeyPoints TVTensor #8817
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/8817
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
Description
Adds and integrates the KeyPoints TVTensor (requested in #8728), which is a representation of picture-attached points (or vertices) attached point
Details
Inner workings
The KeyPoints represent a tensor of shape
[..., 2]
, which allow for arbitrarily complex structures to be represented (polygons, skeletons, or even SAM-like points prompts). Whenever the__new__
is called, the shape of the source tensor is checked.Tensors of shape
[2]
are reshaped to[1, 2]
, similarly to BoundingBoxes.KeyPoints, like BoundingBoxes, carry arround a
canvas_size
attribute which represents the scale of a batch-typical picture.Kernels
Kernels for all operations should be supported (if I missed one, I will fix this). It merely consists of an adaptation of the code of BoundingBoxes.
Particularities
Maintainers may notice that a
TYPE_CHECKING
section was added that differs significantly from the implementation:I marked this section as
EVIL
since it is a trick, but it cannot generate vulnerabilities: TYPE_CHECKING is alwaysFalse
at runtime, and only everTrue
for the linter.For the last few months, I had issues in my weird
PyLance
+Mypy
mix withBoundingBoxes
initialization. No overload is ever detected to match it. By "re-defining" it, I got to it solved on my machine.Convertors
Added a convertor
convert_box_to_points
intorchvision.transorfms.v2.functional._meta
exported intorchvision.transforms.v2
which (as its name states) converts a[N, 4]
BoundingBoxes
TVTensor into a[N, 4, 2]
KeyPoints TVTensor.Other changes
For the purposes of my custom type checking, I also changed
tv_tensors.wrap
to be 3.8-compatible generics.Since
wrap
only ever outputs a subclass of itslike
argument, I used aTypeVar
bound toTVTensor
to ensure that type-checking passes no matter the checker used.Methodology
Discussion
Since many converters of BoundingBoxes are based on chaning the bboxes to polygons, then operating on the points, I believe that there is a possibility to lower line count and increase reliability for negligeable computational latency cost by using KeyPoints kernels and converting using the method described in the details above