You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am proposing to a) make Datapoint part of the public API so that 3rd parties can create custom features, and b) provide a dispatch mechanism by which a 3rd party feature can coerce a transform (either 1st party or 3rd party transform) into dispatching to its own implementation of said transform.
E.g., this would enable me to write a Polygon subclass of Datapoint in my own library, which would be able to implement its own affine transformation that both torchvision.transforms.affine and torchvision.transforms.RandomAffine (and the other standard transforms) dispatch to when they operate on a Polygon tensor.
There is already some degree of dispatching going on among some transforms. E.g., for affine and RandomAffine, any Datapoint subclass can simply implement a Datapoint.affine method. The affine function checks isinstance(inp, Feature) and then calls inp.affine(...). And RandomAffine._transform simply calls affine under the hood, which performs the aforementioned dispatching.
Motivation, pitch
The crown jewel of torchvision.transforms v2 is its added support for features like bounding boxes and segmentation masks. Presently, the development of new features and transforms is gated on the development efforts of the intrepid torchvision team. It would be great to enable users to implement their own features (e.g. Polygon or ImageTrack) and for torchvision's transforms (or 3rd party transforms) to be able to dispatch on those features. This would reduce the burden on the torchvision team in that it enables users to develop bespoke features that have "drop-in" compatibility with standard torchvision pipelines, without you needing to officially support the feature.
Alternatives
No response
Additional context
This proposal had some initial feedback here: #6753 (comment)
and the torchvision team provided positive feedback on this here: #6753 (comment)
@rsokl Thanks for taking the time to clean up the proposal on this separate thread. We agree that the Datapoint needs to become public. After discussions with @pmeier and @vfdev-5 we decided to keep it non-public for a bit longer so that we are able to make any necessary changes based on user feedback.
May I propose the following? Start doing the proposed Polygon expansion on your side as if the Datapoint was public. Comment any obstacle you face on the developer API, so that we can make the necessary changes to assist your work. Finally comment here the location of your repo, so that if we make a breaking change, we can reach out and assist you. This will allow you to progress without worrying about breakages and allow us to build confidence over the developer API which will speed up the process of making the base-class public.
One final note. From reading your proposal, I suspect that we might need to make some additional changes to ensure that most dispatchers are accessible via the methods of the Datapoint object. This will allow users to dispatch for custom objects.
There is some movement here. We are implementing a new and public way to register kernels for custom datapoints in #7747. We should have this for the next release.
🚀 The feature
(This was originally pitched in this long feedback thread. It was recommended that I open a separate issue).
I am proposing to a) make
Datapoint
part of the public API so that 3rd parties can create custom features, and b) provide a dispatch mechanism by which a 3rd party feature can coerce a transform (either 1st party or 3rd party transform) into dispatching to its own implementation of said transform.E.g., this would enable me to write a
Polygon
subclass ofDatapoint
in my own library, which would be able to implement its own affine transformation that bothtorchvision.transforms.affine
andtorchvision.transforms.RandomAffine
(and the other standard transforms) dispatch to when they operate on aPolygon
tensor.There is already some degree of dispatching going on among some transforms. E.g., for
affine
andRandomAffine
, anyDatapoint
subclass can simply implement aDatapoint.affine
method. Theaffine
function checksisinstance(inp, Feature)
and then callsinp.affine(...)
. AndRandomAffine._transform
simply callsaffine
under the hood, which performs the aforementioned dispatching.Motivation, pitch
The crown jewel of
torchvision.transforms
v2 is its added support for features like bounding boxes and segmentation masks. Presently, the development of new features and transforms is gated on the development efforts of the intrepid torchvision team. It would be great to enable users to implement their own features (e.g.Polygon
orImageTrack
) and for torchvision's transforms (or 3rd party transforms) to be able to dispatch on those features. This would reduce the burden on the torchvision team in that it enables users to develop bespoke features that have "drop-in" compatibility with standard torchvision pipelines, without you needing to officially support the feature.Alternatives
No response
Additional context
This proposal had some initial feedback here: #6753 (comment)
and the torchvision team provided positive feedback on this here: #6753 (comment)
cc @vfdev-5 @datumbox @bjuncek @pmeier
The text was updated successfully, but these errors were encountered: