-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove categories metadata from (OneHot)Label datapoint #7171
Conversation
@@ -167,16 +167,16 @@ class FiveCrop(Transform): | |||
""" | |||
Example: | |||
>>> class BatchMultiCrop(transforms.Transform): | |||
... def forward(self, sample: Tuple[Tuple[Union[datapoints.Image, datapoints.Video], ...], datapoints.Label]): | |||
... def forward(self, sample: Tuple[Tuple[Union[datapoints.Image, datapoints.Video], ...], datapoints.LabelWithCategories]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, auto refactoring was a little to eager. Need to revert this.
output = one_hot(inpt.as_subclass(torch.Tensor), num_classes=num_categories) | ||
return datapoints.OneHotLabel(output, categories=inpt.categories) | ||
return datapoints.OneHotLabel(output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does change the functionality, but not by much. Previously, users could use this transform and it would do the right thing by default if the label had some categories
attached to it. Now they have to explicitly pass it to the constructor.
Note that the -1
default is somewhat weird. It is a sentinel to let the torch kernel know to "figure it out". This means it checks all values available in the input and infers the number of categories from them. Obviously, if the input doesn't contain the smallest and largest value, this will give false results.
We (@NicolasHug, @vfdev-5, and I) had a longer offline discussion about this and decided to not include the Closing this PR as we don't need to remove the |
We had an offline discussion about this and the consensus is that right now there is no critical need for the labels to have a
categories
field. It is nice to have for visualization and if there is user demand for it later we can re-add it. Still, as is it is a liability since we haven't done any perf tests with it and there might be other implications from sharing the same list of categories between all label instances coming from a dataset.This PR removes the attribute and all functionality that comes with it. Diff is so large because I ported the "old" behavior over to
torchvision.prototype.datasets
since they use the attribute quite a bit.cc @bjuncek