Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove categories metadata from (OneHot)Label datapoint #7171

Closed
wants to merge 1 commit into from

Conversation

pmeier
Copy link
Collaborator

@pmeier pmeier commented Feb 3, 2023

We had an offline discussion about this and the consensus is that right now there is no critical need for the labels to have a categories field. It is nice to have for visualization and if there is user demand for it later we can re-add it. Still, as is it is a liability since we haven't done any perf tests with it and there might be other implications from sharing the same list of categories between all label instances coming from a dataset.

This PR removes the attribute and all functionality that comes with it. Diff is so large because I ported the "old" behavior over to torchvision.prototype.datasets since they use the attribute quite a bit.

cc @bjuncek

@@ -167,16 +167,16 @@ class FiveCrop(Transform):
"""
Example:
>>> class BatchMultiCrop(transforms.Transform):
... def forward(self, sample: Tuple[Tuple[Union[datapoints.Image, datapoints.Video], ...], datapoints.Label]):
... def forward(self, sample: Tuple[Tuple[Union[datapoints.Image, datapoints.Video], ...], datapoints.LabelWithCategories]):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, auto refactoring was a little to eager. Need to revert this.

output = one_hot(inpt.as_subclass(torch.Tensor), num_classes=num_categories)
return datapoints.OneHotLabel(output, categories=inpt.categories)
return datapoints.OneHotLabel(output)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does change the functionality, but not by much. Previously, users could use this transform and it would do the right thing by default if the label had some categories attached to it. Now they have to explicitly pass it to the constructor.

Note that the -1 default is somewhat weird. It is a sentinel to let the torch kernel know to "figure it out". This means it checks all values available in the input and infers the number of categories from them. Obviously, if the input doesn't contain the smallest and largest value, this will give false results.

@pmeier
Copy link
Collaborator Author

pmeier commented Feb 6, 2023

We (@NicolasHug, @vfdev-5, and I) had a longer offline discussion about this and decided to not include the datapoints.Label and datapoints.OneHotLabel class in the first iteration of the transforms v2 release at all. I'll post a longer explanation for this and all other changes that we made the last couple of weeks compared to the initial announcement in #6753 as soon as we are sure that we can release the rest.

Closing this PR as we don't need to remove the categories any longer, if we don't release the label datapoints in the first place.

@pmeier pmeier closed this Feb 6, 2023
@pmeier pmeier deleted the rip-label-categories branch February 9, 2023 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants