-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rename features._Feature to datapoints._Datapoint #7002
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmeier LGTM, thanks! I had a quick look, even though I understand you made the change with the IDE, just to be safe. Let's wait for green tests before merging.
d675ff4 also added quite a few extra lines, since we now either need two lines if we import |
I also proposed VisionModality in the past: #5045 (comment) Datapoint IMO mostly makes think of input data points, e.g. input images or some abstract input data Also, I saw that you put a bunch of transforms as instance method on _Datapoint class. I wonder what is the guidance in selecting the subset (especially for solarize/equalize which seemed quite arbitrary and not very special compared to tons of other image transforms). In theory, this list would only be growing. And if not very needed, maybe some other dispatch mechanism is better? |
Yup, and we acknowledged so in #6753 (comment). The list above is far from exhaustive of the names that we had in our pool. In this particular case, "VisionModality" has the "vision" restriction explained above while "modality" is also being too disconnected from what it would be representing.
Right now, the rule of thumb is that all dispatchers should have an associated method on the class, but of course there are some exceptions for now. We agreed to flesh out this part in later iterations of the API.
Not sure what you are saying here? Yes, adding new kernels or dispatchers also entails new methods on the datapoint subclasses. This architecture leaves us the option for users to inject custom datapoint subclasses into our API if they just implement these methods as proposed in #6753 (comment). We agreed that this is a really neat feature to have and will go for it in the next iterations: #6753 (comment) In general, please post you feedback in the dedicated thread: #6753. Otherwise it gets lost easily. |
IMO solarize/equalize and others are mostly related only to images, not to boxes or masks. so it's strange to see them on the base class _datapoint |
I guess I missed it because I"ve unsubscribed from that thread some long time ago as I'm not believer in this proposed object-oriented design, and transforms treating automatically (?) all sorts of "modalities". I have a feeling that it will lead to over-engineered codes and sometimes surprising and hard to work-around behaviors about what exactly gets transformed and how to prevent it (sometimes we tag along with a input example some masks/boxes that we do not want to be transformed in any way as they correspond to ground truth and are for some visualization). I will be happy to be wrong about this though. In the meantime, I only hope that the purely-functional, "plumbing" functions that do not require any object-oriented wrappers My feedback on this is not new, so nothing new to add :( |
So far the feedback from others that have tried the new API has been positive on all accounts. Please have a look at the feedback thread and if you have concerns, please post them there.
The new API has three layers, with the lowest level being the functional kernels. The are namespaced by attaching the supported type to the name, i.e. The dispatchers and transforms itself continue to work with plain tensors, which will be treated as images or videos where applicable. However, neither of them work with plain tensors for any other type like bounding boxes. |
Great! These are all I need: image kernels, various box/mask format conversion/aug kernels :) Hope the parts of transforms that are sampling the random params are also available as staticmethods.
We will see how it plays out in the future. I hope it's successful! My personal experience reading codes using detectron2/mmdetection have been so far that it's overcomplicated: too many wrappers and layers of indirection and fragile outside of the standard task, but of course it's also a matter of personal taste and of specific task at hand. |
Summary: * rename features._Feature to datapoints.Datapoint * _Datapoint to Datapoint * move is_simple_tensor to transforms.utils * fix CI * move Datapoint out of public namespace Reviewed By: datumbox Differential Revision: D41836898 fbshipit-source-id: ff11dcb220346d98d07c807a12f3b3e59fba6146
After some extensive offline discussions, we decided to rename the class from
_Feature
toDatapoint
and the namespace fromfeatures
todatapoints
.This is the result of trying to balance between something generic and something too narrow. Here are a few examples:
VisionTensor
somewhat exclude things like labels, which are very much part of our API, but go far beyond vision tasks.Apart from
Datapoint
we also hadDataTensor
,torchvision.tensors
SampleTensor
,torchvision.tensors
as candidates.
DataTensor
was rejected since it is very generic (every tensor carries data) and is fairly close to theTensor.data
attribute from PyTorch core.SampleTensor
was rejected due to the "sample" phrase. Although it is technically correct that we are sampling something from a dataset here, it mostly carries the "random sampling" connotation that is to be avoided.Finally,
Datapoint
is not perfect as well. After this PR, it basically describes the following scenario:although one could argue that
is also a valid interpretation.
SampleTensor
has this issue as well. That is something that we need to handle through documentation.Per title. Refactoring was done through IDE in most cases. The ten extra affected lines (897 additions and 887 deletions) stem from the fact that
datapoints._Datapoint
is four characters longerfeatures._Feature
and in very few cases this was sufficient for an extra line break.cc @vfdev-5 @datumbox @bjuncek