-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrating download utils to torchdata #7549
Comments
Thanks for the suggestion @biphasic.
We had an initial push for this in #6060. Unfortunately, we hit quite a few roadblocks that would need to be resolved before we could adapt that. In addition, the prototype datasets are currently on hold and thus we currently don't depend on
Could you explain how exactly you would benefit as a downstream library from this? Our idea was to just switch the backend and thus the users would have no touching points with |
Why is that if I may ask? Is there a place where your team defines a roadmap for torchvision?
The benefit would be not having to rely on torchvision for download utils and torchdata for datapipes, but just torchdata for all of that functionality. I'm not sure I understand what you mean by saying that the users would have no touching points with torchdata, isn't the goal of torchvision/prototype to use datapipes? |
Hi @biphasic
We came to the conclusion that for now, there are more suitable opportunities to improve data-loading speed in torchvision. We might revisit in the future. Re roadmap: we used to disclose those more actively, but recent events have made that difficult. To give you an idea, the rough roadmap for now is to 1) release the transforms V2 work 2) improve data-loading speed via improvements to Resize() 3) revamp video decoding utilities. Going back to the original goal of this issue: any additional dependency comes with significant maintenance cost, and there's no immediate plan to take a dependency on torchdata. So as @pmeier already mentioned, it's unlikely that we'll migrate our download functionality to torchdata (on top of the blockers @pmeier already noted). Also @biphasic please note that while our download utils are effectively public, we really don't recommend that you rely on them in your library: those were introduced a long time ago as public but they should have been private from the beginning. We might deprecate and make them private in the future, so please migrate away from them if you can. Regardless of whether torchvision relies on torchdata for download functionality, there shouldn't be anything preventing you from relying on torchdata on your side? The download functionalities over there are public, so perhaps this is a better alternative for you |
Thanks a lot for the clarification @NicolasHug . While I understand that datapipes might not bring a big speed improvement, for me there's a big plus in their API. Would you say then there's a risk of torchdata not being actively developed anymore? In that case I might hold off as well then for the time being |
Related on need of a better / reliable download utility across PyTorch. I was arguing that the basic (at least http-based) such download-a-file utility should be included directly in core.
@biphasic maintainers of core might be open to accept such contributions (see discussion in pytorch/pytorch#68320) |
@biphasic just making sure you saw it: https://github.com/pytorch/data/#torchdata-see-note-below-on-current-status Since torchdata development is paused, we won't be migrating to torchdata utils anytime soon, so I'll close this issue. Feel free to provide feedback for torchdata on pytorch/data#1196 |
@NicolasHug thanks a lot for letting me know! |
Hello,
following Patrick's suggestion, I'm opening an issue here to discuss if the torchvision maintainers would be open to move the download utils upstream to torchdata. I'm the maintainer of a similar package which is modeled after torchvision to a large extent and would benefit greatly from such a move, much like torchaudio/text I would assume.
cc @pmeier
The text was updated successfully, but these errors were encountered: