-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does torchdata already work with GCP and Azure blob storage #794
Comments
Technical speaking, with |
Have you by any chance observed any perf impact from using fsspec vs the S3 integration. If not then agreed fsspec is a good option and we just need to spend some time authoring a tutorial |
After the observation on the performance regression last time, I didn't get a chance to take a deeper look at the culprit. But, discussed with @ydaiming earlier, and he claimed that S3 integration works better on archive files but not on small pieces of files compared to Overall, in some cases, |
I am going to take a quick look into |
My benchmark shows that using |
Our benchmarking results shows even for archives (large files) |
🚀 The feature
We already have an S3 integration and it seems like the S3 API already works with both
Motivation, pitch
So ideally we can already support Azure, GCP without doing much
Alternatives
Build a new integration for each of Azure and GCP using their native APIs
h/t: @chauhang for the idea
The text was updated successfully, but these errors were encountered: