-
Notifications
You must be signed in to change notification settings - Fork 290
Open
Description
When I open a file on S3 like this:
import fsspec
fs = fsspec.filesystem('s3', anon=True)
path = "coiled-datasets/uber-lyft-tlc/part.93.parquet"
fs.open(path, mode="rb")
The fs.open
call often takes ~0.5-1.5 seconds to run. Here's a snakeviz profile (again, just of the fs.open
call) where it looks like most time is spent in a details
call that hits S3:
I think this is mostly to get the file size (though I'm not sure why the size is needed at file object creation time) because if I pass the file size to fs.open
, then things are much faster:
@martindurant do you have a sense for what's possible here to speed up opening files?
The actual use case I'm interested in is passing a bunch (100k) of netcdf files to Xarray, whose h5netcdf
engine requires open file objects.
Metadata
Metadata
Assignees
Labels
No labels