-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consider using filesystem-spec #15
Comments
After some digging, it seems that the ftp protocol is not cachable (see here https://filesystem-spec.readthedocs.io/en/latest/_modules/fsspec/implementations/ftp.html#FTPFileSystem), which makes fsspec much less attractive now, but still very promising to simplify our code in data fetchers |
my mistake, this works for caching ftp: fs = fsspec.filesystem("filecache",
target_protocol='ftp',
target_options={'host': 'ftp.ifremer.fr'},
cache_storage='./tmp')
with fs.open('/ifremer/argo/dac/coriolis/1900067/1900067_prof.nc') as of:
xr.open_dataset(of) |
Wow! I'm so glad to see that something came of this suggestion. So exciting to see this library progress. 👏 |
Thanks @rabernat |
Cool. @martindurant, creator of fsspec, has always been very helpful and responsive when working with Pangeo folks. I'm sure he'd be glad to know you're using fsspec and would try to help resolve any technical challenges. |
Indeed, happy to see it, and that it proved fairly easy to do
This is probably and unfortunately true - fsspec was born initially by factoring out internal implementation details from dask. |
Note that the concept of fetching and loading datasets with specific arguments is superficially similar to what Intake does. I don't know the world of argo, but it might be interesting for you, as indeed it has been for some of pangeo, whether online file-based catalogues of datasets (e.g., at https://catalog.pangeo.io/browse/master/ ), or catalogs derived from online data services (perhaps intake-esm being a good example) |
I surely understand that, and to be honest, I started from very far with file systems ! |
Yes, I would love an intake catalogue entry with Argo data, and I'm working with the France data center and Ifremer to have it. |
Hi @martindurant , With ffspec implemented in argopy, I came up with an error on my station (I'm back at the office) that wasn't happening on my laptop :
Is that error familiar to you ? |
as mentionned here, |
@quai20 this was discussed here fsspec/filesystem_spec#322 |
My bad 😃 ! Thanks @gmaze |
I had a quick look at your backend code, and I wanted to suggest you investigate filesystem-spec: https://filesystem-spec.readthedocs.io/en/latest
Using fsspec might allow you to remove some of your code related to file downloading, caching, etc. It might also make it easier to point at different endpoints for the data (e.g. ftp, http, s3). We use it, for example, in llcreader, which is similar to this project (tries to provide a uniform API for reading ECCO LLC data regardless of where it is stored).
An added benefit of using fsspec is its end-to-end compatibility with dask, which is somewhat related to #14.
The text was updated successfully, but these errors were encountered: