-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open on-disk kerchunk references as a virtual dataset #118
Comments
I can take a crack at this one. |
You might have to fight @norlandrhagen haha |
Oh! I can back off :) I do think there is an argument to be made for having different methods for |
Sorry @jsignell! I should have mentioned it in this issue :) If I open a PR, would you mind taking a look at it? |
Yeah this is an interesting question. The same thing will arise for Zarr stores too: should there be a different function to open zarr arrays backed by chunk manifests vs zarr arrays backed by actual bytes on-disk in the store? I think in that context it would be confusing to have two functions, especially as "mixed" zarr stores are possible (and useful). |
I would say it makes sense to my brain to have separate functions for "just reading" vs "doing work" so that I can form an expectation about how long something will take to run. But I would expect the |
It might be useful to be able to open an existing kerchunk json/parquet file as a virtual dataset, e.g to make changes to it before writing it back out.
This is essentially the kerchunk version of suggestion (2) here #63 (comment).
This should be really easy to implement: We already have a function for doing it (
dataset_from_kerchunk_refs
), we just have to teachopen_virtual_dataset
that existing kerchunk json/parquet files are also valid filetypes to pass in.The text was updated successfully, but these errors were encountered: