Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud optimized netCDF and zarr #4

Open
jhamman opened this issue Sep 10, 2018 · 3 comments
Open

cloud optimized netCDF and zarr #4

jhamman opened this issue Sep 10, 2018 · 3 comments

Comments

@jhamman
Copy link

jhamman commented Sep 10, 2018

As part of the Pangeo project, we have been exploring the concept of "cloud optimized netCDF" - building off of "cloud optimized GeoTIFF". Zarr is an open-source Python library and storage spec "providing an implementation of chunked, compressed, N-dimensional arrays." The spec is simple, clearly documented, and well suited for use in cloud object store.

Last year, we (@rabernat, myself, and others from the xarray/dask/pangeo projects) wrote an experimental xarray backend for zarr and we have been testing its use on public clouds over the last year. The community is eager to see some formal effort put behind these concepts.

This proposal would do the following:

Other possible development objectives include:

per: https://twitter.com/rabernat/status/1039210134600396800

NumFOCUS project: Xarray
ESIP member institution: NCAR

cc @mrocklin @rabernat @shoyer @alimanfoo @WardF

@alimanfoo
Copy link

Thanks @jhamman, I fully support this proposal.

Regarding other possible development objectives, it might be worth also linking to the work that @tjcrone, @shikharsg and @dazzag24 are doing to add support for Azure blob storage (zarr-developers/zarr-python#293), and to the work @martindurant is doing on support for consolidated metadata (zarr-developers/zarr-python#268). Along with zarr-developers/zarr-python#252 these are all working towards extending and optimising support for cloud storage. Although we have or are close to working solutions across multiple cloud platforms, I think there is still work to be done to improve performance and robustness.

Also maybe worth mentioning as a possible development objective work towards implementations of the Zarr storage specification in other programming languages. @jakirkham has been reaching out and initiated a number of conversations, see https://github.com/zarr-developers/zarr/issues/291, https://github.com/zarr-developers/zarr/issues/289, https://github.com/zarr-developers/zarr/issues/286, https://github.com/zarr-developers/zarr/issues/285, https://github.com/zarr-developers/zarr/issues/284, https://github.com/zarr-developers/zarr/issues/279.

@WardF
Copy link

WardF commented Sep 11, 2018

Tagging @DennisHeimbigner. We are looking at these issues (the intersection of the model/spec between netCDF and zarr) as we begin our work towards adding native zarr support to the core netCDF C library.

@jhamman
Copy link
Author

jhamman commented Sep 27, 2018

@WardF and @DennisHeimbigner - any additional thoughts here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants