-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support specifying single HDF Group in open_virtual_dataset #165
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this on @scottyhq!
Seems to work fine, my comments are mostly about refactoring.
One other thing is that the group kwarg could also be supported when reading references from Zarr, but that could also just be flagged as an issue and left for a later PR, as the logic will be independent of the kerchunk-parsing logic here.
Co-authored-by: Tom Nicholas <tom@cworthy.org>
Co-authored-by: Tom Nicholas <tom@cworthy.org>
virtualizarr/xarray.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why the diff is so big on this file, i guess it's the ruff formatting...
Yeah, sounds like a plan. I think this is now refactored and minimally functional. I'm sure working with other example datasets with groups (https://github.com/pydata/xarray-data/blob/master/cmip6.nc) will uncover improvements or the need for additional tests. |
virtualizarr/tests/test_xarray.py
Outdated
tmpfile = fsspec.open_local( | ||
f"filecache::{url}", filecache=dict(cache_storage="/tmp", same_names=True) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with this fsspec
function. Is this not something that can just be done with pathlib
?
virtualizarr/tests/test_xarray.py
Outdated
indexes={}, | ||
drop_variables=["listOfCovarianceTerms", "listOfPolarizations"], | ||
) | ||
tmpref = "/tmp/cmip6.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytest has a fixture tmpdir
- I think we just want to write to that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great @scottyhq - just a couple of very minor comments and then we can merge.
for more information, see https://pre-commit.ci
Thank you @scottyhq ! |
docs/releases.rst
api.rst
@TomNicholas @forrestfwilliams I took a pass at this for basic functionality of loading a single group. Seems to be working for the couple datasets I'm trying out from https://nisar.jpl.nasa.gov/data/sample-data/