-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modified: xarray/backends/api.py #817
Conversation
xarray/backends/api.py
Outdated
if is_remote_uri(filename_or_obj): | ||
if engine !='netcdf4': | ||
raise ValueError('can only read remote gzipped netCDF files ' | ||
"with engine='netcdf4'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is actually not quite right -- pydap would also work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't deal with the mess, but added pydap
to the possibilities.
I think this is probably correct, but the heuristics here are starting to get convoluted enough that I worry about test coverage. Is there any way we can test this? Maybe try to pull the gzip logic into a helper function (an extended variant of |
Certainly can take a crack at this. This is working for my application, but would be good to clean up the code to simplify the cases as you suggest. Regarding testing, it would be good to have a test module that includes grabbing files for local access and accessing openDAP datasets (from UNIDATA?) to ensure that the installed backends are all working as expected. |
We do have existing tests for backends: https://github.com/pydata/xarray/blob/master/xarray/test/test_backends.py This includes a test accessing an OpenDAP dataset from the OpenDAP test server (via pydap, at the end). But in my experience, there servers are somewhat unreliable (maybe available 90%), so we don't require that test to pass for the build to pass. Also, even in the best case scenario network access is slow. So it would be nice to modularize this enough that our logic is testable without actually using opendap. |
I'd be a little hesitant to enforce that we get access to opendap (or other remote datasets) on travis. I've seen this come back to bite us in the past. We can allow tests to fail on travis and just print a warning via pytest if it is something we really want to see tested in that way. |
Currently the way we handle this is that the only test that accessing |
OK - understood. Incidentally, I didn't include pydap because it doesn't seem to be python 3.* compatible. I tested it on 2.7 for my application and it seems to work. |
@shoyer - are we okay with the final logic here? @swnesbitt - can we get an entry in the what's new docs? |
if engine is not None and engine != 'scipy': | ||
raise ValueError('can only read gzipped netCDF files with ' | ||
"default engine, engine='scipy'") | ||
# if the string ends with .gz, then gunzip and open as netcdf file | ||
if sys.version_info[:2] < (2, 7): | ||
raise ValueError('reading a gzipped netCDF not ' | ||
'supported on Python 2.6') | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This try
/except
clause needs to only wrap the call to ScipyDataStore
-- otherwise it's catching too many potentially wrong conditions.
I am reluctant to merge this without having any way to test the logic. Without automated tests this issue is likely to recur. That said, I suppose we could leave the refactoring for a TODO. Let's add a note on that and also one minimal test to verify that we raise an error if you try to use |
How is this issue progressed so far? I am running into the same problem now and I might able to offer some help in testing. |
@tsaoyu - I don't think anyone has worked on developing a test case for this feature. I assume @swnesbitt would appreciate help there. |
Regarding the testing issue, another option is to use something like vcrpy to record and playback http responses for opendap requests. I've had good luck with that for Siphon. |
I am trying to understand the logic by looking through the comments/discussion on this commit. To bring the concept on the same table, there are 3 backends that can be used to handle gzipped netCDF file namely So what we need to do is to make both |
The only way we could make reading a gzipped netCDF4 file is to load the entire file into memory. That's why we didn't support this before. It's also less relevant for netCDF4, because netCDF4 supports in-file compression directly. With netCDF3, we can use scipy's netcdf reader, which supports Python file objects. But netCDF4-Python does not support Python file objects. This issue is concerned about supporting paths ending with |
Closing this as stale and out of date with our current backends. @swnesbitt (or others) - feel free to open a new PR if you feel there is more to do here. |
This pull request addresses issue #816, allowing the user to open remote openDAP files that are gzipped using the netCDF4 openDAP-enabled backend. Thanks to @dopplershift and @shoyer for the quick guidance!