-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reading large netcdf file with python 3 #535
Comments
No, there is no know issue with python 3 compatibility. It's difficult to say without more to go on. I'd suggest posting the file somewhere, but at 30Gb that would be difficult. |
Since the error is occuring when opening the dataset, the variable data has not been read yet (only the metadata about the variables, dimensions and groups). Could you create a version of the file without the data written to the variables (just the variable, dimension and attributes defined)? If compression is turned on, the filesize should be small since all the variable data would set to the _FillValue and would compress down to nearly nothing. |
I have tried with the kitchen sink tool, to reduce the file size. With a file size of 8MB, it works fine with the python 3.5. With another file size of 6GB, it returns the same error. |
What platform are you on? |
I am using anaconda 3 64bit on windows |
Since you are on Windows, I wonder if it is related to Unidata/netcdf-c#188 , the fix for which should go in today. |
When you trimmed the file size with the nco tool, did you retain the same number of variables, dimensions and attributes? |
No, actually, i trimmed the file size by reducing the number of variables and dimensions. |
Unidata/netcdf-c#188 has been merged into master. Since you're using Anaconda Windows, I understand it may be difficult to try this - but if you have the ability to rebuild the C library from source it would be much appreciated if you could try this fix and let us know if it works. |
I am having same issue with a large netCDF file (~6 GB) using python 3.5.1 on windows 10. I can open the same file using python 2.7.11 just fine. I get the error: The netCDF file is here: https://www.dropbox.com/s/3ia5wrh5u8z9spr/states.nc?dl=0 (please note that it is ~6 GB in size) |
Although redundant, I think it´s necessary to push this issue a little. I have the same issue on Windows 10. I try to append an already existing file that is about 5 GB big. It gives also As 1.2.4 (which I´m using with Anaconda) was released after the fix, the problem seems still to exist. Edit: Tested it with ritviksahajpal's file. Here the error also occurs when reading it. |
I do experience the same with current Anaconda and Windows 7. The error seems to exist in both the versions in "conda" and in "conda-forge". |
Any update on this? The error still persists. |
I can report that as of yesterday, the problem still exists. My circumstances are: For example: two netcdf4 files, a big file with over 3 million points in the time series (3.2GB). A small file with 9999 points in the time series (9.8 MB). This code will open the small file (using xarray): |
Has anybody tried @WardF's suggestion of upgrading the netcdf-c library to the current master, which includes the fix for Unidata/netcdf-c#188? It sure sounds like this could fix the issue. |
Here, we've demonstrated that my files, large and small, can be opened on a MAC, and not in windows. |
In those instruction we pin But conda-forge does ship version @WardF I can backport the fix for Unidata/netcdf-c#188 on |
How can I double check if I have the offending versions noted in Unidata/netcdf-c#188? It looks - from my Anaconda Navigator listing - like I have msvc 14 runtime as vs2015_runtime version 14.0.25420, and netcdf4 version 1.2.7, neither indicating an update is needed. I'm trying a conda update --all anyway. |
Try this: |
Thanks Ryan, I can confirm that it is not working on libnetcdf 4.4.0. I mangled my python installation trying to upgrade, so need to reinstall. |
@msquared6, Unidata/netcdf-c#188 fixes a Windows-specific issue with large files. |
So getting libnetcdf >= 4.4.1 should be enough to resolve--and will be manageable once they figure out what's going on with some opendap links on windows. |
@WardF or @jswhit, I just want to reiterate what @ocefpaf said so it doesn't get lost here:
The problem on Windows right now is:
So @ocefpaf is offering to back port the 4.4.1 fix for libnetcdf back to 4.4.0, but he can't find the relevent code. |
You're assuming they were done in a PR--don't do that. I'm guessing a81f150e886239 and b19b807e8bbe81. |
OK, I have reinstalled my conda, freshly downloaded as per IOOS3 instructions and I now have this: 3.6.0 | packaged by conda-forge | (default, Feb 9 2017, 14:54:13) [MSC v.1900 64 bit (AMD64)] Please help a python neophyte properly update to libnetcdf 4.4.1.1, I wrecked my installation thinking I knew the right command. conda update ??? Thanks. |
@mmartini-usgs I just pushed a patched version for
I don't recommend using |
@ocefpaf hmm... sorry to continue the hijack - I need a step by step tutorial on how to re-create my env. I thought I knew what I was doing before and clearly... I still don't understand anaconda and python install structure very well. I am thinking this is delete C:\Users\username\AppData\Local\Continuum\Miniconda3\envs\envinquestion and then redo the following that er- includes conda update? |
No need to delete nor to re-configure. All you need to do is: deactivate
conda env remove -n IOOS3
conda env create --file environment.yml The deactivate is only to exit the env in case you are inside it. |
Many thanks, large files now work on my installation. |
That's great - so that confirms that the problem is fixed by recent updates to the C lib. Closing the issue now. |
Dear fellows, I have encountered the same problem as in this issue. As I understand, the libnetcdf version 4.4.0/4.4.1 and netCDF4 version 1.2.7 solved this problem. However, although I have the latest version of Anaconda for Python 3.6, when I tried to install netCDF4 via "conda install netCDF4", the conda installer still installed libnetcdf 4.3.3.1 and netCDF4 1.2.4. How can I tell the conda installer to install the latest version of libnetcdf and netCDF4? Any hints are appreciated. update I figured out myself: I need to use conda-forge channel to get the latest version. Sorry for bothering. |
I have a netcdf file about 30gb size. When i try to read the file with python 3.5, it give an error:
File "netCDF4_netCDF4.pyx", line 1795, in netCDF4._netCDF4.Dataset.init (netCDF4_netCDF4.c:12278)
RuntimeError: Unknown error
I also tried the different options for the netcdf operator, such as versions. But nothing helps. The strange thing is when i read the same file with python 2, it works. Is there a compatible issue with the netcdf4 libraries and python 3?
The text was updated successfully, but these errors were encountered: