-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible bug relating to the setting of Variable chunksizes #1323
Comments
The current code will not call |
I think chunking is only used be default if there is an unlimited dimension. Try this: import netCDF4
import numpy as np
def write(**kwargs):
nc = netCDF4.Dataset('chunk.nc', 'w')
x = nc.createDimension('x', 8000)
y = nc.createDimension('y', 400)
z = nc.createDimension('z', None)
tas = nc.createVariable('tas', 'f8', ('z','y', 'x'), **kwargs)
tas[0:10,:,:] = np.random.random(32000000).reshape(10,400, 8000)
print(tas.chunking())
nc.close()
write()
so even if you specify I can see how this can be confusing since the default for the contingous kwarg is False, yet the library default is True unless there is an unlimited dimension. It does say this in the netcdf4-python docs though "Fixed size variables (with no unlimited dimension) with no compression filters are contiguous by default." |
As near as I can tell, when a variable is created, it has default chunksizes computed automatically. |
Thanks for the background, @jswhit and @DennisHeimbigner - it's very useful. So, not a bug then, but maybe a feature request! Could it be possible get netCDF4-python to write with the default chunking strategy a variable that has no unlimited dimensions? I guess that you don't want to change the existing API, so perhaps that could be controlled by a new keyword to createVariable? Thanks, |
@davidhassell it is already being reported - variables with no unlimited dimension are not chunked by default (they are contiguous). |
Hi @jswhit, I see that what I wrote was ambiguous - sorry! I'll try again: I would like to create chunked variables, chunked with the netCDF default chunk sizes, that have no unlimited dimensions. As far as I can tell this is not currently possible, but would you be open to creating this option? |
@davidhassell thanks for clarifying, I understand now. Since the python interface doesn't have access to the default chunking algorithm in the C library, I don't know how this would be done. I'm open to suggestions though. |
a potential workaround that doesn't require having an unlimited dimension is to turn on compression ( |
Hello,
I have found it impossible (at v1.6.5) to get netCDF4 to write out a file with the default chunking strategy - it either writes contiguous, or with explicitly set chunksizes, but never with the default chunks.
To test this I used the following function:
and ran it as follows:
Surely it's the case that if
contiguous=False, chunksizes=None
then the netCDF default chunking strategy should be used?I found that if I changed line https://github.com/Unidata/netcdf4-python/blob/v1.6.5rel/src/netCDF4/_netCDF4.pyx#L4307 to read:
then I could get the default chunking to work as expected:
However, this might not be the best way to do things - what do you think?
Many thanks,
David
The text was updated successfully, but these errors were encountered: