-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error while saving an altered dataset to NetCDF when loaded from a file #8694
Comments
Thanks for opening your first issue here at xarray! Be sure to follow the issue template! |
Thanks for raising this. Please see #6323 for discussion on handling As you already found, |
Thank you for pointing out the resources. I like the idea of the The documentation could benefit from being more explicit about the issue of invalid encoding and its consequences. Specifically, the section Writing encoded data could mention that some operations may lead to invalid encoding information that can cause errors when writing to a file, for example with |
@tarik Thanks for taking the time to share your thoughts. Regarding the documentation changes we always welcome contributions. |
What happened?
When attempting to save an altered Xarray dataset to a NetCDF file using the
to_netcdf
method, an error occurs if the original dataset is loaded from a file. Specifically, this error does not occur when the dataset is created directly but only when it is loaded from a file.What did you expect to happen?
The altered Xarray dataset is saved as a NetCDF file using the
to_netcdf
method.Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
Findings:
The issue is related to the encoding information of the dataset becoming invalid after filtering data with the
where
method. Theto_netcdf
method takes the available encoding information instead of considering the actual shape of the data.In the provided examples, the maximum length of strings stored in "player_1" and "player_2" is originally set to 8 characters. However, after filtering with the
where
method, the maximum length of the string becomes 5 in "player_1" and remains 8 in "player_2.". But the encoding information of the variables still shows a length of 8, particularly the attributechar_dim_name
.Workaround:
A workaround to resolve this issue is to call the
drop_encoding
method on the dataset before saving it withto_netcdf
. This action ensures that the encoding information is not available, and theto_netcdf
method is forced to take the actual shapes of the data, preventing the broadcasting error.Environment
INSTALLED VERSIONS
commit: None
python: 3.9.14 (main, Aug 24 2023, 14:01:46)
[GCC 11.4.0]
python-bits: 64
OS: Linux
OS-release: 6.3.1-060301-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 2024.1.1
pandas: 2.2.0
numpy: 1.26.3
scipy: 1.12.0
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 69.0.3
pip: 23.3.2
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None
The text was updated successfully, but these errors were encountered: