Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xarray version 0.16.1 breaking argopy #64

Closed
gmaze opened this issue Oct 15, 2020 · 4 comments · Fixed by #65
Closed

xarray version 0.16.1 breaking argopy #64

gmaze opened this issue Oct 15, 2020 · 4 comments · Fixed by #65
Labels
internals Internal machinery invalid This doesn't seem right

Comments

@gmaze
Copy link
Member

gmaze commented Oct 15, 2020

It seems that since xarray moved to 0.16.1, argopy is broken !

MCVE Code Sample

import xarray  # Loading version 0.16.1
import argopy  # Loading master version
l = argopy.fetchers.ArgoDataFetcher(src='erddap', cache=0).region([-60, -55, 40., 45., 0., 10., '2007-08-01', '2007-09-01'])
# l = argopy.fetchers.ArgoDataFetcher(src='erddap', cache=0).profile(6902746, 34)
l.to_xarray()

the same code with an xarray version strictly lower than 0.16.1 will work (eg. 0.15.1, 0.16.0)

This will throw an error like those found in unit testing here before we locked xarray version to 0.15.1.
This is the reason we currently have an "outdated" message for argopy requirements.

Full error stack

ValueError Traceback (most recent call last)
in
1 l = argopy.fetchers.ArgoDataFetcher(src='erddap', cache=0).region([-60, -55, 40., 45., 0., 10., '2007-08-01', '2007-09-01'])
2 # l = argopy.fetchers.ArgoDataFetcher(src='erddap', cache=0).profile(6902746, 34)
----> 3 l.to_xarray()

~/git/github/euroargodev/argopy/argopy/fetchers.py in to_xarray(self, **kwargs)
271 ",".join(self.Fetchers.keys()))
272 xds = self.fetcher.to_xarray(**kwargs)
--> 273 xds = self.postproccessor(xds)
274 return xds
275

~/git/github/euroargodev/argopy/argopy/fetchers.py in postprocessing(xds)
251
252 def postprocessing(xds):
--> 253 xds = self.fetcher.filter_data_mode(xds)
254 xds = self.fetcher.filter_qc(xds)
255 xds = self.fetcher.filter_variables(xds, self._mode)

~/git/github/euroargodev/argopy/argopy/data_fetchers/erddap_data.py in filter_data_mode(self, ds, **kwargs)
467
468 def filter_data_mode(self, ds, **kwargs):
--> 469 ds = ds.argo.filter_data_mode(errors="ignore", **kwargs)
470 if ds.argo._type == "point":
471 ds["N_POINTS"] = np.arange(0, len(ds["N_POINTS"]))

~/git/github/euroargodev/argopy/argopy/xarray.py in filter_data_mode(self, keep_error, errors)
308
309 # Create one dataset for each of the data_mode:
--> 310 argo_r, argo_a, argo_d = ds_split_datamode(ds)
311
312 # Fill in the adjusted field with the non-adjusted wherever it is NaN

~/git/github/euroargodev/argopy/argopy/xarray.py in ds_split_datamode(xds)
221 """
222 # Real-time:
--> 223 argo_r = ds.where(ds['DATA_MODE'] == 'R', drop=True)
224 for v in plist:
225 vname = v.upper() + '_ADJUSTED'

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/common.py in where(self, cond, other, drop)
1266 cond = cond.isel(**indexers)
1267
-> 1268 return ops.where_method(self, cond, other)
1269
1270 def close(self: Any) -> None:

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/ops.py in where_method(self, cond, other)
191 # alignment for three arguments is complicated, so don't support it yet
192 join = "inner" if other is dtypes.NA else "exact"
--> 193 return apply_ufunc(
194 duck_array_ops.where_method,
195 self,

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, meta, dask_gufunc_kwargs, *args)
1090 # feed datasets apply_variable_ufunc through apply_dataset_vfunc
1091 elif any(is_dict_like(a) for a in args):
-> 1092 return apply_dataset_vfunc(
1093 variables_vfunc,
1094 *args,

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/computation.py in apply_dataset_vfunc(func, signature, join, dataset_join, fill_value, exclude_dims, keep_attrs, *args)
405 )
406
--> 407 list_of_coords = build_output_coords(args, signature, exclude_dims)
408 args = [getattr(arg, "data_vars", arg) for arg in args]
409

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/computation.py in build_output_coords(args, signature, exclude_dims)
222 else:
223 # TODO: save these merged indexes, instead of re-computing them later
--> 224 merged_vars, unused_indexes = merge_coordinates_without_align(
225 coords_list, exclude_dims=exclude_dims
226 )

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/merge.py in merge_coordinates_without_align(objects, prioritized, exclude_dims)
327 filtered = collected
328
--> 329 return merge_collected(filtered, prioritized)
330
331

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/merge.py in merge_collected(grouped, prioritized, compat)
227 variables = [variable for variable, _ in elements_list]
228 try:
--> 229 merged_vars[name] = unique_variable(name, variables, compat)
230 except MergeError:
231 if compat != "minimal":

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/merge.py in unique_variable(name, variables, compat, equals)
118 if compat == "broadcast_equals":
119 dim_lengths = broadcast_dimension_size(variables)
--> 120 out = out.set_dims(dim_lengths)
121
122 if compat == "no_conflicts":

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/variable.py in set_dims(self, dims, shape)
1438 # don't use broadcast_to unless necessary so the result remains
1439 # writeable if possible
-> 1440 expanded_data = self.data
1441 elif shape is not None:
1442 dims_map = dict(zip(dims, shape))

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/variable.py in data(self)
357 return self._data
358 else:
--> 359 return self.values
360
361 @data.setter

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/variable.py in values(self)
508 def values(self):
509 """The variable's data as a numpy.ndarray"""
--> 510 return _as_array_or_item(self._data)
511
512 @values.setter

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/variable.py in _as_array_or_item(data)
270 data = data.get()
271 else:
--> 272 data = np.asarray(data)
273 if data.ndim == 0:
274 if data.dtype.kind == "M":

~/anaconda/envs/py38/lib/python3.8/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
81
82 """
---> 83 return array(a, dtype, copy=False, order=order)
84
85

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/indexing.py in array(self, dtype)
683
684 def array(self, dtype=None):
--> 685 self._ensure_cached()
686 return np.asarray(self.array, dtype=dtype)
687

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/indexing.py in _ensure_cached(self)
680 def _ensure_cached(self):
681 if not isinstance(self.array, NumpyIndexingAdapter):
--> 682 self.array = NumpyIndexingAdapter(np.asarray(self.array))
683
684 def array(self, dtype=None):

~/anaconda/envs/py38/lib/python3.8/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
81
82 """
---> 83 return array(a, dtype, copy=False, order=order)
84
85

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/indexing.py in array(self, dtype)
653
654 def array(self, dtype=None):
--> 655 return np.asarray(self.array, dtype=dtype)
656
657 def getitem(self, key):

~/anaconda/envs/py38/lib/python3.8/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
81
82 """
---> 83 return array(a, dtype, copy=False, order=order)
84
85

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/core/indexing.py in array(self, dtype)
558 def array(self, dtype=None):
559 array = as_indexable(self.array)
--> 560 return np.asarray(array[self.key], dtype=None)
561
562 def transpose(self, order):

~/anaconda/envs/py38/lib/python3.8/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
81
82 """
---> 83 return array(a, dtype, copy=False, order=order)
84
85

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/coding/variables.py in array(self, dtype)
68
69 def array(self, dtype=None):
---> 70 return self.func(self.array)
71
72 def repr(self):

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/coding/times.py in decode_cf_datetime(num_dates, units, calendar, use_cftime)
156 if use_cftime is None:
157 try:
--> 158 dates = _decode_datetime_with_pandas(flat_num_dates, units, calendar)
159 except (KeyError, OutOfBoundsDatetime, OverflowError):
160 dates = _decode_datetime_with_cftime(

~/anaconda/envs/py38/lib/python3.8/site-packages/xarray/coding/times.py in _decode_datetime_with_pandas(flat_num_dates, units, calendar)
121 with warnings.catch_warnings():
122 warnings.filterwarnings("ignore", "invalid value encountered", RuntimeWarning)
--> 123 pd.to_timedelta(flat_num_dates.min(), delta) + ref_date
124 pd.to_timedelta(flat_num_dates.max(), delta) + ref_date
125

~/anaconda/envs/py38/lib/python3.8/site-packages/numpy/core/_methods.py in _amin(a, axis, out, keepdims, initial, where)
41 def _amin(a, axis=None, out=None, keepdims=False,
42 initial=_NoValue, where=True):
---> 43 return umr_minimum(a, axis, None, out, keepdims, initial, where)
44
45 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,

ValueError: zero-size array to reduction operation minimum which has no identity

Problem Description

Obviously, argopy must work with the last versions of xarray.

I noticed that in the breaking changes of the 0.16.1 version that:

DataArray.astype() and Dataset.astype() now preserve attributes. Keep the old behavior by passing keep_attrs=False

This may be the origin of the issue here.

Versions

Output of `argopy.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.8.6 | packaged by conda-forge | (default, Oct 7 2020, 18:42:56)
[Clang 10.0.1 ]
python-bits: 64
OS: Darwin
OS-release: 18.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4

argopy: 999
xarray: 0.16.1
pandas: 1.1.3
numpy: 1.19.2
scipy: 1.5.2
fsspec: 0.8.4
erddapy: 0.7.2
netCDF4: 1.5.4
pydap: installed
h5netcdf: 0.8.1
h5py: 2.10.0
Nio: None
zarr: 2.4.0
cftime: 1.2.1
nc_time_axis: 1.2.0
PseudoNetCDF: installed
rasterio: 1.1.7
cfgrib: 0.9.8.4
iris: 2.4.0
bottleneck: 1.3.2
dask: 2.30.0
distributed: 2.30.0
matplotlib: 3.3.2
cartopy: 0.18.0
seaborn: 0.11.0
numbagg: installed
gsw: 3.4.0
setuptools: 49.6.0.post20201009
pip: 20.2.3
conda: None
pytest: 6.1.1
IPython: 7.18.1
sphinx: None

@gmaze gmaze added the bug Something isn't working label Oct 15, 2020
@gmaze
Copy link
Member Author

gmaze commented Oct 15, 2020

the error above, that we have with xarray 0.16.1 and not previous versions is in the line:

argo_r = ds.where(ds['DATA_MODE'] == 'R', drop=True)

which triggers an enigmatic:

ValueError: zero-size array to reduction operation minimum which has no identity

@gmaze
Copy link
Member Author

gmaze commented Oct 15, 2020

I have to notice than the mask here is full of False, so that the argo_r output will be empty.

This is what is returned with an xarray version older than 0.16.1:

<xarray.Dataset>
 Dimensions:                   (N_POINTS: 0)
 Coordinates:
     TIME                      (N_POINTS) datetime64[ns] 
     LATITUDE                  (N_POINTS) float64 
     LONGITUDE                 (N_POINTS) float64 
   * N_POINTS                  (N_POINTS) int64 
 Data variables:
     CONFIG_MISSION_NUMBER     (N_POINTS) float64
[...]

This resembles this old issue: pydata/xarray#1329

@gmaze
Copy link
Member Author

gmaze commented Oct 15, 2020

So it's the time decoding of an empty array that fails here,
If we remove the TIME variable before the where statement, it works and return an empty array:

ds2 = ds.copy(deep=True)
ds2 = ds2.drop_vars('TIME')
ds2.where(ds2['DATA_MODE'] == 'R', drop=True)

returns:

<xarray.Dataset>
Dimensions:                   (N_POINTS: 0)
Coordinates:
    LATITUDE                  (N_POINTS) float64 
    LONGITUDE                 (N_POINTS) float64 
  * N_POINTS                  (N_POINTS) int64 
Data variables:
    CONFIG_MISSION_NUMBER     (N_POINTS) float64 
[...]

and no error is raised anymore

To fix this, we should then monitor for the possible xarray error and turn around it like in atmtools/typhon#37

@gmaze
Copy link
Member Author

gmaze commented Oct 15, 2020

I implemented a fix in #65
But I still don't understand why this popped up in 0.16.1 and not before ...

@gmaze gmaze added internals Internal machinery invalid This doesn't seem right and removed bug Something isn't working labels Nov 11, 2020
@gmaze gmaze closed this as completed in #65 Nov 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internals Internal machinery invalid This doesn't seem right
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant