nansum vs nanmean for all-nan vectors #2889

mathause · 2019-04-11T15:04:39Z

import xarray as xr
import numpy as np

ds = xr.DataArray([np.NaN, np.NaN])

ds.mean()
ds.sum()

Problem description

ds.mean() returns NaN, ds.sum() returns 0. This comes from numpy (cp np.nanmean vs. np.nansum), so it might have to be discussed upstream, but I wanted to ask the xarray community first on their opinion. This is also relevant for #422 (what happens if the all weights are NaN or sum up to 0).

Expected Output

I would expect both to return np.nan.

Output of `xr.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.7.3 | packaged by conda-forge | (default, Mar 27 2019, 23:01:00)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 4.4.176-96-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.2

xarray: 0.12.1
pandas: 0.24.2
numpy: 1.16.2
scipy: 1.2.1
netCDF4: 1.5.0.1
pydap: None
h5netcdf: 0.7.1
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: 1.2.0
PseudonetCDF: None
rasterio: 1.0.22
cfgrib: None
iris: None
bottleneck: 1.2.1
dask: 1.1.5
distributed: 1.26.1
matplotlib: 3.0.3
cartopy: 0.17.0
seaborn: 0.9.0
setuptools: 41.0.0
pip: 19.0.3
conda: None
pytest: 4.4.0
IPython: 7.4.0
sphinx: 2.0.1

The text was updated successfully, but these errors were encountered:

dcherian · 2019-04-11T15:19:42Z

This is correct though isn't it?

Mean = nansum(not nan values)/count(not nan values) = 0/0

fujiisoup · 2019-04-11T15:28:58Z

Thanks @mathause
I also think the current behavior is not perfect but the best.

I would expect both to return np.nan

I expect that np.nansum(ds) is equivalent to np.sum(not nan values) and thus should be 0, while np.mean should be NaN as @dcherian pointed out.

To me, the future average function would also return np.nan for all nan slices.

mathause · 2019-04-11T16:08:02Z

Thanks for the feedback. I think I see the reasoning behind it now.

mathause closed this as completed Apr 11, 2019

dcherian mentioned this issue Aug 11, 2021

Dataset.sum() returns 0 for nan values #5693

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nansum vs nanmean for all-nan vectors #2889

nansum vs nanmean for all-nan vectors #2889

mathause commented Apr 11, 2019

INSTALLED VERSIONS

dcherian commented Apr 11, 2019

fujiisoup commented Apr 11, 2019 •

edited by mathause

Loading

mathause commented Apr 11, 2019

nansum vs nanmean for all-nan vectors #2889

nansum vs nanmean for all-nan vectors #2889

Comments

mathause commented Apr 11, 2019

Problem description

Expected Output

Output of xr.show_versions()

INSTALLED VERSIONS

dcherian commented Apr 11, 2019

fujiisoup commented Apr 11, 2019 • edited by mathause Loading

mathause commented Apr 11, 2019

Output of `xr.show_versions()`

fujiisoup commented Apr 11, 2019 •

edited by mathause

Loading