Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netCDF + lazy backend: Error when sel is used with slice, reverse arrange #6560

Closed
4 tasks done
jesieleo opened this issue May 2, 2022 · 12 comments · Fixed by #7586
Closed
4 tasks done

netCDF + lazy backend: Error when sel is used with slice, reverse arrange #6560

jesieleo opened this issue May 2, 2022 · 12 comments · Fixed by #7586

Comments

@jesieleo
Copy link

jesieleo commented May 2, 2022

What happened?

import xarray as xr
data = xr.open_dataset('data_example.nc')

# wrong
a = data.sel(time='1979-1',isobaricInhPa=200).z[:,  ::10,::10][:, ::-1,:]

It seem that when use sel() , [: , ::-1 , :] and [: , ::10 , ::10] at the same time will cause the second coord will wrong like this.
Why just when use sel and[: , ::10 , ::10][: , ::-1 , :] ,it will goes wrong in axis=2( the data DO NOT correspond to the coordinates) ?

What did you expect to happen?

The data correspond to the coordinates on Latitude.

Minimal Complete Verifiable Example

problem_example.zip

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS


commit: None
python: 3.8.12 (default, Oct 12 2021, 03:01:40) [MSC v.1916 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: AMD64 Family 25 Mod
el 33 Stepping 0, AuthenticAMD
byteorder: little
LC_ALL: None
LANG: en
LOCALE: ('Chinese (Simplified)_China', '936')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 2022.3.0
pandas: 1.4.2
numpy: 1.22.3
scipy: 1.7.3
netCDF4: 1.5.7
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.2.10
cfgrib: 0.9.10.1
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.5.1
cartopy: 0.20.2
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
setuptools: 56.0.0
pip: 22.0.4
conda: None
pytest: None
IPython: 8.3.0
sphinx: 4.5.0

@jesieleo jesieleo added bug needs triage Issue that has not been reviewed by xarray team member labels May 2, 2022
@jesieleo jesieleo changed the title error when error when using .sel() and [:, ::10, ::10][:, ::-1, :] May 2, 2022
@max-sixty max-sixty added needs mcve https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports and removed bug needs triage Issue that has not been reviewed by xarray team member labels May 2, 2022
@max-sixty
Copy link
Collaborator

@jesieleo — I'm probably being unclear, and I'm sure you're well-intentioned, but this & the previous issues aren't constructive for either of us so far.

In the politest possible way: please make an issue that is complete and "copy & pastes into an IPython prompt or Binder notebook, returning the result.". You cannot reference data that is only on your computer. This will require work from your end to take the example you have there, and reduce it to the bare minimum.

@jesieleo
Copy link
Author

jesieleo commented May 2, 2022 via email

@jesieleo jesieleo changed the title error when using .sel() and [:, ::10, ::10][:, ::-1, :] Error of multiple data slices May 2, 2022
@max-sixty
Copy link
Collaborator

OK — I would like to make your experience GitHub a good one! It's a magical place.

I have added a zip dile with a .ipynb file , don't know if it is ok?

But no, this is not sufficient — can you show your problem on an array like xr.DataArray(np.arange(12).reshape(2,3,4)), only adding extra as necessary? Please have another read of the links above showing how to minimize the examples.

@jesieleo
Copy link
Author

jesieleo commented May 3, 2022

But no, this is not sufficient — can you show your problem on an array like xr.DataArray(np.arange(12).reshape(2,3,4)), only adding extra as necessary? Please have another read of the links above showing how to minimize the examples.

Ok ,thanks. I try to simplify the problem on an array
a = data3.sel(time='1979-1',isobaricInhPa=200).z[:, ::10,::10][:, ::-1,:] ,my question is that the coordinates DO NOT correspond to the data, there is 73 on coordinates latitude but only the there is 72 only on real data latitude.

@jesieleo jesieleo changed the title Error of multiple data slices Multilayer data slice error May 3, 2022
@jesieleo jesieleo changed the title Multilayer data slice error Error when sel is used with slice, reverse arrange May 3, 2022
@max-sixty
Copy link
Collaborator

What is data3? We need an example that I can paste into a blank notebook and will show the problem.

@jesieleo
Copy link
Author

jesieleo commented May 4, 2022

What is data3? We need an example that I can paste into a blank notebook and will show the problem.

I packed them in one .zip this time ,the second-dimensional coordinates of variable a do not correspond to the data

@max-sixty
Copy link
Collaborator

But no, this is not sufficient — can you show your problem on an array like xr.DataArray(np.arange(12).reshape(2,3,4)), only adding extra as necessary? Please have another read of the links above showing how to minimize the examples.

Don't bundle data — build a new example from a basic array and demonstrate the bug.

@jesieleo
Copy link
Author

jesieleo commented May 5, 2022

Don't bundle data — build a new example from a basic array and demonstrate the bug.

Since I don't know how the grib file is generated, I can't reproduce it. But the data I use is the ECWMF official website data converted by engine='cfgrib', this problem may be caused by the 'cfgrib' engine, although it is not very big wrong, but it's a hidden danger.
I'm not competent enough, sorry to bother you.

@jesieleo jesieleo closed this as completed May 5, 2022
@dcherian
Copy link
Contributor

dcherian commented May 5, 2022

OK turns out my attempts at a minimal example didn't work.

There's some bug here (on xarray main)

  1. the size of latitude is different in .sizes, .shape, .data.shape and
  2. latitude really should be 73 elements. I don't see why a subsetting happens when slicing with ::-

image


@jesieleo here's how I tried to create a "minimal example"

import xarray as xr
import numpy as np

ds = xr.Dataset(
    {
        "z": (
            ("time", "isoBaricInhPa", "latitude", "longitude"),
            np.ones((1, 5, 721, 1440)), # use numpy.ones to create array of same shape.
        )
    },
    coords={"latitude": np.linspace(-90, 90, 721)},
)
ds.isel(isoBaricInhPa=1).z[:, ::10, :][:, ::-1, :]

Though this doesn't replicate it!

@dcherian dcherian reopened this May 5, 2022
@dcherian dcherian added bug topic-indexing topic-backends and removed needs mcve https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports labels May 5, 2022
@dcherian
Copy link
Contributor

dcherian commented May 5, 2022

OK it has to do with the lazy backend. Calling data_example.load() fixes things. @jesieleo in the mean time, if your data is small use load_dataset or load to fix it.

Though I still can't make a minimal example. The example below works fine.

import numpy as np
import xarray as xr
ds = xr.Dataset(
    {
        "z": (
            ("time", "isoBaricInhPa", "latitude", "longitude"),
            np.ones((1, 5, 721, 1440)),
        )
    },
    coords={"latitude": np.linspace(-90, 90, 721)},
).to_netcdf("test.nc")
ds = xr.open_dataset('test.nc')
ds.isel(time=[0], isoBaricInhPa=1).z[:, ::10, :][:, ::-1, :]

@max-sixty
Copy link
Collaborator

@dcherian thanks a lot for looking into this. @jesieleo thanks for your patience in us finding the bug.

@dcherian dcherian changed the title Error when sel is used with slice, reverse arrange netCDF + lazy backend: Error when sel is used with slice, reverse arrange May 5, 2022
@jesieleo
Copy link
Author

jesieleo commented Nov 4, 2022

First of all, thank you for your help @dcherian @max-sixty . I finally reproduced this problem.
I put the ds.isel(time=[0], isoBaricInhPa=1).z[:, ::10, :][:, ::-1, :] to
ds.isel(time=[0], isoBaricInhPa=1).z[:, ::10, ::10][:, ::-1, :] can make a minimal example.

import numpy as np
import xarray as xr
ds = xr.Dataset(
    {
        "z": (
            ("time", "isoBaricInhPa", "latitude", "longitude"),
            np.ones((1, 5, 721, 1440)),
        )
    },
    coords={"latitude": np.linspace(-90, 90, 721)},
).to_netcdf("test.nc")
ds = xr.open_dataset('test.nc')
a = ds.isel(time=[0], isoBaricInhPa=1).z[:, ::10, ::10][:, ::-1, :]
print(a.sizes)
print(a.shape)
print(a)

Frozen({'time': 1, 'latitude': 73, 'longitude': 144})
(1, 73, 144)
<xarray.DataArray 'z' (time: 1, latitude: 72, longitude: 144)>
array([[[1., 1., ..., 1., 1.],
        [1., 1., ..., 1., 1.],
        ...,
        [1., 1., ..., 1., 1.],
        [1., 1., ..., 1., 1.]]])
Coordinates:
  * latitude  (latitude) float64 90.0 87.5 85.0 82.5 ... -82.5 -85.0 -87.5 -90.0
Dimensions without coordinates: time, longitude

捕获

dcherian added a commit to dcherian/xarray that referenced this issue Mar 5, 2023
There was a bug in estimating the last index of the slice.
Index a range object instead.

Closes pydata#6560
dcherian added a commit that referenced this issue Mar 27, 2023
* Fix lazy slice rewriting.

There was a bug in estimating the last index of the slice.
Index a range object instead.

Closes #6560

* Support reversed slices with Zarr

* Update xarray/core/indexing.py

* Fix test

* bring back xfail

* Smaller test

* Better?

* fix typing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants