Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xarray DataArray attributes breaking grdimage #1578

Closed
reint-fischer opened this issue Oct 12, 2021 · 4 comments · Fixed by #1581
Closed

xarray DataArray attributes breaking grdimage #1578

reint-fischer opened this issue Oct 12, 2021 · 4 comments · Fixed by #1581
Labels
bug Something isn't working
Milestone

Comments

@reint-fischer
Copy link

reint-fischer commented Oct 12, 2021

Description of the problem

When trying to plot a timeslice of a field from an open xarray DataSet with the coordinates time, latitude, and longitude, I get a warning indicating the presence of a datacube, and a ValueError in accessors.py expecting 2 values but getting none. I found that this problem disappears when I perform an operation on the DataArray, e.g. multiplying with 1. The only thing that seems to change with this operation is that the attributes in xarray are dropped.

Full code that generated the error
Using the original DataArray:

print(ds['msl'].isel(time=i))

<xarray.DataArray 'msl' (latitude: 89, longitude: 241)>
array([[102200.34 , 102197.32 , 102193.22 , ..., 101123.69 , 101133.4  ,
        101143.98 ],
       [102182.86 , 102179.41 , 102176.59 , ..., 101061.734, 101072.31 ,
        101083.1  ],
       [102170.555, 102166.234, 102161.92 , ..., 100998.49 , 101009.28 ,
        101020.08 ],
       ...,
       [101951.47 , 101926.21 , 101900.52 , ..., 101845.7  , 101869.66 ,
        101891.67 ],
       [102002.19 , 101976.51 , 101950.17 , ..., 101921.25 , 101941.1  ,
        101961.83 ],
       [102049.89 , 102023.99 , 101997.445, ..., 101993.34 , 102012.984,
        102031.98 ]], dtype=float32)
Coordinates:
  * longitude  (longitude) float32 150.0 150.2 150.5 150.8 ... 209.5 209.8 210.0
  * latitude   (latitude) float32 72.0 71.75 71.5 71.25 ... 50.5 50.25 50.0
    time       datetime64[ns] 1996-11-04T06:00:00
Attributes:
    units:          Pa
    long_name:      Mean sea level pressure
    standard_name:  air_pressure_at_mean_sea_level

Breaks:

ds = xr.open_dataset(data_path) #

fig = pygmt.Figure()

msl=ds['msl'].isel(time=i)

fig.grdimage(grid=msl)

fig.show()

Using the DataArray multiplied by 1:

print(ds['msl'].isel(time=i)*1)

<xarray.DataArray 'msl' (latitude: 89, longitude: 241)>
array([[102200.34 , 102197.32 , 102193.22 , ..., 101123.69 , 101133.4  ,
        101143.98 ],
       [102182.86 , 102179.41 , 102176.59 , ..., 101061.734, 101072.31 ,
        101083.1  ],
       [102170.555, 102166.234, 102161.92 , ..., 100998.49 , 101009.28 ,
        101020.08 ],
       ...,
       [101951.47 , 101926.21 , 101900.52 , ..., 101845.7  , 101869.66 ,
        101891.67 ],
       [102002.19 , 101976.51 , 101950.17 , ..., 101921.25 , 101941.1  ,
        101961.83 ],
       [102049.89 , 102023.99 , 101997.445, ..., 101993.34 , 102012.984,
        102031.98 ]], dtype=float32)
Coordinates:
  * longitude  (longitude) float32 150.0 150.2 150.5 150.8 ... 209.5 209.8 210.0
  * latitude   (latitude) float32 72.0 71.75 71.5 71.25 ... 50.5 50.25 50.0
    time       datetime64[ns] 1996-11-04T06:00:00

Works:

ds = xr.open_dataset(data_path)

fig = pygmt.Figure()

msl=ds['msl'].isel(time=i)*1

fig.grdimage(grid=msl)

fig.show()

ERA5

Full error message

grdinfo [WARNING]: Detected a data cube (/Users/rfische1/Documents/UMD/Assistantship/POLARA/data/ERA5/ERA5_1995-2020-2.nc) but -Q not set - skipping
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/var/folders/hh/2wq7ys4544ngb10v0y46_f_h0000gr/T/ipykernel_12727/718688776.py in <module>
      5 msl=ds['msl'].isel(time=i)
      6 
----> 7 fig.grdimage(
      8     grid=msl
      9 )

~/opt/anaconda3/envs/name/lib/python3.8/site-packages/pygmt/helpers/decorators.py in new_module(*args, **kwargs)
    580                     )
    581                     warnings.warn(msg, category=SyntaxWarning, stacklevel=2)
--> 582             return module_func(*args, **kwargs)
    583 
    584         new_module.aliases = aliases

~/opt/anaconda3/envs/name/lib/python3.8/site-packages/pygmt/helpers/decorators.py in new_module(*args, **kwargs)
    723                         kwargs[arg] = separators[fmt].join(f"{item}" for item in value)
    724             # Execute the original function and return its output
--> 725             return module_func(*args, **kwargs)
    726 
    727         return new_module

~/opt/anaconda3/envs/name/lib/python3.8/site-packages/pygmt/src/grdimage.py in grdimage(self, grid, **kwargs)
    171                 kwargs["I"] = stack.enter_context(shading_context)
    172 
--> 173             fname = stack.enter_context(file_context)
    174             arg_str = " ".join([fname, build_arg_string(kwargs)])
    175             lib.call_module("grdimage", arg_str)

~/opt/anaconda3/envs/name/lib/python3.8/contextlib.py in enter_context(self, cm)
    423         _cm_type = type(cm)
    424         _exit = _cm_type.__exit__
--> 425         result = _cm_type.__enter__(cm)
    426         self._push_cm_exit(cm, _exit)
    427         return result

~/opt/anaconda3/envs/name/lib/python3.8/contextlib.py in __enter__(self)
    111         del self.args, self.kwds, self.func
    112         try:
--> 113             return next(self.gen)
    114         except StopIteration:
    115             raise RuntimeError("generator didn't yield") from None

~/opt/anaconda3/envs/name/lib/python3.8/site-packages/pygmt/clib/session.py in virtualfile_from_grid(self, grid)
   1335         >>> # The output is: w e s n z0 z1 dx dy n_columns n_rows reg gtype
   1336         """
-> 1337         _gtype = {0: "GMT_GRID_IS_CARTESIAN", 1: "GMT_GRID_IS_GEO"}[grid.gmt.gtype]
   1338         _reg = {0: "GMT_GRID_NODE_REG", 1: "GMT_GRID_PIXEL_REG"}[grid.gmt.registration]
   1339 

~/opt/anaconda3/envs/name/lib/python3.8/site-packages/xarray/core/extensions.py in __get__(self, obj, cls)
     34 
     35         try:
---> 36             accessor_obj = self._accessor(obj)
     37         except AttributeError:
     38             # __getattr__ on data object will swallow any AttributeErrors

~/opt/anaconda3/envs/name/lib/python3.8/site-packages/pygmt/accessors.py in __init__(self, xarray_obj)
     32             # From the shortened summary information of `grdinfo`,
     33             # get grid registration in column 10, and grid type in column 11
---> 34             self._registration, self._gtype = map(
     35                 int, grdinfo(self._source, per_column="n", o="10,11").split()
     36             )

ValueError: not enough values to unpack (expected 2, got 0)

System information

Please paste the output of python -c "import pygmt; pygmt.show_versions()":

PyGMT information:
  version: v0.4.1
System information:
  python: 3.8.11 (default, Aug  6 2021, 08:56:27)  [Clang 10.0.0 ]
  executable: /Users/name/opt/anaconda3/envs/name/bin/python
  machine: macOS-10.16-x86_64-i386-64bit
Dependency information:
  numpy: 1.21.2
  pandas: 1.3.3
  xarray: 0.19.0
  netCDF4: 1.5.7
  packaging: 21.0
  ghostscript: 9.54.0
  gmt: 6.2.0
GMT library information:
  binary dir: /Users/name/opt/anaconda3/envs/name/bin
  cores: 8
  grid layout: rows
  library path: /Users/name/opt/anaconda3/envs/name/lib/libgmt.dylib
  padding: 2
  plugin dir: /Users/name/opt/anaconda3/envs/name/lib/gmt/plugins
  share dir: /Users/name/opt/anaconda3/envs/name/share/gmt
  version: 6.2.0
@reint-fischer reint-fischer added the bug Something isn't working label Oct 12, 2021
@weiji14
Copy link
Member

weiji14 commented Oct 12, 2021

Hi @reint-fischer, thanks for trying out PyGMT!

So the Detected a data cube error indicates that you may be passing in a array with 3 dimensions (e.g. time, latitude and longitude). The solution would be to use .isel to select a single time slice (as you have discovered), so that there are only latitude and longitude coordinates, and ds.shape should return a tuple with two numbers (e.g. (180, 360)).

As to why it's not working, could you please provide the output of print(ds) for the two xarray.DataArray objects you used, the one that worked and the one that didn't? It's a bit hard to tell from your code what's going on (also, I think you posted the same code twice, could you edit your original post to clarify things?). Once you've done that, we should be able to see what's going on, and how you can modify your data to get it plotted.

@reint-fischer
Copy link
Author

reint-fischer commented Oct 12, 2021

hi @weiji14 ,

sorry for the confusion, I updated my original post. The crucial part that I forgot is the multiplying by 1: ds['msl'].isel(time=i)*1. I have printed the two DataArrays and as you can see, the only difference seems to be the attributes.

Relating to the datacube message I have tried alternative ways of selecting the data, but I always have a 2D slice as input and it only works when the attributes are not there.

I should also note again that the multiplication by 1 is a good workaround for me, so the bug is not very urgent, it just took me a while to figure it out and I think it would be nice to prevent people from running into this in the future.

Thanks for replying so quickly

@weiji14
Copy link
Member

weiji14 commented Oct 12, 2021

Ah I see what's going on now. So this is the same issue as mentioned in #524 (comment). Long story short, the error happens because grdimage runs grdinfo to extract some metadata before plotting, and grdinfo is attempting to read off of your original ERA5 NetCDF file (accessed using ds.encoding["source"]) but cannot since it is a datacube.

Your multiplying by 1 workaround (which is very smart by the way!) works because it resets ds.encoding["source"] to nothing (i.e. it won't be able to find your ERA5 NetCDF file). What we might do is to modify this try-except block:

pygmt/pygmt/accessors.py

Lines 30 to 39 in 5cb1035

try:
self._source = self._obj.encoding["source"] # filepath to NetCDF source
# From the shortened summary information of `grdinfo`,
# get grid registration in column 10, and grid type in column 11
self._registration, self._gtype = map(
int, grdinfo(self._source, per_column="n", o="10,11").split()
)
except KeyError:
self._registration = 0 # Default to Gridline registration
self._gtype = 0 # Default to Cartesian grid type

We'll need to update it to fallback nicely when grdinfo doesn't work (e.g. for datacubes). I'll try to work on a bugfix for this, but in the meantime, you can continue using your workaround, or do something like this:

msl = ds['msl'].isel(time=i)
msl.encoding.pop("source")  # removes link to NetCDF file
fig.grdimage(grid=msl)

To help us out a little, do you have a link to a publicly available dataset like the ERA5 NetCDF file you used? Would make it easier for us to test things out. Edit: I've opened a bugfix at #1581, hopefully we'll get this fixed by the next PyGMT v0.5.0 release coming end of Oct.

@weiji14
Copy link
Member

weiji14 commented Oct 28, 2021

@reint-fischer, the bugfix has been merged at #1581 so fig.grdimage(grid=msl) should work on your sliced data cube. We're just about to release PyGMT v0.5.0 tomorrow, so if you update PyGMT next week, things should work 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants