Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

N + 1 sized grids for X and Y #15

Open
hugke729 opened this issue Nov 4, 2016 · 7 comments
Open

N + 1 sized grids for X and Y #15

hugke729 opened this issue Nov 4, 2016 · 7 comments

Comments

@hugke729
Copy link

hugke729 commented Nov 4, 2016

For plotting (specifically pcolor), it's often helpful to have X and Y grids that include both end points. I think it would be convenient to have something like Xp1 and Yp1, just like there is for Zp1.

Having this baked into the Dataset would enable use of all of Dataset's plotting methods, and it would save boilerplate code for creating Xp1 and Yp1 everytime I need them.

@rabernat
Copy link
Member

rabernat commented Nov 4, 2016

This makes sense from a physical point of view. But there are two obstacles:

  1. You are incorrect about xarray's plotting methods: xarray does NOT support having N+1 sized grids for pcolormesh. Instead, it infers the N+1 point from the existing dimensions. In fact this is a topic of discussion on active xarray issues (Don't infer x/y coordinates interval breaks for cartopy plot axes pydata/xarray#781) and pull requests (New infer_intervals keyword for pcolormesh pydata/xarray#1079). Some people would like to be able to specify the N+1 point (since the inferring won't always work correctly for all datasets), but xarray does not support the CF concept of cell_boundaries at this time.
  2. The MITgcm MDS output does not actually write the N+1 points (Xp1 and Yp1), and there is no universal way to infer them, especially for LLC grids. I am curious how you are doing it now.

Have you tried just using the dataset.plot.pcolormesh method directly as is? Does the automatic inferring of intervals not work correctly? If so, you should consider opening an xarray issue.

@rabernat
Copy link
Member

rabernat commented Nov 4, 2016

p.s. the relevant xarray code for inferring intervals is here:
https://github.com/pydata/xarray/blob/master/xarray/plot/plot.py#L545

@hugke729
Copy link
Author

hugke729 commented Nov 4, 2016

Thanks for the clarification.

  1. I hadn't noticed that xarray's plotting defaults to inferring the edges since it isn't their docstring. Their method is good enough for my purposes.
  2. I'm only using Cartesian grids (albeit uneven), so calculating all edge points is easy.

Good to know it's an ongoing issue and that function arguments may change in the future.

@hugke729 hugke729 closed this as completed Nov 4, 2016
@rabernat rabernat reopened this Nov 5, 2016
@rabernat
Copy link
Member

rabernat commented Nov 5, 2016

I would prefer to keep this open.

While xarray handles your cartesian case fine, I don't think it will handle the llc case correctly, because the coordinates themselves are 2D arrays. For that case, we really would want to provide (Ny+1, Nx+1)-shaped arrays to pcolor. That may require a custom plotting function in xmitgcm, or it may be supported by a future version of xarray. In any case, it is a real issue.

Thanks @hugke729 for your feedback!

@JiaweiZhuang
Copy link

JiaweiZhuang commented Jul 11, 2017

I am not an MITgcm user but having exactly the same issue regarding xarray. We are developing a xarray-based Python package for the Cubed-Sphere (CS) grids in GEOS-Chem and GFDL-FV3. The CS grids have similar features as the Lat-Lon-Cap (LLC) grids -- we need an additional "tile" dimension in an xarray DataSet to represent 6 tiles, pretty similar to how xmitgcm handles faces.

I am very interested in implementing N+1 sized coordinates as it will benefit all kinds of quadrilateral grids. For example, if I use DataArray's plotting method on each tile (i.e. call dr.plot.pcolormesh 6 times), with cell centers being the x and y arguments, I will get ugly white lines between tiles because one row and one column are missing:
cs

If I instead use the original plt.pcolormesh() and provide cell boundaries, it looks much better:
cs_bnds
Similar to the LLC grid, inferring boundaries just doesn't work well because the tiles span over the poles.

Passing cell boundaries to xarray's plottig method should be possible, because the wrapper is tiny. However, the main barrier is that DataArray (not DataSet) currently cannot contain N+1 sized coordinated variables. If you try to assign such a coordinate by dr.assign_coords(), you will get
ValueError: cannot add coordinates with new dimensions to a DataArray

On the other hand, say you already have a DataSet containing cell bounds ("lat_b" and "lon_b") like below:

<xarray.Dataset>
Dimensions:  (pfull: 20, phalf: 21, tile: 6, time: 8, x: 48, x_b: 49, y: 48, y_b: 49)
Coordinates:
  * x        (x) float64 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 ...
  * y        (y) float64 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 ...
  * time     (time) float64 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0
  * pfull    (pfull) float64 7.673 47.07 100.7 152.7 204.4 255.8 307.2 358.6 ...
  * phalf    (phalf) float64 1.0 26.0 77.26 128.5 179.8 231.1 282.3 333.6 ...
  * x_b      (x_b) float64 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 ...
  * y_b      (y_b) float64 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 ...
    lon      (tile, y, x) float32 305.783 307.37 308.986 310.631 312.306 ...
    lat      (tile, y, x) float32 -34.8911 -35.5988 -36.2858 -36.951 ...
    lon_b    (tile, y_b, x_b) float32 305.0 306.573 308.174 309.805 311.465 ...
    lat_b    (tile, y_b, x_b) float32 -35.2644 -35.9889 -36.6926 -37.3743 ...
Dimensions without coordinates: tile
Data variables:
    ps       (tile, time, y, x) float64 9.999e+04 9.999e+04 9.999e+04 ...
    temp     (tile, time, pfull, y, x) float64 261.3 261.3 261.4 261.4 261.5 ...
    omega    (tile, time, pfull, y, x) float64 -0.000111 -8.006e-05 ...
Attributes:
    filename:   atmos_daily.tile1.nc
    title:      Cubed-Sphere_C48
    grid_type:  regular
    grid_tile:  N/A

The cell bound variables will be dropped automatically when you extract a DataArray by calling ds["temp"]

Coordinates:
  * x        (x) float64 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 ...
  * y        (y) float64 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 ...
  * time     (time) float64 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0
  * pfull    (pfull) float64 7.673 47.07 100.7 152.7 204.4 255.8 307.2 358.6 ...
    lon      (tile, y, x) float32 305.783 307.37 308.986 310.631 312.306 ...
    lat      (tile, y, x) float32 -34.8911 -35.5988 -36.2858 -36.951 ...

So there must be a way to preserve cell bounds. I wonder if it is possible to:
Let DataArray accept N+1 sized coordinate variables, and be able to inherit them from the parent DataSet.
Will this be too drastic and conflict with DataArray's basic structure? I use xarray a lot but haven't quite looked into its source code, so if anyone has tried this please do tell me. If that sounds doable I would definitely give it a shot.



PS1:
In the xarray document I notice:

The data model of xarray does not support datasets with cell boundaries yet. If you want to use these coordinates, you’ll have to make the plots outside the xarray framework.

It doesn't seem necessary to support the cell bound format specified by the CF-convention . The CF-convention uses an array of size (N,2) instead of (Nx+1) for 1D grid bounds, and (Nx,Ny,4) instead of (Nx+1,Ny+1) for 2D bounds. N+1 sized grids is more elegant and also more consistent with pcolormesh as it expects (Nx+1,Ny+1) not (Nx,Ny,4).

PS2:
Having cell bounds as persistent coordinate variables is also important for conservative (area-weighted) regridding, which is discussed in pydata/xarray#486

@rabernat
Copy link
Member

Hi @JiaweiZhuang--thanks for your interest. I know you have gotten some responses on the related xarray issue.

I wanted to also draw your attention to xgcm: https://github.com/xgcm/xgcm. The point of xgcm is to represent and operate on Arakawa grids using xarray data structures. I am about to make a new release of xgcm which will come closer to supporting the features you need. As you pointed out, MITgcm and geos-chem have many similar issues related to these tiled grids.

I will update you soon once I make this new release.

@JiaweiZhuang
Copy link

@rabernat Thanks for pointing me to xgcm. I've just looked through its documentation and it indeed addresses xarray's limitations. Please keep me informed of the updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants