-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retrieve regridding weights as numpy arrays #2
Comments
It remains to be seen whether the numpy implementation of sparse matrix multiplication will be slower than the Fortran implement in ESMPy. But calling a lot of |
I'll let @rokuingh provide more information on the implementation plan for retrieving and working with weights. I don't want to put my foot in my mouth. 😄 A couple things:
ESMPy using Fortran ordering since everything underneath is built on it. Hence when using
A number of times we've encountered issues in user code related to dimension ordering. I don't have a good sense how xarray's
My guess is that ESMF is faster. Whether the performance gain is enough to overcome issues in usability is a different question! A question for you, do you have a sense how array copies work with the |
Dimension ordering has been the Achilles heel of ESMPy, I suppose that is the price of building on a Fortran package, we are hoping to resolve this soonest! |
It seems all numpy arrays that point to ESMPy data fields are Fortran-ordered.
I think that's equivalent to np.swapaxes. Again I would want to avoid swapping axes at all, if possible.
I am doing deep copy in both directions. I don't think the resulting array is safe if it just points to I actually don't understand why ESMPy uses
I am not sure, considering the current heavy use of memory copies and array rearrangements. We'll need to do a benchmark once regridding weights are visible. |
This is just how ESMF works underneath which ESMPy attempts to mirror as much as possible. I agree it would be more convenient in cases where only weights are needed to use grids/meshes directly. Here is the underlying ESMF call that ESMPy wraps: http://www.earthsystemmodeling.org/esmf_releases/last_built/ESMF_refdoc/node5.html#SECTION050365900000000000000. |
My sincere apologies for the delay on getting you the weight file write code. We have a snapshot tag that includes writing weights to file:
The only change to your code should be the addition of a filename argument to the regrid call: rh = ESMF.Regrid(srcfield, dstfield, filename='/tmp/weights.nc', ...) Please let me know if you have any questions or find any issues! Though tested, note this is considered "development code". |
@bekozi Thanks! I am not worried about any code&API changes because I plan to only interface with the weight file. As long as the format of Is that snapshot also available on conda? I only tried to build ESMF a long time ago and forget how to do that now... Or could you just send me a sample weight file? Just lat-lon to lat-lon would be great. |
It is now. I sat bolt upright last night realizing I forgot to build one. 😄
|
@bekozi Thanks! But I got an error when using the The test environment is docker-miniconda3 |
Thanks for the reproducer. This is a very obscure bug related to Python 3. The code will work in Python 2.7. Will that be okay for a bit? I'll be looking into the Python 3 failure - it's somehow connected to |
That wasn't as bad as I predicted. We were not testing a new enough Index: src/addon/ESMPy/src/ESMF/interface/cbindings.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- src/addon/ESMPy/src/ESMF/interface/cbindings.py (revision 6f48610bd5243c1c33cfcd4e6ae9c2788c6c4a56)
+++ src/addon/ESMPy/src/ESMF/interface/cbindings.py (revision 9e1e6544f77b6ae10af92c65ef7ac9b2b761beb6)
@@ -1984,9 +1984,13 @@
raise TypeError('dstMaskValues must have dtype=int32')
dstMaskValues_i = ESMP_InterfaceInt(dstMaskValues)
+ # Need to create a C string buffer for Python 3.
+ b_filename = filename.encode('utf-8')
+ b_filename = ct.create_string_buffer(b_filename)
+
rc = _ESMF.ESMC_FieldRegridStoreFile(srcField.struct.ptr,
dstField.struct.ptr,
- filename,
+ b_filename,
srcMaskValues_i,
dstMaskValues_i,
ct.byref(routehandle), |
@bekozi Thanks for looking into this! I am OK with Python2.7 for now. |
@bekozi The Also, can it also write out grid information as in the Fortran version? |
Hmmm...I was able to reproduce the issue on my end. Thanks for the code. It doesn't appear the weight file write is appropriately tested on the ESMPy side. I ran the grids through the CLI app and the indexing aligns, so the issue is definitely in the Python interface. I'll see what I can find...may take a bit since I didn't code the original implementation. Sorry for the trouble. |
@bekozi That's fine! Take your time. |
Hi, @JiaweiZhuang. A new snapshot with a conda build is available that addresses the weight file write. I added a test for your specific case. Again, please let me know if you find any issues! |
@bekozi Thanks so much! I've checked that both bilinear and conservative algorithms are working correctly. Now I will have a lot to update in xESMF! There are two additional issues, though. They are not urgent but kind of confuse me.
The two issues are actually not related to this thread or to xESMF itself. I am willing to continue the discussion here but please do tell me if you want to move to other place to discuss ESMPy-specific things. |
Glad to hear it. 😅 We can move this discussion over to the ESMF support list. I'll create a ticket and will email you. I'll also create a ticket for getting the grid information into the ESMPy-generated weight file. |
Close this issue because ESMPy can now output regridding weights. See #6 for an example. Will reopen this if there's any bug about weights writing. |
…dif3 Respect dtype modif
Most regridding schemes are linear, i.e. the output data field is linearly dependent on the input data field. Any linear transform can be viewed as a matrix-vector multiplication y = A*x, where A is a matrix (typically sparse) containing regridding weights, and x, y are input and output data fields flatten to 1D.
In practice, linear regridding schemes are broken into two steps:
regrid = ESMF.Regrid(...)
.destfield = regrid(sourcefield, destfield)
, whereregrid
was created in the previous step.However, ESMPy's
regrid
object is like a black box. It knows how to perform regridding but there's no way to explicitly view the regridding weights (the matrix A).In the Fortran version of ESMF, the function
ESMF_RegridWeightGen
dumps regridding weights to NetCDF files. The content of the file is shown in "12.8 Regrid Interpolation Weight File Format" section in ESMF documention (link). The matrix A is stored in a sparse matrix form by variablesrow
,col
andS
in that NetCDF file. But in ESMPy there's no function equivalent toESMF_RegridWeightGen
.Being able to view the regridding weights in the Python-level will solve many troubles:
transpose()
function to match ESMPy's expection, we can write the sparse matrix multiplication directly in numpy, taking care of dimension broadcasting.dask.array
easily and natively. Otherwise, we need to let each dask worker call the underlying Fortran routine separately to regrid each hyberslab -- sounds like a very ugly solution.@bekozi seems to have some Python tools for
ESMF_RegridWeightGen
. Maybe we could start with that.The text was updated successfully, but these errors were encountered: