-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For nearest-neighbor remapping, ensure results are independent of processor count if there are equidistant source points #276
Comments
Following up: Is this something that's on the roadmap to be in the ESMF version used in the CESM3 release? No worries if not, but in that case I'll need to make some of my tooling more robust and official. |
Yep, it's on the roadmap to ESMF 8.8.0, which is what we're targeting for CESM3. I'm hoping to get it done soon-ish, so we can make sure that it works awhile before the release. |
Excellent, thanks! |
CESM person that needs this may have another workaround.. so punting to 8.9.0 for now. |
@anntsay Do you mean me, with the tweaked input files? If not, what is the workaround you're referring to? |
Ann's comment came from my brief/vague verbal comment. Yes, @samsrabin , I was referring to you here. My understanding was that you have a workaround that can/will be applied in the upcoming CESM3 release code, so the lack of a fix in ESMF won't be a hold-up for CESM3... but actually, I just read back through the comments here and see that we had given the message that the more robust fix would make it into 8.8, which now no longer looks like it will be possible. Sorry: that earlier comment came before we pushed the 8.8 release timing earlier by a couple of months. So, @samsrabin , can you let us know how much of a problem it will be for you that the ESMF release that will be used in CESM3 likely won't have this more robust fix in place? |
Very little problem at all, I just need to make the scripts I used for the tweaking more robust. Thanks! |
Thanks @samsrabin , and sorry about this. There were different forces pulling in different directions regarding the ESMF 8.8 release timing. It's still possible that ESMF 8.9 will be ready in time for CESM3, but that will depend largely on the CESM3 timing, so at this point we can't count on it. |
For nearest-neighbor remapping, if there are equidistant source points, there is currently some logic that says that, if there are equidistant source points, arbitrarily use the point with the smallest ID. But, according to @oehmke , that logic isn't done in the multi-processor case, because currently the IDs aren't sent between processors. This results in nearest-neighbor mapping giving different results with different processor counts if there are equidistant source points. @oehmke proposes adding a send of the IDs so that the multi-processor case can break ties using the ID, similarly to in the single-processor case.
Discussed in https://github.com/orgs/esmf-org/discussions/261
Originally posted by samsrabin July 10, 2024
Requirements
Affiliation(s)
NSF-NCAR
ESMF Version
No response
Issue
In CTSM, we use ESMF to read some input files. One particular pair of input files, specifying crop sowing window start and end dates, is at half-degree resolution. We tell ESMF to do nearest-neighbor1 spatial interpolation as necessary to match the simulation grid.
When I do a run at 10°x15° resolution, some of the simulation gridcell centers are located exactly at the "corners" of four half-degree input pixels, meaning that those four neighbors are equally near. It doesn't matter to me which of those ESMF chooses as the "nearest neighbor," as long as it's consistent.
Unfortunately, it's not: At least one gridcell has a different "nearest neighbor" chosen depending on how many processors the job is split across.
As an example, I've made a figure based on two cases that are identical in setup except that Case 1 used 128 processors and Case 2 used 64. Due to this issue, a certain crop in the gridcell centered at latitude 0, longitude 30°E2 gets sowing window of days 7-82 in Case 1 and 336-46 in Case 2.
The white/gray/black in this figure represents the half-degree sowing window files. Gray pixels match the values in Case 1, black pixels match Case 2, and white pixels match neither. The red lines intersect at the center of the 10x15 CTSM gridcell.
data:image/s3,"s3://crabby-images/20ec8/20ec8ae93f1bd9b7408de09df1ad6427e4644a57" alt="screenshot_1104"
It looks like Case 1 reads from the pixel to the southwest, whereas Case 2 reads from the pixel to the northwest.
Some notes:
Tagging @ekluzek, @billsacks, and @briandobbins, who have expressed interest in this. By the way, I think I mentioned to y'all that I was having an ERP test pass but the equivalent PEM test fail—this is why! The read of sowing windows only happens at the very beginning of the test, so changing processor count halfway through makes no difference.
Autotag
@oehmke
Footnotes
It needs to be nearest-neighbor because dates are modulo—interpolating between Jan. 2 [day 2] and Dec. 31 (day 365) should give Jan. 1 (day 1), not July 3-4 (day [2+365]/2 = 183.5)—and that's not something ESMF can do, to my knowledge. ↩
There are other crops in this gridcell that also get different sowing windows. There are no crops in any other gridcell that get different sowing windows, but that doesn't necessarily mean different "nearest" neighbors are getting chosen. That might be happening, just with input pixels that don't differ. ↩
The text was updated successfully, but these errors were encountered: