-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zero-diff bug fixes in make_bcs package #601
Conversation
GEOSagcm_GridComp/GEOSphysics_GridComp/GEOSsurface_GridComp/Utils/Raster/mkSMAPTilesPara_v2.F90
Show resolved
Hide resolved
GEOSagcm_GridComp/GEOSphysics_GridComp/GEOSsurface_GridComp/Utils/Raster/mkSMAPTilesPara_v2.F90
Show resolved
Hide resolved
GEOSagcm_GridComp/GEOSphysics_GridComp/GEOSsurface_GridComp/Utils/Raster/mkSMAPTilesPara_v2.F90
Show resolved
Hide resolved
@weiyuan-jiang, I have one more question: I can see how passing the min/max lat/lon into LRRasterize() is important for the EASE and EASEv2 grids because these grid do not fully cover the globe. However, in #596 @biljanaorescanin reports that she hasn't seen the "Rasterization completed successfully" message for the CF grids either: #596 (comment) |
I think this PR is non-zero diff for EASEv2 grid. For CF grid, it should be zero-diff. I have printed out the x values, it is exact 180.000000 and -180.000000. Of course it is highly possible it could be 180.000001 since the last digit is not reliable. But this PR it is already non-zero diff anyway. |
We could not see the message "Rasterization completed successfully" because this message is redirected to >/dev/null . I tested NL3 C24 case. If I remove ">/dev/null" in the job script, the successful message would be printed out. @gmao-rreichle @biljanaorescanin |
This line is obviously a bug. tmpstring is not assigned at this point. We probably can just remove it. Line 752 in 1046239
|
@weiyuan-jiang, thanks for the explanation and adding the non-0-diff label. If the PR is indeed not 0-diff for EASEv2, figuring out how to proceed will need more thought. We would lose the 0-diff backward compatibility for all existing EASEv2 bcs datasets, including the brandnew NLv5 datasets.
I agree. We should be able to just remove this line. |
Note from the Gallery, but if you are updating this file, I might recommend making the following changes:
The commands on the left are all Fortran extensions that some compilers (e.g., NAG) do not support because, well, Fortran can do it now. I have a WIP PR (#500) trying to change this everywhere in GCM GC, but that's something I only go back to every so often (low priority task). But I figure this is one file where the changes can be tested to make sure they work! 😄 |
Nightly tests say this is zero diff as it is now. It is not zero diff for boundary conditions is that what you wanted to say @weiyuan-jiang ? |
Yes. These routines are used to create BCs. |
I think the PR is incomplete. There are many hard-coded range , LL ( low left) UR (upper right) for EASE grid. For example Line 1133 in 1046239
It is not 180 any more. |
Thanks for the update, @biljanaorescanin. Yes, the "non-0-diff" aspect should only apply to creating bcs, and it should only apply to EASEv2 bcs (but not sure about the latter). The important test is to generate a bcs datasets and see to what extent the output differs from what we generated with a recent tag (or "develop"). I suggest starting with NLv3 for c180 and EASEv2 M36. |
Fine with me, although at this point it's hard to predict when we can merge this PR, and it's unlikely to be 0-diff for bcs, so throwing in an unrelated change that should be 0-diff may not be ideal. |
@gmao-rreichle I will tests this as soon as bcs package runs again we are still debugging at this moment. |
…ESM/GEOSgcm_GridComp into bugfix/wjiang/pass_range_to_easev2
I pushed more changes to avoid crash in debug mode. It is slow for debug mode. I think we don't have to call "create_mapping" every time. We should be able to improve it. |
@weiyuan-jiang there are still errors, this is from C24 NLv3 CS run: |
I saw it and fixing it... |
@gmao-rreichle @biljanaorescanin Now I can finish make_bcs with NL3 and m9 choices. It is time to check the difference. |
@weiyuan-jiang @gmao-rreichle with last commit we are able now to run both ease grid case and cf case with no errors. As we suspected it is not zero diff. My control/develop EASE m36 NLv3 case from May: Files that differ are: I can take a closer look at all the differences next week. But sole fact that .til and catchment.def are different make all past bcs files in comparison non zero diff. Some differences are not related to this branch, but to peat update like soil_param.dat has 5 tiles that are different. Maybe I should just rerun control to isolate just this PR impact, I didn't realize I didn't run these cases after last peat change. For C90 NLv3 CS case similar differences: Same files are affected as in ease case summary. |
#setenv NCPUS `/usr/bin/lscpu | grep '^CPU(s)' | cut -d ':' -f2 | head -1 ` | ||
#@ NCPUS = $NCPUS / 4 | ||
#@ NCPUS = $NCPUS * 3 | ||
setenv NCPUS 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@biljanaorescanin:
I'm not sure why NCPUS was hardcoded here, overwriting the earlier setting(s).
I'm also not sure if NCPUS is needed as a global environment variable. Maybe we could just use
set NCPUS = 20
?
For now, I'm leaving it as an env variable, but if NCPUS is just a local variable, we should use "set", not "setenv".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can use:
set NCPUS = 20
I am still trying to figure out why we use 20 and not more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked to Matt and he thinks it was either memory issue or it stopped scaling around then.
If we remove from make_bcs script this section: GEOSgcm_GridComp/GEOSagcm_GridComp/GEOSphysics_GridComp/GEOSsurface_GridComp/Utils/Raster/make_bcs Lines 1315 to 1321 in e91a8ad
We get zero diff for EASE grid cases m9 and m36. So without "if loop" section on this branch vs when compared to develop. Two things:
So question is why did we ever have this section? Maybe there was preparation to run some cases with different oceans? I am able to run GEOSldas without these files and experiments are zero diff if I use this branch produced m36 vs develop branch produced m36 bcs files. ( those /til files are never used in experiment setup) |
@biljanaorescanin : Are you saying we could and should remove the lines in question? If that's the case, we should remove this block. |
@biljanaorescanin, @mathomp4: Since the PR turned out to be 0-diff for make_bcs after all, it would be ok to now add the changes suggested by @mathomp4. If the final tests unexpectedly show non-0-diff, we'd have to undo the changes, but that's unlikely. So if you're still interested in cleaning this up, please go ahead. |
@mathomp4 I've changed all in "Raster" directory.
|
@sdrabenh, I just approved this PR for the land group. It's trivially 0-diff for running the GCM because the changes are confined to the ./Raster (bcs generation) directory. We successfully tested the bcs generation. Please let me know if you have any questions. |
This PR addresses issue #596 .
Without passing the range into LRRasterize(), the "completed successfully" message was never printed out for EASE grid.
GEOSgcm_GridComp/GEOSagcm_GridComp/GEOSphysics_GridComp/GEOSsurface_GridComp/Utils/Raster/rasterize.H
Line 153 in 1046239
PR contains additional cleanup and bug fixes. Without the fixes, array indices were out of bounds and array elements were assigned garbage numbers, hence the non-zero-diff results for some intermediate files generated in M09 and M36 bcs. These intermediate files do not seem to be used and are no longer created in the present branch, so effectively the PR is 0-diff for make_bcs.
The PR is trivially 0-diff for running the GCM or GEOSldas.
Adds functionality to point out non-0-diff results between:
In this case, the default string for the output dir is appended to clarify expected non-0-diff results from make_bcs.