Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add quiet nan and signaling nan options to FieldFill. #133

Merged
merged 4 commits into from
May 15, 2023

Conversation

danrosen25
Copy link
Member

@danrosen25 danrosen25 commented Mar 21, 2023

Questions

  • If we add nan (a.k.a. quiet nan) and snan (a.k.a. signaling nan) options to the FieldFill will it help developers and scientists initialize fields or background data and see data issues? Yes, we believe this to be helpful.
  • Is the ieee_arithmetic intrinsic module available with all compilers? IEEE arithmetic is supported for all Fortran compilers listed there except g95 https://fortranwiki.org/fortran/show/Fortran+2003+status

Testing

I've added a test to ESMF_FieldRegridUTest. There's also a snan (signaling nan) test that will fail IF floating point exceptions are added to the build; this test is turned off.
Intel fpe flags

+ESMF_F90OPTFLAG_G       += -traceback -check arg_temp_created,bounds,format,output_conversion,stack,uninit -fpe0
+ESMF_CXXOPTFLAG_G       += -traceback -Wcheck -fp-trap=common

Testing complete with gfortran (mac os) and intel (linux)

@billsacks
Copy link
Member

Thanks for this. I like the idea of providing a mechanism for initializing fields to NaN.

According to https://fortranwiki.org/fortran/show/Fortran+2003+status, IEEE arithmetic is supported for all Fortran compilers listed there except g95 (is that compiler actually still used anywhere?)... though there may be some other compilers not listed in the table, and we have definitely found instances where this table lists a compiler as supporting something when in fact the support is incomplete / buggy.

For what it's worth, here is the inf/nan module we use in CESM:

https://github.com/escomp/CESM_share/blob/main/src/shr_infnan_mod.F90.in

It's quite possible that it is out of date in terms of its list of compilers that support ieee arithmetic - e.g., you found that gnu supports it, though it's possible that its support isn't complete (see also my comment in the next paragraph).

Also, as a side-note: When working with NaNs in gfortran, I ran into this issue: ESMCI/ccs_config_cesm#4. I don't think this should affect the changes here, but is something to be aware of when working with NaNs.

I'd suggest that, if we include this, then we include at least one unit test that can indicate whether this is working for a given machine/compiler – e.g., setting a field to NaN and then checking that the value is NaN.

@danrosen25
Copy link
Member Author

@theurich @billsacks
I cleaned up the code and I wanted to build a test that would fail due to a signaling NaN (see link) but when I run it on a Mac it doesn't fail. I learned that Macs treat signaling NaNs like quiet NaNs. I don't really know where this test belongs but I think it could eventually end up in the esmx-app-prototypes. I could also set a desired exit code for a test in ctest but that's a problem because this test won't fail on Mac OS.
https://github.com/esmf-org/nuopc-app-prototypes/tree/feature/snan/ESMX_NaN

@danrosen25
Copy link
Member Author

@danrosen25 Add unit test to FieldFill for filling nan and snan.

@danrosen25 danrosen25 marked this pull request as ready for review May 9, 2023 15:54
Copy link
Member

@theurich theurich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes look good, and results under https://earthsystemmodeling.org/esmf-test-summary/feature_fill_nan/index.html are as expected.

This is ready to be merged into develop.

@theurich
Copy link
Member

@danrosen25 after this PR is closed, please open a new one for looking at adding additional compiler flags under under the ESMF_BOPT=g option. Thanks.

@danrosen25
Copy link
Member Author

@theurich Should we worry about any of these results?

Combo No. Machine OS Compiler Version MPI Version BOPT NetCDF v8.5.0b18-9-g15f3b88
18 breve Linux aocc 3.2.0 mpiuni None O None 36
24 cheyenne Linux nvhpc 21.11 openmpi 4.1.1 O 4.8.1 916
26 cheyenne Linux nvhpc 22.2 mpt 2.25 O 4.8.1 914
29 cheyenne Linux nvhpc 22.2 mpiuni None O 4.8.1 643
52 discover Linux pgi 20.4 mpiuni None O None 631
87 grits Darwin gfortranclang 12.2.0_14.0.0 mpich 4.1.1 O 4.9.2 3
89 gust Unicos cce 15.0.1 mpi 8.1.25 O 4.9.1 81
92 gust Unicos cce 15.0.1 mpiuni None O 4.9.1 78
93 gust Unicos nvhpc 23.1 mpi 8.1.25 O 4.9.1 10
96 gust Unicos nvhpc 23.1 mpiuni None O 4.9.1 2
109 hera Linux pgi 19.1 intelmpi 2019.0.5 O None 854
114 hera Linux pgi 19.1 mpiuni None O None 631
134 jet Linux nvhpc 22.2 intelmpi 2022.3.0 O None 856
135 jet Linux nvhpc 22.2 mpiuni None O None 631
142 orion Linux pgi 2019 openmpi 4.0.2 O 4.7.4 914
149 perlmutter Unicos aocc 3.2.0 mpiuni None O 4.9.0 36
150 perlmutter Unicos aocc 3.2.0 mpi 8.1.22 O 4.9.0 94

@theurich
Copy link
Member

No, they are as expected. The branch is behind a fix that went into develop recently to bring many of those counts down. I did not see anything in the results I looked at that would indicate there is any issue with the NaNs, especially not on a fundamental level, like not finding the ieee module, etc. Please merge into develop at your convenience. Thanks.

@danrosen25 danrosen25 merged commit 013f236 into develop May 15, 2023
@danrosen25 danrosen25 deleted the feature/fill_nan branch May 15, 2023 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants