Replies: 7 comments 12 replies
-
@weiyuan-jiang @mathomp4 Are you still accessible to Orion? |
Beta Was this translation helpful? Give feedback.
-
We had some discussion last year on this problem. Please see GOCART issue #127 . |
Beta Was this translation helpful? Give feedback.
-
@junwang-noaa @weiyuan-jiang thanks for this tip. i looked over #127, and as @weiyuan-jiang mentioned, it seems that the fix for the errors in that issue was merged several version of mapl ago. that said, we can run w/ aerosols (e.g., cpld_control_p8) on hera with is the suggestion to use a newer version of gnu on cheyenne/orion? cheynne has 11.1.0, 11.2.0, & 12.1.0, while the newest available on orion is 11.3.0. thanks! |
Beta Was this translation helpful? Give feedback.
-
I think we have also seen issues with MPI_Win_Fence (which I think I saw in the thread?) when using Open MPI 4.0. The solution then was to move to Open MPI 4.1. Note that here at GMAO, we mainly use GCC 12.1 and Open MPI 4.1.3 (I think), though I've tested up to Open MPI 4.1.5 with no issue. I am pretty sure GEOS/MAPL only likes GCC 11.3+ and 12.1+. It does not like GCC 13 at the moment. We are looking into that. |
Beta Was this translation helpful? Give feedback.
-
I just wanted to let you know that switching to an entirely new compiler, especially a major GNU compiler, is one of the very few things that is more involved with spack-stack. You can read all the glorious details about it here: JCSDA/spack-stack#99. That said, it's not impossible, we can certainly do it, but we should consider other options first. Given what @mathomp4 said it seems likely that a simple update of openmpi might suffice solving the problem. In general, I would also suggest that as the UFS transitions to spack-stack, there needs to be a communication in advance of software updates. In other words, before changes are made that require major updates of the software stack, this needs to be communicated and planned together. For example, it may be a bit of a stretch to expect that every system and every user has gcc 11 on their system. It's a bit of a pain to get gcc 11 on older Ubuntu 20 and Red Hat/CenOS 7 systems. These things need to be considered before making such updates imo. |
Beta Was this translation helpful? Give feedback.
-
thanks for your input, @mathomp4 & @climbfuji . based on what @climbfuji explained regarding moving to new, major gnu version, i would opt to test with a newer openmpi. however, with gnu/10.2.0 on orion, the most recent openmpi available is 4.0.5, which seems too old based on @mathomp4's MPI_Win_Fence comment. further, on cheyenne (the primary machine where this really is a bottleneck), we have already tested with openmpi/4.1.1 (and gnu/10.1.0) without success. we could test w/ open/4.1.4 there and keep the gnu version fixed, but that's about the last option i think we'd have before running into the gnu upgrade issue. |
Beta Was this translation helpful? Give feedback.
-
thank you @climbfuji for taking the time to install the stack w/
i'm looking around and it seems this is a relatively common issue w/ some openmpi versions, but have not yet found a fix. the expdir is here: |
Beta Was this translation helpful? Give feedback.
-
The ufs weather model cpld_control_p8 (S2SWA) fails when built/run with spack-stack/1.3.1. Experiment setup is available on Cheyenne and Orion. Detail information is available at ufs-community/ufs-weather-model#1737. Can gocart team take a look at the setup on Orion?
Beta Was this translation helpful? Give feedback.
All reactions