-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
E3SM hangs in MPI finalize on Summit #2847
Comments
Tahsin Kurc also noted that updating the IBM MPI library version resolved the issue (Hang in MPI finalize is a known issue - https://www.olcf.ornl.gov/for-users/system-user-guides/summit/summit-user-guide/#known-issues) I verified that the MPI hello world program no longer hangs after updating the MPI library version locally (spectrum-mpi/10.2.0.10-20181214 to spectrum-mpi/10.2.0.11-20190201) |
I will be creating a PR soon to fix this issue |
Also, the upcoming PR will also update the essl module (essl/6.1.0-20180406 is not available, we need to use essl/6.1.0-2 instead) |
Upgrading the summit essl and MPI modules. With the older version of MPI modules, MPI_Finalize call hangs. The older version of essl module is no longer available. Fixes #2847
Upgrading the summit essl and MPI modules. With the older version of MPI modules, MPI_Finalize call hangs. The older version of essl module is no longer available. Fixes #2847
Upgrading the summit cmake, essl and MPI modules. With the older version of MPI modules, MPI_Finalize call hangs. The older version of essl module is no longer available. Fixes #2847
Upgrading the summit cmake, essl and MPI modules. Also updating the ROMIO version to prevent OOM errors. Fixes #2847 [BFB]
…e_fixes Upgrading the summit cmake, essl and MPI modules. Also updating the ROMIO version to prevent OOM errors. Fixes #2847 [BFB]
Upgrading the summit cmake, essl and MPI modules. With the older version of MPI modules, MPI_Finalize call hangs. The older version of essl module is no longer available. Fixes #2847
…e_fixes Upgrading the summit cmake, essl and MPI modules. Also updating the ROMIO version to prevent OOM errors. Fixes #2847 [BFB]
Maint 5.6 merge Merge maint-5.6 into master, conflicts are resolved on this branch. Clean up xml Test suite: scripts_regression_tests.py Test baseline: Test namelist changes: Test status: bit for bit, Fixes User interface changes?: Update gh-pages html (Y/N)?: Code review: jgfouca
Maint 5.6 merge Merge maint-5.6 into master, conflicts are resolved on this branch. Clean up xml Test suite: scripts_regression_tests.py Test baseline: Test namelist changes: Test status: bit for bit, Fixes User interface changes?: Update gh-pages html (Y/N)?: Code review: jgfouca
SHOC interface bug fix - TKE clipping
E3SM hangs on Summit when running F case with ne4_ne4 resolution. This problem was reported by Tahsin Kurc (@tkurc) while testing E3SM+PIO2+ADIOS on Summit.
I observed the same issue with PIO2 tests on Summit.
This issue can be recreated using a simple MPI hello world program on Summit.
The text was updated successfully, but these errors were encountered: