-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPICH_MEMORY_REPORT affects performance of E3SM on Cori #2915
Comments
I can look at this. Certainly the information reported using this env variable is not worth a slowdown. But I did not see any overall slowdown when I first added it. You note a percent slowdown within a call. Do you have a test I can run that would show this? |
@ndkeen This issue can also be reproduced with PnetCDF's benchmark program on E3SM-IO, see I will send you more information on how to run PnetCDF's E3SM-IO program on Cori. |
Here are two runs of PnetCDF's E3SM-IO program on Cori:
[MPICH_MEMORY_REPORT is not set]
|
This PR is a follow-up to PR #2915, which optionally runs hipInit prior to MPI_Init to prevent occasional segmentation faults during MPI_Init (see OLCFDEV-1655).
In Cori's machine settings, MPICH_MEMORY_REPORT is turned on by default.
With this setting, MPICH prints a summary of the min/max high water mark and associated rank. While this might be useful for memory profiling/debugging, there is an overhead to the performance of E3SM cases.
I have been running some E3SM benchmark tests on Cori. With MPICH_MEMORY_REPORT set to 1, PnetCDF's ncmpi_wait_all() time increases about 20% to 30%.
It is suggested to set this environment variable only when DEBUG="TRUE".
The text was updated successfully, but these errors were encountered: