Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPICH_MEMORY_REPORT affects performance of E3SM on Cori #2915

Open
dqwu opened this issue May 9, 2019 · 3 comments
Open

MPICH_MEMORY_REPORT affects performance of E3SM on Cori #2915

dqwu opened this issue May 9, 2019 · 3 comments

Comments

@dqwu
Copy link
Contributor

dqwu commented May 9, 2019

In Cori's machine settings, MPICH_MEMORY_REPORT is turned on by default.

    <environment_variables compiler="intel">
      <env name="FORT_BUFFERED">yes</env>
      <env name="MPICH_MEMORY_REPORT">1</env>
    </environment_variables>

With this setting, MPICH prints a summary of the min/max high water mark and associated rank. While this might be useful for memory profiling/debugging, there is an overhead to the performance of E3SM cases.

I have been running some E3SM benchmark tests on Cori. With MPICH_MEMORY_REPORT set to 1, PnetCDF's ncmpi_wait_all() time increases about 20% to 30%.

It is suggested to set this environment variable only when DEBUG="TRUE".

@ndkeen
Copy link
Contributor

ndkeen commented May 9, 2019

I can look at this. Certainly the information reported using this env variable is not worth a slowdown. But I did not see any overall slowdown when I first added it. You note a percent slowdown within a call. Do you have a test I can run that would show this?

@dqwu
Copy link
Contributor Author

dqwu commented May 9, 2019

@ndkeen
I tested E3SM with PIO2, not PIO1.

This issue can also be reproduced with PnetCDF's benchmark program on E3SM-IO, see
https://github.com/Parallel-NetCDF/E3SM-IO

I will send you more information on how to run PnetCDF's E3SM-IO program on Cori.

@dqwu
Copy link
Contributor Author

dqwu commented May 10, 2019

Here are two runs of PnetCDF's E3SM-IO program on Cori:
[export MPICH_MEMORY_REPORT=1]

   0: ==== benchmarking G case using varn API ========================
   0: History output file                = g_case_hist_varn.nc
   0: MAX heap memory allocated by PnetCDF internally is 19.53 MiB
   0: Total number of variables          = 52
   0: Total write amount                 = 81604.07 MiB = 79.69 GiB
   0: Total number of requests           = 176636648
   0: Max number of requests             = 19780
   0: Max Time of open + metadata define = 0.2362 sec
   0: Max Time of I/O preparing          = 0.0504 sec
   0: Max Time of ncmpi_iput_varn        = 0.0525 sec
   0: Max Time of ncmpi_wait_all         = 43.2970 sec
   0: Max Time of close                  = 0.0109 sec
   0: Max Time of TOTAL                  = 43.6485 sec
   0: I/O bandwidth (open-to-close)      = 1869.5718 MiB/sec
   0: I/O bandwidth (write-only)         = 1884.7496 MiB/sec
   0: -----------------------------------------------------------

[MPICH_MEMORY_REPORT is not set]

   0: ==== benchmarking G case using varn API ========================
   0: History output file                = g_case_hist_varn.nc
   0: MAX heap memory allocated by PnetCDF internally is 19.53 MiB
   0: Total number of variables          = 52
   0: Total write amount                 = 81604.07 MiB = 79.69 GiB
   0: Total number of requests           = 176636648
   0: Max number of requests             = 19780
   0: Max Time of open + metadata define = 0.3534 sec
   0: Max Time of I/O preparing          = 0.0594 sec
   0: Max Time of ncmpi_iput_varn        = 0.0954 sec
   0: Max Time of ncmpi_wait_all         = 33.5573 sec
   0: Max Time of close                  = 0.0076 sec
   0: Max Time of TOTAL                  = 34.0763 sec
   0: I/O bandwidth (open-to-close)      = 2394.7487 MiB/sec
   0: I/O bandwidth (write-only)         = 2431.7824 MiB/sec
   0: -----------------------------------------------------------

jgfouca pushed a commit that referenced this issue Oct 9, 2024
jgfouca pushed a commit that referenced this issue Oct 9, 2024
This PR is a follow-up to PR #2915, which optionally runs hipInit
prior to MPI_Init to prevent occasional segmentation faults during
MPI_Init (see OLCFDEV-1655).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants