Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cacluated PIO stride and numtasks not correct for varying threading on Titan #1427

Closed
worleyph opened this issue Apr 19, 2017 · 17 comments
Closed

Comments

@worleyph
Copy link
Contributor

a) For env_mach_pes.xml with (among other components):

<entry id="NTASKS">
    <value component="ATM">1350</value>
    <value component="CPL">1024</value>
    <value component="OCN">512</value>
    <value component="LND">336</value>

<entry id="NTHRDS">
    <value component="ATM">1</value>
    <value component="CPL">1</value>
    <value component="OCN">1</value>
    <value component="LND">1</value>

<entry id="ROOTPE">
    <value component="ATM">0</value>
    <value component="CPL">0</value>
    <value component="OCN">1360</value>
    <value component="LND">1024</value>

the PIO settings are

<entry id="PIO_STRIDE">
    <value component="ATM">16</value>
    <value component="CPL">16</value>
    <value component="OCN">16</value>
    <value component="LND">16</value>

b) If this is instead changed to

<entry id="NTASKS">
    <value component="ATM">675</value>
    <value component="CPL">512</value>
<value component="OCN">512</value>
    <value component="LND">168</value>

<entry id="NTHRDS">
     <value component="ATM">2</value>
    <value component="CPL">2</value>
    <value component="OCN">1</value>
    <value component="LND">2</value>

<entry id="ROOTPE">
     <value component="ATM">0</value>
    <value component="CPL">0</value>
    <value component="OCN">680</value>
    <value component="LND">512</value>

then the PIO settings in env_run.xml are now

<entry id="PIO_STRIDE">
    <value component="ATM">8</value>
    <value component="CPL">8</value>
    <value component="OCN">8</value>
    <value component="LND">8</value>

So, even though OCN is pure MPI in both cases, the stride is 16 in the first and 8 in the second. This is appropriate on systems at NERSC and the ALCF, but not on Titan.

@jayeshkrishna
Copy link
Contributor

Is this a behaviour that has changed in the latest CIME?

@worleyph
Copy link
Contributor Author

The old CIME did not have explicit lists of PIO strides and numtasks, correct? The calculation was done elsewhere. Perhaps this has always been broken. I'll run with an old version and check what the model log says was being done.

@worleyph
Copy link
Contributor Author

Okay ...

CIME 5.2 also had this list, but the stride was set at $PES_PER_NODE, so if a component was threaded, than this was one PIO task every few nodes, not every node. This was an "error" in the opposite direction than in CIME5.3, and there was never more than one PIO task per node (by default). I prefer this?

I'll look at old versions next (if I can find anything).

@jayeshkrishna
Copy link
Contributor

Thanks @worleyph ! It would really help if you can get the behaviour pre-CIME-5.2.

@worleyph
Copy link
Contributor Author

CIME5.1 also had the default of

 <entry id="PIO_STRIDE" value="$PES_PER_NODE">

@worleyph
Copy link
Contributor Author

CIME2 did things differently - in one example stride was 4 even though this was two pio tasks for the 2-way threaded components and four pio tasks for the MPI-only OCN (on its own nodes). So, looks like the focus was on the number of pio tasks rather than on stride?

In the "old days" I never trusted PIO strides and numiotasks to be set correctly, so always manually set something reasonable before running. The new logic generating a reasonable default is better, but I think that CIME5.3 needs to switch back to the CIME5.2 decision (use $PES_PER_NODE) or calculate the stride that guarantees no more than one pio task per node for all components, which is only an issue on Titan.

@mt5555 also has (always) preferred a max number of PIO tasks, but that tends to be system and resolution specific.

Note that pio_root being set to 1 needs to be evaluated. This was justified a long time ago (except that POP always required it to be 0). I haven't seen a study showing that this is worthwhile now (and potentially has issues when hyperthreading, and also ensures that the default timing files always miss the true I/O cost).

@amametjanov
Copy link
Member

This is an issue on Mira too. This line appears to be incorrect: https://github.com/ACME-Climate/ACME/blob/master/cime/scripts/lib/CIME/case_setup.py#L178. The stride should be coming from machine-specific settings, instead of $PES_PER_NODE: https://github.com/ACME-Climate/ACME/blob/master/cime/config/acme/machines/config_pio.xml#L25.

jgfouca pushed a commit that referenced this issue Jun 2, 2017
Correct a budget diagnostic printed in the coupler log files. The fresh water budget was missing a freshwater term associated the salt flux due to ice melt/growth. This salt flux should be represented as an additional freshwater flux in the coupler water budget.

The complete equation for net surface forcing of the POP model is

PREC_F+EVAP_F+ROFF_F+IOFF_F+MELT_F+SALT_F*(sflux_factor/salinity_factor) + QFLUX/lh_fusion1e4s_scaling

The freshwater representation of the SALT_F term was missing from the coupler budget. I've added that term and verified that the net ocean fresh water budget better represents the freshwater tendency calculated by the ocean component. I doubled checked the net budget against the global salinity trend in a model run. The new term in the table for the missing flux is weqsalt which stands for the fresh water equivalent of the salt flux.

Test suite: cime regression tests and a coupled run where I backed out the freshwater flux from the salinity timeseries.
Test baseline: cime_master
Test namelist changes:
Test status: [bit for bit, roundoff, climate changing]

Fixes #1429

Code review: Tony, Rob, and Mariana.
@worleyph
Copy link
Contributor Author

On TItan, the PIO_STRIDE is now $PES_PER_NODE in all cases, same as in CIME5.2 and CIME5.1 . I'm okay with this, but will wait to hear from @amametjanov and @jayeshkrishna before closing this. Has something special been done for Mira? Or is there a more general fix in the pipe. Based on @amametjanov 's last comment, Az does not like using $PES_PER_NODE on Mira, at least.

@rljacob
Copy link
Member

rljacob commented Oct 13, 2017

We just updated CIME on master so might be worth checking again.

@worleyph
Copy link
Contributor Author

That is what I was testing - master updated with the latest CIME.

@rljacob
Copy link
Member

rljacob commented Oct 13, 2017

Ok.

@worleyph
Copy link
Contributor Author

Just looked at this on Cori-KNL, Cori-Haswell, and Edison.

For --compset A_WCYCL1850S --res ne30_oECv3_ICG and the default pe layout:

Cori-KNL has PIO_STRIDE == 64
Cori-Haswell has PIO_STRIDE == $PES_PER_NODE
Edison has PIO_STRIDE == $PES_PER_NODE

@amametjanov
Copy link
Member

Hi Pat. PIO_STRIDE stayed at 128 with the new CIME on Cetus. This is okay.

The strides are being set from here, which is also setting Cori-KNL's stride to 64.

@worleyph
Copy link
Contributor Author

Hi Az, Thanks. If things are working as they are supposed to and if you are happy with this, could you please close this issue?

@amametjanov
Copy link
Member

Thanks.

@ndkeen
Copy link
Contributor

ndkeen commented Oct 13, 2017

I need to revisit pio stride default on cori-knl

1 similar comment
@ndkeen
Copy link
Contributor

ndkeen commented Oct 13, 2017

I need to revisit pio stride default on cori-knl

rljacob added a commit that referenced this issue Apr 12, 2021
Correct a budget diagnostic printed in the coupler log files. The fresh water budget was missing a freshwater term associated the salt flux due to ice melt/growth. This salt flux should be represented as an additional freshwater flux in the coupler water budget.

The complete equation for net surface forcing of the POP model is

PREC_F+EVAP_F+ROFF_F+IOFF_F+MELT_F+SALT_F*(sflux_factor/salinity_factor) + QFLUX/lh_fusion1e4s_scaling

The freshwater representation of the SALT_F term was missing from the coupler budget. I've added that term and verified that the net ocean fresh water budget better represents the freshwater tendency calculated by the ocean component. I doubled checked the net budget against the global salinity trend in a model run. The new term in the table for the missing flux is weqsalt which stands for the fresh water equivalent of the salt flux.

Test suite: cime regression tests and a coupled run where I backed out the freshwater flux from the salinity timeseries.
Test baseline: cime_master
Test namelist changes:
Test status: [bit for bit, roundoff, climate changing]

Fixes #1429

Code review: Tony, Rob, and Mariana.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants