Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update PE layouts on edison after cime5 #1220

Closed
wants to merge 19 commits into from

Conversation

ndkeen
Copy link
Contributor

@ndkeen ndkeen commented Jan 17, 2017

No description provided.

rljacob and others added 18 commits December 10, 2016 21:48
Conclicts resolved by removing the old mpas-cice and mpas-o buildnaml
and commit beta0 version with necessary cime5 changes.
Removed old config_grid.xml
Also converted AVIC-L to use beta0 settings in cam config_component.xml

* rljacob/cime/cime5-upgrade: (111 commits)
  For titan, don't put numa_node on aprun for pgi
  Revert "Add beta0 changes to the cime5 mpas-cice buildnml"
  Revert "Add beta0 changes to the cime5 mpas-o buildnml"
  Do not switch AV1C-L to 04p2 just yet
  Add beta0 changes to the cime5 mpas-cice buildnml
  Add beta0 changes to the cime5 mpas-o buildnml
  Update CIME5 config_grids w ocean cavities
  Set CO2_PPMV value for ACME AV1C-04P2 compset
  Add beta0 compsets to cime5 cam config files
  Replace ERP_Ln3.ne30_oEC.A_WCYCL2000 with ERP_Ld3.ne30_oEC.A_WCYCL2000
  Revert "Switch to using several config_pes.xml files"
  Switch skybridge and redsky back to openmpi-1.6
  Unshare sharedlibs, fix bug in machines.py
  Switch to using several config_pes.xml files
  update 3 config_pes files after testing
  Set up ACME MPASLI config_pes
  Set up ACME MPAS-cice config_pes
  Update CICE config_pes for DTEST
  Set up ACME MPAS-O config_pes.xml
  Add ACME cice config_pes.xml
  ...
Another merge to next

* rljacob/cime/cime5-upgrade:
  Set skybrige env back to cime2 equivalent for HOMME
  Change skybridge and redsky envs to be exact match to cime2
bless_test_results: hot fix to support acme baseline naming

[BFB]

* jgfouca/bless_test_hotfix:
  bless_test_results: hot fix to support acme baseline naming
  Hotfix for anvil SAVE_TIMING_DIR
remove unsupported FVM code from HOMME

This code is under active development at NCAR - it allows tracer
advection to be done with the CSLAM algorithm. The code in ACME was
obsolete and not maintained. If ACME wants to adopt CSLAM for v3, we
can work with NCAR to bring in the new version.

[BFB] for all acme tests except HOMME: some HOMME tests will have roundoff level
differences in PS output field
Update melvin gcc to 5.3.0

Fixes build failure in ERP_Ln9.ne30_ne30.FC5.melvin_gnu.cam-outfrq9s

[BFB]

* jgfouca/update_melvin_compiler:
  Update melvin gcc to 5.3.0
…1189)

Adjust default walltimes for sandia HPCs

Redsky was set to 50 minutes, which is too short to run some of our
tests.

On the other end, skybridge was defaulting to too much time.

[BFB]

* jgfouca/adjust_walltimes_for_sandia_hpcs:
  Adjust default walltimes for sandia HPCs
…st_results' into next (PR# 1190)

compare_test_results: Support ACME's $compiler/$name baseline scheme

[BFB]

* jgfouca/cime/support_acme_baseline_scheme_in_compare_test_results:
  compare_test_results: Support ACME's $compiler/$name baseline scheme
Fix pgiacc environment for titan

Use 'module use' command instead of environment variable
to add to MODULEPATH. Allows this change to happen before we
attempt to load modules. Still seeing build errors with pgiacc.

Fixes #1173

[BFB]

* jgfouca/titan_env_fix:
  Only do titan pgiacc fix for ACME
  Changing pgi_acc to pgiacc in config_machines.xml
  Special case for titan and pgiacc
  Renaming Depends.titan.pgi_acc to Depends.titan.pgiacc
  Fix pgiacc environment for titan
Update redsky configuration

ERP_Ld3.ne30_oEC.A_WCYCL2000 was getting killed on redsky due to running
out of memory. This PR doubles the number of tasks for this grid on
redsky and also increases the jobmax for the ec queue for both
redsky and skybridge.

[BFB]

* jgfouca/machine_files/redsky_updates:
  Update redsky configuration
…1195)

Allow for suite-specific test walltimes in ACME

Adds an optional field to the test suite definitions to
allow a maxtime to be specified for a suite.

Also allows for multiple inheritance of test suites.

[BFB]

* jgfouca/cime/timing_info_for_test_suites:
  Allow for suite-specific test walltimes in ACME
Each model now required to provide it's own element_state.F90, which contains:

elem_state_t
elem_accum_t
elem_derived_t

In addition, the semi-lagrange code was moved out of the generic prim_advection_mod base class and isolated to its own subroutines in src/preqx.

[BFB]
Keep MPAS SCRATCH files open during init, run and finalization

Previously, SCRATCH files were opened and closed in each init_mct, run_mct and
final_mct call. With large number of MPI ranks that may
overwhelm a file system. This will instead open per-process temporary
files once in init_mct call and close once in final_mct call, keeping them
open in each run_mct call.

Fixes #1106
[BFB]
…1200)

Fix machine config for edison.

Add PROJECT info.
Change CESMSCRATCHROOT to not conflict with other nersc machines.

[BFB]
Fix git describe errors on some platforms

Need to run it from the root of the repo on edison and cori.

[BFB]
Add expect statement for jobid search

This search must succeed. In the previous code, we get a nasty
and mostly useless stacktrace. With this change, we should get
an informative error message.

[BFB]

* jgfouca/cime/jobid_fail_err_msg:
  Add expect statement for jobid search
Add back cron_script from pre-CIME5 ACME.

It somehow got lost in the transition. Is needed for titan runs.

[BFB]
Change cron_script to always submit to dashboard.

[BFB]
Fix incorrect CMPASO and GMPAS compsets in CIME5

In moving from CIME2 to CIME5, the MPAS compsets were incorrectly translated and
need to be fixed. The CMPASO compsets are currently defined to include active
MPAS-CICE (which makes them G compsets) instead of DICE, while one of the GMAS
compsets is missing and the other is misnamed. The results will not be BFB but
only for CMPASO tests, since the components used will change back to the correct
set. This should not impact fully-coupled results.

[non-BFB]
[FCC]
@ndkeen ndkeen self-assigned this Jan 17, 2017
@ndkeen ndkeen closed this Jan 17, 2017
@ndkeen
Copy link
Contributor Author

ndkeen commented Jan 17, 2017

I went the wrong direction

jgfouca pushed a commit that referenced this pull request Jun 2, 2017
remove empty mpi-serial settings from config_machines.xml
Remove the requirement to have mpilib mpi-serial explicitly listed in the config_machines.xml file since it is assumed supported on all systems. Remove the empty executable field for that library.

Test suite: scripts_regression_tests.py, preview_run (thanks for this!), SMS_Mmpi-serial.f19_g17.X.hobart_intel
Test baseline:
Test namelist changes:
Test status: bit for bit
Fixes #1220

User interface changes?: no

Code review:rjacob, jayesh, ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants