Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove potential race conditions #78

Merged
merged 1 commit into from
Jan 27, 2015

Conversation

jgfouca
Copy link
Member

@jgfouca jgfouca commented Jan 9, 2015

IOP tests need extra decoration on their sub test cases in
order to ensure no race condition.

Test cases that need clones should put clones within original
test case directory.

@jeff-cohere
Copy link
Contributor

Is this change only intended for Melvin and Redsky? Do you foresee such race conditions on other machines? (Sorry for the delay in looking at this--was out Friday).

@jgfouca
Copy link
Member Author

jgfouca commented Jan 12, 2015

The changes to the test case scripts will affect everyone. The RUNDIR and EXEROOT changes are only for melvin and redsky. Other machines will have to make similar changes if they are to avoid races.

@worleyph
Copy link
Contributor

I looked at this briefly and thought that it would affect melvin and redsky only. On Titan, you do not want to put bld and run in your case directory unless your case directory is in the Lustre project-shared or scratch areas. Similar issues probably arise on Mira and at NERSC (limited disk space in home directories). In any case, this change is purely for the test scripts, correct? Will it also affect users building cases?

@jgfouca
Copy link
Member Author

jgfouca commented Jan 13, 2015

The changes to the test cases, e.g. ERH_script, will affect everyone.
I would think on Titan, you'd want everything, including case directory, on a fast file system. What's the usecase for having case and bld on separate file systems?

@worleyph
Copy link
Contributor

The fast file systems on Titan get swept. People like to put the case directories in their home directories, and the bld and run directories are in scratch (by default with the current scripts). ERH_script would not be called by a user who just wants to build a science new case, correct? This is still limited to testing? If so, being swept after 30 days may not be an issue?

@rljacob
Copy link
Member

rljacob commented Jan 13, 2015

Could you add a description of the race-condition fail or provide the link to Confluence?

@jgfouca
Copy link
Member Author

jgfouca commented Jan 13, 2015

@worleyph, yes the ERH_script is a "test case" so only used in ERH tests that are in our test categories.
@rljacob, do you want me to add it to this conversation or to the commit message?

@rljacob
Copy link
Member

rljacob commented Jan 13, 2015

This conversation.

@jgfouca
Copy link
Member Author

jgfouca commented Jan 13, 2015

OK, I'll try. If the following conditions are met, you have a race

  1. Test A creates a "sub" test case B
  2. There is also a test case "B" listed in your test category
  3. On a batch system, test A is launched and begins to run
  4. As A is running, the create_test script gets to "B" and begin to build it
  5. testroot/B/bld is now simultaneously the bld area for two cases and bad things happen

@jeff-cohere
Copy link
Contributor

If any of the acme_developer tests create "sub" tests, I think this means we have to address this race condition on all ACME-supported machines, since that test category is supposed to run everywhere. But it sounds like we don't have a solution that is "elastic" enough for that yet. Can someone correct me if I'm wrong?

@jgfouca
Copy link
Member Author

jgfouca commented Jan 13, 2015

Yeah, it sounds like, in some cases, the builds need to go in the case directory and in other cases they don't. To me, the key factor is whether the cases are being created by create_test or not. If create_test is the driver, then builds need to go in the case directory to avoid conflicts, otherwise they need to go in the scratch area. Thoughts?

@rljacob
Copy link
Member

rljacob commented Jan 13, 2015

Do you need both the changes to the scripts AND the change to RUNDIR and EXEROOT to avoid the race condition?

@jgfouca
Copy link
Member Author

jgfouca commented Jan 13, 2015

The changes to the test case scripts are mostly for cleanliness. Usually, the cloned cases always have a "refN" appended, so I haven't seen a conflict. This just avoids clutter in your test root.

@jgfouca
Copy link
Member Author

jgfouca commented Jan 15, 2015

Hi all,
Our redsky tests are still on hold until this is resolved. Any thoughts? One idea I had is that we can remove any test case from our suite that is also being created as a sub test case by another case. In other words, in my example above, we can remove test case "B" from acme_integration since "A" will create a "B" test case anyway. I don't like this as much as fixing the test scripts such that test cases cannot conflict with each other, but it might be less controversial. Thoughts?
-Jim

@jeff-cohere
Copy link
Contributor

Removing redundancies in tests is a good short term fix, but it requires intimate knowledge about which tests create subtests. I think this would allow us some time to think about a more general strategy. We should also ask CSEG if they see these race conditions (though they have been putting enough recent effort into their testing system that it's already beginning not to resemble ours[!]).

As an aside, a name-mangling scheme for cases that involved process IDs would be a way to ensure that different tests and subtests don't interfere with each other.

@worleyph
Copy link
Contributor

I agree with @jnjohnsonlbl that a way to generate unique case names is the long term solution. Perhaps the "parent" case name can be prepended to the case name of the sub test case (by adding an optional parameter of some sorts to the scripts)?

@jgfouca
Copy link
Member Author

jgfouca commented Jan 15, 2015

That's a good idea. Unless there are objections, I'll move forward with the mangling.

IOP tests need extra decoration on their sub test cases in
order to ensure no race condition.

Test cases that need clones should put clones within original
test case directory.
@jgfouca jgfouca force-pushed the jgfouca/ccsm_utils/fix-potential-race-conditions branch from 73b79a1 to 9502f7f Compare January 26, 2015 23:32
@jgfouca
Copy link
Member Author

jgfouca commented Jan 26, 2015

I struggled with this a bit but eventually settled on mangling the test-id parameter for sub test cases needed by ERS_IOP* tests. The test name itself could not be mangled because the scripts parse that name and infer information about the test from it. Thoughts?

@jeff-cohere
Copy link
Contributor

Is mangling the test-id parameter sufficient to prevent this race condition? It's unfortunate that the test name is tightly coupled, but we might not need to change it to get over this hump.

Thanks for continuing to look into this in any case, Jim!

@jgfouca
Copy link
Member Author

jgfouca commented Jan 26, 2015

Yes, it should be sufficient. You can see the potential for races if you look at our test suite:
ERS.f19_g16_rx1.A
ERS_IOP.f19_g16_rx1.A
ERS_IOP4c.f19_g16_rx1.A
ERS_IOP4p.f19_g16_rx1.A
... all of these will try to create a ERS.f19_g16_rx1.A-${TEST_ID} directory in the RUNDIR and EXEROOT. By default, the TEST_ID has a timestamp which reduces the odds of a conflict, but batch systems might fire off several of these at the same time so it's not full proof at all. By adding some source test case data into the TEST_ID, it should prevent potential conflicts.

@jeff-cohere
Copy link
Contributor

Okay, I'll do the merge tomorrow. Thanks!

@jgfouca
Copy link
Member Author

jgfouca commented Jan 27, 2015

I can merge it if you need to head home. I'd like to get it in before tests kick off tonight.

@jeff-cohere
Copy link
Contributor

I'm just in the middle of something else at the moment, so if you can merge, it would be great. For what it's worth, the changes look good, so I can send you an ASCII seal of approval if you want. :-)

@jgfouca
Copy link
Member Author

jgfouca commented Jan 27, 2015

Let me elaborate; redsky has been a bit busier than usual lately and I was having problems with tests not getting a chance to run in the 24 hour testing window. I bumped the window to 48 hours with tests running on even-number days of the month (*/2 in cron), so if we don't run tonight, we won't run until Thursday and I won't have conclusive evidence of a fix for our Friday meeting.

@jgfouca jgfouca merged commit 9502f7f into master Jan 27, 2015
@jeff-cohere
Copy link
Contributor

No need to elaborate--I am well familiar with the need to push something through a crowded queue! Thanks for the info, though.

@jgfouca jgfouca deleted the jgfouca/ccsm_utils/fix-potential-race-conditions branch April 7, 2015 21:18
rljacob added a commit that referenced this pull request Sep 7, 2016
e616da0 conditional was backward
e4b520f Merge pull request #513 from jedwards4b/mask_grid_fix
b79f247 fix issue with task count for archive tools
58e1f5b Merge pull request #512 from jedwards4b/user_mods_path_fix
81888fa skip save_timings tests for cesm
83bcaff dont look for _NX and _NY for MASK
af51d92 add back MASK_GRID removed in earlier tag - used by clm component
e4bbd32 fix pylint issue
5adc5d1 fix issue with user mods path
f21e864 get correct xml var
35126b9 Merge pull request #508 from ESMCI/jgfouca/decouple_provenance
5329354 New command-line access to provenance capabilities
2ab2202 Merge pull request #507 from ESMCI/jgfouca/fix_indent_error
5c45efd Fix indent error in hist_utils
e5d6423 Merge pull request #502 from jedwards4b/undo_move_changes
327ea5d Merge pull request #503 from ESMCI/jgfouca/get_climate_working_for_cime
612f5a4 set suffix None
67a8165 Get new sandia desktop machine 'climate' running scripts_regression_tests
66a5b48 undo the changes to the hist_utils move tool and remove suffix in ssp test (no hist files produced)
7262448 Merge pull request #501 from jedwards4b/user_mods_and_pe_layouts
75309c3 improved handling of user_mods_dir
2a1f3ea get it right
944f415 typo fix
f3e9c9d component_compare_move should not expect
4bcb7d8 Merge pull request #499 from ESMCI/jedwards/perl_xml_workaround
e78458d fix issues with user_mods user_nl_ files and pe layouts
a1ad4b4 Merge pull request #495 from ESMCI/jgfouca/make_builds_more_thread_safe
57a4726 lnd build should do SMP build if overall case is SMP
779fd21 Merge pull request #494 from ESMCI/jgfouca/remove_PT_from_acme_tests
bc9cd7c Remove _PT from acme PET tests
593069e Merge pull request #493 from ESMCI/jgfouca/no_baseline_should_be_compare_fail
109254b Merge pull request #492 from ESMCI/jgfouca/make_builds_more_thread_safe
c19b9ef Do not completely fail if no hists were compared
7838d29 Make builds thread safe
5f5b15b Merge pull request #489 from ESMCI/douglasjacobsen/update_lanl_machine_files
8e4303f Add `-std=c99` for gnu compilers when building csm_share
7fc7463 Remove redundant definition of (p)netcdf variables for LANL machines
e90e1a0 workaround for problem resolving vars in perl
45cc32d Change PEM testcase to ERP
f56435e Change unsupported PMT testcase to PEM
f79c22f We probably want 1.8 afterall
5301f38 Merge pull request #488 from ESMCI/jgfouca/env_changes_for_sandia_machs
2fc853b Merge pull request #487 from jedwards4b/config_pes_fix
173f765 Change skybridge back to openmpi1.6, update config for redsky
9ed61aa fix pylint issue
f7d434a correct calculation of pes_per_node when specified in config_pes.xml file;
651a5a3 update ChangeLog
3c8ce42 Merge branch 'jedwards4b-multiinstance_plus_nck_fixes'
35e2094 merge to master
ea7c23a make sure these test always build threaded
8224cb4 fix pylint issues
2df47af Merge pull request #484 from ekluzek/fixquerymachines
7019fcb Suggestions from Jim, add comment about machine name in manage_case and modify default for mach option in create_newcase
3e30771 remove debug print statements
02dc89b git rid of the dot
3539f8c rework and clean up _hists_match
3bbd222  add a documentation note to hist_utils.py
b354e17 fix issue with user_mods/test_mods
d9c0e98 Fix minor bug
01b5888 Rewrite NCK test using SystemTestsCompareTwo
f0f7571 change debug log message
3922258 response to review comments
cf2de41 need to copy CaseDocs to baseline dir
76a7dbc make help message consistant with create_test
ec5430f make help message consistant with create_test
1e8b10f skip this test in cesm
f8d9bdb add special case for cpl compare in multiinst cases
d14f787 fix issues with create_test and scripts_regression_test
32eecba support for multiinstance cases
4f3232a add option allow_baseline_overwrite to generate_baseline, fix issue 310
d3ea142 Merge remote-tracking branch 'remotes/esmci/master' into fixquerymachines
f29b0ea Add missing OS for oci5 machine for acme
0a224eb Allow manage_case --query-machines to work and remove "(required)" from -mach
cff1801 Merge pull request #481 from jedwards4b/fix_manage_case
e7b934b provide a machine name in manage_case
0342608 Merge pull request #477 from ESMCI/jgfouca/fix_single_submit_and_test_cleanup
3de40f3 Fix single submit, cleanup scripts_regr_test by encapsulating run_cmd
98db22a Merge pull request #475 from jedwards4b/pes_config_fix
3ae8178 get children from each section
4b5c50f make sure all settings in config_pes are used
d06ae19 Fix boneheaded mistake in create_test
9d27a45 Merge pull request #461 from billsacks/create_test_help
82a8376 Merge pull request #466 from ESMCI/santos/config-build-fallback
b1fdd18 Use `config_build` as fallback `config_compilers`.
def6e59 Merge pull request #463 from jedwards4b/fix_create_test
00a9cd2 fix issues introduced in PR 459
9e62b16 Merge pull request #460 from quantheory/python-config
915df23 Add `configure.configure`.
4552326 Fix issue with mpi-serial on yellowstone.
ac35f0a Clean up some help text
3cf9eec Use `configure` description as docstring.
a3a0a39 fix issue with create_test command line args for cesm users
febb8cb Merge pull request #459 from ESMCI/jgfouca/remove_compiler_in_baseline_dir
9c5352e Restore setup_standard_logging_options
23dd5af Move `CIME.macros` to `CIME.XML.build`.
329ab29 Update cprnc README with configure changes.
0a051b4 Write compiler/mpilib/debug info from configure.
7a520f5 Change how `configure` gets compiler/mpilib/debug.
324339d Allow `configure` to autodetect machine.
b050581 Fix erroneous syntax in write statement.
7fe9067 Update CESM `config_build.xml` file.
c4229fe Translate `configure` script to Python.
0e7ca89 Remove `os_` from `MacroMaker` constructor.
66506ba Merge pull request #458 from billsacks/unit_tests_change_back_to_original_dir
91eb28a Fix pylint errors in scripts_regression_tests
002f046 Return to the original directory after unit tests
5e0f491 Revert "Partial revert to find bugs"
2a17a80 I don't understand this
f1808dd Minor fix, add -o
cf3f1e7 Partial revert to find bugs
cbc8a68 progress
726cfba Merge pull request #456 from fischer-ncar/testreporter_fix
cf58d1d Update to testreporter to handle new changes to TestStatus logs
9e9d5c2 Updating to ESMCI master
bae3a8c Merge pull request #450 from jedwards4b/cesm_workflow_fix
4b0a72f add -o short option
f7a5dd4 add --allow-baseline-overwrite flag to create_test
c09ed15 refuse to overwrite existing baseline directory in cesm workflow
cfab668 Merge pull request #442 from jedwards4b/bluewaters_update
6abeca7 update modules on bluewaters
ef030cf Merge pull request #439 from jedwards4b/edison_module_updates
cbde559 Merge pull request #441 from jedwards4b/pylint_version
63c345d should be < 5
9ef4050 check for pylint version
e39eb93 add disable for pylint
0e484d9 fix setup issue in pea test
36f7039 Merge pull request #430 from jedwards4b/pea_test_fix
fc081a8 rebase and update based on pr review
8974922 update documentation
169cc31 update documentation
acd6b2b add two build capability to system_tests_compare_two and rewrite pea to use it
4097183 force regeneration of Macros file in pea test
8377b0e update netcdf and pnetcdf on edison
4fb5c77 Merge pull request #437 from ESMCI/jgfouca/fix_longstanding_nightly_fail
38ab9f1 Merge pull request #410 from ESMCI/sarich/eos_config
78eef72 Merge pull request #436 from ESMCI/jgfouca/fix_pylint_err_in_compare_two
8b8cbef Ensure exceptions are added to TestStatus.log
1e4f0d2 Remove unused argument from run_indv
9099b4e Merge pull request #434 from ESMCI/santos/fix-recursion
50f145b Merge pull request #435 from billsacks/fix_pylint_problems
3d62b25 Fix problems discovered by pylint / code_checker
aa6a8a1 Prevent infinite recursion in `Case`.
94b27aa Merge branch 'jgfouca/hist_tools_conv_to_python' (PR #413)
164cfd9 Make comparison matchups more robust
6fb0220 Fix user docs for compare_test_results
a5e7531 Merge branch 'fix_issue_417' (PR #419)
f6ea4a1 improved reporting of baseline file count mismatch
da2da68 Merge pull request #427 from bertinia/archive_schema
96c3c18 correct location of debug log in help message, store baselines with original filename
4e5facf Add usage example for typical CESM workflow
9fac682 Get rid of pdb trace that I believe was mistakenly left in
df31225 Make a very obvious simplification to code
0221d99 Merge pull request #425 from jedwards4b/namelist_compare_fix
e8a6f92 Update config_archive.xml and archive.xsd for validation.
51e181b Remove unneeded global
51afde0 Update hist infra to better-support user-chosen baseline_root
aec4b2f Merge pull request #426 from ESMCI/jgfouca/melvin_git
8a430e3 Make sure to load git on melvin after purge
39b5632 fix issue matching case name if case has both G and C actions
7847e04 minor help string fix
19a5b30 More fixes from review
cf4b7df fix issue in component_generate_baseline, get only most recent files
0c483c7 Remove last cwd default args
9b6943b Remove dangerous cwd defaults, add documentation to hist_utils public API
a0e010e Add new compare_test_results, counterpart to bless_test_results
311ce87 move code around in configure so that project is resolved
4953517 Merge pull request #404 from billsacks/two_part_system_tests_clone
8231716 Merge remote-tracking branch 'esmci/master' into two_part_system_tests_clone
26345c0 bless_test_results: Need sane error code
30a48da remove check for None
39a778b fixes in hist_utils
cc502e4 Fix mistake caught by code review
949cebb Merge pull request #420 from jedwards4b/nag_port
b7f163a disable pnetcdf with nag
33f93bf use $ENV{MEMBERWORK} because MEMBERWORK is expected to be an environment variable
e070e90 look in env_batch if var is otherwise unresolved before giving up
196c7dc Merge pull request #414 from jedwards4b/more_early_resolve_issues
8dbbc43 still cannot use pnetcdf with nag
5765f99 use $PROJECT in eos config file
876201d fix issues for nag compiler
a2f464a Update comments based on feedback from Jim Edwards
bb15173 fix typo in eos xml, now mpi-serial should no longer set pnetcdf variables.
51b3cf4 Add a flush after setting BUILD_COMPLETE for case2
882aedc Set case2 BUILD_COMPLETE after case1 builds
ba2135f Rewrite PET using the new SystemTestsCompareTwo infrastructure
4107bad fix issue in perl cice path was corrupted
0f6dfd5 Upgrade history tools to python
b33c9e1 remove unused MASK_GRID variable
212b185 Merge pull request #411 from jedwards4b/batch_fix_reorder_scripts_regression_tests
00755f5 trying again Revert "Revert "More early resolve issues""
aa9c4dc add timestamp to testcase names
631b7fa fix indent
40f5b47 add support for special queue on yellowstone
60290c6 moved config_tests.xml to cime_config directory, reorder tests in scripts_regression tests
0b52c77 moved config_tests.xml to cime_config directory, reorder tests in scripts_regression tests
b4bf40a location of config_tests.xml
6a7b86e remove debugging argument
b1d9662 initial eos configuration
4560c5a Reorganize unit tests based on discussion with Jim Edwards
23ebbfd add timestamp to testcase names
87b6cd9 update python version
02b0567 fix up eos information
e1464e2 add eos to supported acme machines, test
756d13e Merge pull request #407 from ESMCI/jayeshkrishna/pio2/latest_master_081616
c9ce402 Merge pull request #409 from ESMCI/santos/remove-esmf
d72a111 Remove `*_comp_esmf` modules for stub and xcpl.
b78608a fix indent
518552b add support for special queue on yellowstone
1ff6c2b moved config_tests.xml to cime_config directory, reorder tests in scripts_regression tests
d334f39 moved config_tests.xml to cime_config directory, reorder tests in scripts_regression tests
525492d Merge pull request #370 from ESMCI/wilke/scripts/xmlchange
4ac5402 Merge branch 'master' into wilke/scripts/xmlchange
abfbee4 Merge branch 'ParallelIO_branch' (PIO2 master)
f2a27a2 Add some documentation on LII and REP tests
9331f85 Merge pull request #403 from ESMCI/revert-398-more_early_resolve_issues
fe0ae4d Revert "More early resolve issues"
f95584b Merge pull request #402 from billsacks/do_not_get_cwd_in_arg_default
f210e1c Change implementation of default caseroot for check_lockedfiles
492e038 Change implementation of default test_dir for TestStatus
c533185 Merge remote-tracking branch 'esmci/master' into two_part_system_tests_clone
cdb9fb2 Merge pull request #392 from ESMCI/jgfouca/guard_against_test_obj_init_throw
062bc53 Merge pull request #398 from jedwards4b/more_early_resolve_issues
f09e6e8 clean up debug code
de9c59d clean up debug code
87d0c1a update clone cimeroot variable
5b514cc fix more early resolve issues
448c428 Merge pull request #395 from jedwards4b/fix_vars_resolved_too_early
6be24a0 Merge pull request #397 from ESMCI/jayeshkrishna/pio/more_pio_rearr_opts
6da59d3 remove redundunt xml read
73a0e45 fix issue with resolving DIN_LOC_ROOT in perl
90422ae variables were being resolved too early causing env vars to be used incorrectly
dab4528 Merge pull request #394 from jedwards4b/external_system_support
6e14a4b work on support for external systems
a14d719 Exceptions in SystemTest constructors should leave TestStatus in decent state
1fe2631 Merge pull request #93 from Katetc/master
4d798ad Changes required for nightly cdash to work with new Hobart and Nag 6.1
eed3ba1 Merge branch 'jayeshkrishna/shr/cime_more_pio_rearr_opts' into jayeshkrishna/pio/more_pio_rearr_opts
81f373f Merge pull request #92 from Katetc/master
b874c8a Merge branch 'ParallelIO_pio1_branch' into jayeshkrishna/pio/more_pio_rearr_opts
941a4b8 Changes required for the new Hobart nag 6.1
418c9ad Merge pull request #8 from NCAR/master
3f0fd7c Merge pull request #90 from NCAR/jayeshkrishna/pio1_0/pio1_more_rearr_opts
72bc74a Disable logging for test_unittests
84c01b0 Revert "Implementation #2 of running other unit tests from scripts_regression_tests"
54af6ca Merge branch 'esmci_master' into two_part_system_tests_clone
13d7ea7 Fix error in build_indv call
cda8a0f Make _common_setup optional
8b546a4 Move common_setup xml changes into config_tests.xml
aa412ee Remove 'Clone' from SystemTestsCompareTwo name
15cbdea Remove now-unused SystemTestsCompareTwo and associated unit tests
a0f8795 Tweak unit tests
ba49873 Add unit tests of runs failing
e8f52a0 Get test_run_phase_internal_calls working
c16fa9c Make _link_to_case2_output not a staticmethod
25ab9f1 Begin recording calls to stub methods
a116a08 Rework SystemTestsCompareTwoClone for recently-changed test infrastructure
8eab267 Tweak unit tests
daa2a4b Begin adding unit tests for SystemTestsCompareTwoClone
04f17b2 Merge pull request #91 from NCAR/ejh_fix_test_names
65e731c removed test that depended on changing netCDF error string
0381d05 Reword a comment
42f8c59 Fixes #364 , ignore trailing equal signs in split
643e54a Merge remote-tracking branch 'esmci/master' into two_part_system_tests_clone
a05e16e Add a comment
a4e379b Add separate_builds argument to SystemTestsCompareTwoClone constructor
f798050 Extract code to setup cases into a method
8660a9c Implement new functionality needed in user_nl_utils
bf416c5 Add 'WARNING' to output
1e5e620 Move case1 flush to immediately after case1 setup
482e3c5 Add a note
d6ca1ff Link to case2 output in case1 run directory
8911951 Set RUN_WITH_SUBMIT for case2
213a68a Minor fix
48ee90b Add some robustness in test setup
057ff74 Start on implementation of two-run tests using clones
f21d807 Completely remove references to two builds in SystemTestsCompareTwo
1b6fff6 Merge pull request #88 from NCAR/ejh_strerr
86d23b9 added test for fortran pio_strerror
83a8514 Require run one suffix to be 'base'
7f877f8 adding fortran interface to PIOc_strerror()
c0dd8dd changed signature of PIOc_strerror()
1c5f431 added PIOc_strerror() and test for it
9bea831 Revert "Implementation #3 of running other unit tests from scripts_regression_tests"
b33215c Implementation #3 of running other unit tests from scripts_regression_tests
fdfa1a5 Implementation #2 of running other unit tests from scripts_regression_tests
de42303 Implementation #1 of running other unit tests from scripts_regression_tests
6cdc316 Fix some log messages
d9a961d Rewrite ERS test using new infrastructure
b31fcbf Clean up documentation of available tests
4ce194f Merge remote-tracking branch 'esmci/master' into two_part_system_tests
5cbb5d9 Changing PIO1 flow control logic for io2comp and comp2io
469ae01 Merge pull request #87 from NCAR/ejh_docs
e51cceb Comment and clean up
004aeb9 Fixed doc build for async vs. non-async builds
eee2b03 Add unit tests for SystemTestsCompareTwo
e887ef2 Fix syntax errors
c1ed0b7 Rework/cleanup of SystemTestsCompareTwo
ea39095 Merge pull request #7 from NCAR/master
74374c7 Add a comment
3f8e12f Write LII test using the new infrastructure
e406c31 Add _common_setup to SystemTestsCompareTwo
6e7d094 Add missing import statement
57a07da Initial implementation of SystemTestsCompareTwo and REP test
1749053 Add a utility class to copy and modify user_nl_files in system tests
2e9be53 Cime hooks for more PIO1 rearranger options
5600642 Adding more runtime rearranger options
7febfee Merge pull request #84 from NCAR/ejh_darray4
5831e9f more comments
ed31fb0 minor cleanup
d6bd625 removed some dead code, improved comments
3114702 more comments, some code cleanup
ecbc513 more documentation changes
23b06b4 added some comments
7cd49c1 added config.h include
80dfb55 added test_darray_async.c for async darray testing
1e317dc split off pio_darray_async.c
c902eb9 Merge pull request #81 from NCAR/ejh_darray3
5068c60 cleanup and documentation
d827fa6 documentation and spacing changes
6c968a0 Merge pull request #78 from NCAR/ejh_darray2
b7db2ac getting non-async build to work
58cb4db put messages back in until async build is the only build
e7dd43e fixing problems when built without logging, took out unneeded msgs
77152b4 more work on darray test
79de1e3 more work on darray test
4717e60 more work on darray test
0f1fe88 development of darray test
541a070 cleanout of test_darray
bce138f starting to add darray test
a28ab94 Merge pull request #77 from NCAR/ejh_darray1
ede6182 documentation fix
08dc6ff documentation and spacing cleanup
046fc26 Merge pull request #76 from NCAR/revert-75-ejh_cleanup6
1b7ffb6 Revert "Ejh cleanup6"
2559cf4 added logging statement to debug cdash problem
cb8f4ee Merge pull request #75 from NCAR/ejh_cleanup6
bf1ed73 more cleanup
87e32a2 more cleanup
c1df0fb more cleanup
995ebc5 more cleanup
cadc332 breaking branch to test cdash building of branches
1d705fc Merge pull request #74 from NCAR/ejh_cleanup4
6c09c56 more cleanup
617c65a cleanup
82d9faa working on put issue
985ba72 more log messages
be9af21 stopped faking the stride for puts
329777e more logging
612998f more logging
386b641 compensate for poor handling of NULL for stride by pnetcdf
50ab03b more log messages
b3a83a3 more log messages
08b7a7e more log messages
09d5b65 more log messages
abde5d2 more log messages
f845325 more log messages
70238ae more log messages
fcd502d more log messages
887f651 more log messages
7f17935 more log messages
7b4f744 more log messages
c871a94 more log messages
26298e6 more log messages
07bf495 turned pnetcdf back on in test_intercomm
4a391ca more logging statements to find bug
e94b1fc Merge pull request #72 from NCAR/ejh_test_error_handling
ab93ed0 cleanup of error handling in file functions
5c49a3a cleanup of error handling in file functions
c659512 clean up error handling
5651c78 now MPI_abort does not overwrite ret_val
611b76d more logging to try and get enddef working on caldera
5d0d400 changed type of num_elems from size_t to PIO_Offset
8a640f3 no longer run test_intercomm for serial builds
b2f19be tryhing to isolate put problem
378ac20 tryhing to isolate put problem
2f2bb44 tryhing to isolate put problem
e42ad0b turned pnetcdf testing back on for test_intercomm.c
4e50dd0 changes to fix cdash tests, added logging to fortran
4ff0a36 commented out extra declaration
b62a9bb more logging statements
8585feb temporarily turned off pnetcdf in test_intercomm.c
0cd713c added temporary mpi_intercomm_merge() function for MPI_SERIAL builds
7b92cdb more log messages to find bug
5d1dba0 fixed error handling of MPI errors on transfer of parameter data to pio_msg
751adee took out use of log in fortran test
30be388 trying to fix problem with put on some platforms
a9c9029 took out logging statement
30e7878 took out log statement that was causing too much output
3671790 Merge pull request #71 from NCAR/ejh_async4
f1f9a06 removed use of MPI_Comm_create_group
8a9e0e5 Merge pull request #70 from NCAR/ejh_async3
3a205bb turn on async for yellowstone
9ebc917 clean up
1ad2762 response to code review comments
eecde10 Merge branch 'ejh_async2'
696d403 got get_var1 working
e7eced0 got get_var/vara working
de1c4e7 got get_vars working
98eed4d futher cleanup of var1 functions
2fe1f56 cleaned up put_var1 functions
7119c0b return error for async use with varm
4d335b3 added file for varm functions
32e3165 further development of async
d6bfc03 more development of async changes
fdd7d8b more development of async code
58b9466 put_vars with async
138876a Merge branch 'master' into ejh_async2
cea8e39 first pass at PIOc_put_vars_tc()
24a34ff Merge pull request #68 from Katetc/master
ce91865 Added documentation for the Unit and Performance tests in PIO2. Fixed some of the markup on the Examples page.
46ba2cc further development of async code
5f597a8 further development of async code
68c3068 created internal function because pnetcdf does not have inq_type()
7e01a84 got redef function working with async
cf375e1 more async changes
be7b7b4 more async changes
7ec0477 more async changes
a5d6df3 more async changes
b61b5a3 more async changes
a4be288 more async changes
b3d7b94 more async changes
7417325 more async changes
d9ef4b9 more async changes
2810747 continued async development
149d94e continued async development
f60fe96 continued async development
2e17460 continued async development
1728b3e continued async development
01679e5 more cleanup
a055a39 fixed rename_att function for async
bc0d033 Merge pull request #6 from NCAR/master
36c065c Changed Pnetcdf required version to 1.6.1 in documentation.
0ce232f got rename_var working with async
f446718 got rename_var working with async
6c0ee7b got rename_dim working
10ff143 got inq_attid working for async
5b14021 got inq_attid working for async
2a250cf got inq_attname working with async
16e3799 got inq_format working with async
ddaa30a cleaning up call to netcdf layer in PIOc_nc_async
0337dbe cleaning up call to netcdf layer in PIOc_nc_async
9020e2d moved logging code to pioc_support
4866ec1 further development of async code
99632fd further development of async code
06b279e got put_att working with async
e667561 now get_att generalized by type
1c35182 now get_att generalized by type
f5b581a now get_att generalized by type
fed2d0f further development of async get_att
5912c2a added PIOc_inq_type
ae43f47 added PIOc_inq_type
cd4421f further async development
6e7e780 further async development
8938469 further async development
19ea322 further development of async code
54c8287 further cleanup of async code
0a6fb4f further cleanup of async code
f788f6b further cleanup of async code
96e7498 further cleanup of async code
a0aaea9 further cleanup of async code
10b178a code cleanup
80a361f further async development
8b46080 further development of async
891b29c further development of async
f028866 further development of async
2a77fe1 further development of async
e2504bb further development of async
0783073 further development
fb378ca added log message
90f714e continued development of async
fc48722 better error handling
af76675 further async development
b16d251 cleaning up error handling
bb17ac4 rearranged order of functions, started to use LOG in pio_msg
b312291 added logging
8600910 removed some unneeded msg constants
90c689f more inq functions working with async
aee5878 development of async inq functions
3b0ad26 got attribut put working
badb925 got non-async build working again
d2664af got async option working
b16142f got non-async build working
8e34e30 manually merged async changes from ejh_27
3faa0e9 Merge pull request #67 from NCAR/ejh_async_files
458a3fa removed extra index for async ncids
c854f34 changes to pioc.c to support async, also some temporary copies of code files for async development
0c4cbe6 Merge pull request #66 from NCAR/ejh_pio_file_async
32228f5 Merge pull request #65 from NCAR/ejh_example_valgrind
4b825ba Merge pull request #64 from NCAR/ejh_doc
aebdcdd change in response to review feedback
a4cdd06 changing in pio_file.c to support async
648b350 changed example data size, added valgrind suppression file
04db212 added documentation to iosystem_desc_t
17da322 Added a catch for NC_EINVAL errors on file opening (in this case, try plain netcdf before giving up and throwing a total error). This addresses the runtime error in test ERP_Ln9.f19_f19.FW5.yellowstone_intel.cam-outfrq9s
467cb31 Merge pull request #62 from Katetc/master
d5a24c0 Changes to pioperformance.F90 to get it to build and run with PIO1. Also adding the hacky makefile I built for my yelowstone work dir to build pioperf against PIO1 as an example and backup.
c3f5a1d changes to pioperformance.F90 to support the PIO1 library.
a7ce0fa Merge pull request #61 from NCAR/ejh_24
5712329 free MPI group
238583a added comments
81ccf39 now duplicate MPI communicators in init_intracomm
820aeda Merge pull request #5 from NCAR/master
4383369 Merge branch 'jayeshkrishna/pio1_0/pio1_rearr_opts' into pio1_0 makes communications opts runtime rather than compile time
7f46411 add PIO: prefix to timers
c39db3e Fixing PIO Source line too long issue
1aafa5c Updating timing events and associated logic in pio
1053052 Removing some comments - No code change
a45b901 Simplifying the initialization of PIO rearranger options
1e0d159 Allow rearranging data with a collective without any compile-time flags
83195da Moving PIO rearranger options to runtime
22fdf30 Merge pull request #4 from NCAR/master

git-subtree-dir: cime
git-subtree-split: e616da0
jonbob pushed a commit that referenced this pull request Sep 6, 2023
Fixed a bug involving use of invalid MAPS cells IDs. Fixes issue #78
Also added checks to make sure that invalid MPAS IDs of cells/edges/vertices are not used.

* origin/mperego/interface_fix:
  mpas-albany-landice: fix to velocity solver interface
mwarusz pushed a commit to mwarusz/E3SM that referenced this pull request Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants