-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move case bld directory to $HOME/acme_scratch/$PROJECT/$CASE on titan #503
Conversation
@worleyph can you take care of this this week? Its needed for the automatic testing on Titan. |
Can someone describe (again) what the original problem was, and why this solves this? Nothing is changing except moving the bld directory out of the case directory and putting it in a new acme_scratch directory in the users home directory. I don't like this personally, having gotten used to the existing location (in the case directory). I have run the test suites with the current setup (though not recently), with no problems. |
@worleyph , if your build directory in is /scratch, you get disastrously slow build times, if your run directory is not in /scratch, then the compute nodes cannot see it. This change forces these directories to go into the right places. This change also simplifies our nightly testing situation. Our jenkins scripts use $CESMSCRATCHROOT/jenkins as the testroot for create_test. I'd like to not have to make a special case for titan. Looking at it differently, this change bring titan's behavior in-line with all our other machines. No other machine puts its build directories under CASEROOT. |
agreed
agreed
The current settings also have the required behavior.
Okay - this is the basis of the problem then.
No other system builds in the home directory. This changes the meaning of $CESMSCRATCHROOT in an equally unexpected way, in that scratch is not the "usual" scratch location. So Jenkins requirements are screwing around with other ways that the model is built and used, which is also different from other systems. I'll give this some thought this afternoon. |
Okay ... if we are going this direction, I'd like to make one change to the proposal: $ENV{HOME}/acme_scratch/$PROJECT (adding $PROJECT to the definition of CESMSCRATCHROOT; this reintroduces some symmetry with the $RUNDIR location that was lost previously). This still takes some control away from the user - I have not given up on building in lustre. It used to work, and may again some day, and may be a requirement in some cases when the home directory is too full to permit builds there. The workaround will require the user to modify env_build.xml, so this is just an educational issue. Please push my suggestion to your pull request. I'll then do a quick test. I assume that I can skip waiting for the automatic testing in next? I haven't done the "quick and dirty" merging process before. Can I ask @mrnorman to do this? Matt should weigh in on this anyway, given that he is the Titan POC. Thanks. |
@worleyph , as to your previous comment, I agree with most of what you said. I will just note that the correctness of the current settings depend on the user's selection of CASEROOT. The problem I was seeing in Jenkins is that it was picking a caseroot based on the CESMSCRATCHROOT and therefore builds took forever. I do agree that CESMSCRATCHROOT is not typically a home directory. Melvin is the only other machine where this is the case. We got forced into this situation by the terrible performance of titan's scratch area. I am fine with adding PROJECT in the way you suggested. |
84e4d69
to
0c5557f
Compare
I just force pushed the $PROJECT change. There's no need to wait for automatic testing, it isn't working on Titan anyway (times out). You can give this PR to @mrnorman if you'd like. |
@worleyph , I should add that I'm open to keeping the build directory as a derivative of CASEROOT. The only thing that really must change in my opinion is the CESMSCRATCHROOT. |
Probably irrelevant, but I just ran three experiments on Titan, all
on titan-ext8 with a load of around .53 when not building (this was true between each experiment):
a) source and build directory in $HOME.
b) source and build directory in /lustre/atlas1/cli112/scratch
c) source in home and build directory in /lustre/atlas1/cli112/scratch
So penalty (when building all components) of building in /lustre/atlas1/scratch/cli112 is approximately 25% at the moment. (Note, atmosphere build is between 13-15m. land is 7-9m, ice is 6-7m, ocn is 8-9m .) |
25% isn't bad. The nightly process uses the ACME create_test, so a large number of builds are kicked off at once. What impact would that have on the difference between scratch and home? |
I have no idea. Perhaps lustre would not handle this as well as home, but ... we would need to determine this experimentally. |
Let me do some timings |
Just for fun, I changed the optimization flags from -O2 to -O0 for the source+build in $HOME. This decreased compile time to 28m 36s . I'll try the lustre builds with this as well when I get the chance, |
@jgfouca , does jenkins clean up after itself? Note that
does not delete old executables, or the source copied over from ocn and ice, or weird stuff like: ./pgi/mpich/nodebug/threads/MCT/noesmf/clm/obj (@rljacob, it looks like the clean_build logic doesn't do everything that it should for cime?) This will quickly fill up a home directory allocation if the bld directories are not simply deleted. |
@worleyph , yes, Jenkins attempts to remove all artifacts from the previous run. |
@jgfouca , things are not going smoothly. When testing your branch, and then when testing next after merging in your branch, the build is dying with the error message: PGF90-S-0285-Source line too long (/autofs/nccs-svm1_home1/worley/ACPI/SVN/ACME/jgfouca/machines-acme/change_titan_scratch/ACME/components/homme/src/share/prim_advection_mod.F90: 1647) This line is a comment, and is identical to the version in master as fas as I can tell. master builds fine with both the original location of ./bld and with your new location, introduced by editing config_machines.xml manually. I need you to figure this out, perhaps by creating a new branch from master and adding your commit again. If you can then verify that a -res ne30_m120 -compset A_B1850 case works for you, I'll look at merging this in again. |
@worleyph , OK I'll look at it. |
0c5557f
to
bddf281
Compare
Explicitly set titan's CESMSCRATCHROOT to a subdirectory within the user's home and ensure rundir gets set to a place that the compute nodes can see. [BFB]
bddf281
to
2469413
Compare
@worleyph , I'm not seeing the build error. Probably because my repo is not living in such a long path. The comment should probably be changed to not be so fragile, but that's an unrelated issue to this PR. |
@mrnorman , could you please take this over? Verify that it works for you, and then do the usual merge without further testing. This is Titan, can compile-only, specific, so any successful build should be a sufficient test. Thanks. |
@worleyph , I'm getting timings now. It looks like create_test does not run that much faster in HOME than on scratch. It may not be worth the trouble. |
Hard decision ... the poor performance in the past was very real, and it could return. My suggestion is to put this on hold for the moment, but if you see slow build times return (and can attribute it to the file system), then complete the pull request. However, @mrnorman should weigh in here as I would have no qualms about this PR being merged in. |
@worleyph , sorry to keep changing my tune, but all the data finally came in: |
Great - this is unambiguous then. |
@mrnorman , thoughts? |
Sounds like a good change to me. Am I still on hold? I remember Pat asking me to hold off while something was fixed. |
No, you are free to merge. |
@mrnorman , since the "hold" request I reassigned this to you. My only request now is that you verify that this works for you (single build for any case) before the merge. |
Something has come up, and I have to take today and tomorrow off from this point. i'll work on this on Monday. |
If I can get to this, I'll give it a try. If not, it will still be yours on Monday :-). |
Passed a simple build test. SInce this is Titan and build specific, skipping acme-developer (before next) and integration (before master). @jgfouca , is there a GitHub issue # or a Jira task associated with this? |
SEG-143 |
#503) Move case bld directory to $HOME/acme_scratch/$PROJECT/$CASE on titan Explicitly set titan's CESMSCRATCHROOT to a subdirectory within the user's home and ensure rundir gets set to a place that the compute nodes can see. [BFB] * jgfouca/machines-acme/change_titan_scratch: Change titan config.
Also, @jgfouca , how did you test this? Did you run the full acme_developer test suite or integration test suite in your performance evaluations, or just a single case? |
…tch' (PR #503) Move case bld directory to $HOME/acme_scratch/$PROJECT/$CASE on titan Explicitly set titan's CESMSCRATCHROOT to a subdirectory within the user's home and ensure rundir gets set to a place that the compute nodes can see. Built ./create_newcase -case XXX -res ne30_m120 -compset A_B1850 -mach titan -compiler pgi -project YYY successfully. [BFB] SEG-143 * origin/jgfouca/machines-acme/change_titan_scratch: Change titan config.
Integration testing skipped. |
@jgfouca , you should advertise to ACME the impact of this change, so people do not get confused about where things are now built on Titan, and where they need to go and what they need to do if their home directories fill up. |
@jgfouca , should I delete this branch? (Or do you want to do this?) |
@jgfouca - I'll make the announcement. |
@worleyph, I ran "./create_test --no-run acme_developer" |
I will delete, thanks for integrating. |
e616da0 conditional was backward e4b520f Merge pull request #513 from jedwards4b/mask_grid_fix b79f247 fix issue with task count for archive tools 58e1f5b Merge pull request #512 from jedwards4b/user_mods_path_fix 81888fa skip save_timings tests for cesm 83bcaff dont look for _NX and _NY for MASK af51d92 add back MASK_GRID removed in earlier tag - used by clm component e4bbd32 fix pylint issue 5adc5d1 fix issue with user mods path f21e864 get correct xml var 35126b9 Merge pull request #508 from ESMCI/jgfouca/decouple_provenance 5329354 New command-line access to provenance capabilities 2ab2202 Merge pull request #507 from ESMCI/jgfouca/fix_indent_error 5c45efd Fix indent error in hist_utils e5d6423 Merge pull request #502 from jedwards4b/undo_move_changes 327ea5d Merge pull request #503 from ESMCI/jgfouca/get_climate_working_for_cime 612f5a4 set suffix None 67a8165 Get new sandia desktop machine 'climate' running scripts_regression_tests 66a5b48 undo the changes to the hist_utils move tool and remove suffix in ssp test (no hist files produced) 7262448 Merge pull request #501 from jedwards4b/user_mods_and_pe_layouts 75309c3 improved handling of user_mods_dir 2a1f3ea get it right 944f415 typo fix f3e9c9d component_compare_move should not expect 4bcb7d8 Merge pull request #499 from ESMCI/jedwards/perl_xml_workaround e78458d fix issues with user_mods user_nl_ files and pe layouts a1ad4b4 Merge pull request #495 from ESMCI/jgfouca/make_builds_more_thread_safe 57a4726 lnd build should do SMP build if overall case is SMP 779fd21 Merge pull request #494 from ESMCI/jgfouca/remove_PT_from_acme_tests bc9cd7c Remove _PT from acme PET tests 593069e Merge pull request #493 from ESMCI/jgfouca/no_baseline_should_be_compare_fail 109254b Merge pull request #492 from ESMCI/jgfouca/make_builds_more_thread_safe c19b9ef Do not completely fail if no hists were compared 7838d29 Make builds thread safe 5f5b15b Merge pull request #489 from ESMCI/douglasjacobsen/update_lanl_machine_files 8e4303f Add `-std=c99` for gnu compilers when building csm_share 7fc7463 Remove redundant definition of (p)netcdf variables for LANL machines e90e1a0 workaround for problem resolving vars in perl 45cc32d Change PEM testcase to ERP f56435e Change unsupported PMT testcase to PEM f79c22f We probably want 1.8 afterall 5301f38 Merge pull request #488 from ESMCI/jgfouca/env_changes_for_sandia_machs 2fc853b Merge pull request #487 from jedwards4b/config_pes_fix 173f765 Change skybridge back to openmpi1.6, update config for redsky 9ed61aa fix pylint issue f7d434a correct calculation of pes_per_node when specified in config_pes.xml file; 651a5a3 update ChangeLog 3c8ce42 Merge branch 'jedwards4b-multiinstance_plus_nck_fixes' 35e2094 merge to master ea7c23a make sure these test always build threaded 8224cb4 fix pylint issues 2df47af Merge pull request #484 from ekluzek/fixquerymachines 7019fcb Suggestions from Jim, add comment about machine name in manage_case and modify default for mach option in create_newcase 3e30771 remove debug print statements 02dc89b git rid of the dot 3539f8c rework and clean up _hists_match 3bbd222 add a documentation note to hist_utils.py b354e17 fix issue with user_mods/test_mods d9c0e98 Fix minor bug 01b5888 Rewrite NCK test using SystemTestsCompareTwo f0f7571 change debug log message 3922258 response to review comments cf2de41 need to copy CaseDocs to baseline dir 76a7dbc make help message consistant with create_test ec5430f make help message consistant with create_test 1e8b10f skip this test in cesm f8d9bdb add special case for cpl compare in multiinst cases d14f787 fix issues with create_test and scripts_regression_test 32eecba support for multiinstance cases 4f3232a add option allow_baseline_overwrite to generate_baseline, fix issue 310 d3ea142 Merge remote-tracking branch 'remotes/esmci/master' into fixquerymachines f29b0ea Add missing OS for oci5 machine for acme 0a224eb Allow manage_case --query-machines to work and remove "(required)" from -mach cff1801 Merge pull request #481 from jedwards4b/fix_manage_case e7b934b provide a machine name in manage_case 0342608 Merge pull request #477 from ESMCI/jgfouca/fix_single_submit_and_test_cleanup 3de40f3 Fix single submit, cleanup scripts_regr_test by encapsulating run_cmd 98db22a Merge pull request #475 from jedwards4b/pes_config_fix 3ae8178 get children from each section 4b5c50f make sure all settings in config_pes are used d06ae19 Fix boneheaded mistake in create_test 9d27a45 Merge pull request #461 from billsacks/create_test_help 82a8376 Merge pull request #466 from ESMCI/santos/config-build-fallback b1fdd18 Use `config_build` as fallback `config_compilers`. def6e59 Merge pull request #463 from jedwards4b/fix_create_test 00a9cd2 fix issues introduced in PR 459 9e62b16 Merge pull request #460 from quantheory/python-config 915df23 Add `configure.configure`. 4552326 Fix issue with mpi-serial on yellowstone. ac35f0a Clean up some help text 3cf9eec Use `configure` description as docstring. a3a0a39 fix issue with create_test command line args for cesm users febb8cb Merge pull request #459 from ESMCI/jgfouca/remove_compiler_in_baseline_dir 9c5352e Restore setup_standard_logging_options 23dd5af Move `CIME.macros` to `CIME.XML.build`. 329ab29 Update cprnc README with configure changes. 0a051b4 Write compiler/mpilib/debug info from configure. 7a520f5 Change how `configure` gets compiler/mpilib/debug. 324339d Allow `configure` to autodetect machine. b050581 Fix erroneous syntax in write statement. 7fe9067 Update CESM `config_build.xml` file. c4229fe Translate `configure` script to Python. 0e7ca89 Remove `os_` from `MacroMaker` constructor. 66506ba Merge pull request #458 from billsacks/unit_tests_change_back_to_original_dir 91eb28a Fix pylint errors in scripts_regression_tests 002f046 Return to the original directory after unit tests 5e0f491 Revert "Partial revert to find bugs" 2a17a80 I don't understand this f1808dd Minor fix, add -o cf3f1e7 Partial revert to find bugs cbc8a68 progress 726cfba Merge pull request #456 from fischer-ncar/testreporter_fix cf58d1d Update to testreporter to handle new changes to TestStatus logs 9e9d5c2 Updating to ESMCI master bae3a8c Merge pull request #450 from jedwards4b/cesm_workflow_fix 4b0a72f add -o short option f7a5dd4 add --allow-baseline-overwrite flag to create_test c09ed15 refuse to overwrite existing baseline directory in cesm workflow cfab668 Merge pull request #442 from jedwards4b/bluewaters_update 6abeca7 update modules on bluewaters ef030cf Merge pull request #439 from jedwards4b/edison_module_updates cbde559 Merge pull request #441 from jedwards4b/pylint_version 63c345d should be < 5 9ef4050 check for pylint version e39eb93 add disable for pylint 0e484d9 fix setup issue in pea test 36f7039 Merge pull request #430 from jedwards4b/pea_test_fix fc081a8 rebase and update based on pr review 8974922 update documentation 169cc31 update documentation acd6b2b add two build capability to system_tests_compare_two and rewrite pea to use it 4097183 force regeneration of Macros file in pea test 8377b0e update netcdf and pnetcdf on edison 4fb5c77 Merge pull request #437 from ESMCI/jgfouca/fix_longstanding_nightly_fail 38ab9f1 Merge pull request #410 from ESMCI/sarich/eos_config 78eef72 Merge pull request #436 from ESMCI/jgfouca/fix_pylint_err_in_compare_two 8b8cbef Ensure exceptions are added to TestStatus.log 1e4f0d2 Remove unused argument from run_indv 9099b4e Merge pull request #434 from ESMCI/santos/fix-recursion 50f145b Merge pull request #435 from billsacks/fix_pylint_problems 3d62b25 Fix problems discovered by pylint / code_checker aa6a8a1 Prevent infinite recursion in `Case`. 94b27aa Merge branch 'jgfouca/hist_tools_conv_to_python' (PR #413) 164cfd9 Make comparison matchups more robust 6fb0220 Fix user docs for compare_test_results a5e7531 Merge branch 'fix_issue_417' (PR #419) f6ea4a1 improved reporting of baseline file count mismatch da2da68 Merge pull request #427 from bertinia/archive_schema 96c3c18 correct location of debug log in help message, store baselines with original filename 4e5facf Add usage example for typical CESM workflow 9fac682 Get rid of pdb trace that I believe was mistakenly left in df31225 Make a very obvious simplification to code 0221d99 Merge pull request #425 from jedwards4b/namelist_compare_fix e8a6f92 Update config_archive.xml and archive.xsd for validation. 51e181b Remove unneeded global 51afde0 Update hist infra to better-support user-chosen baseline_root aec4b2f Merge pull request #426 from ESMCI/jgfouca/melvin_git 8a430e3 Make sure to load git on melvin after purge 39b5632 fix issue matching case name if case has both G and C actions 7847e04 minor help string fix 19a5b30 More fixes from review cf4b7df fix issue in component_generate_baseline, get only most recent files 0c483c7 Remove last cwd default args 9b6943b Remove dangerous cwd defaults, add documentation to hist_utils public API a0e010e Add new compare_test_results, counterpart to bless_test_results 311ce87 move code around in configure so that project is resolved 4953517 Merge pull request #404 from billsacks/two_part_system_tests_clone 8231716 Merge remote-tracking branch 'esmci/master' into two_part_system_tests_clone 26345c0 bless_test_results: Need sane error code 30a48da remove check for None 39a778b fixes in hist_utils cc502e4 Fix mistake caught by code review 949cebb Merge pull request #420 from jedwards4b/nag_port b7f163a disable pnetcdf with nag 33f93bf use $ENV{MEMBERWORK} because MEMBERWORK is expected to be an environment variable e070e90 look in env_batch if var is otherwise unresolved before giving up 196c7dc Merge pull request #414 from jedwards4b/more_early_resolve_issues 8dbbc43 still cannot use pnetcdf with nag 5765f99 use $PROJECT in eos config file 876201d fix issues for nag compiler a2f464a Update comments based on feedback from Jim Edwards bb15173 fix typo in eos xml, now mpi-serial should no longer set pnetcdf variables. 51b3cf4 Add a flush after setting BUILD_COMPLETE for case2 882aedc Set case2 BUILD_COMPLETE after case1 builds ba2135f Rewrite PET using the new SystemTestsCompareTwo infrastructure 4107bad fix issue in perl cice path was corrupted 0f6dfd5 Upgrade history tools to python b33c9e1 remove unused MASK_GRID variable 212b185 Merge pull request #411 from jedwards4b/batch_fix_reorder_scripts_regression_tests 00755f5 trying again Revert "Revert "More early resolve issues"" aa9c4dc add timestamp to testcase names 631b7fa fix indent 40f5b47 add support for special queue on yellowstone 60290c6 moved config_tests.xml to cime_config directory, reorder tests in scripts_regression tests 0b52c77 moved config_tests.xml to cime_config directory, reorder tests in scripts_regression tests b4bf40a location of config_tests.xml 6a7b86e remove debugging argument b1d9662 initial eos configuration 4560c5a Reorganize unit tests based on discussion with Jim Edwards 23ebbfd add timestamp to testcase names 87b6cd9 update python version 02b0567 fix up eos information e1464e2 add eos to supported acme machines, test 756d13e Merge pull request #407 from ESMCI/jayeshkrishna/pio2/latest_master_081616 c9ce402 Merge pull request #409 from ESMCI/santos/remove-esmf d72a111 Remove `*_comp_esmf` modules for stub and xcpl. b78608a fix indent 518552b add support for special queue on yellowstone 1ff6c2b moved config_tests.xml to cime_config directory, reorder tests in scripts_regression tests d334f39 moved config_tests.xml to cime_config directory, reorder tests in scripts_regression tests 525492d Merge pull request #370 from ESMCI/wilke/scripts/xmlchange 4ac5402 Merge branch 'master' into wilke/scripts/xmlchange abfbee4 Merge branch 'ParallelIO_branch' (PIO2 master) f2a27a2 Add some documentation on LII and REP tests 9331f85 Merge pull request #403 from ESMCI/revert-398-more_early_resolve_issues fe0ae4d Revert "More early resolve issues" f95584b Merge pull request #402 from billsacks/do_not_get_cwd_in_arg_default f210e1c Change implementation of default caseroot for check_lockedfiles 492e038 Change implementation of default test_dir for TestStatus c533185 Merge remote-tracking branch 'esmci/master' into two_part_system_tests_clone cdb9fb2 Merge pull request #392 from ESMCI/jgfouca/guard_against_test_obj_init_throw 062bc53 Merge pull request #398 from jedwards4b/more_early_resolve_issues f09e6e8 clean up debug code de9c59d clean up debug code 87d0c1a update clone cimeroot variable 5b514cc fix more early resolve issues 448c428 Merge pull request #395 from jedwards4b/fix_vars_resolved_too_early 6be24a0 Merge pull request #397 from ESMCI/jayeshkrishna/pio/more_pio_rearr_opts 6da59d3 remove redundunt xml read 73a0e45 fix issue with resolving DIN_LOC_ROOT in perl 90422ae variables were being resolved too early causing env vars to be used incorrectly dab4528 Merge pull request #394 from jedwards4b/external_system_support 6e14a4b work on support for external systems a14d719 Exceptions in SystemTest constructors should leave TestStatus in decent state 1fe2631 Merge pull request #93 from Katetc/master 4d798ad Changes required for nightly cdash to work with new Hobart and Nag 6.1 eed3ba1 Merge branch 'jayeshkrishna/shr/cime_more_pio_rearr_opts' into jayeshkrishna/pio/more_pio_rearr_opts 81f373f Merge pull request #92 from Katetc/master b874c8a Merge branch 'ParallelIO_pio1_branch' into jayeshkrishna/pio/more_pio_rearr_opts 941a4b8 Changes required for the new Hobart nag 6.1 418c9ad Merge pull request #8 from NCAR/master 3f0fd7c Merge pull request #90 from NCAR/jayeshkrishna/pio1_0/pio1_more_rearr_opts 72bc74a Disable logging for test_unittests 84c01b0 Revert "Implementation #2 of running other unit tests from scripts_regression_tests" 54af6ca Merge branch 'esmci_master' into two_part_system_tests_clone 13d7ea7 Fix error in build_indv call cda8a0f Make _common_setup optional 8b546a4 Move common_setup xml changes into config_tests.xml aa412ee Remove 'Clone' from SystemTestsCompareTwo name 15cbdea Remove now-unused SystemTestsCompareTwo and associated unit tests a0f8795 Tweak unit tests ba49873 Add unit tests of runs failing e8f52a0 Get test_run_phase_internal_calls working c16fa9c Make _link_to_case2_output not a staticmethod 25ab9f1 Begin recording calls to stub methods a116a08 Rework SystemTestsCompareTwoClone for recently-changed test infrastructure 8eab267 Tweak unit tests daa2a4b Begin adding unit tests for SystemTestsCompareTwoClone 04f17b2 Merge pull request #91 from NCAR/ejh_fix_test_names 65e731c removed test that depended on changing netCDF error string 0381d05 Reword a comment 42f8c59 Fixes #364 , ignore trailing equal signs in split 643e54a Merge remote-tracking branch 'esmci/master' into two_part_system_tests_clone a05e16e Add a comment a4e379b Add separate_builds argument to SystemTestsCompareTwoClone constructor f798050 Extract code to setup cases into a method 8660a9c Implement new functionality needed in user_nl_utils bf416c5 Add 'WARNING' to output 1e5e620 Move case1 flush to immediately after case1 setup 482e3c5 Add a note d6ca1ff Link to case2 output in case1 run directory 8911951 Set RUN_WITH_SUBMIT for case2 213a68a Minor fix 48ee90b Add some robustness in test setup 057ff74 Start on implementation of two-run tests using clones f21d807 Completely remove references to two builds in SystemTestsCompareTwo 1b6fff6 Merge pull request #88 from NCAR/ejh_strerr 86d23b9 added test for fortran pio_strerror 83a8514 Require run one suffix to be 'base' 7f877f8 adding fortran interface to PIOc_strerror() c0dd8dd changed signature of PIOc_strerror() 1c5f431 added PIOc_strerror() and test for it 9bea831 Revert "Implementation #3 of running other unit tests from scripts_regression_tests" b33215c Implementation #3 of running other unit tests from scripts_regression_tests fdfa1a5 Implementation #2 of running other unit tests from scripts_regression_tests de42303 Implementation #1 of running other unit tests from scripts_regression_tests 6cdc316 Fix some log messages d9a961d Rewrite ERS test using new infrastructure b31fcbf Clean up documentation of available tests 4ce194f Merge remote-tracking branch 'esmci/master' into two_part_system_tests 5cbb5d9 Changing PIO1 flow control logic for io2comp and comp2io 469ae01 Merge pull request #87 from NCAR/ejh_docs e51cceb Comment and clean up 004aeb9 Fixed doc build for async vs. non-async builds eee2b03 Add unit tests for SystemTestsCompareTwo e887ef2 Fix syntax errors c1ed0b7 Rework/cleanup of SystemTestsCompareTwo ea39095 Merge pull request #7 from NCAR/master 74374c7 Add a comment 3f8e12f Write LII test using the new infrastructure e406c31 Add _common_setup to SystemTestsCompareTwo 6e7d094 Add missing import statement 57a07da Initial implementation of SystemTestsCompareTwo and REP test 1749053 Add a utility class to copy and modify user_nl_files in system tests 2e9be53 Cime hooks for more PIO1 rearranger options 5600642 Adding more runtime rearranger options 7febfee Merge pull request #84 from NCAR/ejh_darray4 5831e9f more comments ed31fb0 minor cleanup d6bd625 removed some dead code, improved comments 3114702 more comments, some code cleanup ecbc513 more documentation changes 23b06b4 added some comments 7cd49c1 added config.h include 80dfb55 added test_darray_async.c for async darray testing 1e317dc split off pio_darray_async.c c902eb9 Merge pull request #81 from NCAR/ejh_darray3 5068c60 cleanup and documentation d827fa6 documentation and spacing changes 6c968a0 Merge pull request #78 from NCAR/ejh_darray2 b7db2ac getting non-async build to work 58cb4db put messages back in until async build is the only build e7dd43e fixing problems when built without logging, took out unneeded msgs 77152b4 more work on darray test 79de1e3 more work on darray test 4717e60 more work on darray test 0f1fe88 development of darray test 541a070 cleanout of test_darray bce138f starting to add darray test a28ab94 Merge pull request #77 from NCAR/ejh_darray1 ede6182 documentation fix 08dc6ff documentation and spacing cleanup 046fc26 Merge pull request #76 from NCAR/revert-75-ejh_cleanup6 1b7ffb6 Revert "Ejh cleanup6" 2559cf4 added logging statement to debug cdash problem cb8f4ee Merge pull request #75 from NCAR/ejh_cleanup6 bf1ed73 more cleanup 87e32a2 more cleanup c1df0fb more cleanup 995ebc5 more cleanup cadc332 breaking branch to test cdash building of branches 1d705fc Merge pull request #74 from NCAR/ejh_cleanup4 6c09c56 more cleanup 617c65a cleanup 82d9faa working on put issue 985ba72 more log messages be9af21 stopped faking the stride for puts 329777e more logging 612998f more logging 386b641 compensate for poor handling of NULL for stride by pnetcdf 50ab03b more log messages b3a83a3 more log messages 08b7a7e more log messages 09d5b65 more log messages abde5d2 more log messages f845325 more log messages 70238ae more log messages fcd502d more log messages 887f651 more log messages 7f17935 more log messages 7b4f744 more log messages c871a94 more log messages 26298e6 more log messages 07bf495 turned pnetcdf back on in test_intercomm 4a391ca more logging statements to find bug e94b1fc Merge pull request #72 from NCAR/ejh_test_error_handling ab93ed0 cleanup of error handling in file functions 5c49a3a cleanup of error handling in file functions c659512 clean up error handling 5651c78 now MPI_abort does not overwrite ret_val 611b76d more logging to try and get enddef working on caldera 5d0d400 changed type of num_elems from size_t to PIO_Offset 8a640f3 no longer run test_intercomm for serial builds b2f19be tryhing to isolate put problem 378ac20 tryhing to isolate put problem 2f2bb44 tryhing to isolate put problem e42ad0b turned pnetcdf testing back on for test_intercomm.c 4e50dd0 changes to fix cdash tests, added logging to fortran 4ff0a36 commented out extra declaration b62a9bb more logging statements 8585feb temporarily turned off pnetcdf in test_intercomm.c 0cd713c added temporary mpi_intercomm_merge() function for MPI_SERIAL builds 7b92cdb more log messages to find bug 5d1dba0 fixed error handling of MPI errors on transfer of parameter data to pio_msg 751adee took out use of log in fortran test 30be388 trying to fix problem with put on some platforms a9c9029 took out logging statement 30e7878 took out log statement that was causing too much output 3671790 Merge pull request #71 from NCAR/ejh_async4 f1f9a06 removed use of MPI_Comm_create_group 8a9e0e5 Merge pull request #70 from NCAR/ejh_async3 3a205bb turn on async for yellowstone 9ebc917 clean up 1ad2762 response to code review comments eecde10 Merge branch 'ejh_async2' 696d403 got get_var1 working e7eced0 got get_var/vara working de1c4e7 got get_vars working 98eed4d futher cleanup of var1 functions 2fe1f56 cleaned up put_var1 functions 7119c0b return error for async use with varm 4d335b3 added file for varm functions 32e3165 further development of async d6bfc03 more development of async changes fdd7d8b more development of async code 58b9466 put_vars with async 138876a Merge branch 'master' into ejh_async2 cea8e39 first pass at PIOc_put_vars_tc() 24a34ff Merge pull request #68 from Katetc/master ce91865 Added documentation for the Unit and Performance tests in PIO2. Fixed some of the markup on the Examples page. 46ba2cc further development of async code 5f597a8 further development of async code 68c3068 created internal function because pnetcdf does not have inq_type() 7e01a84 got redef function working with async cf375e1 more async changes be7b7b4 more async changes 7ec0477 more async changes a5d6df3 more async changes b61b5a3 more async changes a4be288 more async changes b3d7b94 more async changes 7417325 more async changes d9ef4b9 more async changes 2810747 continued async development 149d94e continued async development f60fe96 continued async development 2e17460 continued async development 1728b3e continued async development 01679e5 more cleanup a055a39 fixed rename_att function for async bc0d033 Merge pull request #6 from NCAR/master 36c065c Changed Pnetcdf required version to 1.6.1 in documentation. 0ce232f got rename_var working with async f446718 got rename_var working with async 6c0ee7b got rename_dim working 10ff143 got inq_attid working for async 5b14021 got inq_attid working for async 2a250cf got inq_attname working with async 16e3799 got inq_format working with async ddaa30a cleaning up call to netcdf layer in PIOc_nc_async 0337dbe cleaning up call to netcdf layer in PIOc_nc_async 9020e2d moved logging code to pioc_support 4866ec1 further development of async code 99632fd further development of async code 06b279e got put_att working with async e667561 now get_att generalized by type 1c35182 now get_att generalized by type f5b581a now get_att generalized by type fed2d0f further development of async get_att 5912c2a added PIOc_inq_type ae43f47 added PIOc_inq_type cd4421f further async development 6e7e780 further async development 8938469 further async development 19ea322 further development of async code 54c8287 further cleanup of async code 0a6fb4f further cleanup of async code f788f6b further cleanup of async code 96e7498 further cleanup of async code a0aaea9 further cleanup of async code 10b178a code cleanup 80a361f further async development 8b46080 further development of async 891b29c further development of async f028866 further development of async 2a77fe1 further development of async e2504bb further development of async 0783073 further development fb378ca added log message 90f714e continued development of async fc48722 better error handling af76675 further async development b16d251 cleaning up error handling bb17ac4 rearranged order of functions, started to use LOG in pio_msg b312291 added logging 8600910 removed some unneeded msg constants 90c689f more inq functions working with async aee5878 development of async inq functions 3b0ad26 got attribut put working badb925 got non-async build working again d2664af got async option working b16142f got non-async build working 8e34e30 manually merged async changes from ejh_27 3faa0e9 Merge pull request #67 from NCAR/ejh_async_files 458a3fa removed extra index for async ncids c854f34 changes to pioc.c to support async, also some temporary copies of code files for async development 0c4cbe6 Merge pull request #66 from NCAR/ejh_pio_file_async 32228f5 Merge pull request #65 from NCAR/ejh_example_valgrind 4b825ba Merge pull request #64 from NCAR/ejh_doc aebdcdd change in response to review feedback a4cdd06 changing in pio_file.c to support async 648b350 changed example data size, added valgrind suppression file 04db212 added documentation to iosystem_desc_t 17da322 Added a catch for NC_EINVAL errors on file opening (in this case, try plain netcdf before giving up and throwing a total error). This addresses the runtime error in test ERP_Ln9.f19_f19.FW5.yellowstone_intel.cam-outfrq9s 467cb31 Merge pull request #62 from Katetc/master d5a24c0 Changes to pioperformance.F90 to get it to build and run with PIO1. Also adding the hacky makefile I built for my yelowstone work dir to build pioperf against PIO1 as an example and backup. c3f5a1d changes to pioperformance.F90 to support the PIO1 library. a7ce0fa Merge pull request #61 from NCAR/ejh_24 5712329 free MPI group 238583a added comments 81ccf39 now duplicate MPI communicators in init_intracomm 820aeda Merge pull request #5 from NCAR/master 4383369 Merge branch 'jayeshkrishna/pio1_0/pio1_rearr_opts' into pio1_0 makes communications opts runtime rather than compile time 7f46411 add PIO: prefix to timers c39db3e Fixing PIO Source line too long issue 1aafa5c Updating timing events and associated logic in pio 1053052 Removing some comments - No code change a45b901 Simplifying the initialization of PIO rearranger options 1e0d159 Allow rearranging data with a collective without any compile-time flags 83195da Moving PIO rearranger options to runtime 22fdf30 Merge pull request #4 from NCAR/master git-subtree-dir: cime git-subtree-split: e616da0
…tch' (PR #503) Move case bld directory to $HOME/acme_scratch/$PROJECT/$CASE on titan Explicitly set titan's CESMSCRATCHROOT to a subdirectory within the user's home and ensure rundir gets set to a place that the compute nodes can see. Built ./create_newcase -case XXX -res ne30_m120 -compset A_B1850 -mach titan -compiler pgi -project YYY successfully. [BFB] SEG-143 * origin/jgfouca/machines-acme/change_titan_scratch: Change titan config.
Explicitly set titan's CESMSCRATCHROOT to a subdirectory within
the user's home and ensure rundir gets set to a place that the
compute nodes can see.
[BFB]