Merge pull request #404 from billsacks/two_part_system_tests_clone

Infrastructure for clone-based two-part system tests This adds infrastructure for doing two-part system tests (i.e., our typical system tests, where you do two runs and compare them) via clones. From discussion on a recent CIME telecon, there was general agreement that we want to migrate our system tests to this style where possible: It's easier to write, understand and maintain tests set up like this, where there are two separate cases - rather than trying to do two different runs from the same case, which requires careful saving and restoring of the appropriate files. One slight oddity: A TestStatus.log file appears in the clone directory (rather than the test's main case directory). At least for passing tests, this only includes a date/time stamp, so this didn't seem to be a problem. But it's possible that this could include more information if a test failed at a certain point. Presumably this is because I chdir to the case2 case directory before running case2 (I tried not doing this chdir, but this caused the run to fail) - and presumably there is some code somewhere that opens TestStatus.log in the current directory. I haven't tried to dig into this, because it didn't seem like a big deal to me, but can come back to it if others feel it's a problem. This also adds a new test based on this infrastructure (REP - a basic reproducibility test), and rewrites one existing test based on this infrastructure (LII). This also adds some unit tests written using the unittest module, and adds the capability for scripts_regression_tests to discover other unittests that may be distributed throughout the code base. Test suite: scripts_regression_tests.py on yellowstone Test baseline: Test namelist changes: Test status: bit for bit All passed except for the same 4 failures that I got in testing master: test_check_code (__main__.CheckCode) ... FAIL test_b_full (__main__.E_TestTestScheduler) ... FAIL test_save_timings (__main__.TestSaveTimings) ... FAIL test_full_system (__main__.Z_FullSystemTest) ... FAIL Fixes #214 Fixes #291 Fixes #292 Connects to #290 User interface changes?: none Code review: @jedwards4b @jgfouca @mvertens @rljacob
E3SM-Project · Aug 18, 2016 · 4953517 · 4953517
2 parents 949cebb + 8231716
commit 4953517
Show file tree

Hide file tree

Showing 19 changed files with 1,380 additions and 121 deletions.
diff --git a/cime_config/config_tests.xml b/cime_config/config_tests.xml
@@ -4,13 +4,15 @@
 
 The following are the test functionality categories:
   1) smoke tests
-  2) restart tests
-  3) threading/pe-count modification tests
-  4) sequencing (layout) modification tests
-  5) multi-instance tests
-  6) archiving (short-term and long-term) tests
-  7) performance tests
-  8) spinup tests
+  2) basic reproducibility tests
+  3) restart tests
+  4) threading/pe-count modification tests
+  5) sequencing (layout) modification tests
+  6) multi-instance tests
+  7) archiving (short-term and long-term) tests
+  8) performance tests
+  9) spinup tests
+  10) other component-specific tests
 
 NOTES:
 - unless otherwise noted everything is run in one executable directory
@@ -29,6 +31,12 @@ SMS    smoke startup test (default length)
 
 SBN    smoke build-namelist test (just run preview_namelist and check_input_data)
 
+======================================================================
+    Basic reproducibility Tests
+======================================================================
+
+REP    reproducibility: do two identical runs give the same results?
+
 ======================================================================
     Restart Tests
 ======================================================================
@@ -143,6 +151,13 @@ SPO    smoke spinup-ocean test
 
 LAR    long term archive test
 
+======================================================================
+    Other component-specific tests
+======================================================================
+
+LII    CLM initial condition interpolation test
+
+
 -->
 
 <config_test>
@@ -323,8 +338,11 @@ LAR    long term archive test
     <DESC>CLM initial condition interpolation test (requires configuration with non-blank finidat)</DESC>
     <INFO_DBUG>1</INFO_DBUG>
     <CCSM_TCOST>0</CCSM_TCOST>
-    <REST_OPTION>none</REST_OPTION>
     <DOUT_S>FALSE</DOUT_S>
+    <CONTINUE_RUN>FALSE</CONTINUE_RUN>
+    <REST_OPTION>never</REST_OPTION>
+    <HIST_OPTION>$STOP_OPTION</HIST_OPTION>
+    <HIST_N>$STOP_N</HIST_N>
   </test>
 
   <test NAME="PEA">
@@ -349,7 +367,10 @@ LAR    long term archive test
     <INFO_DBUG>1</INFO_DBUG>
     <BFBFLAG>TRUE</BFBFLAG>
     <BUILD_THREADED>TRUE</BUILD_THREADED>
-    <REST_OPTION>never</REST_OPTION>
+    <CONTINUE_RUN>FALSE</CONTINUE_RUN>
+    <REST_OPTION>none</REST_OPTION>
+    <HIST_OPTION>$STOP_OPTION</HIST_OPTION>
+    <HIST_N>$STOP_N</HIST_N>
     <CCSM_TCOST>1</CCSM_TCOST>
     <DOUT_S>FALSE</DOUT_S>
   </test>
@@ -397,6 +418,17 @@ LAR    long term archive test
     <DOUT_S>FALSE</DOUT_S>
   </test>
 
+  <test NAME="REP">
+    <DESC>reproducibility test: do two runs give the same answers?</DESC>
+    <CCSM_TCOST>0</CCSM_TCOST>
+    <INFO_DBUG>1</INFO_DBUG>
+    <DOUT_S>FALSE</DOUT_S>
+    <CONTINUE_RUN>FALSE</CONTINUE_RUN>
+    <REST_OPTION>never</REST_OPTION>
+    <HIST_OPTION>$STOP_OPTION</HIST_OPTION>
+    <HIST_N>$STOP_N</HIST_N>
+  </test>
+
   <test NAME="SBN">
     <DESC>smoke build-namelist test (just run preview_namelist and check_input_data)</DESC>
     <INFO_DBUG>1</INFO_DBUG>

diff --git a/utils/python/CIME/SystemTests/README b/utils/python/CIME/SystemTests/README
@@ -1,11 +1,13 @@
 The following are the test functionality categories:
   1) smoke tests
-  2) restart tests
-  3) threading/pe-count modification tests
-  4) sequencing (layout) modification tests
-  5) multi-instance tests
-  6) performance tests
-  7) spinup tests (TODO)
+  2) basic reproducibility tests
+  3) restart tests
+  4) threading/pe-count modification tests
+  5) sequencing (layout) modification tests
+  6) multi-instance tests
+  7) performance tests
+  8) spinup tests (TODO)
+  9) other component-specific tests
 
 Some tests not yet implemented in python.  They can be found in
 cime/scripts/Testing/Testcases
@@ -23,6 +25,12 @@ SMS    smoke startup test (default length)
        if $IOP_ON is set then suffix is base_iop
        success for non-iop is just a successful coupler
 
+======================================================================
+    Basic reproducibility Tests
+======================================================================
+
+REP    reproducibility: do two identical runs give the same results?
+
 ======================================================================
     Restart Tests
 ======================================================================
@@ -138,3 +146,8 @@ SSP    smoke CLM spinup test (only valid for CLM compsets with CLM45 and CN or B
          short term archiving is on
        do a hybrid non-spinup run run from the restart files generated in the first phase
 
+======================================================================
+    Other component-specific tests
+======================================================================
+
+LII    CLM initial condition interpolation test
diff --git a/utils/python/CIME/SystemTests/lii.py b/utils/python/CIME/SystemTests/lii.py
@@ -1,75 +1,35 @@
 """
-Implementation of the CIME LII test.  This class inherits from SystemTestsCommon
+Implementation of the CIME LII test.
 
 This is a CLM specific test:
-Verifies that namelist variable 'use_init_interp' works correctly
+Verifies that interpolation of initial conditions onto an identical
+configuration gives identical results:
 (1) do a run with use_init_interp false (suffix base)
 (2) do a run with use_init_interp true (suffix init_interp_on)
 """
 
-import shutil, glob
+from CIME.SystemTests.system_tests_compare_two import SystemTestsCompareTwo
 from CIME.XML.standard_module_setup import *
-from CIME.SystemTests.system_tests_common import SystemTestsCommon
+from CIME.SystemTests.test_utils.user_nl_utils import append_to_user_nl_files
 
 logger = logging.getLogger(__name__)
 
-class LII(SystemTestsCommon):
+class LII(SystemTestsCompareTwo):
 
     def __init__(self, case):
-        """
-        initialize a test object
-        """
-        SystemTestsCommon.__init__(self, case)
+        SystemTestsCompareTwo.__init__(self, case,
+                                       separate_builds = False,
+                                       run_two_suffix = 'interp',
+                                       run_one_description = 'use_init_interp set to false',
+                                       run_two_description = 'use_init_interp set to true')
+
+    def _case_one_setup(self):
+        append_to_user_nl_files(caseroot = self._get_caseroot(),
+                                component = "clm",
+                                contents = "use_init_interp = .false.")
+
+    def _case_two_setup(self):
+        append_to_user_nl_files(caseroot = self._get_caseroot(),
+                                component = "clm",
+                                contents = "use_init_interp = .true.")
 
-    def build_phase(self, sharedlib_only=False, model_only=False):
-
-        # Make copies of the namelist files for each part of the test. Enclose the
-        # copies in conditionals so that we only do this namelist setup the first time
-        # the build script is invoked - otherwise, if the build is rerun, the namelist
-        # files would build up repeated instances of the setting of force_init_intep.
-        #
-        # Note the use of shell wildcards to make sure we apply these mods to
-        # multi-instance versions
-
-        if not os.path.exists("user_nl_nointerp"):
-            os.makedirs("user_nl_nointerp")
-            for filename in glob.glob(r'user_nl_clm*'):
-                shutil.copy(filename, os.path.join("user_nl_nointerp",filename))
-                with open(os.path.join("user_nl_nointerp",filename), "a") as newfile:
-                    newfile.write("use_init_interp = .false.")
-
-        if not os.path.exists("user_nl_interp"):
-            os.makedirs("user_nl_interp")
-            for filename in glob.glob(r'user_nl_clm*'):
-                shutil.copy(filename, os.path.join("user_nl_interp",filename))
-                with open(os.path.join("user_nl_interp",filename), "a") as newfile:
-                    newfile.write("use_init_interp = .true.")
-
-        self.clean_build()
-        self.build_indv(sharedlib_only=sharedlib_only, model_only=model_only)
-
-    def run_phase(self):
-        '''
-        Do a run with init_interp false, a run with init_interp true and
-        compare
-        '''
-        caseroot = self._case.get_value("CASEROOT")
-
-        self._case.set_value("CONTINUE_RUN",False)
-        self._case.set_value("REST_OPTION","none")
-        self._case.set_value("HIST_OPTION","$STOP_OPTION")
-        self._case.set_value("HIST_N","$STOP_N")
-        self._case.flush()
-        for user_nl_dir in ("nointerp", "interp"):
-            for filename in glob.glob(r'user_nl_%s/*'%user_nl_dir):
-                shutil.copy(filename,
-                            os.path.join(caseroot,os.path.basename(filename)))
-
-            stop_n = self._case.get_value("STOP_N")
-            stop_option = self._case.get_value("STOP_OPTION")
-            logger.info("doing a %d %s initial test with init_interp set to %s, no restarts written"
-                        % (stop_n, stop_option, user_nl_dir == "interp"))
-
-            self.run_indv(suffix=user_nl_dir)
-
-        self._component_compare_test("nointerp", "interp")
diff --git a/utils/python/CIME/SystemTests/pet.py b/utils/python/CIME/SystemTests/pet.py
@@ -6,63 +6,39 @@
 (2) do another initial run with nthrds=1 for all components (suffix: single_thread)
 """
 
-import shutil
 from CIME.XML.standard_module_setup import *
 from CIME.case_setup import case_setup
-from CIME.SystemTests.system_tests_common import SystemTestsCommon
+from CIME.SystemTests.system_tests_compare_two import SystemTestsCompareTwo
 
 logger = logging.getLogger(__name__)
 
-class PET(SystemTestsCommon):
+class PET(SystemTestsCompareTwo):
+
+    _COMPONENT_LIST = ('ATM','CPL','OCN','WAV','GLC','ICE','ROF','LND')
 
     def __init__(self, case):
         """
         initialize a test object
         """
-        SystemTestsCommon.__init__(self, case)
+        SystemTestsCompareTwo.__init__(self, case,
+                                       separate_builds = False,
+                                       run_two_suffix = 'single_thread',
+                                       run_one_description = 'default threading',
+                                       run_two_description = 'threads set to 1')
 
-    def build_phase(self, sharedlib_only=False, model_only=False):
+    def _case_one_setup(self):
         # first make sure that all components have threaded settings
-        for comp in ['ATM','CPL','OCN','WAV','GLC','ICE','ROF','LND']:
+        for comp in self._COMPONENT_LIST:
             if self._case.get_value("NTHRDS_%s"%comp) <= 1:
                 self._case.set_value("NTHRDS_%s"%comp, 2)
-        self._case.flush()
 
+        # Need to redo case_setup because we may have changed the number of threads
         case_setup(self._case, reset=True)
 
-        self.clean_build()
-        self.build_indv(sharedlib_only=sharedlib_only, model_only=model_only)
-
-    def _pet_first_phase(self):
-        #Do a run with default threading
-        self._case.set_value("CONTINUE_RUN",False)
-        self._case.set_value("REST_OPTION","none")
-        self._case.set_value("HIST_OPTION","$STOP_OPTION")
-        self._case.set_value("HIST_N","$STOP_N")
-        self._case.flush()
-
-        stop_n = self._case.get_value("STOP_N")
-        stop_option = self._case.get_value("STOP_OPTION")
-        logger.info("doing a %d %s initial test with default threading, no restarts written"
-                    % (stop_n, stop_option))
-
-        self.run_indv()
-
-    def _pet_second_phase(self):
+    def _case_two_setup(self):
         #Do a run with all threads set to 1
-        for comp in ['ATM','CPL','OCN','WAV','GLC','ICE','ROF','LND']:
+        for comp in self._COMPONENT_LIST:
             self._case.set_value("NTHRDS_%s"%comp, 1)
-        self._case.flush()
-        shutil.copy("env_mach_pes.xml", os.path.join("LockedFiles","env_mach_pes.xml"))
-
-        stop_n = self._case.get_value("STOP_N")
-        stop_option = self._case.get_value("STOP_OPTION")
-        logger.info("doing a %d %s initial test with threads set to 1, no restarts written"
-                    % (stop_n, stop_option))
 
-        self.run_indv(suffix="single_thread")
-        self._component_compare_test("base", "single_thread")
-
-    def run_phase(self):
-        self._pet_first_phase()
-        self._pet_second_phase()
+        # Need to redo case_setup because we may have changed the number of threads
+        case_setup(self._case, reset=True)
diff --git a/utils/python/CIME/SystemTests/rep.py b/utils/python/CIME/SystemTests/rep.py
@@ -0,0 +1,22 @@
+"""
+Implementation of the CIME REP test
+
+This test verifies that two identical runs give bit-for-bit results
+"""
+
+from CIME.SystemTests.system_tests_compare_two import SystemTestsCompareTwo
+
+class REP(SystemTestsCompareTwo):
+
+    def __init__(self, case):
+        SystemTestsCompareTwo.__init__(self, case,
+                                       separate_builds = False,
+                                       run_two_suffix = 'rep2')
+
+    def _case_one_setup(self):
+        pass
+
+    def _case_two_setup(self):
+        pass
+
+
diff --git a/utils/python/CIME/SystemTests/system_tests_common.py b/utils/python/CIME/SystemTests/system_tests_common.py
@@ -30,9 +30,23 @@ def __init__(self, case, expected=None):
         self._casebaseid = self._case.get_value("CASEBASEID")
         self._test_status = TestStatus(test_dir=caseroot, test_name=self._casebaseid)
 
+        self._init_environment(caseroot)
+        self._init_locked_files(caseroot, expected)
+        self._init_case_setup()
+
+    def _init_environment(self, caseroot):
+        """
+        Do initializations of environment variables that are needed in __init__
+        """
         # Needed for sh scripts
         os.environ["CASEROOT"] = caseroot
 
+    def _init_locked_files(self, caseroot, expected):
+        """
+        If the file LockedFiles/env_run.orig.xml does not exist, copy the current
+        env_run.xml file. If it does exist, restore values changed in a previous
+        run of the test.
+        """
         if os.path.isfile(os.path.join(caseroot, "LockedFiles", "env_run.orig.xml")):
             self.compare_env_run(expected=expected)
         elif os.path.isfile(os.path.join(caseroot, "env_run.xml")):
@@ -44,6 +58,10 @@ def __init__(self, case, expected=None):
             shutil.copy(os.path.join(caseroot,"env_run.xml"),
                         os.path.join(lockedfiles, "env_run.orig.xml"))
 
+    def _init_case_setup(self):
+        """
+        Do initial case setup needed in __init__
+        """
         if self._case.get_value("IS_FIRST_RUN"):
             self._case.set_initial_test_values()
 
@@ -149,6 +167,12 @@ def run_phase(self):
         """
         self.run_indv()
 
+    def _get_caseroot(self):
+        """
+        Returns the current CASEROOT value
+        """
+        return self._caseroot
+
     def _set_active_case(self, case):
         """
         Use for tests that have multiple cases