Next: 8 Performance Up: UsersGuide Previous: 6 CCSM3 Use Cases Contents Index

Subsections

7.1 Features to be tested
7.2 Testing script
7.3 Common create_test use cases
- 7.3.1 Validation of an Installation
- 7.3.2 Verification of Source Code Changes

7 Testing

This section describes some of the practical tests that can be used to validate model installation as well as verify that key model features still work after source code modifications. Several pre-configured test cases are included in the CCSM3.0 distribution. Their use is strong encouraged. This section will be expanded as more pre-configured test cases are created.

These tests are provided for validation and verification only and must not be used as starting points for production runs. The user should always start a production run by invoking the create_newcase command.

7.1 Features to be tested

Below is a summary of some of the critical aspects of CCSM3 that are tested regularly as part of the CCSM development process. These tests are performed over a range of resolutions, component sets, and machines. For those features where pre-configured test cases exist, the first three letters of the test case names are specified.

Exact restart: The ability to stop a run and then restart it with bit-for-bit continuity is key to making long production runs. This is tested by executing a 10 day startup run, writing restarts at the end of day 5, and then restarting the run from the component restart files produced at day 5 and running for an additional 5 days. The results at day 10 should be identical for both runs. Note that processor configuration (MPI processes, OpenMP threads, load balance) is not changed during these tests. Test case names begin with "ER".
Exact restart for branch runs: The exact restart capability is tested in a similar way for branch runs. Test case names begin with "BR".
Exact restart for hybrid runs: The exact restart capability is tested in a similar way for hybrid runs. Test case names begin with "HY".
Software Trapping: Short CCSM runs are carried out regularly with hardware and software trapping turned on. Often, such trapping will detect code that produces floating-point exceptions, out-of-bounds array indexing and other run-time errors. Runs with trapping turned on are typically much slower than production runs, but they provide extra confidence in the robustness of the code. Trapping is turned on automatically in these test cases via setting $DEBUG to TRUE in env_run. Test case names begin with "DB".
Regression: Often, a source code change is not expected to change results for one or more test case(s). It is always a good idea to test this expectation. A regression tests runs a test case and compares results with a previous run of the same test case. The regression test fails if results are not identical. Regression testing can be enabled for any pre-configured test case when the test case is created.
Parallelization: Several CCSM components produce bit-for-bit identical results on different processor configurations. Tests are carried out to verify this capability with many different processor counts using MPI, OpenMP, hybrid parallel, and no parallelization. Pre-configured test cases that compare runs made with different process configurations are under development.
Science Validation: CCSM science validation takes several forms. First, the model should run for several months or years without any failures. Second, multi-year and multi-century runs are reviewed by scientists for problems, trends, and comparison with observations. Third, an automated climate validation test has been developed at NCAR that compares two multi-decadel runs for climate similarity. This test is performed as a part of port validation and for model changes. Pre-configured test cases for automated climate validation are under development.

7.2 Testing script

The create_test script is used to create the pre-configured tests mentioned in section 7.1. A brief overview of create_test follows.

7.2.1 create_test

The script create_test in $CCSMROOT/scripts/ generates test cases automatically. create_test was developed to facilitate rapid testing of CCSM3 under many different configurations with minimal manual intervention. Many of the scripts and tools used by create_newcase are also used by create_test. Following is a usage summary for create_test:

      method 1:
      ========================================================
 
      create_test -test <testcase> -mach <machine> -res <resolution>
         -compset <component-set> [-testroot <test-root-directory>]
         [-ccsmroot <ccsm-root-directory>] [-pes_file <PES_file>]
         [-clean <clean_option>] [-inputdataroot <input-data-root-directory>]
         [-testid <id>] [-regress <regress_action> -regress_name <baseline_name>
         [-baselineroot <baseline_root_directory>] ]
 
      method 2:
      =========================================================
 
      create_test -testname <full-test-name> [-testroot <test-root-directory>]
         [-ccsmroot <ccsm-root-directory>] [-pes_file <PES_file>] [-testid <id>]
         [-clean <clean_option>] [-inputdataroot <input-data-root-directory>]
         [-baselineroot <baseline_root_directory>]
 
      help option:
      =========================================================
 
      create_test -help

An up-to-date version of the above text can be generated by typing the create_test command without any arguments. create_test must be invoked from the $CCSMROOT/ccsm3/scripts/ directory. Detailed descriptions of options and usage examples can be obtained via the -help option:

 > create_test -help

Note that the -ccsmroot and -testid options are primarily used by developers to test create_test. Normally these options should be avoided as incorrect use can lead to obscure errors.

In general, test cases are created for a supported machine, resolution and component-set combination. Test cases generated using the second method above use the standard CCSM3 shorthand test case naming convention. For example,

 > create_test -testname TER.01a.T42_gx1v3.B.blackforest 
               -testroot /ptmp/$USER/tst

creates an exact restart test, ER.01a, at T42_gx1v3 resolution for component set B on machine blackforest. An equivalent command using the first method is:

 > create_test -test ER.01a -mach blackforest -res T42_gx1v3 -compset B 
               -testroot /ptmp/$USER/tst

Note that in using method 2, an extra ``T'' must prepend the testname in the ``-test'' argument. The -testroot option causes create_test to create the test in the directory

     /ptmp/$USER/tst/TER.01a.T42_gx1v3.B.blackforest.TESTID/

where TESTID is a unique integer string identifying the test. TESTID is generated automatically by create_test. The intent of TESTID is to make it difficult to accidentally overwrite a previously created test with a new one.

If the -testroot option is not used, create_test creates the test in the $CCSMROOT/ccsm3/scripts/ directory. It is much safer to create tests in a completely separate directory. We recommend that the -testroot option always be used and that the test root directory always be outside of the $CCSMROOT/ tree.

7.2.2 Using create_test

The following steps should be followed to create a new test case. Sample commands follow most steps.

Go to the CCSM scripts directory $CCSMROOT/ccsm3/scripts/
```
 > cd $CCSMROOT/ccsm3/scripts/
```

Execute create_test:

 > ./create_test -testname TER.01a.T42_gx1v3.B.blackforest 
                 -testroot /ptmp/$USER/tst

Note that create_test prints the location of the test directory

 ...
 Successfully created new case root directory \
       /ptmp/$USER/tst/TER.01a.T42_gx1v3.B.blackforest.163729
 ...

Go to the test directory:

 > cd /ptmp/$USER/tst/TER.01a.T42_gx1v3.B.blackforest.163729

Build the model interactively:

 > ./TER.01a.T42_gx1v3.B.blackforest.163729.build

Edit the test script to modify the default batch queue setting (optional):
```
 > vi ./TER.01a.T42_gx1v3.B.blackforest.163729.test
```

Submit the test script to the batch queueing system (using llsubmit, or qsub, or ...):
Note that users must submit the test script

       TER.01a.T42_gx1v3.B.blackforest.163729.test

NOT the run script

       TER.01a.T42_gx1v3.B.blackforest.163729.run.

 > llsubmit ./TER.01a.T42_gx1v3.B.blackforest.163729.test

When the batch job completes, the result of the test will be written to file Teststatus.out Examine Teststatus.out to find out if the test passed: If the last line of Teststatus.out is "PASS", then the test passed. Otherwise, the test either failed or did not run.

A script, batch.$MACH, is created in $CCSMROOT/ccsm3/scripts/ the first time create_test is invoked for a test root directory. This file will contain both the interactive command to build the test and the batch submission command to run the test. Each time a new test is subsequently created in the same test root directory, build and run commands for the new test are appended to batch.$MACH. Thus, batch.$MACH can be used as follows to easily build and run a whole sequence of tests that share the same test root directory.

      > cd /ptmp/$USER/tst
      > ./batch.$MACH

The test scripts (TER.01a.T42_gx1v3.B.blackforest.163729.test, etc.) should be modified prior to running batch.$MACH if the default queue is not adequate. Finally, if a new series of tests are to be created, batch.$MACH should be deleted before create_test is invoked again.

7.2.3 Available tests

The following list comprises the most commonly used tests. A full list of all available tests is obtained by running create_test -help.

SM.01a = Smoke test.
Model does a 5 day startup run.
The test passes if the run finishes without errors.
ER.01a = Exact restart test for startup run.
Model does a 10 day startup test, writing a restart file at day 5. A restart run is then done starting from day 5 of the startup run of the initial 10 day run.
The test passes if the restart run is identical to the last 5 days of the startup run.
ER.01b = Same as ER.01a but with IPCC_MODE set to 1870_CONTROL
ER.01e = Same as ER.01a but with IPCC_MODE set to RAMP_CO2_ONLY
DB.01a = Software trapping test.
Model does a 5 day startup run where $DEBUG is set to TRUE in env_run.
The test passes if the run finishes without errors.
BR.01a = Branch run test.
Model executes a startup reference run followed by branch run using the reference run restart output for initialization.
The test passes if the branch run is identical to the reference run.
BR.02a = Exact restart test of branch run.
Model does a branch run test followed by an exact restart test of the branch run (executes a startup, branch, and continue run).
The test passes if the restart run is identical to the last 5 days of the branch run.
HY.01a = Hybrid run test.
Model executes a startup reference run followed by a hybrid run using the reference run output for initialization.
The test passes if the hybrid run finishes without errors (this is effectively a smoke test of a hybrid run).
HY.02a = Exact restart test of hybrid run.
Model executes a hybrid run followed by an exact restart test of the hybrid run (executes a startup, hybrid, and continue run).
The test passes if the restart run is identical to the hybrid run.

Users are encouraged to provide feedback and suggestions for new CCSM tests by sending email to csm@ucar.edu.

7.3 Common create_test use cases

The two most common applications of CCSM tests are described in more detail below.

7.3.1 Validation of an Installation

After installing CCSM3.0, it is recommended that users run the following suite of low-resolution tests before starting scientific runs. If these tests all pass it is very likely that installation has been done properly for the tested resolution and component set.

TER.01a.T31_gx3v5.B
TDB.01a.T31_gx3v5.B
TBR.01a.T31_gx3v5.B
TBR.02a.T31_gx3v5.B
THY.01a.T31_gx3v5.B
THY.02a.T31_gx3v5.B

The sample session below demonstrates how this test suite can be built and run on the IBM "bluesky" machine at NCAR. Note that when testing on machines located at other sites, the -inputdataroot option can be used to tell create_test where to find input data sets. See the text generated by create_test -help for more about this option. Also see section 7.3.2 for an example showing use of the -inputdataroot option.

Create a new directory for testing:
```
 > mkdir -p /ptmp/$USER/tstinstall
```
Go to the CCSM scripts directory $CCSMROOT/ccsm3/scripts/
```
 > cd $CCSMROOT/ccsm3/scripts/
```

Execute create_test for each test case:

 > ./create_test -testname TER.01a.T31_gx3v5.B.bluesky 
                 -testroot /ptmp/$USER/tstinstall
 > ./create_test -testname TDB.01a.T31_gx3v5.B.bluesky 
                 -testroot /ptmp/$USER/tstinstall
 > ./create_test -testname TBR.01a.T31_gx3v5.B.bluesky 
                 -testroot /ptmp/$USER/tstinstall
 > ./create_test -testname TBR.02a.T31_gx3v5.B.bluesky 
                 -testroot /ptmp/$USER/tstinstall
 > ./create_test -testname THY.01a.T31_gx3v5.B.bluesky 
                 -testroot /ptmp/$USER/tstinstall
 > ./create_test -testname THY.02a.T31_gx3v5.B.bluesky 
                 -testroot /ptmp/$USER/tstinstall

Go to the test directory:
```
 > cd /ptmp/$USER/tstinstall/
```
Edit the test scripts to modify the default batch queue (optional):
```
 > vi ./T*/*test
```
Build and submit all of the tests:
```
 > ./batch.bluesky
```
Check test results.
When the batch jobs complete, look for test results in T*/Teststatus.out. If a test fails, then model installation may not be correct.

Users are advised to run additional test that specifically exercise features they plan to use (such as other component sets, model resolutions, dynamical cores, etc.).

7.3.2 Verification of Source Code Changes

When model source code is modified, several tests should be run to verify that essential model features have not been broken. The following suite should be run at a resolution of the users choosing:

TER.01a.*.B
TDB.01a.*.B
TBR.02a.*.B
THY.02a.*.B

Users are advised to run additional test that specifically exercise features they plan to use (such as other component sets, model resolutions, dynamical cores, etc.).

In some cases, source code modifications are not intended to cause any changes in model output. Examples include structural improvements or addition of new features that can be turned off at run time (or, less preferably, at build time). The regression testing features of create_test can be used to quickly determine if model output has been changed unintentionally.

CCSM regression testing has two phases: baseline data set generation and comparison with a baseline data set. In the first phase, a test case is run using a "known-good" version of CCSM and model output is stored as a "baseline". A "known-good" version might be an unmodified copy of CCSM3.0, or it might be a version in which the user has confidence in due to previous testing. In the second phase, the same test case is run using newly modified CCSM source code and model output is compared with the "baseline". The test passes if there are no differences. Baseline data set generation and comparison are both enabled using the -regress and -regress_name options to create_test. Use -regress generate for baseline generation or -regress compare for baseline comparison. Every baseline data set must have a unique name. When generating a new baseline data set, use the -regress_name option to specify the new name. When comparing with an existing baseline data set, use the -regress_name option to specify an existing name.

The sample session below demonstrates baseline data set generation and comparison for a single test case, ER.01a using the IBM "bluesky" machine at NCAR. Note that when testing on machines located at other sites, the -baselineroot option can be used to tell create_test where baseline data sets should be located. Also, the -inputdataroot option can be used to tell create_test where to find input data sets. Both of these options are used in the sample session below. Also, see the text generated by create_test -help for more about this option. In the sample session, assume that the "known-good" version is stored in $CCSMROOTOK/ccsm3/ and the version under test is stored in $CCSMROOT/ccsm3/.

Create a new directory for testing:
```
 > mkdir -p /ptmp/$USER/tstregress
```
Create the baseline data set:
1. Create a new directory for the baseline data set:
```
 > mkdir -p /ptmp/$USER/mybaseline
```
2. Go to the CCSM scripts directory for the "known-good" version $CCSMROOTOK/ccsm3/scripts/:
3. Execute create_test with baseline generation: This use of the -inputdataroot option assumes that CCSM input data sets exist in /ptmp/$USER/mybaseline/.
```
 > create_test -test ER.01a -mach bluesky -res T31_gx3v5 -compset B 
               -inputdataroot /ptmp/$USER/CCSM3.0/inputdata -regress generate 
               -regress_name mynewbaseline -baselineroot /ptmp/$USER/mybaseline 
               -testroot /ptmp/$USER/tstregress
```
4. Go to the test directory:
```
 > cd /ptmp/$USER/tstregress/TER.01a.T31_gx3v5.B.bluesky.G.130344/
```
5. Build the model interactively:
```
 > ./TER.01a.T31_gx3v5.B.bluesky.G.130344.build
```
6. Edit the test script to select queue for batch submission (optional):
```
 > vi ./TER.01a.T31_gx3v5.B.bluesky.G.130344.test
```
7. Submit the test script to the batch queue system (using llsubmit, or qsub, or ...):
```
 > llsubmit ./TER.01a.T31_gx3v5.B.bluesky.G.130344.test
```
8. Check test results.
  When the batch jobs complete, look for test results in Teststatus.out. If a test fails, then something is wrong with the "known-good" model (i.e. maybe it is not good after all). Fix the problem and start over from step 1. The new baseline data set will be stored in the specified baseline root directory with the specified name if and only if the ER.01a test case passes. Verify that model output was stored in the directory
  /ptmp/$USER/mybaseline/mynewbaseline/TER.01a.T31_gx3v5.B.bluesky.
Compare with the baseline data set:
1. Go to the CCSM scripts directory for the version under test $CCSMROOT/ccsm3/scripts/:
2. Execute create_test with baseline comparison:
```
 > create_test -test ER.01a -mach bluesky -res T31_gx3v5 -compset B 
               -inputdataroot /ptmp/$USER/CCSM3.0/inputdata -regress compare 
               -regress_name mynewbaseline -baselineroot /ptmp/$USER/mybaseline 
               -testroot /ptmp/$USER/tstregress
```
3. Go to the test directory:
```
 > cd /ptmp/$USER/tstregress/TER.01a.T31_gx3v5.B.bluesky.C.162532/
```
4. Build the model interactively:
```
 > ./TER.01a.T31_gx3v5.B.bluesky.C.162532.build
```
5. Edit the test script to select queue for batch submission (optional):
```
 > vi ./TER.01a.T31_gx3v5.B.bluesky.C.162532.test
```
6. Submit the test script to the batch queueing system (using llsubmit, or qsub, or ...):
```
 > llsubmit ./TER.01a.T31_gx3v5.B.bluesky.C.162532.test
```
7. Check test results: When the batch jobs complete, look for test results in Teststatus.out. If baseline comparison failed, then the two models are not producing identical reults.

Note that coupler log files are currently used for baseline comparison. Since these are ASCII text files it is possible, though unlikely, that a very small error appearing late in a test run might escape detection. The baseline data sets also include full-precision coupler history files. In a future patch these will be compared too to reduce chances of undetected bit-for-bit changes.

Next: 8 Performance Up: UsersGuide Previous: 6 CCSM3 Use Cases Contents Index

csm@ucar.edu