Regular system testing and validation of the CCSM is required to ensure that model quality and integrity is maintained throughout the development process. This section establishes the system testing standards and the procedures that will be used to verify the standards have been met. It is assumed that component model development teams have unit tested their component prior to making it available for system testing. See section ) for more information on testing of individual components and unit-testing of individual subroutine and modules within components.
There are two general categories of model evaluations: frequent short test runs and infrequent long validation integrations.
Model testing refers to short (3 to 31 day) model runs designed to verify that the underlying mechanics and performance of the coupled model continues to meet specifications. This includes verifying that the model actually starts up and runs, benchmarking model performance and relative speed/cost of each model component as well as checking that the model restarts exactly. These tests are done on each of the target platforms. Model testing does not address whether the model answer is correct, it merely verifies that it mechanically operates as specified
Model validation involves longer (at least 1 year) integrations to ensure that the model results are in acceptable agreement with both previous model
climate statistics and observed characteristics of the real climate system.
Model validation occurs with each minor CCSM version (i.e. CCSM2.1, CCSM2.2)
or at the request of the CCSM scientists and working groups.
Once requested, model validation is only carried out after CCSM scientists
have been consulted and the model testing phase is successfully completed.
The model validation results are documented on a publicly assessable web page
(http://www.cesm.ucar.edu/models/ccsm2.0beta/testing/status.html).
Port validation is defined as verification that the differences between two otherwise identical model simulations obtained on different machines or using different environments are caused by machine roundoff errors only.
Formal testing of the CCSM is required for each tagged version of the model. The CCSM quality assurance lead is responsible for ensuring that these tests are run, either by personally doing it or having them run by a qualified person. If a model component is identified as having a problem, the liaison for that component is expected to make resolving that problem their highest priority. The results of the testing and benchmarking will be included in the tagged model to document the run characteristics of the model. The actual testing and analysis scripts will be part of the CCSM CVS repository to encourage use by outside users.
Model Validation occurs with each Minor CCSM version (i.e. CCSM2.1, CCSM2.2) or at the request of the CCSM scientists and working groups. Before starting a validation run, the CCSM Quality Assurance Lead will consult with the CCSM scientists to design the validation experiment.
Pre-Validation Run Steps:
Validation Steps:
Port validation is defined as verification that the differences between two otherwise identical model simulations obtained on different machines or using different environments are caused by machine roundoff errors only. Roundoff errors can be caused by using two machines with different internal floating point representation, or by using a different number of processing elements on the same machine which may cause a known re-ordering of some calculations, or by using different compiler versions or options (on a single machine or different machines) which parse internal computations differently.
The following paper offers a primary reference for port validation (hereafter referred to as RW):
Rosinski, J.M. and D.L. Williamson: The Accumulation of Rounding Errors and Port Validation for Global Atmospheric Models. Journal of Scientific Computation, Vol. 18, No. 2, March 1997.
As established in RW, three conditions of model solution behavior must be fulfilled to successfully validate a port of atmospheric general circulation models:
The extent to which these conditions apply to models other than an atmospheric model has not yet been established. Also, note that the third condition is not the focus of this section (see section 13.2).
Validation of the full CCSM system, defined as the combination of all active model components participating in the full computation, is a two-step process:
Validation of each component model alone should be performed by the model developers, and it may not be necessary to perform the standalone tests as part of regular, frequent validation testing.
To validate the fully coupled CCSM, the objective is to establish a procedure which will allow one to conclude confidently that the port of the full system (all components active) is valid. However, there are at least two potential problems which should be noted:
The general procedure for port validation of the full CCSM is to examine the growth of differences between two solutions over a suitable number of integral timesteps. This error growth can be compared to the growth of differences between two solutions on a single machine, where the differing solution was produced by introducing a random perturbation of the smallest amplitude which can be felt by the model at the precision of the machine.
It is recommended that the procedure examine the growth of differences in a state variable which resides at the primary physical interface (that is, the surface), where the accumulation of errors in all components will act quickly and where the action of the CCSM coupler is also significant (for example, grid mapping).
It is also recommended that the procedure be performed on a coupled system where the exchange of information between active components is frequent. Exchanges of information a model day boundaries may mask the detection of an invalid port because the magnitude of the error differences could reach roundoff saturation levels prior to an exchange of data. See example 5 in section 13.3.4.
The recipe for CCSM validation is as follows:
The errors should satisfy the first two conditions described in RW.
Specific recommendations for a port validation of CCSM:
Item | Recommendation |
length of test | 5-30 days |
field to examine | 2-D surface temperature on atmospheric grid |
frequency of samples | every timestep |
size of perturbation | smallest which can be felt on original machine (1.0E-14) |
error statistic | RMS difference of field, area-averaged |
Note that the field being examined must be processed using the full machine precision. The field must be saved at full machine precision during the model
history archival step, and the error statistic must be computed at full machine precision.
Example 1. Perturbation Error Growth
A typical perturbation error growth of the globally averaged RMS difference of surface temperature using a control and a low-order bit perturbation of CCM on 16pes of the IBM SP. Two days (144 atmospheric timesteps) are shown. Note that the first few timesteps satisfy the first condition of RW.
Example 2. Machine Port
Black line is the perturbation error growth on the original machine (same as example 1). Red line is the grow in differences between the simulation on the
original machine and the simulation on 64pes of an SGI Origin 2000, and the blue line is the grow of differences from a simulation on 32pes of an IBM SP.
Note that the first two days (144 timesteps) satisfy the second condition of RW.
Example 3. Bad Port I
Same as example 2, but blue line is a port where the default Greenhouse gas concentration was modified accidentally in the atmospheric source code. The
first and second conditions of RW are violated.
Example 4. Bad Port II
Same as example 2, but blue line is a port where the second order diffusion
coefficient was raised by 15% in the atmospheric model namelist input. The
first and second conditions of RW are violated.
Example 5. Frequency of Model Data Exchange
Same as example 2, but blue line is a port where the ocean model vertical diffusion coefficient was lowered intentionally. While the first and second RW
conditions are satisfied, the port was forced to have been bad. The problem is that the ocean and atmosphere were directed to exchange data only at day
boundaries (72 atmospheric timesteps), and thus the coupler did not communicate the ocean solution to the atmosphere until the start of the second day.
The error in the ocean model solution had already reached the roundoff saturation level by the time the atmospheric model received the information. For
port validation, this example demonstrates that the exchanges of data between components must occur more frequently than time scale at which the
roundoff error reaches a level (saturated) value.