next up previous contents
Next: 4 Building the CCSM Up: UsersGuide Previous: 2 CCSM2.0 Quick Start   Contents

Subsections

3 The CCSM Scripts

Three levels of c-shell scripts are used to build and run the model. The run script coordinates the building/running the complete system, the component setup scripts are responsible for configuring each individual CCSM component model, and the tool scripts handle generic operations, such as file transfer and positioning.

The CCSM execution is controlled by a single script, referred to as the ``run script''. In the CCSM distribution, this file is $HOME/ccsm2.0/scripts/test.a1.run. By convention, the name of the script is the name of the CASE with a suffix of ".run" For example, if the CASE name was "test.01", the run script would be named "test.a1.run, located in the scripts directory, $SCRIPTS.

Once the run script has defined the common environment, each of the component models (cpl, atm, ocn, ice lnd) is configured using a component setup scripts. The common build and run environment defined in the run script is automatically propagated to each of the component model setup scripts. These variables define such things as the machine architecture, the number of CPUs to run on, and common experiment and file naming conventions.

Finally, when all of the setup scripts have successfully completed, the run script executes all CCSM components simultaneously.

3.1 The CCSM Run Script: test.a1.run

The coordinated build and execution of the CCSM is controlled by a single UNIX c-shell script. The example script distributed with CCSM is called test.a1.run. This script has the following tasks:



a. Set batch system options
b. Define build and run environment variables common to all components.
c. Select multi-processing and resolution specifications
d. Run the setup script for each component (see next section)
e. Run the CCSM Integration.
f. Archive/harvest/resubmit when this run is finished


Below, the various steps of the test.a1.run scripts are outlined.

3.1.1 Set batch system options



#! /bin/csh -f
#===============================================================================
# CVS Id: CCSM.Scripts.tex,v 1.7 2002/05/13 20:07:10 southern Exp $
# CVS Source:  $
# CVS Name: ccsm2_0_beta47 $


The first section of the CCSM run scripts documents the revision control version of the script. The Concurrent Versions System (CVS) is used as the revision control software for the CCSM. The ``CVS Name:'' should be used when reporting CCSM problems.



#===============================================================================
#  This is a CCSM coupled model NQS batch job script
#===============================================================================

#-------------------------------------------------------------------------------
# a. Set batch system options
#-------------------------------------------------------------------------------

# ------------ NCAR IBM SP: blackforest ------------
# @ shell = /usr/bin/csh
# @ output = poe.stdout.$(jobid).$(stepid)
# @ error  = poe.stderr.$(jobid).$(stepid)
# @ network.MPI = csss,not_shared,us
# @ environment  = MP_EUILIB=us; MP_EAGER_LIMIT=0
# @ node_usage = not_shared
# @ checkpoint = no
# @ ja_report = yes
# @ wall_clock_limit = 3800
# @ class = csl_reg
# @ job_type = parallel
# @ job_name = test.a1
## @ account_no = 00000000
## @ task_geometry = {(0,1,2,3)(4)}
# @ task_geometry = {(0)(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11,12,13,14)
   (15,16,17,18)(19,20,21,22)(23,24,25,26)(27,28,29,30)(31,32,33,34)
   (35,36,37,38)(39,40,41,42)(43,44,45,46)(47,48,49,50)(51,52,53,54)
   (55,56,57,58)(59,60,61,62)(63,64,65,66)(67)}
# @ queue
# ------------ NCAR SGI O2K: ute ------------
# QSUB -q ded_28 -l mpp_p=30            # select batch queue
#  SUB -lT    59:00 -lt   59:00         # set CPU time limits
# QSUB -mb -me -eo                      # combine stderr & stout
# ------------ NCAR Compaq: prospect ------------
#PBS -q reg
#PBS -l nodes=3:ppn=4
#PBS -l walltime=0:59:00
#-----------------------------------------------------------------------


The CCSM is supported on two platforms, the IBM SP and the SGI Origin 2000. Typically, the CCSM is run via a batch queuing system. The commands in the default script define the settings for three different batch queue environments at NCAR. The batch queueing system is machine- and site-dependent.

The #!/bin/csh -f line indicates that this is a c-shell script. The "-f" option keeps the users personalized $HOME/.cshrc file from being executed to avoid introducing aliases that could adversely affect the operation of this script. The CVS lines document the revision control version of this script. The remaining lines in this section control the batch submission environment for the three different platforms. Refer to the relevant system documents for more information on these options.

3.1.2 Define the common run environment



echo -------------------------------------------------------------------------
echo  b.  Set env variables available to model setup scripts
echo -------------------------------------------------------------------------

setenv MSSNAME `echo $LOGNAME | tr '[a-z]' '[A-Z]'`  # LOGNAME in caps

setenv CASE        test.a1                       # case name
setenv GRID        T42_gx1v3                     # T42_gx1v3 or T31_gx3
setenv RUNTYPE     startup                 # startup, continue, branch, hybrid
setenv SETBLD      auto                          # auto, true, false
setenv BASEDATE    0001-01-01                    # initial start date yyyy-mm-dd


In this section, the common run environment is defined. All of the components will share this environment. The variables defined in this section are:

MSSNAME (string) follows a convention used by the NCAR Mass Store System. The first element in a user's Mass Store directory path is the user's login name in capital letters. This may be used by each of the components.
CASE (string) is the name that identifies this model run. The CASE name is propagated throughout the CCSM environment. It is used to define where the model run scripts are located, the area where the model is actually run and is used as part of the output file path name. Currently CASE can be up to 16 characters long.
GRID (string) specifies the CCSM horizontal grid. The format is atm/lnd_ocn/ice, where atm/lnd is resolution of the atmosphere and land components and ocn/ice is the resolution of the ocean and sea-ice components. the Currently distributed grids are T42_gx1v3, T31_gx3 for all configurations other than latm, and T62_gx1v3 for latm.
RUNTYPE (string) specifies the state in which the CCSM is to begin a run. A startup run begins a new CCSM run from conditions that might involve reading data from external files or initializing variables internally or some combination. A hybrid run indicates the start of a new CCSM run largely from existing CCSM restart or initial files. A continuation run extends an existing CCSM run from its current endpoint, guaranteeing exact restart. A branch run defines a new CCSM run that is started from bit-for-bit exact restart files but with a new case name.
SETBLD (string) controls whether or not the model executable is built. If SETBLD = true, all of the CCSM component executables will be rebuilt if gmake determines it is needed. For SETBLD = false, the component executables will not be rebuilt. For SETBLD = auto the components will not be rebuilt if RUNTYPE is continue. For all other RUNTYPE parameters, SETBLD = auto will invoke gmake. This ensures that the same component executables are used during the entire integration when running production runs using RUNTYPE = continue.
BASEDATE (integer) defines the baseline date for this run. BASEDATE conforms to ISO-8601 format YYYY-MM-DD, where YYYY is the year in the Gregorian calendar, MM is the month of the year ranging from 01 for January to 12 for December and DD is the day of the month from 01 to 31.



setenv REFCASE  test.a1                          # Runtype = branch data case
setenv REFDATE  0001-01-06                       # Runtype = branch start date


REFCASE (string)is the common reference case to use when starting up with a RUNTYPE of branch. This coordinates the CASE from which the CCSM will be branching across all of the components. REFCASE is ignored unless RUNTYPE is set to branch.
REFDATE (string) coordinates the date in REFCASE from which the branch run is to begin. REFDATE is ignored unless RUNTYPE is set to branch.



setenv CASESTR "fully coupled $GRID test"        # short descriptive text string
setenv MSSDIR   mss:/$MSSNAME/csm/$CASE          # MSS directory path name
setenv MSSDIR   null:/dev/nul                    # MSS directory path name
setenv MSSRPD   0                                # MSS file retention period
setenv MSSPWD   $LOGNAME                         # MSS file write password


CASESTR (string) is designed to hold a short description of this CASE. CASESTR will appear in the model output and in the global attributes of the output data files.
MSSDIR (string) defines the destination of the output datasets. mss:/PATH/name indicates that the datasets should be written to the NCAR mass store. WARNING: Some components (i.e. ice and ocn) do not obey this directive!!! cp:/file/path indicates that the datasets are to be copied into the directory /file/path. null:/dev/nul means do nothing. In this case, the archiver and harvester scripts in the $SCRIPTS directory (see below) can be used to send the output files to their final destination.
MSSRPD (integer) sets the NCAR Mass Storage System's retention period in days. The atmosphere and land models interpret a 0 MSSRPD value to mean that output data files will not be copied to the Mass Storage System.
MSSPWD (string) sets the NCAR Mass Storage System's write password.



setenv CSMROOT  /fs/cgd/home0/$LOGNAME/ccsm2.0   # root directory of source
setenv SCRIPTS  $CSMROOT/scripts/$CASE           # run scripts are here
setenv TOOLS    $CSMROOT/scripts/tools           # some tools are here
setenv LOGDIR   $CSMROOT/scripts/$CASE/logs      # save stdout here
setenv CSMCODE  $CSMROOT/models                  # base dir for src code
setenv CSMUTL   $CSMROOT/models/utils            # Util directory
setenv CSMSHR   $CSMROOT/models/csm_share        # shared code dir
setenv CSMBLD   $CSMROOT/models/bld              # makefiles are here
setenv CSMDATA  /fs/cgd/csm/inputdata            # base dir for input data
setenv LID      "`date +%y%m%d-%H%M%S`"          # time-stamp/file-ID string


CSMROOT (string) defines the root directory of the CCSM code distribution directory tree. The model source code, scripts and documention are located underneath this directory. In the default case, all of the following environment variables are be derived from CSMROOT.
SCRIPTS (string) is the directory containing the run scripts for the current CASE.
TOOLS (string) is the directory containing CCSM utility tools.
LOGDIR (string) is the directory to which copies of the standard out log files (printout) from each of the component models will be copied.
CSMCODE (string) points to the root directory of the CCSM source code for all components.
CSMUTL (string) is the directory containing utility codes, such as the Earth System Modeling Framework (ESMF) routines.
CSMSHR (string) is the directory holding CCSM code that is shared across a number of different components, such as timers, orbital settings, physical constants and message-passing wrappers.
CSMBLD (string) is the directory containing the makefiles and site-specific gnumake macros necessary to build the model executables.
CSMDATA (string) is the root directory for the input and boundary datasets. This directory differs from the others in that it is not created by the CCSM run script. It is assumed that $CSMDATA already exists and contains the data files in the CCSM2.0 input data distribution tar files downloaded from the CCSM2.0 release home page (www.cesm.ucar.edu/models/ccsm2.0 ).
LID (string) defines a unique time-stamp string of the form YYMMDD-hhmmss that is incorporated into the filenames of all of the component output files of the current run.



setenv EXEROOT  /ptmp/$LOGNAME/$CASE  # run model here
setenv OBJROOT  $EXEROOT              # build code here
setenv LIBROOT  $EXEROOT/lib          # Location of supplemental libraries
setenv INCROOT  $LIBROOT/include      # Location of supplemental includes/modfiles
setenv ARCROOT  /XXX/$LOGNAME/archive/$CASE # archive root directory


EXEROOT (string) is the directory where the model actually executes. Subdirectories for each of the model components will be created under EXEROOT. These directories will contain the input and output datasets, the namelists, the model executables and other run artifacts.
OBJROOT (string) defines the directory where the model object files are to be created. While most systems allow OBJROOT and EXEROOT to be the same, some systems need these to be different.
LIBROOT (string) is the directory where supplemental libraries (such as ESMF) will be built and maintained.
INCROOT (string) is the directory for include and module files needed by the supplemental libraries.
ARCROOT (string) defines a directory to be used by the CCSM tools when running the data harvesting scripts.



setenv LFSINP  $CSMDATA                       # LOCAL INPUTDATA FSROOT
setenv LMSINP  /CCSM/inputdata                # LOCAL INPUTDATA MSROOT
setenv LMSOUT  /$MSSNAME/csm/$CASE            # LOCAL OUTPUT MSROOT
setenv MACINP  dataproc.ucar.edu              # REMOTE INPUTDATA MACHINE
setenv RFSINP  /fs/cgd/csm/inputdata          # REMOTE INPUTDATA FSROOT
setenv RMSINP  /CCSM/inputdata                # REMOTE INPUTDATA MSROOT
setenv MACOUT  dataproc.ucar.edu              # REMOTE OUTPUT MACHINE
setenv RFSOUT  /ptmp/$LOGNAME/archive/$CASE   # REMOTE OUTPUT FSROOT


These environment variables allow the user to configure variables to acquire input data and save output data.

LFSINP (string) is the local file system disk location of the input data.
LMSINP (string) is the root directory location on the local mass storage device (NCAR MSS or LANL/NERSC HPSS) of the input data.
LMSINP (string) is the root directory on the local mass storage device (NCAR MSS or LANL/NERSC HPSS) for the output data.
MACINP (string) is the remote machine to copy the inputdata from if the data cannot be located locally. The Unix scp command is used to transfer the files.
RFSINP (string) is the remote file system directory on the MACINP machine for acquiring input data via scp.
RMSINP (string) is the root directory on a remote mass storage device (NCAR MSS or LANL/NERSC HPSS) to acquire the input data from if those data cannot be located locally.
MACOUT (string) is the remote machine to which the output data will be copied (via the Unix scp command).
RFSOUT (string) is the remote file system directory on the MACOUT machine where the output data will be sent.



foreach DIR ( $EXEROOT $LIBROOT $INCROOT $OBJROOT $LOGDIR)
  if !(-d $DIR) mkdir -p $DIR
end


This foreach loop creates the listed directories if they don't already exist.



#--- logic to set BLDTYPE based on SETBLD above
setenv BLDTYPE $SETBLD
if ($SETBLD =~ auto*) then
  setenv BLDTYPE true
  if ($RUNTYPE == 'continue') setenv BLDTYPE false
endif
if ($BLDTYPE != 'true' && $BLDTYPE != 'false') then
  echo "error in BLDTYPE: $BLDTYPE"
  exit 1
endif


This logic resolves the setenv SETBLD auto option. For setenv SETBLD auto, gmake is run on the components only if RUNTYPE is set to startup, branch, or hybrid. If RUNTYPE is continue, component executables are assumed to already exist and rebuilds are not carried out. This ensures that the component executables are unchanged during the entire integration.



echo -------------------------------------------------------------------------
echo  b1.  Determine os/machine/site
echo -------------------------------------------------------------------------

setenv OS   unknown
setenv ARCH unknown
setenv MACH unknown
setenv SITE unknown

setenv OS `uname -s`                               # operating system
if ($status == 0) then                             # architecture
 if ( $OS == 'AIX')    setenv ARCH IBM
 if ( $OS == 'OSF1')   setenv ARCH CPQ
 if ( $OS == 'IRIX64') setenv ARCH SGI
endif

setenv MACHKEY `hostname`
if ($status == 0) then
 if ($MACHKEY =~ bb*    ) setenv MACH babyblue     # machine
 if ($MACHKEY =~ bf*    ) setenv MACH blackforest
 if ($MACHKEY =~ s*     ) setenv MACH seaborg
 if ($MACHKEY =~ prosp* ) setenv MACH prospect
 if ($MACHKEY =~ ute*   ) setenv MACH ute
 if ($MACHKEY =~ n*     ) setenv MACH nirvana
 if ($MACHKEY =~ eag*   ) setenv MACH eagle
 if ($MACHKEY =~ fal*   ) setenv MACH falcon
 setenv SITE ncar                                  # site, default is ncar
 if ($MACHKEY =~ n*     ) setenv SITE lanl
 if ($MACHKEY =~ s*     ) setenv SITE nersc
 if ($MACHKEY =~ eag*   ) setenv SITE ornl
 if ($MACHKEY =~ fal*   ) setenv SITE ornl
endif


This section tries to identify the site where the CCSM is being run.

ARCH (string) returns the architecture of the machine on which the CCSM is being built.
SITEKEY (string) returns the hostname of the machine on which the CCSM is being built.
SITE (string) Converts $SITEKEY into SITE for use in the scripts. If $SITE is ``unknown'', then the script will halt. $SITE is used by the site-specific files $TOOLS/module.$ARCH.$SITE and $CSMBLD/Macros.$ARCH to build the CCSM.



echo -------------------------------------------------------------------------
echo  b2.  Create ccsm_joe
echo -------------------------------------------------------------------------

setenv CSMJOE $SCRIPTS/ccsm_joe
rm -f $CSMJOE
$TOOLS/ccsm_checkenvs > $CSMJOE


The ccsm_joe file documents the job environment variables that are in effect for the run. This will aid in debugging any problems that might be experienced. ccsm_joe is also used by the data harvester and utility tools to get environment variables for the case.

The ccsm_getrestart utility positions restart files from the archive area. Use of this tool is commented out in the default version. This ccsm tools is a handy way to gather restart datasets from a central directory and copy them into the appropriate executable directories. This is often used when carrying out a branch or hybrid RUNTYPE and can be used for a continue RUNTYPE.

3.1.3 Select multi-processing and resolution specs



echo -------------------------------------------------------------------------
echo  c. Select multi-processing and resolution specs
echo -------------------------------------------------------------------------

set MODELS   = (  atm   lnd   ice   ocn    cpl  )  # generic model names.
set SETUPS   = (  atm   lnd   ice   ocn    cpl  )  # setup script name

if ($GRID == T42_gx1v3 ) then
 set NTASKS  = (    8     3    16    40      1  )  # use NTASK = 1 for data model
 set NTHRDS  = (    4     4     1     1      4  )  # use NTHRD = 1 for data model
else if ($GRID == T31_gx3 ) then
 set NTASKS  = (    4     4     2     4      1  )  # use NTASK = 1 for data model
 set NTHRDS  = (    4     1     1     1      2  )  # use NTHRD = 1 for data model
else if ($GRID == T62_gx1v3 ) then
 set NTASKS  = (    1     1    16    40      1  )  # use NTASK = 1 for data model
 set NTHRDS  = (    1     1     1     1      4  )  # use NTHRD = 1 for data model
else
 echo "unsupported configuration: $GRID"
 exit 1
endif


This section defines the arrays of model components and their threading and tasking layouts. The MODELS array defines the generic name of the model components to be coupled. Unless new components are being added, there should be no reason to change these settings. For each MODEL array element, a corresponding element definition is expected in the $SETUPS, $NTASKS and $NTHRDS arrays.

The SETUPS array defines the specific names of the model components. These should align with the ordering on the $MODELS array. The names set here will be used to identify which setup scripts (i.e. $SCRIPTS/$SETUPS.setup.csh) will be called to build the individual model components. In this example, the $SCRIPTS/atm.setup.csh will be called to build the atmosphere. If the data atmosphere is to be used instead of the active atmosphere model, "datm" should be used as the first element in the SETUPS array.

The NTASKS array sets the number of MPI tasks to be used for each model component.

The NTHRDS array sets the number of OPENMP threads to be used for each MPI task.

The example configuration is setup to execute on an IBM SP with 4 processors per node.



setenv ATM_GRID `echo $GRID | sed s/_.\*//`; setenv LND_GRID   $ATM_GRID
setenv OCN_GRID `echo $GRID | sed s/.\*_//`; setenv ICE_GRID   $OCN_GRID


This section obtains grid information for use in the component setup scripts.

ATM_GRID is set to the first part of $GRID for use in the atm.setup.csh and lnd.setup.csh scripts.

OCN_GRID is set to the second part of $GRID for use in the ocn.setup.csh and ice.setup.csh scripts.

3.1.4 Run the setup script for each component

This section compiles and builds the CCSM component executables. In addition, many of the architecture dependent environment variables are set in this section.



echo -------------------------------------------------------------------------
echo  d. Prepare $GRID component models for execution
echo      - create execution directories for atm,cpl,lnd,ice,ocn
echo      - invoke component model setup scripts found in $SCRIPTS
echo -------------------------------------------------------------------------

#--- create working directories
foreach DIR ( $EXEROOT $LIBROOT $INCROOT $OBJROOT $LOGDIR)
  if !(-d $DIR) mkdir -p $DIR
end

#--- run machine dependent commands (i.e. modules on SGI).
if (-f $TOOLS/modules.$OS.$MACH) source $TOOLS/modules.$OS.$MACH  || exit 1


The foreach DIR loop creates the directories in the DIR list.

$SCRIPTS/$SITE.$ARCH.modules contains site specific module and environment settings. The modules at various sites do change with time, so if problems are encountered with compiling or linking in message passing libraries, the modules settings should be examined.



#--- create env variables for use in components
foreach n (1 2 3 4 5)
  set model = $MODELS[$n]
  setenv ${model}_dir $EXEROOT/$model; setenv ${model}_setup $SETUPS[$n]
  setenv ${model}_in  $model.stdin   ; setenv ${model}_out $model.log.$LID
end

#--- get restart files
#$TOOLS/ccsm_getrestart


This loop pre-defines environment variables for the run directory, the setup script, as well as the standard input and standard output file names.



echo -------------------------------------------------------------------------
echo  d1. Build Earth System Modeling Framework   http://www.esmf.scd.ucar.edu
echo -------------------------------------------------------------------------

setenv EXEDIR $EXEROOT/esmf     ; if !(-d $EXEDIR) mkdir -p $EXEDIR
cd $EXEDIR
echo `date` $EXEDIR/esmf.log.$LID | tee esmf.log.$LID
$SCRIPTS/esmf.setup.csh >>& esmf.log.$LID || exit 1


Various components of the CCSM use the Earth System Modeling Framework (ESMF) utilities. In this step the ESMF package is build, with the output from the build process being recorded in a log file. The ESMF documentation can be accessed from the URL shown above.



echo -------------------------------------------------------------------------
echo  d2. Execute component setup.csh scripts, build models
echo -------------------------------------------------------------------------
foreach n (1 2 3 4 5)
#--- activate stdin/stdout redirect work-around ---
#--- setup env variables for components and grids ---
  setenv MODEL  $MODELS[$n]         ; setenv SETUP  $SETUPS[$n]
  setenv NTHRD  $NTHRDS[$n]         ; setenv NTASK  $NTASKS[$n]
  setenv OBJDIR $OBJROOT/$MODEL/obj ; if !(-d $OBJDIR) mkdir -p $OBJDIR
  setenv EXEDIR $EXEROOT/$MODEL     ; if !(-d $EXEDIR) mkdir -p $EXEDIR
  setenv THREAD FALSE               ; if ($NTHRD > 1) setenv THREAD TRUE


The foreach loop cycles through the five-element arrays defined above. Each cycle through the loop will run the setup script for the MODELS component corresponding to the value of n (FORTRAN ordering).

First, a number of environment variables are defined identifying the specific component to be built ($MODEL) and the setup script name ($SETUP) that will be run to build the component. Next, the number of OMP threads ($NTHRD) and number of MPI tasks associated with that component ($NTASK) are resolved.

Names for the model execution ($EXEDIR) and object ($OBJDIR) directories are defined and these directories are created.

Finally, a true/false flag for OMP threading ($THREAD) is set based on the value of $NTHRD.



  cd   $EXEDIR
  echo `date` $EXEDIR/$MODEL.log.$LID | tee $MODEL.log.$LID
  $SCRIPTS/$SETUP.setup.csh             >>& $MODEL.log.$LID
  if ($status != 0) then
    echo  ERROR: $MODEL.setup.csh failed, see $MODEL.log.$LID
    echo  ERROR: cat $cwd/$MODEL.log.$LID
    exit  99
  endif


In this step, the setup script is run for each CCSM component. In $EXEDIR, a log file of the build process is created. The component setup script writes its standard output into the log file. If the component setup script returns with a nonzero status, a diagnostic message is printed and the test.a1.run script exits with a nonzero return code.



#--- create model directories and processor counts for each platform
#--- ($EXEROOT/all for SGI, poe.cmdfile for AIX, prun.cmdfile for OSF1)
  if ($n == 1) then
    rm -rf  $EXEROOT/poe.cmdfile $EXEROOT/all; mkdir -p  $EXEROOT/all
    echo "#! /bin/csh -f" >! $EXEROOT/prun.cmdfile
    @ PROC = 0
    if ($BLDTYPE == 'true') then
      cd $EXEROOT
      tar -cf $EXEROOT/$CASE.exe.$LID.tar $MODEL/$MODEL
    endif
  else
    if ($BLDTYPE == 'true') then
      cd $EXEROOT
      tar -rf $EXEROOT/$CASE.exe.$LID.tar $MODEL/$MODEL
    endif
  endif


In the first time through the component build loop, a number of utility files, directories and counters are initialized. To keep the run script simple, all of these items are created whether they are needed or not.

On the SGI, mpirun requires that all of the model component executables exist in the same directory. If this directory, $EXEROOT/all, exists, it is first deleted, then recreated.

On the Compaq systems, the file $EXEROOT/prun.cmdfile will be created listing the model executables. Here, the first line of this file is created. The rest of the file will be made further on in this script.

The variable $PROC is initialized. $PROC will sum to the total number of processors requested.

If $BLDTYPE == 'true', the component executable is added to the executable tar file. The executable tar file holds the copies of the component executables which are be used for this run and any subsequent runs where $BLDTYPE == 'false'.



  @ M = 0
  while ( $M < $NTASK )
    echo "env OMP_NUM_THREADS=$NTHRD $MODEL/$MODEL"   >>! $EXEROOT/poe.cmdfile
    echo "if ("\$"RMS_RANK == $PROC) ./$MODEL/$MODEL" >>! $EXEROOT/prun.cmdfile;
    @ M++ ; @ PROC++
  end
  ln -s $EXEROOT/$MODEL/$MODEL  $EXEROOT/all/.
end


Some machine specific bookkeeping is attended to here. The IBM SP and the Compaq machines require text files identifying the names of the component executables to be run under MPI. The counters for the number of tasks ($M) and the processors ($PROC) are incremented. The SGI O2K requires that all of the executable be run from a single directory, hence the link of all model executables in the to all/ directory. These constraints are handled here, the text files are created and the component executables are linked into a common directory. Again, to keep the run script simple, all of these items are created whether they are needed or not.



#--- save the latest executables to the active exe.tar
if ($BLDTYPE == 'true') then
  rm -f $EXEROOT/$CASE.exe.tar
  cp $EXEROOT/$CASE.exe.$LID.tar $EXEROOT/$CASE.exe.tar
endif


Finally, if $BLDTYPE == 'true', the executable tar file for this build is made to be the default set of executables for this CASE.

3.1.5 Run the CCSM integration

The various supported platforms each have different environment settings that need to be specified to achieve optimum performance. Once these are set, the model is executed.



echo -------------------------------------------------------------------------
echo  e. Setup hardware specific env variables
echo -------------------------------------------------------------------------

cd $EXEROOT
chmod 755 $EXEROOT/prun.cmdfile

if ( $OS == 'AIX') then
 limit datasize  unlimited    ; setenv XLSMPOPTS "stack=86000000"
 setenv MP_EUILIB us          ; setenv MP_RMPOOL 1
 setenv MP_NODES $PROC        ; setenv MP_PROCS $PROC
 setenv MP_PGMMODEL mpmd      ; setenv MP_CMDFILE       poe.cmdfile
 setenv MP_STDOUTMODE ordered ; setenv MP_SHARED_MEMORY yes
 setenv MP_EAGER_LIMIT 65536  ; setenv MP_INFOLEVEL      6
else if ( $OS == 'IRIX64') then
 setenv TRAP_FPE "UNDERFL=FLUSH_ZERO; OVERFL=ABORT,TRACE; DIVZERO=ABORT,TRACE"
 setenv OMP_DYNAMIC FALSE    ; setenv MPC_GANG OFF; setenv _DSM_WAIT SPIN
 setenv _DSM_VERBOSE         ; setenv _DSM_PLACEMENT ROUND_ROBIN
endif
env | egrep '(MP_|LOADL|XLS|FPE|DSM|OMP|MPC)' # document above env vars


Settings for the IBM SP are:

limit datasize unlimited maximized the virtual memory allocation.
XLSMPOPTS "stack=86000000" reserves 86 Mbytes of stack space for each thread.
MP_EUILIB us requests User Space protocol for communications. This boosts performance for production runs by prohibiting other users from using the nodes where the model is running.
MP_RMPOOL 1 tells POE to allocate nodes from resource manager pool 1.
MP_NODES $PROC sets the number of nodes over which the parallel tasks will be run.
MP_PROCS $PROC is the total number of processes for the model.
MP_PGMMODEL mpmd identifies the programming model to be MPMD (Multiple Processes, Multiple Datastreams).
MP_CMDFILE poe.cmdfile names the text file specifying the names of the component executables to be run under MPI.
MP_STDOUTMODE ordered asks that standard out be buffered and flushed in the order of the tasks that wrote to standard out.
MP_SHARED_MEMORY yes requests that all tasks running on the same node use shared memory for message passing on that node rather than communicating across the switch.
MP_EAGER_LIMIT 65536 maximizes the message size of the receive data buffer for optimal performance.
MP_INFOLEVEL 6 requests that all available informational messages be written to standard output.


Environment settings for the SGI Origin 2000 are:

setenv TRAP_FPE "UNDERFL=FLUSH_ZERO; OVERFL=ABORT,TRACE;
DIVZERO=ABORT,TRACE"

  traps floating point errors by setting floating-point values to zero when they become too small to represent or aborting on overflow or divide-by-zero.
OMP_DYNAMIC FALSE forbids the use of dynamic scheduling for OpenMP threads.
MPC_GANG OFF disallows ``gang scheduling'' to achieve higher performance with the OMP_DYNAMIC FALSE setting.
_DSM_WAIT SPIN instructs each thread to wait in a loop without giving up the CPU until a synchronization event such as a lock or barrier succeeds.
_DSM_VERBOSE requests that all available informational messages be written to standard output
_DSM_PLACEMENT ROUND_ROBIN specifies round-robin memory allocation for stack, data, and text.



echo -------------------------------------------------------------------------
echo  f. Run the model, execute models simultaneously allocating CPUs
echo -------------------------------------------------------------------------

#exit                 # UNCOMMENT to EXIT HERE, BUILD ONLY

echo "`date` -- CSM EXECUTION BEGINS HERE"
if ( $OS == 'AIX')    timex poe
if ( $OS == 'OSF1')   prun  -n $PROC csh -c prun.cmdfile
if ( $OS == 'IRIX64') mpirun -v -d $EXEROOT/all                     \
         -np $NTASKS[1] "env OMP_NUM_THREADS=$NTHRDS[1] $MODELS[1]" : \
         -np $NTASKS[2] "env OMP_NUM_THREADS=$NTHRDS[2] $MODELS[2]" : \
         -np $NTASKS[3] "env OMP_NUM_THREADS=$NTHRDS[3] $MODELS[3]" : \
         -np $NTASKS[4] "env OMP_NUM_THREADS=$NTHRDS[4] $MODELS[4]" : \
         -np $NTASKS[5] "env OMP_NUM_THREADS=$NTHRDS[5] $MODELS[5]"   &
wait
echo "`date` -- CSM EXECUTION HAS FINISHED"


Finally, the CCSM is run. On the IBM SP ($ARCH == 'AIX'), the Parallel Operating Environment (POE) is invoked. The information for the model configuration to run is input through the file specified with the $MP_CMDFILE environment variable. On the Compaq ($ARCH == 'OSF1'), the prun command executes the files listed in the prun.cmdfile. On the SGI, ( $ARCH == 'IRIX64'), mpirun is called with the parallel tasking and threading information for each component being specified.

The wait command suspends the execution of the test.a1.run script until all background processes are complete.

3.1.6 Archive and harvest

In this step, the printed logs are archived and the output datasets are harvested.



echo -------------------------------------------------------------------------
echo  g. save model output stdout & stderr to $LOGDIR
echo     archive and submit harvester
echo -------------------------------------------------------------------------

cd $EXEROOT
if (! -d $LOGDIR) mkdir -p $LOGDIR
gzip  */*.$LID
if ($LOGDIR != "" ) cp -p */*.$LID.* $LOGDIR
#$SCRIPTS/ccsm_archive


Once the model has finished executing, the model standard output files are compressed and copied to $LOGDIR. If desired, the c-shell comment symbol, #, can be removed from the last line to run the ccsm_archive tool script to archive the log file.



if ($OS == 'AIX')  then
  set num = `llq | grep -i $LOGNAME | grep -i share | wc -l`
  cd $SCRIPTS
#  if ($num < 1) llsubmit $CASE.har
endif
#if ($OS != 'AIX')      qsub $SCRIPTS/$CASE.har


A data harvester script ($SCRIPTS/$CASE.har) is used to transfer CCSM output data from the execution directories to a long-term storage device. Separating the harvesting function from the model execution allows model execution to continue even if the connections to the storage device are temporarily interrupted. By default, the harvester is turned off and all the output data will accumulate in the components' execution directories. Removing the c-shell comment symbol,#, will submit the harvester script for this case to the batch queue.

3.1.7 Resubmit



echo -------------------------------------------------------------------------
echo  h. Resubmit another run script $CASE.run
echo -------------------------------------------------------------------------

set echo
if ( -e  $SCRIPTS/RESUBMIT ) then
  @ N = `cat $SCRIPTS/RESUBMIT`
  if ( $N > 0 ) then
    echo "Note: resubmitting run script $CASE.run"
    @ N--
    echo $N >! $SCRIPTS/RESUBMIT
    cd $SCRIPTS
    if ($OS == 'AIX')  llsubmit $CASE.run
    if ($OS != 'AIX')      qsub $CASE.run
  endif
endif

echo =========================================================================
echo  i. end of nqs shell script
echo =========================================================================


The test.a1.run script ends a test to see if the model should be automatically resubmitted to the batch queues. If the file$SCRIPTS/RESUBMIT exists and contains a number greater than 0, the test.a1.run script will be resubmitted to the batch queues. Then the number in the $SCRIPTS/RESUBMIT file is decremented and rewritten to the file.

WARNING: It should be noted that if $CASE.run has a RUNTYPE setting of startup, hybrid or branch, then the model will uselessly repeat the run that was just made. To avoid this, set the value of the counter in the file RESUBMIT to 0 until full production has begun using $RUNTYPE ``continue''.

3.2 Sample Component Setup Script: cpl.setup.csh

The CCSM is designed to allow new component models to easily replace and existing component in the system. To encapsulate the different build procedures required by different component models, each CCSM component has a setup script designed to:

In this section, the coupler setup script is used as an example of a typical component setup script. The component setup scripts, $SCRIPTS/*.setup.csh, are called by $SCRIPTS/test.a1.run. Each component setup script prepares the component for execution by defining the run environment, positioning any restart or input datasets and building the component.

If the setup script is unable to complete any of these tasks, it will abort with a non-zero error status. The test.a1.run script checks the error status and will halt if an error is detected

3.2.1 Document the setup script



#! /bin/csh -f
#===============================================================================
# CVS $Id: sample.setup.csh.tex,v 1.4 2002/06/18 21:25:52 southern Exp $
# CVS Source:  $
# CVS $Name:  $
#===============================================================================
# cpl.setup.csh: Preparing a CSM coupler, cpl5, for execution
# 
# (a) set environment variables, preposition input data files
# (b) create the namelist input file
# (b) build this component executable
#
# For help, see:  http://www.cesm.ucar.edu/models/cpl
#===============================================================================

cat $0;$TOOLS/ccsm_checkenvs || exit -1            # cat this file, check envs


The first section typically documents the setup script.

The first line of this section identifies this as a C-shell script. The "-f" option prevents the user's personalized $HOME/.cshrc file from being executed to avoid introducing aliases that could adversely affect the operation of this script.

The CVS lines document the revision control version of this script.

The echo lines document the purpose of this script. These output from the echo commands will appear in the component log files.

The cat command combines two functions on one line. The "cat $0" command prints a copy of the entire setup script into the output log file in order to document the exact options set by this script. Then $TOOLS/ccsm_checkenvs writes the environment variables that have been set by test.a1.run into the same output log file. If any of the required environment variables are not set, the setup script will exit with an error status of -1.

3.2.2 Set local component script variables



echo ---------------------------------------------------------------------------
echo  a. set environment variables, preposition input data files             
echo ---------------------------------------------------------------------------

if ($GRID =~  T21*   ) set ATM =  (   T21  64  32 ) # name, x dimension, y dimension
if ($GRID =~  T31*   ) set ATM =  (   T31  96  48 )
if ($GRID =~  T42*   ) set ATM =  (   T42 128  64 )
if ($GRID =~  T62*   ) set ATM =  (   T62 192  94 )
if ($GRID =~  T85*   ) set ATM =  (   T85 256 128 )
if ($GRID =~  *gx3   ) set OCN =  (   gx3 100 116 )
if ($GRID =~  *gx1v3 ) set OCN =  ( gx1v3 320 384 )

if (!($?ATM) || !($?OCN)) echo  'unknown GRID = ' $GRID 
if (!($?ATM) || !($?OCN)) exit -1


Here the CCSM resolution variable is parsed into the atmosphere and ocean grid names and the number of points in the longitude and latitude directions are defined. If this is unsuccessful, the script aborts with a nonzero return status.

3.2.3 Define and position the input datasets



\rm -f map_*2*.nc

if ($GRID == T42_gx1v3) then
 set MAP_A2OF_FILE = map_T42_to_gx1v3_aave_da_010709.nc
 set MAP_A2OS_FILE = map_T42_to_gx1v3_bilin_da_010710.nc
 set MAP_O2AF_FILE = map_gx1v3_to_T42_aave_da_010709.nc
 set MAP_R2O_FILE  = map_r05_to_gx1v3_roff_smooth_010718.nc
else if ($GRID == T31_gx3) then

   [ .. many lines deleted for brevity ... ]

else if ($GRID == T85_gx1v3) then
 set MAP_A2OF_FILE = map_T85_to_gx1v3_aave_da_020405.nc
 set MAP_A2OS_FILE = map_T85_to_gx1v3_bilin_da_020405.nc
 set MAP_O2AF_FILE = map_gx1v3_to_T85_aave_da_020405.nc
 set MAP_R2O_FILE  = map_r05_to_gx1v3_roff_smooth_010718.nc
else
 echo "Using unsupported configuration, no mapping files set"
 exit 1
endif
$TOOLS/ccsm_getinput cpl/cpl5/$MAP_A2OF_FILE $MAP_A2OF_FILE || exit 1
$TOOLS/ccsm_getinput cpl/cpl5/$MAP_A2OS_FILE $MAP_A2OS_FILE || exit 1
$TOOLS/ccsm_getinput cpl/cpl5/$MAP_O2AF_FILE $MAP_O2AF_FILE || exit 1
$TOOLS/ccsm_getinput cpl/cpl5/$MAP_R2O_FILE  $MAP_R2O_FILE  || exit 1


This section controls the acquisition of the mapping datasets needed for the coupler. In general, each component requires a unique set of input data files. All input datasets are all uniquely named by a description and a six digit number which documents the creation date (format: yymmdd) of the file. While the hard-wiring of the filenames restricts the degree of automation, it ensures that the exact data that the user requests is input into the model.

A few of the tools from the $TOOLS directory make their first appearance here. The utility $TOOLS/ccsm_getinput will attempt to copy datasets from the input data directory into the current working directory.

If a copy of the data file is unavailable, the script will abort.



set RUN_TYPE = $RUNTYPE
if ($RUNTYPE == startup) set RUN_TYPE = initial
if ($RUNTYPE == hybrid)  set RUN_TYPE = initial

set BASEDATE_NUM = `echo $BASEDATE | sed -e 's/-//g'`
if ($RUNTYPE == branch) then
 set REST_BFILE = $REFCASE.cpl5.r.${REFDATE}-00000
 echo  set REST_BFILE = $REST_BFILE
 $TOOLS/ccsm_getfile $REFCASE/$MODEL/rest/${REST_BFILE}  || exit 99
 set BASEDATE_NUM = `echo $REFDATE | sed -e 's/-//g'`
else
 set REST_BFILE  = 'null'
endif


A number of common variables are defined in the $SCRIPTS/test.a1.run. Individual CCSM components often need to translate the common variables into different names or formats that the component can read. Here, $RUNTYPE, $BASEDATE, $REFCASE and $REFDATE are evaluated for use by the coupler. The coupler recognizes both ``startup'' and ``hybrid'' runtypes as coupler ``initial'' runs. The coupler needs a different date format ($BASEDATE_NUM) than supplied by $BASEDATE. For ``branch'' runs, the coupler uses $REFCASE and $REFDATE to generate the branch filename and date.

3.2.4 Write the input namelist



echo ---------------------------------------------------------------------------
echo  b. create the namelist input file                                      
echo ---------------------------------------------------------------------------

cat >! $MODEL.stdin << EOF
  &inparm
  case_name   = '$CASE '
  case_desc   = '$CASE $CASESTR '
  rest_type   = '$RUN_TYPE '
  rest_date   =  $BASEDATE_NUM
  rest_bfile  = '${REST_BFILE} '
  rest_pfile  = '$SCRIPTS/rpointer.$MODEL'
  map_a2of_fn = '$MAP_A2OF_FILE'
  map_a2os_fn = '$MAP_A2OS_FILE'
  map_o2af_fn = '$MAP_O2AF_FILE'
  map_r2o_fn  = '$MAP_R2O_FILE'
  rest_freq   = 'monthly'
  rest_n      = 3
  diag_freq   = 'ndays'
  diag_n      = 1     
  stop_option = 'ndays'
  stop_n      = 5
  hist_freq   = 'monthly'
  hist_n      = 1
  info_bcheck = 2
  orb_year    = 1990
  flx_epbal   = 'off'
  flx_albav   = 0
  mss_dir     = '$MSSDIR/$MODEL/ '
  mss_rtpd    =  $MSSRPD
  mss_pass    = '$MSSPWD'
  mss_rmlf    = 0
  nx_a = $ATM[2] , ny_a = $ATM[3],  nx_l = $ATM[2] , ny_l = $ATM[3]
  nx_o = $OCN[2] , ny_o = $OCN[3],  nx_i = $OCN[2] , ny_i = $OCN[3]
  /
EOF
echo o contents of $MODEL.stdin: ; cat $MODEL.stdin ; echo ' '


This section constructs the input namelist that is used to control runtime operation of the component. In the namelist input file, a wide range of predefined parameters are set to control the behavior of the component. The namelist input file, in this case called cpl.stdin, is a text file that is read by the component model. Namelist input for components consists of text strings enclosed in quotes, integer and real numerical values and logicals.

The "cat" command uses the c-shell here-document option to create the file $EXEDIR/cpl.stdin with all the settings being evaluated to the current values of the specified environment variables.

&inparm is the namelist group name, which matches the groupname defined within the coupler.
case_name = '$CASE ' (string) sets a unique text string (16-characters or less) that is used to identify this run. The CASE variable is set in $SCRIPTS/test.a1.run and is used extensively in the CCSM as an identifier. Since CASE will be used in file and directory names, it should only contain standard UNIX filenaming characters such as letters, numbers, underscores, dashes, commas or periods.
case_desc = '$CASE $CASESTR ' (string) provides 80 characters to further describe this run. This description appears in the output logs and in the header data for the output data sets. CASESTR is set in the $SCRIPTS/test.a1.run script.
rest_type = ' $RUN_TYPE ' (string) specifies the state in which the coupler is to begin the run. rest_type settings initial, branch and continue map into the CCSM variables startup or hybrid, branch and continue).
rest_date = $BASEDATE_NUM (string) is the initial date of the simulation. This variable is ignored on continuation or branch runs.
rest_bfile = '${REST_BFILE}' (string) specifies the branch file to use when starting a branch run. This ignored unless rest_type is set to 'branch'.
rest_pfile = '$SCRIPTS/rpointer.$MODEL' (string) is the complete filepath and filename of the restart "pointer file" used for continuation runs.
map_a2of_fn = '$MAP_A2OF_FILE' (string) is the filename of the map for atmosphere-to-ocean flux fields.
map_a2os_fn = '$MAP_A2OS_FILE' (string) is the filename of the map for atmosphere-to-ocean state fields.
map_o2af_fn = '$MAP_O2AF_FILE' (string) is the filename of the map for ocean-to-atmosphere flux fields.
map_r2o_fn = '$MAP_R2O_FILE' (string) is the filename of the map for land-runoff-to-ocean.
rest_freq = 'monthly' (string) instructs the coupler to have all the CCSM components write out restart files on the first day of every month.
rest_n = 3 (integer) when rest_freq is set to 'nday', rest_n sets the number of days between writes of the restart files. Since rest_freq is 'monthly', this setting is ignored.
diag_freq = 'ndays' (string) sets the frequency at which diagnostics are printed from the coupler. In this case, the setting ndays will use the number of days set by diag_n.
diag_n = 1 (integer) specifies the number of time periods for the time unit set in diag_freq.
stop_option = 'ndays' (string) controls the length of the CCSM run.
stop_n = 5 (string) specifies that this integration will run for 5 days.
hist_freq = 'monthly' (string) controls the frequency of history file output.
hist_n = 1 (integer) is the option when hist_freq = 'ndays' or 'nstep'. Since hist_freq is 'monthly' this setting is ignored.
info_bcheck = 2 (string) specifies that high precision printed output is to be written every day into the coupler log file. This is used for verifying that two runs are exactly the same.
orb_year = 1990 (integer) is the calendar year that is used to determine the solar orbit and resulting solar angles.

flx_epbal = 'off' (string) turns off evaporation/precipitation balancing.
flx_albav = 0 (integer) turns off daily average a albedos.
mss_dir = '$MSSDIR/$MODEL/' (string) sets the pathname of the NCAR Mass Storage System (MSS) files.
mss_rtpd = $MSSRPD (integer) sets the retention period when using the NCAR MSS.
mss_pass = '$MSSPWD' (string) sets the write password when using the NCAR MSS.
mss_rmlf = 0 (integer) does not remove local files after mswrite.
nx_a (integer) is the latitude dimensions of the atmosphere model.
ny_a (integer) is the longitude dimensions of the atmosphere model.
nx_l (integer) is the latitude dimensions of the land model.
ny_l (integer) is the longitude dimensions of the land model.
nx_o (integer) is the latitude dimensions of the ocean model.
ny_o (integer) is the longitude dimensions of the ocean model.
nx_i (integer) is the latitude dimensions of the sea-ice model.
ny_i (integer) is the longitude dimensions of the sea-ice model.
/ marks the end of the inparm namelist group.
EOF marks the end of the here document begun with the ``cat'' command

Detailed information on the coupler namelist variables can be found in the coupler User's Guide.

3.2.5 Build the component executable



echo ---------------------------------------------------------------------------
echo  c. Build an executable in $OBJDIR
echo ---------------------------------------------------------------------------

cd $OBJDIR

# Filepath: List of source code directories (in order of importance).
#--------------------------------------------------------------------

\cat >! Filepath << EOF
$SCRIPTS/src.$MODEL
$CSMCODE/cpl/cpl5
$CSMSHR
EOF

# run make
#--------------------------------------------------------------------

if ($BLDTYPE == 'true') then
  cc -o makdep $CSMBLD/makdep.c                              || exit 2
  gmake -j 6 VPFILE=Filepath MODEL=cpl5 EXEC=$EXEDIR/$MODEL \
      -f  $CSMBLD/Makefile MACFILE=$CSMBLD/Macros.$OS    || exit 2
else
  echo "BLDTYPE = $BLDTYPE"
endif


The CCSM uses the gnumake (also known as ``gmake'') tool to build the model executable. Each of the components setup scripts creates a list of source code directories from which to gather the input source code for that component. This list is called Filepath and will be used as the input to the gmake VPATH list. The file Filepath is written in each of the components $OBJDIR directories.

The Filepath directories are listed in order of precedence. If a file is found in more than one of the directories listed in Filepath, the version of the file found in the directory listed first will be used to build the code. The first directory, $SCRIPTS/src.cpl, is typically used to hold modified coupler source code. If a directory in the Filepath list is either empty or doesn't exist at all, no error will result. In general, the directories $SCRIPTS/src.$MODEL can be used to store locally modified source code. Each component script recognizes this directory as the top priority for finding source code.

First the makdep code is compiled. This utility program is called by the Makefile and checks for source code dependencies. This is done by seeing if any of the header or include files have been updated since the model was last built and ensures that the F90 modules are constructed in the proper order.

Once makdep is compiled, the GNU make program, gmake, is used to actually build the model. The -j 4 option uses 4 processors to build the model. The -f $CSMBUILD/Makefile points to the generic CCSM Makefile while MACFILE=$CSMBLD/Macros.$OS points to the machine specific make options. MODEL identifies the component being built and VPFILE points to the Filepath list. Finally, the actual executable to be built is $EXEDIR/$MODEL.



# document the source code used, cleanup $EXEDIR/obj files
#--------------------------------------------------------------------

grep 'CVS' *.[hf]*                       
#gmake -f $CSMBLD/Makefile MACFILE=$CSMBLD/Macros.$OS mostlyclean

echo ' '
echo ===========================================================================
echo  End of setup shell script  `date` 
echo ===========================================================================


The final portion of the script documents the source code CVS tags and optionally cleans up the object files that were created.

At this point, control is returned to test.a1.run.


next up previous contents
Next: 4 Building the CCSM Up: UsersGuide Previous: 2 CCSM2.0 Quick Start   Contents
csm@ucar.edu