The cesm_setup command does the following:
Creates the Macros file if it does not exist. Calling cesm_setup -clean does not remove this file.
Creates the files, user_nl_xxx, (where xxx denotes the set of components targeted for the specific case). As an example, for a B_ compset, xxx would denote [cam,clm,rtm,cice,pop2,cpl]. In CESM1.1, these files are where all user component namelist modifications are now made. cesm_setup -clean does not remove these files
Creates the file
$CASEROOT/$CASE.run which runs the CESM
model and performs short-term archiving of output data (see running CESM).
cesm_setup also contains the necessary batch directives
to run the model on the required machine for the requested PE layout.
Any user modifications to env_mach_pes.xml must be done
before cesm_setup is invoked. In the simplest case,
cesm_setup can be run without modifying this file and default settings
will be then be used. The cesm_setup command must be run in the
$CASEROOT
directory.
CASEROOT
directory will now
appear as if create_newcase had just been run with the exception
that already created Macros and user_nl_xxx files will not be touched
and local modifications to the env_*.xml files
will be preserved. After further modifications are made to
env_mach_pes.xml,
cesm_setup must be rerun before you can build and run the
model.
If env_mach_pes.xml variables need to be changed after cesm_setup has been called, then cesm_setup -clean must be run first, followed by cesm_setup.
The following summarizes the new directories and files that are created by cesm_setup. For more information about the files in the case directory, see the Section called BASICS: What are the directories and files in my case directory? in Chapter 7.
Table 2-2. Result of invoking cesm_setup
File or Directory | Description |
---|---|
Macros | File containing machine-specific makefile directives for your target platform/compiler. This is only created the first time that cesm_setup is called. Calling cesm_setup -clean will not remove the Macros file once it has been created. |
user_nl_xxx[_NNNN] files | Files where all user modifications to component namelists are
made. xxx denotes the set of components targeted for the specific
case. NNNN goes from 0001 to the number of instances of that
component (see the multiple
instance discussion below). For example, for a B_ compset, xxx
would denote [cam,clm,rtm,cice,pop2,cpl]. For a case where there is only 1
instance of each component (default) NNNN will not appear in the
user_nl file names. A user_nl file of a given name will only be
created once. Calling cesm_setup -clean will not remove
any user_nl files. Changing the number of instances in the
env_mach_pes.xml will only cause new user_nl files
to be added to $CASEROOT .
|
$CASE.run | File containing the necessary batch directives to run the model on the required machine for the requested PE layout. Runs the CESM model and performs short-term archiving of output data (see running CESM). |
CaseDocs/ | Directory that contains all the component namelists for the run. This is for reference only and files in this directory SHOULD NOT BE EDITED since they will be overwritten at build time and run time. |
env_derived | File containing environmental variables derived from other settings. Should not be modified by the user. |
env_mach_pes.xml variables determine the number of processors for each component, the number of instances of each component and the layout of the components across the hardware processors. Optimizing the throughput and efficiency of a CESM experiment often involves customizing the processor (PE) layout for load balancing. CESM1 has significant flexibility with respect to the layout of components across different hardware processors. In general, the CESM components -- atm, lnd, ocn, ice, glc, rof, and cpl -- can run on overlapping or mutually unique processors. Whereas Each component is associated with a unique MPI communicator, the driver runs on the union of all processors and controls the sequencing and hardware partitioning. The component processor layout is via three settings: the number of MPI tasks, the number of OpenMP threads per task, and the root MPI processor number from the global set.
For example, the following env_mach_pes.xml settings
<entry id="NTASKS_OCN" value="128" /> <entry id="NTHRDS_OCN" value="1" /> <entry id="ROOTPE_OCN" value="0" /> |
would cause the ocean component to run on 128 hardware processors with 128 MPI tasks using one thread per task starting from global MPI task 0 (zero).
In this next example:
<entry id="NTASKS_ATM" value="16" /> <entry id="NTHRDS_ATM" value="4" /> <entry id="ROOTPE_ATM" value="32" /> |
the atmosphere component will run on 64 hardware processors using 16 MPI tasks and 4 threads per task starting at global MPI task 32. There are NTASKS, NTHRDS, and ROOTPE input variables for every component in env_mach_pes.xml. There are some important things to note.
NTASKS must be greater or equal to 1 (one) even for inactive (stub) components.
NTHRDS must be greater or equal to 1 (one). If NTHRDS is set to 1, this generally means threading parallelization will be off for that component. NTHRDS should never be set to zero.
The total number of hardware processors allocated to a component is NTASKS * NTHRDS.
The coupler processor inputs specify the pes used by coupler computation such as mapping, merging, diagnostics, and flux calculation. This is distinct from the driver which always automatically runs on the union of all processors to manage model concurrency and sequencing.
The root processor is set relative to the MPI global communicator, not the hardware processors counts. An example of this is below.
The layout of components on processors has no impact on the science. The scientific sequencing is hardwired into the driver. Changing processor layouts does not change intrinsic coupling lags or coupling sequencing. ONE IMPORTANT POINT is that for a fully active configuration, the atmosphere component is hardwired in the driver to never run concurrently with the land or ice component. Performance improvements associated with processor layout concurrency is therefore constrained in this case such that there is never a performance reason not to overlap the atmosphere component with the land and ice components. Beyond that constraint, the land, ice, coupler and ocean models can run concurrently, and the ocean model can also run concurrently with the atmosphere model.
If all components have identical NTASKS, NTHRDS, and ROOTPE set, all components will run sequentially on the same hardware processors.
The root processor is set relative to the MPI global communicator, not the hardware processor counts. For instance, in the following example:
<entry id="NTASKS_ATM" value="16" /> <entry id="NTHRDS_ATM" value="4" /> <entry id="ROOTPE_ATM" value="0" /> <entry id="NTASKS_OCN" value="64" /> <entry id="NTHRDS_OCN" value="1" /> <entry id="ROOTPE_OCN" value="16" /> |
the atmosphere and ocean are running concurrently, each on 64 processors with the atmosphere running on MPI tasks 0-15 and the ocean running on MPI tasks 16-79. The first 16 tasks are each threaded 4 ways for the atmosphere. The batch submission script ($CASE.run) should automatically request 128 hardware processors, and the first 16 MPI tasks will be laid out on the first 64 hardware processors with a stride of 4. The next 64 MPI tasks will be laid out on the second set of 64 hardware processors.
If you set ROOTPE_OCN=64 in the preceding example, then a total of 176 processors would have been requested and the atmosphere would have been laid out on the first 64 hardware processors in 16x4 fashion, and the ocean model would have been laid out on hardware processors 113-176. Hardware processors 65-112 would have been allocated but completely idle.
Note: env_mach_pes.xml cannot be modified after "./cesm_setup" has been invoked without first invoking "cesm_setup -clean". For an example of changing pes, see the Section called BASICS: How do I change processor counts and component layouts on processors? in Chapter 7