Chapter 2. Using the CLM tools to create your own input datasets

Table of Contents
Common environment variables and options used in building the FORTRAN tools
General information on running the FORTRAN tools
Using NCL scripts
The File Creation Process
Using the cprnc tool to compare two history files
Using interpinic to interpolate initial conditions to different resolutions
Creating an output SCRIP grid file at a resolution to run the model on
Creating mapping files that mksurfdata_map will use
Creating a domain file for CLM and DATM
Using mksurfdata_map to create surface datasets from grid datasets
Converting unstructured grid output to gridded datasets for post processing
How to Customize Datasets for particular Observational Sites
Conclusion of tools description

There are several tools provided with CLM that allow you to create your own input datasets at resolutions you choose, or to interpolate initial conditions to a different resolution, or used to compare CLM history files between different cases. The tools are all available in the models/lnd/clm/tools directory. Most of the tools are FORTRAN stand-alone programs in their own directory, but there is also a suite of NCL scripts in the shared/ncl_scripts directory, and some of the tools are scripts that may also call the ESMF regridding program. Some of the NCL scripts are very specialized and not meant for general use, and we won't document them here. They still contain documentation in the script itself and the README file in the tools directory.

The tools are divided into three directories for three categories: clm4_0, clm4_5, and shared. The first two are of course for tools that are designed to work with either the CLM4.0 or CLM4.5 versions of the model. The last one are shared utilities that can be used by either, or have a "-phys" option so you can specify which version you want to use.

The list of generally important scripts and programs are as follows.

  1. tools/cprnc(relative to top level directory) to compare NetCDF files with a time axis.

  2. shared/mkmapgrids to create SCRIP grid data files from old CLM format grid files that can then be used to create new CLM datasets (deprecated). There is also a NCL script (shared/mkmapgrids/mkscripgrid.ncl to create SCRIP grid files for regular latitude/longitude grids.

    shared/mkmapdata to create SCRIP mapping data file from SCRIP grid files (uses ESMF).

  3. shared/gen_domain to create a domain file for datm from a mapping file. The domain file is then used by BOTH datm AND CLM to define the grid and land-mask.

  4. mksurfdata_map to create surface datasets from grid datasets (clm4_0 and clm4_5 versions).

  5. interpinic to interpolate initial condition files (clm4_0 and clm4_5 versions).

  6. shared/mkprocdata_map to interpolate output unstructured grids (such as the CAM HOMME dy-core "ne" grids like ne30np4) into a 2D regular lat/long grid format that can be plotted easily. Can be used by either clm4_0 or clm4_5.

In the sections to come we will go into detailed description of how to use each of these tools in turn. First, however we will discuss the common environment variables and options that are used by all of the FORTRAN tools. Second, we go over the outline of the entire file creation process for all input files needed by CLM for a new resolution, then we turn to each tool. In the last section we will discuss how to customize files for particular observational sites.

Common environment variables and options used in building the FORTRAN tools

The FORTRAN tools all have similar makefiles, and similar options for building. All of the Makefiles use GNU Make extensions and thus require that you use GNU make to use them. They also auto detect the type of platform you are on, using "uname -s" and set the compiler, compiler flags and such accordingly. There are also environment variables that can be set to set things that must be customized. All the tools use NetCDF and hence require the path to the NetCDF libraries and include files. On some platforms (such as Linux) multiple compilers can be used, and hence there are env variables that can be set to change the FORTRAN and/or "C" compilers used. The tools other than cprnc also allow finer control, by also allowing the user to add compiler flags they choose, for both FORTRAN and "C", as well as picking the compiler, linker and and add linker options. Finally the tools other than cprnc allow you to turn optimization on (which is off by default but on for the mksurfdata_map and interpinic programs) with the OPT flag so that the tool will run faster. To get even faster performance, the interpinic, program allows you to also use the SMP to turn on multiple shared memory processors. When SMP=TRUE you set the number of threads used by the program with the OMP_NUM_THREADS environment variable.

Options used by all: cprnc, interpinic, and mksurfdata_map

LIB_NETCDF -- sets the location of the NetCDF library.
INC_NETCDF -- sets the location of the NetCDF include files.
USER_FC -- sets the name of the FORTRAN compiler.

Options used by: interpinic, mkprocdata_map, mkmapgrids, and mksurfdata_map

MOD_NETCDF -- sets the location of the NetCDF FORTRAN module.
USER_LINKER -- sets the name of the linker to use.
USER_CPPDEFS -- adds any CPP defines to use.
USER_CFLAGS -- add any "C" compiler flags to use.
USER_FFLAGS -- add any FORTRAN compiler flags to use.
USER_LDFLAGS -- add any linker flags to use.
USER_CC -- sets the name of the "C" compiler to use.
OPT -- set to TRUE to compile the code optimized (TRUE or FALSE)
SMP -- set to TRUE to turn on shared memory parallelism (i.e. OpenMP) (TRUE or FALSE)
Filepath -- list of directories to build source code from.
Srcfiles -- list of source code filenames to build executable from.
Makefile -- customized makefile options for this particular tool.
mkDepends -- figure out dependencies between source files, so make can compile in order..
Makefile.common -- General tool Makefile that should be the same between all tools.

Options used only by cprnc:

EXEDIR -- sets the location where the executable will be built.
VPATH -- colon delimited path list to find the source files.

More details on each environment variable.

LIB_NETCDF

This variable sets the path to the NetCDF library file (libnetcdf.a). If not set it defaults to /usr/local/lib. In order to use the tools you need to build the NetCDF library and be able to link to it. In order to build the model with a particular compiler you may have to compile the NetCDF library with the same compiler (or at least a compatible one).

INC_NETCDF

This variable sets the path to the NetCDF include directory (in order to find the include file netcdf.inc). if not set it defaults to /usr/local/include.

MOD_NETCDF

This variable sets the path to the NetCDF module directory (in order to find the NetCDF FORTRAN-90 module file when NetCDF is used with a FORTRAN-90 use statement. When not set it defaults to the LIB_NETCDF value.

USER_FC

This variable sets the command name to the FORTRAN-90 compiler to use when compiling the tool. The default compiler to use depends on the platform. And for example, on the AIX platform this variable is NOT used

USER_LINKER

This variable sets the command name to the linker to use when linking the object files from the compiler together to build the executable. By default this is set to the value of the FORTRAN-90 compiler used to compile the source code.

USER_CPPDEFS

This variable adds additional optional values to define for the C preprocessor. Normally, there is no reason to do this as there are very few CPP tokens in the CLM tools. However, if you modify the tools there may be a reason to define new CPP tokens.

USER_CC

This variable sets the command name to the "C" compiler to use when compiling the tool. The default compiler to use depends on the platform. And for example, on the AIX platform this variable is NOT used

USER_CFLAGS

This variable adds additional compiler options for the "C" compiler to use when compiling the tool. By default the compiler options are picked according to the platform and compiler that will be used.

USER_FFLAGS

This variable adds additional compiler options for the FORTRAN-90 compiler to use when compiling the tool. By default the compiler options are picked according to the platform and compiler that will be used.

USER_LDFLAGS

This variable adds additional options to the linker that will be used when linking the object files into the executable. By default the linker options are picked according to the platform and compiler that is used.

SMP

This variable flags if shared memory parallelism (using OpenMP) should be used when compiling the tool. It can be set to either TRUE or FALSE, by default it is set to FALSE, so shared memory parallelism is NOT used. When set to TRUE you can set the number of threads by using the OMP_NUM_THREADS environment variable. Normally, the most you would set this to would be to the number of on-node CPU processors. Turning this on should make the tool run much faster.

Caution

Note, that depending on the compiler answers may be different when SMP is activated.

OPT

This variable flags if compiler optimization should be used when compiling the tool. It can be set to either TRUE or FALSE, by default it is set to FALSE for mkmapgrids and TRUE for mksurfdata_map, mkprocdata_map and interpinic. Turning this on should make the tool run much faster.

Caution

Note, you should expect that answers will be different when OPT is activated.

Filepath

All of the tools are stand-alone and don't need any outside code to operate. The Filepath is the list of directories needed to compile and hence is always simply "." the current directory. Several tools use copies of code outside their directory that is in the CESM distribution (either csm_share code or CLM source code).

Srcfiles

The Srcfiles lists the filenames of the source code to use when building the tool.

Makefile

The Makefile is the custom GNU Makefile for this particular tool. It will customize the EXENAME and the optimization settings for this particular tool.

Makefile.common

The Makefile.common is the copy of the general GNU Makefile for all the CLM tools. This file should be identical between the different tools. This file has different sections of compiler options for different Operating Systems and compilers.

mkDepends

The mkDepends is the copy of the perl script used by the Makefile.common to figure out the dependencies between the source files so that it can compile in the necessary order. This file should be identical between the different tools.

EXEDIR

The cprnc tool uses this variable to set the location of where the executable will be built. The default is the current directory.

VPATH

The cprnc tool uses this variable to set the colon delimited pathnames of where the source code exists. The default is the current directory.

Note: There are several files that are copies of the original files from either models/lnd/clm/src/util_share, models/csm_share/shr, or copies from other tool directories. By having copies the tools can all be made stand-alone, but any changes to the originals will have to be put into the tool directories as well.

The README.filecopies (which can be found in models/lnd/clm/tools) is repeated here.


models/lnd/clm/tools/README.filecopies			      Jun/04/2013

There are several files that are copies of the original files from
either models/lnd/clm/src/main, models/csm_share/shr,
models/csm_share/unit_testers, or copies from other tool
directories. By having copies the tools can all be made stand-alone,
but any changes to the originals will have to be put into the tool
directories as well.

I. Files that are IDENTICAL:

   1. csm_share files copied that should be identical to models/csm_share/shr:

       shr_const_mod.F90
       shr_log_mod.F90
       shr_timer_mod.F90
       
   2. csm_share files copied that should be identical to models/csm_share/unit_testers:

       test_mod.F90

   3. clm/src files copied that should be identical to models/lnd/clm/src/util_share:

       nanMod.F90

II. Files with differences

   1. csm_share files copied with differences:

       shr_kind_mod.F90 --- SHR_KIND_CXX is new
       shr_sys_mod.F90 ---- Remove mpi abort and reference to shr_mpi_mod.F90.
       shr_infnan_mod.F90 - Earlier version
       shr_string_mod.F90 - Earlier version
       shr_file_mod.F90 --- mkprocdata_map version is stripped down
       clm_varctl.F90 ----- Earlier version

   2. clm/src files with differences:

       fileutils.F90 --- Remove use of masterproc and spmdMod and endrun in abortutils.

   4. Files in mkmapgrids

      domainMod.F90 ---- Highly customized based off an earlier version of clm code.
                         Remove use of abortutils, spmdMod. clm version uses latlon
                         this version uses domain in names. Distributed memory
                         parallelism is removed.

   5. Files in mksurfdata_map

       mkvarpar.F90 --- clm4_0 and clm4_5 versions are different and different from main clm  \ 
   versions.