Chapter 2. Using the CLM tools to create your own input datasets

Table of Contents
Common environment variables and options used in building the FORTRAN tools
General information on running the FORTRAN tools
The File Creation Process
Using the cprnc tool to compare two history files
Using interpinic to interpolate initial conditions to different resolutions
Using mkgriddata to create grid datasets
Using mkdatadomain to create domain datasets for DATM or docn from CLM grid datasets
Using mksurfdata to create surface datasets from grid datasets
Using NCL scripts ndepregrid.ncl and aerdepregrid.ncl to interpolate aerosol deposition datasets
How to Customize Datasets for particular Observational Sites
Conclusion of tools description

There are several tools provided with CLM that allow you to create your own input datasets at resolutions you choose, or to interpolate initial conditions to a different resolution, or used to compare CLM history files between different cases. The tools are all available in the models/lnd/clm/tools directory. Most of the tools are FORTRAN stand-alone programs in their own directory, but there is also a suite of NCL scripts in the ncl_scripts directory. Some of the NCL scripts are very specialized and not meant for general use, and we won't document them here. They still contain documentation in the script itself and the README file in the tools directory. But, the list of generally important scripts and programs are:

  1. cprnc to compare NetCDF files with a time axis.

  2. interpinic to interpolate initial condition files.

  3. mkgriddata to create grid datasets.

  4. mkdatadomain to create domain files from grid datasets used by DATM or docn.

  5. mksurfdata to create surface datasets from grid datasets.

  6. ncl_scripts/getregional_datasets.pl script to extract a region or a single-point from global input datasets. See the single-point chapter for more information on this.

  7. ncl_scripts/npdepregrid.ncl interpolate the Nitrogen deposition datasets to a new resolution.

  8. ncl_scripts/aerdepregrid.ncl interpolate the Aerosol deposition datasets to a new resolution.

In the sections to come we will go into detailed description of how to use each of these tools in turn. First, however we will discuss the common environment variables and options that are used by all of the FORTRAN tools. Second, we go over the outline of the entire file creation process for all input files needed by CLM for a new resolution, then we turn to each tool. In the last section we will discuss how to customize files for particular observational sites.

Common environment variables and options used in building the FORTRAN tools

The FORTRAN tools all have similar makefiles, and similar options for building. All of the Makefiles use GNU Make extensions and thus require that you use GNU make to use them. They also auto detect the type of platform you are on, using "uname -s" and set the compiler, compiler flags and such accordingly. There are also environment variables that can be set to set things that must be customized. All the tools use NetCDF and hence require the path to the NetCDF libraries and include files. On some platforms (such as Linux) multiple compilers can be used, and hence there are env variables that can be set to change the FORTRAN and/or "C" compilers used. The tools other than cprnc also allow finer control, by also allowing the user to add compiler flags they choose, for both FORTRAN and "C", as well as picking the compiler, linker and and add linker options. Finally the tools other than cprnc allow you to turn optimization on (which is off by default but on for the mksurfdata and interpinic programs) with the OPT flag so that the tool will run faster. To get even faster performance, the interpinic, mksurfdata, and mkgriddata programs allow you to also use the SMP to turn on multiple shared memory processors. When SMP=TRUE you set the number of threads used by the program with the OMP_NUM_THREADS environment variable.

Options used by all: cprnc, interpinic, mkdatadomain, mkgriddata, and mksurfdata

LIB_NETCDF -- sets the location of the NetCDF library.
INC_NETCDF -- sets the location of the NetCDF include files.
USER_FC -- sets the name of the FORTRAN compiler.

Options used by: interpinic, mkdatadomain, mkgriddata, and mksurfdata

MOD_NETCDF -- sets the location of the NetCDF FORTRAN module.
USER_LINKER -- sets the name of the linker to use.
USER_CPPDEFS -- adds any CPP defines to use.
USER_CFLAGS -- add any "C" compiler flags to use.
USER_FFLAGS -- add any FORTRAN compiler flags to use.
USER_LDFLAGS -- add any linker flags to use.
USER_CC -- sets the name of the "C" compiler to use.
OPT -- set to TRUE to compile the code optimized (TRUE or FALSE)

Options used by: interpinic, mkgriddata, and mksurfdata:

SMP -- set to TRUE to turn on shared memory parallelism (i.e. OpenMP) (TRUE or FALSE)
Filepath -- list of directories to build source code from.
Srcfiles -- list of source code filenames to build executable from.

Options used only by cprnc:

EXEDIR -- sets the location where the executable will be built.
VPATH -- colon delimited path list to find the source files.

More details on each environment variable.

LIB_NETCDF

This variable sets the path to the NetCDF library file (libnetcdf.a). If not set it defaults to /usr/local/lib. In order to use the tools you need to build the NetCDF library and be able to link to it. In order to build the model with a particular compiler you may have to compile the NetCDF library with the same compiler (or at least a compatible one).

INC_NETCDF

This variable sets the path to the NetCDF include directory (in order to find the include file netcdf.inc). if not set it defaults to /usr/local/include.

MOD_NETCDF

This variable sets the path to the NetCDF module directory (in order to find the NetCDF FORTRAN-90 module file when NetCDF is used with a FORTRAN-90 use statement. When not set it defaults to the LIB_NETCDF value.

USER_FC

This variable sets the command name to the FORTRAN-90 compiler to use when compiling the tool. The default compiler to use depends on the platform. And for example, on the AIX platform this variable is NOT used

USER_LINKER

This variable sets the command name to the linker to use when linking the object files from the compiler together to build the executable. By default this is set to the value of the FORTRAN-90 compiler used to compile the source code.

USER_CPPDEFS

This variable adds additional optional values to define for the C preprocessor. Normally, there is no reason to do this as there are very few CPP tokens in the CLM tools. However, if you modify the tools there may be a reason to define new CPP tokens.

USER_CC

This variable sets the command name to the "C" compiler to use when compiling the tool. The default compiler to use depends on the platform. And for example, on the AIX platform this variable is NOT used

USER_CFLAGS

This variable adds additional compiler options for the "C" compiler to use when compiling the tool. By default the compiler options are picked according to the platform and compiler that will be used.

USER_FFLAGS

This variable adds additional compiler options for the FORTRAN-90 compiler to use when compiling the tool. By default the compiler options are picked according to the platform and compiler that will be used.

USER_LDFLAGS

This variable adds additional options to the linker that will be used when linking the object files into the executable. By default the linker options are picked according to the platform and compiler that is used.

SMP

This variable flags if shared memory parallelism (using iOpenMP) should be used when compiling the tool. It can be set to either TRUE or FALSE, by default it is set to FALSE, so shared memory parallelism is NOT used. When set to TRUE you can set the number of threads by using the OMP_NUM_THREADS environment variable. Normally, the most you would set this to would be to the number of on-node CPU processors. Turning this on should make the tool run much faster.

Caution

Note, that depending on the compiler answers may be different when SMP is activated.

OPT

This variable flags if compiler optimization should be used when compiling the tool. It can be set to either TRUE or FALSE, by default it is set to FALSE for mkdatadomain and TRUE for mksurfdata and interpinic. Turning this on should make the tool run much faster.

Caution

Note, you should expect that answers will be different when OPT is activated.

Filepath

All of the tools are stand-alone and don't need any outside code to operate. The Filepath is the list of directories needed to compile and hence is always simply "." the current directory. Several tools use copies of code outside their directory that is in the CESM distribution (either csm_share code or CLM source code).

Srcfiles

The Srcfiles lists the filenames of the source code to use when building the tool.

EXEDIR

The cprnc tool uses this variable to set the location of where the executable will be built. The default is the current directory.

VPATH

The cprnc tool uses this variable to set the colon delimited pathnames of where the source code exists. The default is the current directory.

Note: There are several files that are copies of the original files from either models/lnd/clm/src/main, models/csm_share/shr, or copies from other tool directories. By having copies the tools can all be made stand-alone, but any changes to the originals will have to be put into the tool directories as well.

The README.filecopies (which can be found in models/lnd/clm/tools) is repeated here.


models/lnd/clm/tools/README.filecopies			      May/26/2011

There are several files that are copies of the original files from either
models/lnd/clm/src/main, models/csm_share/shr, or copies from other tool
directories. By having copies the tools can all be made stand-alone, but
any changes to the originals will have to be put into the tool directories
as well.

I. Files that are IDENTICAL:

   1. csm_share files copied that should be identical to models/csm_share/shr:

       shr_kind_mod.F90
       shr_const_mod.F90
       shr_log_mod.F90
       shr_timer_mod.F90
       shr_string_mod.F90
       shr_file_mod.F90

   2. clm/src files copied that should be identical to models/lnd/clm/src/main:

       clm_varctl.F90
       nanMod.F90

   3. Files shared between mkgridata and mksurfdata that are identical:
      (these all came from a much older version of clm)

       ncdioMod.F90
       areaMod.F90
       mkvarpar.F90

II. Files with differences

   1. csm_share files copied with differences:

       shr_sys_mod.F90 - Remove mpi abort and reference to shr_mpi_mod.F90.

   2. clm/src files with differences:

       fileutils.F90 --- Remove use of masterproc and spmdMod and endrun in abortutils.

   3. Files shared between mkgridata and mksurfdata different from models/lnd/clm/src:

      domainMod.F90 ---- Highly customized based off an earlier version of clm code.
                         Remove use of abortutils, spmdMod. clm version uses latlon
                         this version uses domain in names. Distributed memory
                         parallelism is removed.