Chapter 2. Using the CLM tools to create your own input datasets

Table of Contents
Common environment variables and options used in building the FORTRAN tools
General information on running the FORTRAN tools
Using NCL scripts
The File Creation Process
Using the cprnc tool to compare two history files
Using interpinic to interpolate initial conditions to different resolutions
Creating an output SCRIP grid file at a resolution to run the model on
Creating mapping files that mksurfdata_map will use
Creating a domain file for CLM and DATM
Creating a set of regional datasets from existing global datasets
Using mksurfdata_map to create surface datasets from grid datasets
Converting unstructured grid output to gridded datasets for post processing
How to Customize Datasets for particular Observational Sites
Conclusion of tools description

There are several tools provided with CLM that allow you to create your own input datasets at resolutions you choose, or to interpolate initial conditions to a different resolution, or used to compare CLM history files between different cases. The tools are all available in the models/lnd/clm/tools directory. Most of the tools are FORTRAN stand-alone programs in their own directory, but there is also a suite of NCL scripts in the ncl_scripts directory, and some of the tools are scripts that may also call the ESMF regridding program. Some of the NCL scripts are very specialized and not meant for general use, and we won't document them here. They still contain documentation in the script itself and the README file in the tools directory. But, the list of generally important scripts and programs are:

  1. cprnc to compare NetCDF files with a time axis.

  2. mkmapgrids to create SCRIP grid data files from old CLM format grid files that can then be used to create new CLM datasets.

    mkmapdata to create SCRIP mapping data file from SCRIP grid files (uses ESMF).

  3. gen_domain to create a domain file for datm from a mapping file. The domain file is then used by BOTH datm AND CLM to define the grid and land-mask.

  4. mksurfdata_map to create surface datasets from grid datasets.

  5. interpinic to interpolate initial condition files.

  6. ncl_scripts/getregional_datasets.pl script to extract a region or a single-point from global input datasets. See the single-point chapter for more information on this.

  7. mkprocdata_map to interpolate output unstructured grids (such as the CAM HOMME dy-core "ne" grids like ne30np4) into a 2D regular lat/long grid format that can be plotted easily.

In the sections to come we will go into detailed description of how to use each of these tools in turn. First, however we will discuss the common environment variables and options that are used by all of the FORTRAN tools. Second, we go over the outline of the entire file creation process for all input files needed by CLM for a new resolution, then we turn to each tool. In the last section we will discuss how to customize files for particular observational sites.

Common environment variables and options used in building the FORTRAN tools

The FORTRAN tools all have similar makefiles, and similar options for building. All of the Makefiles use GNU Make extensions and thus require that you use GNU make to use them. They also auto detect the type of platform you are on, using "uname -s" and set the compiler, compiler flags and such accordingly. There are also environment variables that can be set to set things that must be customized. All the tools use NetCDF and hence require the path to the NetCDF libraries and include files. On some platforms (such as Linux) multiple compilers can be used, and hence there are env variables that can be set to change the FORTRAN and/or "C" compilers used. The tools other than cprnc also allow finer control, by also allowing the user to add compiler flags they choose, for both FORTRAN and "C", as well as picking the compiler, linker and and add linker options. Finally the tools other than cprnc allow you to turn optimization on (which is off by default but on for the mksurfdata_map and interpinic programs) with the OPT flag so that the tool will run faster. To get even faster performance, the interpinic, program allows you to also use the SMP to turn on multiple shared memory processors. When SMP=TRUE you set the number of threads used by the program with the OMP_NUM_THREADS environment variable.

Options used by all: cprnc, interpinic, and mksurfdata_map

LIB_NETCDF -- sets the location of the NetCDF library.
INC_NETCDF -- sets the location of the NetCDF include files.
USER_FC -- sets the name of the FORTRAN compiler.

Options used by: interpinic, mkprocdata_map, mkmapgrids, and mksurfdata_map

MOD_NETCDF -- sets the location of the NetCDF FORTRAN module.
USER_LINKER -- sets the name of the linker to use.
USER_CPPDEFS -- adds any CPP defines to use.
USER_CFLAGS -- add any "C" compiler flags to use.
USER_FFLAGS -- add any FORTRAN compiler flags to use.
USER_LDFLAGS -- add any linker flags to use.
USER_CC -- sets the name of the "C" compiler to use.
OPT -- set to TRUE to compile the code optimized (TRUE or FALSE)
SMP -- set to TRUE to turn on shared memory parallelism (i.e. OpenMP) (TRUE or FALSE)
Filepath -- list of directories to build source code from.
Srcfiles -- list of source code filenames to build executable from.
Makefile -- customized makefile options for this particular tool.
Makefile.common -- General tool Makefile that should be the same between all tools.

Options used only by cprnc:

EXEDIR -- sets the location where the executable will be built.
VPATH -- colon delimited path list to find the source files.

More details on each environment variable.

LIB_NETCDF

This variable sets the path to the NetCDF library file (libnetcdf.a). If not set it defaults to /usr/local/lib. In order to use the tools you need to build the NetCDF library and be able to link to it. In order to build the model with a particular compiler you may have to compile the NetCDF library with the same compiler (or at least a compatible one).

INC_NETCDF

This variable sets the path to the NetCDF include directory (in order to find the include file netcdf.inc). if not set it defaults to /usr/local/include.

MOD_NETCDF

This variable sets the path to the NetCDF module directory (in order to find the NetCDF FORTRAN-90 module file when NetCDF is used with a FORTRAN-90 use statement. When not set it defaults to the LIB_NETCDF value.

USER_FC

This variable sets the command name to the FORTRAN-90 compiler to use when compiling the tool. The default compiler to use depends on the platform. And for example, on the AIX platform this variable is NOT used

USER_LINKER

This variable sets the command name to the linker to use when linking the object files from the compiler together to build the executable. By default this is set to the value of the FORTRAN-90 compiler used to compile the source code.

USER_CPPDEFS

This variable adds additional optional values to define for the C preprocessor. Normally, there is no reason to do this as there are very few CPP tokens in the CLM tools. However, if you modify the tools there may be a reason to define new CPP tokens.

USER_CC

This variable sets the command name to the "C" compiler to use when compiling the tool. The default compiler to use depends on the platform. And for example, on the AIX platform this variable is NOT used

USER_CFLAGS

This variable adds additional compiler options for the "C" compiler to use when compiling the tool. By default the compiler options are picked according to the platform and compiler that will be used.

USER_FFLAGS

This variable adds additional compiler options for the FORTRAN-90 compiler to use when compiling the tool. By default the compiler options are picked according to the platform and compiler that will be used.

USER_LDFLAGS

This variable adds additional options to the linker that will be used when linking the object files into the executable. By default the linker options are picked according to the platform and compiler that is used.

SMP

This variable flags if shared memory parallelism (using OpenMP) should be used when compiling the tool. It can be set to either TRUE or FALSE, by default it is set to FALSE, so shared memory parallelism is NOT used. When set to TRUE you can set the number of threads by using the OMP_NUM_THREADS environment variable. Normally, the most you would set this to would be to the number of on-node CPU processors. Turning this on should make the tool run much faster.

Caution

Note, that depending on the compiler answers may be different when SMP is activated.

OPT

This variable flags if compiler optimization should be used when compiling the tool. It can be set to either TRUE or FALSE, by default it is set to FALSE for mkmapgrids and TRUE for mksurfdata_map, mkprocdata_map and interpinic. Turning this on should make the tool run much faster.

Caution

Note, you should expect that answers will be different when OPT is activated.

Filepath

All of the tools are stand-alone and don't need any outside code to operate. The Filepath is the list of directories needed to compile and hence is always simply "." the current directory. Several tools use copies of code outside their directory that is in the CESM distribution (either csm_share code or CLM source code).

Srcfiles

The Srcfiles lists the filenames of the source code to use when building the tool.

Makefile

The Makefile is the custom GNU Makefile for this particular tool. It will customize the EXENAME and the optimization settings for this particular tool.

Makefile.common

The Makefile.common is the copy of the general GNU Makefile for all the CLM tools. This file should be identical between the different tools. This file has different sections of compiler options for different Operating Systems and compilers.

EXEDIR

The cprnc tool uses this variable to set the location of where the executable will be built. The default is the current directory.

VPATH

The cprnc tool uses this variable to set the colon delimited pathnames of where the source code exists. The default is the current directory.

Note: There are several files that are copies of the original files from either models/lnd/clm/src/main, models/csm_share/shr, or copies from other tool directories. By having copies the tools can all be made stand-alone, but any changes to the originals will have to be put into the tool directories as well.

The README.filecopies (which can be found in models/lnd/clm/tools) is repeated here.


models/lnd/clm/tools/README.filecopies			      Oct/13/2012

There are several files that are copies of the original files from
either models/lnd/clm/src/main, models/csm_share/shr,
models/csm_share/unit_testers, or copies from other tool
directories. By having copies the tools can all be made stand-alone,
but any changes to the originals will have to be put into the tool
directories as well.

I. Files that are IDENTICAL:

   1. csm_share files copied that should be identical to models/csm_share/shr:

       shr_kind_mod.F90
       shr_const_mod.F90
       shr_log_mod.F90
       shr_file_mod.F90
       
   2. csm_share files copied that should be identical to models/csm_share/unit_testers:

       test_mod.F90

II. Files with differences

   1. csm_share files copied with differences:

       shr_sys_mod.F90 - Remove mpi abort and reference to shr_mpi_mod.F90.

   2. clm/src files with differences:

       fileutils.F90 --- Remove use of masterproc and spmdMod and endrun in abortutils.

   3. Files in mkgriddata (different from mkmapgrids)

      domainMod.F90 ---- Highly customized based off an earlier version of clm code.
                         Remove use of abortutils, spmdMod. clm version uses latlon
                         this version uses domain in names. Distributed memory
                         parallelism is removed.

   4. Files in mkmapgrids (different from mkgriddata)

      domainMod.F90 ---- Highly customized based off an earlier version of clm code.
                         Remove use of abortutils, spmdMod. clm version uses latlon
                         this version uses domain in names. Distributed memory
                         parallelism is removed.

   5. Files in mksurfdata_map

       mkvarpar.F90
       clm_varctl.F90
       clm_varpar.F90