Creating your own single-point/regional surface datasets

Here's an example of setting up a case using CLM_USRDAT_NAME where you rename the files according to the CLM_USRDAT_NAME convention. We have an example of such datafiles in the repository for a specific region over Alaska (actually just a sub-set of the global f19 grid).

Example 5-4. Example of using CLM_USRDAT_NAME to run a simulation using user datasets for a specific region over Alaska


> cd scripts
> ./create_newcase -case my_userdataset_test -res CLM_USRDAT -compset ICRUCLM45 \
-mach yellowstone_intel
> cd my_userdataset_test/
> set GRIDNAME=13x12pt_f19_alaskaUSA
> set LMASK=gx1v6
> ./xmlchange CLM_USRDAT_NAME=$GRIDNAME,CLM_BLDNML_OPTS="-mask $LMASK"
> ./xmlchange ATM_DOMAIN_FILE=domain.lnd.${GRIDNAME}_$LMASK.nc
> ./xmlchange LND_DOMAIN_FILE=domain.lnd.${GRIDNAME}_$LMASK.nc
# Make sure the file exists in your $CSMDATA or else use svn to download it there
> ls $CSMDATA/lnd/clm2/surfdata_map/surfdata_${GRIDNAME}_simyr2000.nc
# If it doesn't exist, comment out the following...
#> setenv SVN_INP_URL https://svn-ccsm-inputdata.cgd.ucar.edu/trunk/inputdata/
#> svn export $SVN_INP_URL/lnd/clm2/surfdata_map/surfdata_${GRIDNAME}_simyr2000.nc \
#$CSMDATA/lnd/clm2/surfdata_map/surfdata_${GRIDNAME}_simyr2000.nc
> ./cesm_setup

The first step is to create the domain and surface datasets using the process outlined in the Section called The File Creation Process in Chapter 2. Below we show an example of the process.

Example 5-5. Example of creating a surface dataset for a single point


# set the GRIDNAME and creation date that will be used later
> setenv GRIDNAME 1x1_boulderCO
> setenv CDATE    `date +%y%m%d`
# Create the SCRIP grid file for the location and create a unity mapping file for it.
> cd models/lnd/clm/tools/shared/mkmapdata
> ./mknoocnmap.pl -p 40,255 -n $GRIDNAME
# Set pointer to MAPFILE just created that will be used later
> setenv MAPFILE `pwd`/map_${GRIDNAME}_noocean_to_${GRIDNAME}_nomask_aave_da_${CDATE}.nc
# create the mapping files needed by mksurfdata_map.
> cd ../../shared/mkmapdata
> setenv GRIDFILE ../mkmapgrids/SCRIPgrid_${GRIDNAME}_nomask_${CDATE}.nc
> ./mkmapdata.sh -r $GRIDNAME -f $GRIDFILE -t regional
# create the domain file
> cd ../../../../tools/mapping/gen_domain_files/src
> ../../../scripts/ccsm_utils/Machines/configure -mach yellowstone -compiler intel
> gmake
> cd ..
> setenv OCNDOM domain.ocn_noocean.nc
> setenv ATMDOM domain.lnd.{$GRIDNAME}_noocean.nc
> ./gen_domain -m $MAPFILE -o $OCNDOM -l $ATMDOM
# Save the location where the domain file was created 
> setenv GENDOM_PATH `pwd`
# Finally create the surface dataset
> cd ../../../../lnd/clm/tools/clm4_5/mksurfdata_map/src
> gmake
> cd ..
> ./mksurfdata.pl -r usrspec -usr_gname $GRIDNAME -usr_gdate $CDATE

The next step is to create a case that points to the files you created above. We will still use the CLM_USRDAT_NAME option as a way to get a case setup without having to add the grid to scripts.

Example 5-6. Example of setting up a case from the single-point surface dataset just created


# First setup an environment variable that points to the top of the CESM directory.
> setenv CESMROOT <directory-of-path-to-main-cesm-directory>
# Next make sure you have a inputdata location that you can write to 
# You only need to do this step once, so you won't need to do this in the future
> setenv MYCSMDATA $HOME/inputdata     # Set env var for the directory for input data
> ./link_dirtree $CSMDATA $MYCSMDATA
# Copy the file you created above to your new $MYCSMDATA location following the CLMUSRDAT 
# naming convention (leave off the creation date)
> cp $CESMROOT/models/lnd/clm/tools/clm4_5/mksurfdata_map/surfdata_${GRIDNAME}_simyr1850_$CDATE.nc \
$MYCSMDATA/lnd/clm2/surfdata_map/surfdata_${GRIDNAME}_simyr1850.nc
> cd $CESMROOT/scripts
> ./create_newcase -case my_usernldatasets_test -res CLM_USRDAT -compset I1850CRUCLM45BGC \
-mach yellowstone_intel
> cd my_usernldatasets_test
> ./xmlchange DIN_LOC_ROOT=$MYCSMDATA
# Set the path to the location of gen_domain set in the creation step above
> ./xmlchange ATM_DOMAIN_PATH=$GENDOM_PATH,LND_DOMAIN_PATH=$GENDOM_PATH
> ./xmlchange ATM_DOMAIN_FILE=$ATMDOM,LND_DOMAIN_FILE=$ATMDOM
> ./xmlchange CLM_USRDAT_NAME=$GRIDNAME
> ./cesm_setup

Note: With this and previous versions of the model we recommended using CLM_USRDAT_NAME as a way to identify your own datasets without having to enter them into the XML database. This has two down-sides. First you can't include creation dates in your filenames, which means you can't keep track of different versions by date. It also means you HAVE to rename the files after you created them with mksurfdata.pl. And secondly, you have to use linkdirtree in order to place the files in a location outside of the usual DIN_LOC_ROOT (assuming you don't have write access to adding new files to the standard location on the machine you are using). Now, since user_nl files are supported for ALL model components, and the same domain files are read by both CLM and DATM and set using the envxml variables: ATM_DOMAIN_PATH, ATM_DOMAIN_FILE, LND_DOMAIN_PATH, and LND_DOMAIN_FILE -- you can use this mechanism (user_nl_clm and user_nl_datm and those envxml variables) to point to your datasets in any location. In the future we will deprecate CLM_USRDAT_NAME and recommend user_nl_clm and user_nl_datm and the DOMAIN envxml variables.