EXPERTS: How do I add a new CESM model component?

The following provides a very general overview of what you need to do to add a new atm, lnd, ocn, ice, glc or rof component to CESM.

  1. You must support init, run, and final top level interfaces.

  2. You must "send" the component grid and decomposition at initialization.

  3. You must pack and unpack coupling fields to/from interface datatypes.

  4. You must integrate forward a fixed amount of time and confirm that your model is in sync with the driver clock.

  5. You must provide/use "expected" scalar information as needed. - provide present/prognostic flags at initialization - provide "nextsw_cday" if atm component, use nextsw_cday if surface model - use mpicom - use stop and restart information - use inst_name, inst_index, inst_suffix (for cesm[version]) - use other infodata information as needed (ie. starttype, case_name, configuration settings like aqua_planet, orbital settings)

  6. Use I/O unit manager in CESM

  7. You must meet filename conventions for history, restart, and log files

There are some component to component variations in the interfaces and in the provide/use of scalar data, so it's best to follow another component of the same flavor. The top level interface will be a fortran file called ***_comp_mct.F90 where *** is atm, ocn, ice, or lnd. That file exists in all components. Below is a generic summary of what is going on using the atm component as an example. Other components are very close.

  1. You must support init, run, and final top level interfaces. The interfaces must follow the naming convention and argument types exactly. These interfaces must be in a file called atm_comp_mct.F90 and the module must be called atm_comp_mod. The driver will access the component model only through the init, run, and final interfaces.

    
      module atm_comp_mct
    
          public :: atm_init_mct
          public :: atm_run_mct
          public :: atm_final_mct
    
          subroutine atm_init_mct( EClock, cdata_a, x2a_a, a2x_a, NLFilename )
          type(ESMF_Clock),intent(in)                 :: EClock
          type(seq_cdata), intent(inout)              :: cdata_a
          type(mct_aVect), intent(inout)              :: x2a_a
          type(mct_aVect), intent(inout)              :: a2x_a   
          character(len=*), optional,   intent(IN)    :: NLFilename ! Namelist filename
    
          subroutine atm_run_mct( EClock, cdata_a, x2a_a, a2x_a)
          type(ESMF_Clock)            ,intent(in)    :: EClock
          type(seq_cdata)             ,intent(inout) :: cdata_a
          type(mct_aVect)             ,intent(inout) :: x2a_a
          type(mct_aVect)             ,intent(inout) :: a2x_a
    
          subroutine atm_final_mct( )
    

  2. You must "send" the component grid and decomp at initialization The cdata datatype contains data for a grid and decomp. The decomp is an mct gsmap and the grid is an general grid. To access these data type from the init method, do the following.

    
      type(mct_gsMap), pointer   :: gsMap_atm
          type(mct_gGrid), pointer   :: dom_a
    
          call seq_cdata_setptrs(cdata_a, gsMap=gsMap_atm, dom=dom_a)
    
          ! call an mct_gsmap_init method and specify the global index
          !   of each local gridcell
          ! call an mct_gGrid_init method and fill the lon/lat/area/mask/frac
          !   arrays
    

  3. You must pack and unpack coupling fields to/from interface datatypes The fields coupling datatypes are mct attribute vectors. x2a contains the coupler->atm fields. a2x contains the atm->coupler fields. This datatype must be initialized by the component in the init method. To do that use the fields list provided by seq_flds_mod and initialize the gsmap first. lsize below is the local number of gridcells on the processor. the mct_aVect_init calls below allocate arrays in the attribute vector to store the appropriate number of fields of appropriate local size.

    
      use seq_flds_mod
    
          lsize = mct_gsMap_lsize(gsMap_atm, mpicom_atm)
          call mct_aVect_init(a2x_a, rList=seq_flds_a2x_fields, lsize=lsize)
          call mct_aVect_zero(a2x_a)
          call mct_aVect_init(x2a_a, rList=seq_flds_x2a_fields, lsize=lsize)
          call mct_aVect_zero(x2a_a)
    
    To pack the data, it's easiest just to write directly into the arrays inside the attribute vector in the following manner,
    
      integer :: index_a2x_Sa_pslv         ! sea level atm pressure
      
          index_a2x_Sa_pslv  = mct_avect_indexra(a2x_a,'Sa_pslv')
    
          do i=is,ie
             a2x_a%rAttr(index_a2x_Sa_pslv ,i) = psl(i)
             a2x_a%rAttr(index_a2x_Sa_z    ,i) = zbot(i)
             a2x_a%rAttr(index_a2x_Sa_u    ,i) = ubot(i)
             a2x_a%rAttr(index_a2x_Sa_v    ,i) = vbot(i)
          enddo
    
    To unpack, basically do the same thing in the opposite direction.
    
      integer :: index_x2a_Sx_t            ! surface temperature
      
          index_x2a_Sx_t  = mct_avect_indexra(x2a_a,'Sx_t')
    
          do i=is,ie
             ts(i) = x2a_a%rAttr(index_x2a_Sx_t ,i)
          enddo
    

    The attribute vectors store only the "local" data and you basically just need to copy data from the model datatype to the coupling MCT (or ESMF) datatype.

  4. You must integrate forward a fixed amount of time and confirm that your model is in sync with the driver clock. You can access clock information from the EClock passed through the coupling interfaces and the EClockGetData method. The approach to sync the model and driver clock is very model specific. In some cases, a model may just get the current time from the EClock and use it. That probably only happens for models that don't advance in time (like maybe a data model). In other cases, model clocks may be initialized based on the EClock data at initialization and then the model time and driver time are regularly compared for consistency. Another approach is to use the "dt" provided by the EClock to advance a model "dt" seconds. Some examples are provided.

    
      type(ESMF_Clock)            ,intent(in)    :: EClock
    
          ! EClock initialization data
          call seq_timemgr_EClockGetData(EClock, &
             start_ymd=start_ymd, start_tod=start_tod, &
             ref_ymd=ref_ymd, ref_tod=ref_tod,         &
             stop_ymd=stop_ymd, stop_tod=stop_tod,     &
             calendar=calendar )
    
          ! EClock dt
          call seq_timemgr_EClockGetData(Eclock,dtime=atm_cpl_dt)
    
          ! EClock current time
          call seq_timemgr_EClockGetData(EClock,curr_ymd=ymd_sync,curr_tod=tod_sync, &
             curr_yr=yr_sync,curr_mon=mon_sync,curr_day=day_sync)
    
          ! Check synchronization
          ymd=model_ymd
          tod=model_tod
          if (.not. seq_timemgr_EClockDateInSync( EClock, ymd, tod)) then
             write(*,*) "Clocks not in sync"
             call shr_sys_abort()
          endif
    

  5. You must provide/use "expected" scalar information as needed. Each component varies a bit wrt what's provided and used.

    • Provide present/prognostic flags at initialization. The logical flag present means the component provides data. The logical flag prognostic means the component uses data. Stub models set both to false. Data models generally set present to true and prognostic to false, except in cases where a data model needs some coupling data for some internal computations (e.g. DOCN-SOM). Active models generally set both flags to true.

      
        call seq_infodata_PutData(infodata, atm_present=.true.)
              call seq_infodata_PutData(infodata, atm_prognostic=.true.)
      

    • Provide "nextsw_cday" if you are adding an atm component. Use nextsw_cday if you are adding a surface model. The real variable nextsw_cday is the time of the next atm radiation calculation if it occurs at the next coupling period. If there is no radiation calculation on the next timestep, this should be set to -1. This allows surface albedos and the atm radiation calculation to stay synced up. The surface models use the nextsw_day field and compute albedos based on that time.

      
        call seq_infodata_PutData( infodata, nextsw_cday=nextsw_cday )
      

    • Use the mpi communicator, mpicom, which is provided by the driver to the component. This must be used for all internal model communication. The data is provided in the cdata datatype.

      
        call seq_cdata_setptrs(cdata_a, mpicom=mpicom_atm)
      

    • Use stop and restart information provided by the driver. The driver will tell the component if this is the last coupling period and/or if a restart is required at the end of this coupling period. The component should listen to both these flags.

      
        stop_now = seq_timemgr_StopAlarmIsOn(EClock)
              restart_now = seq_timemgr_RestartAlarmIsOn(EClock)
      

    • Use inst_name, inst_index, inst_suffix needed for for the multiple instance capability. As a starting point, the following provides standard code that can be added:

    • 
        integer(IN)   :: COMPID                ! mct comp id
              integer       :: inst_index            ! number of current instance (ie. 1)
              character(len=16) :: inst_name         ! fullname of current instance (ie. "lnd_0001")
              character(len=16) :: inst_suffix       ! char string associated with instance 
      
              call seq_cdata_setptrs(cdata, ID=COMPID)
              inst_name   = seq_comm_name(COMPID)
              inst_index  = seq_comm_inst(COMPID)
              inst_suffix = seq_comm_suffix(COMPID)
      

    • Use other infodata information as needed (ie. starttype, case_name, configuration settings like aqua_planet, orbital settings)

  6. Use the I/O unit manager in CESM. To avoid conflicts in I/O unit numbers between components, models should call the shr_file_getUnit and shr_file_freeUnit methods to acquire and release available unit numbers.

    
       nunit = shr_file_getUnit()
           open(nunit,file='xyz')
           read(nunit,*) xyz
           close(nunit)
           call shr_file_freeUnit(nunit)
    

  7. Meet filename conventions for history, restart, and log files. There are specific filename conventions for CESM. In particular, all history files should be CF compliant netcdf. This format is also recommended for restart files. The format is something like

    
     $CASE.atm.ha.2001-01.nc         ! history
         $CASE.atm.r.2001-01-00000.nc    ! restart
    
    The log files are set by a shared method. In particular, models should do something like
    
      !--- open log file ---
          if (my_task == master_task) then
             logUnit = shr_file_getUnit()
             call shr_file_setIO('atm_modelio.nml',logUnit)
          else
             logUnit = 6
          endif
    
    That logunit value should then be used by all processors in the model to write "stdout" messages. shr_file_setIO associates the logUnit number with a unique log filename for the case. unit 6 is used on non-root processors and information written from those processors goes to a stdout file which is machine dependent.