EXPERTS: How do I carry out data assimilation using CAM and DART?

Ensemble Kalman filter data assimilation (DA) can now be conducted within the software framework of CESM. This form of DA uses the multi-instance capability of CESM in which CESM advances an ensemble of model states of one or more CESM components forward to the same forecast time, when observations are available. Then the ensemble of forecast model states is passed to the Data Assimilation Research Testbed (DART), where each state is adjusted toward the observations which are available at that time. For details of this process see an introduction in BAMS (2009) and/or the DART home page. DART then passes the ensemble of adjusted model states back to CESM to be used as initial conditions for the next forecast.

The references above describe the many uses of ensemble data assimilation, which include:

This use case outlines assimilation for a CAM (F comp set) build only. Assimilation is possible with the ocean component (B comp sets), and experimental assimilations with the land component (I comp sets) have been conducted. Additional use case descriptions will be added to cover those and any future evolution of the CESM+DART software. This use case assumes that the user is familiar with setting up and using CESM, and is willing to learn how to set up and use DART in the CESM context. There is no simple example which users can grab and run, because understanding what is being run is crucial to success and there are many choices to be made.

The major steps of assimilating observations into CAM follow.

  1. Download DART. DART relieves researchers of the need to develop data assimilation capabilities, but familiarity with data assimilation and the DART facility is required in order to use it productively. This can be gained through the DART tutorial.

  2. Build the DART executables for a simple model to check that DART has been installed correctly.

  3. Build the DART executables for CAM, following a similar procedure to 2.

  4. The script .../DART/models/cam/CESM_setup.csh builds a CAM which combines the user's desired features and DART's required features. The characteristics of the CAM and assimilation set in CESM_setup.csh are:

    • locations of the build, run, and archive directories,

    • features of the $CASE to be built,

    • locations of input files, including the initial ensemble of CAM (and CLM and CICE) states

    • date and timing characteristics of the assimilation,

    • machine and resource characteristics.

      
- Copy CESM_setup.csh to the directory where the user wants to build CAM.  
      - Edit that CESM_setup.csh to set most of the assimilation parameters.
      - Run CESM_setup.csh.
      

  5. Set the rest of the assimilation parameters:

    
  - cd to $CASE
      - Edit input.nml to set other characteristics of the assimilation. 
        For details see the online help pages or the html in the user's $DART/filter/filter.html#GettingStarted.
      - Edit assimilate.csh to set the location of the observations to be assimilated.
        Sets of real observations are available for use, or synthetic observations
        can be created using the user's model.  
    

  6. Submit the job using $CASE.submit in the $CASEROOT directory.

  7. Output from the assimilation is handled by the CESM archiver(s), which has been modified to handle DART output. Output appears in a new short-term archive directory .../archive/.../dart/hist. The 3 files created at each assimilation time are

    • Prior_Diag.YYYY-MM-DD-SSSSS.nc: the ensemble mean, spread, members (optionally), and 'inflation' fields from before the assimilation (at the end of the forecast).

    • Posterior_Diag.YYYY-MM-DD-SSSSS.nc: same as Prior, but from after the assimilation.

    • obs_seq.YYYY-MM-DD-SSSSS.final: the actual observations assimilated and the ensemble members estimates of those observations.

    
The obs_seq.final files are usually processed by the obs_diag 
    program in DART (.../DART/diagnostics/threed_sphere/obs_diag.f90),
    and the resulting NetCDF files are usually processed with Matlab scripts
    included in DART (or similar).  Little knowledge of Matlab is needed to use them.
    The Prior and Posterior files can be examined with any NetCDF viewing tool.