6 Coding Conventions

It is recommended that all CCSM components follow a coding convention. The goal is to create code with a consistent look and feel so that it is easier to read and maintain. To date, no conventions have been specified which apply across all CCSM components. Conventions have been defined for the atmospheric model and are included below.

6.1 Fortran Coding Standard for the Community Atmospheric Model

This section defines a set of specifications, rules, and recommendations for the coding of the Community Atmospheric Model (CAM). The purpose is to provide a framework that enables users to easily understand or modify the code, or to port it to new computational environments. In addition, it is hoped that adherence to these guidelines will facilitate the exchange and incorporation of new packages and parameterizations into the model. Other works which influenced the development of this standard are "Report on Column Physics Standards" (http://nsipp.gsfc.nasa.gov/infra/) and "European Standards For Writing and Documenting Exchangeable Fortran 90 Code" (http://nsipp.gsfc.nasa.gov/infra/eurorules.html).

6.1.1 Style Rules

Preprocessor Where the use of a language preprocessor is required, it will be the C preprocessor (cpp). cpp is available on any UNIX platform, and many Fortran compilers have the ability to run cpp automatically as part of the compilation process. All tokens will be uppercase to distinguish them from Fortran code, which will be in lower case.
F90 Standard CCM4 will adhere to the Fortran 90 language standard. The purpose is to enhance portability, and to allow use of the many desirable new features of the language. If a situation arises in which there is good reason to violate this rule and include Fortran code which is not compliant with the f90 standard, an alternate set of f90-compliant code must be provided. This is normally done through use of a C-preprocessor ifdef construct.
Free-Form Source Free-form source will be used. The f90 standard allows up to 132 characters, but a self-imposed limit of 90 should enhance readability and make life easier for those with bad eyesight, who wish to make overheads of source code, or print source files with two columns per page. The world will not come to an end if someone extends a line of code to column 91, but multi-line comments that extend to column 100 for example would be unacceptable.
Loops Loops should be structured with the do-end do construct as opposed to numbered loops.
Argument Comments Input arguments and local variables will be declared 1 per line, with a comment field expressed with a "!" character followed by the comment text all on the same line as the declaration. Multiple comment lines describing a single variable are acceptable when necessary. Variables of a like function may be grouped together on a single line. For example:
```
            integer :: i,j,k ! Spatial indices
```
Continuation Lines Continuation lines are acceptable on multi-dimensional array declarations which take up many columns. For example:
```
            real(r8), dimension(plond,plev), intent(in) :: & 
            array1, &! array1 is blah blah blah 
            array2   ! array2 is blah blah blah
```
Note that the f90 standard defines a limit of 39 continuation lines.
Code lines which are continuation lines of assignment statements must begin to the right of the column of the assignment operator. Similarly, continuation lines of subroutine calls and dummy argument lists of subroutine declarations must have the arguments aligned to the right of the "(" character. Examples of each of these constructs are:
```
            a = b + c*d + ... + & 
                h*g + e*f

            call sub76 (x, y, z, w, a, & 
                        b, c, d, e) 

            subroutine sub76 (x, y, z, w, a, & 
                              b, c, d, e)
```
Indentation Code within loops and if-blocks will be indented 3 characters for readability.
Argument List Format Routines with large argument lists will contain 5 variables per line. This applies both to the calling routine and the dummy argument list in the routine being called. The purpose is to simplify matching up the arguments between caller and callee. In rare instances in which 5 variables will not fit on a single line, a number smaller than 5 may be used. But the per-line number must remain consistent between caller and callee. An example is:
```
            call linemsbc (u3(i1,1,1,j,n3m1), v3(i1,1,1,j,n3m1), t3(i1,1,1,j,n3m1), & 
                           q3(i1,1,1,j,n3m1), qfcst(i1,1,m,j), xxx)

            subroutine linemsbc (u, v, t, & 
                                 q, qfcst, xxx)
```
Commenting style Short comments may be included on the same line as executable code using the "!" character followed by the description. More in-depth comments should be written in the form:
```
      ! 
      ! Describe what is going on 
      !
```
Key features of this style are 1) it starts with a "!" in column 1; 2) The text starts in column 3; and 3) the text is offset above and below by a blank comment line. The blank comments could just as well be completely blank lines (i.e. no "!") if the developer prefers.
Use of the operators <, >, <=, >=, ==, /= is recommended instead of their deprecated counterparts .lt., .gt., .le., .ge., .eq., and .ne. The motivation is readability.
Case Code will be written in lower case. This convention cleanly segregates code from C preprocessor tokens, since the convention has been established that such tokens are all uppercase.
File Format Embedding multiple routines within a single file and/or module is allowed, encouraged in fact, if any of three conditions hold. First, if routine B is called by routine A and only by routine A, then the two routines may be included in the same file. This construct has the advantage that inlining B into A is often much easier for compilers if both A and B are in the same file. Practical experience with many compilers has shown that inlining when A and B are in different files often is too complicated for most people to consider worthwhile investigating.
The second condition in which it is desirable to put multiple routines in a single file is when they are "CONTAIN"ed in a module for the purpose of providing an implicit interface block. This type of construct is strongly encouraged, as it allows the compiler to perform argument consistency checking across routine boundaries. An example is:
```
      file 1:

            subroutine driver 
            use mod1 
            real :: x, y 
            ... 
            call sub1(x,y) 
            call sub2(y) 
            return 
            end subroutine

      file 2:

            module mod1 
            private 
            real :: var, var2 
            public sub1, sub2 

            contains 
            subroutine sub1(a,b) 
            ... 
            return 
            end subroutine 

            subroutine sub2(a) 
            ... 
            return 
            end subroutine 

            end module
```
The number, type, and dimensionality of the arguments passed to sub1 and sub2 are automatically checked by the compiler.
The final reason to store multiple routines and their data in a single module is that the scope of the data defined in the module can be limited to only the routines which are also in the module. This is accomplished with the "private" clause.
If none of the above conditions hold, it is not acceptable to simply glue together a bunch of functions or subroutines in a single file.
Module Names Modules MUST be named the same as the file in which they reside. The reason to enforce this as a hard rule has to do with the fact that dependency rules used by "make" programs are based on file names. For example, if routine A "USE"s module B, then "make" must be told of the dependency relation which requires B to be compiled before A. If one can assume that module B resides in file B.o, building a tool to generate this dependency rule (e.g. A.o: B.o) is quite simple. Put another way, it is difficult (to say nothing of CPU-intensive) to search an entire source tree to find the file in which module B resides for each routine or module which "USE"s B.
Note that by implication multiple modules are not allowed in a single file.
The use of common blocks is deprecated in Fortran 90 and their continued use in the CCM is strongly discouraged. Modules are a better way to declare static data. Among the advantages of modules is the ability to freely mix data of various types, and to limit access to contained variables through use of the ONLY and PRIVATE clauses.
Array Syntax The use of array syntax is not encouraged. Though compact and concise, many compilers have trouble generating efficient code from source written in this notation. A good general rule is that little or no performance penalty will result from writing (or rewriting) loops containing only a few statements in array syntax. Long, complicated loops containing much work are best coded as explicitly indexed loops.

6.1.2 Content Rules

Implicit None All subroutines and functions will include an "implicit none" statement. Thus all variables must be explicitly typed.

Prologues Each function, subroutine, or module will include a prologue instrumented for use with the ProTeX auto-documentation script (http://dao.gsfc.nasa.gov/software/protex). The purpose is to describe what the code does, possibly referring to external documentation. The prologue formats for functions and subroutines, modules, and header files are shown. In addition to the keywords in these templates, ProTeX also recognizes the following:

	!BUGS:
	!SEE ALSO:
	!SYSTEM ROUTINES:
	!FILES USED:
	!REMARKS:
	!TO DO:
	!CALLING SEQUENCE:
	!CALLED FROM:
	!LOCAL VARIABLES:

These keywords may be used at the developer's discretion.

Prologue for Functions and Subroutines

If the function or subroutine is included in a module, the keyword !IROUTINE should be used instead of !ROUTINE.

      !----------------------------------------------------------------------- 
      ! BOP
      !
      ! !ROUTINE:  <Function name>  (!IROUTINE if the function is in a module)
      ! 
      ! !INTERFACE:  
                  function <name> (<arguments>)          
      
      ! !USES:
                  use <module>

      ! !RETURN VALUE:
                  implicit none
                  <type> :: <name>               ! <Return value description>

      ! !PARAMETERS:   
                  <type, intent> :: <parameter>  ! <Parameter description>              

      ! !DESCRIPTION: 
      ! <Describe the function of the routine and algorithm(s) used in 
      ! the routine.  Include any applicable external references.> 
      ! 
      ! !REVISION HISTORY:
      ! YY.MM.DD  <Name> <Description of activity> 
      ! 
      ! EOP
      !-----------------------------------------------------------------------
      ! $Id: code_conv_cam.tex,v 1.3 2001/06/19 21:44:14 kauff Exp $
      ! $Author: kauff $
      !-----------------------------------------------------------------------

Prologue for a Module

      !----------------------------------------------------------------------- 
      ! BOP
      !
      ! !MODULE:  <Module name>
      ! 
      ! !USES:
           	  use <module>
      ! !PUBLIC TYPES:
                  implicit none
                  [save]
 
                  <type declaration>
      
      ! !PUBLIC MEMBER FUNCTIONS:
      !           <function>                     ! Description      
      !
      ! !PUBLIC DATA MEMBERS:
                  <type> :: <variable>           ! Variable description

      ! !DESCRIPTION: 
      ! <Describe the function of the module.> 
      ! 
      ! !REVISION HISTORY:
      ! YY.MM.DD  <Name> <Description of activity> 
      ! 
      ! EOP
      !-----------------------------------------------------------------------
      ! $Id: code_conv_cam.tex,v 1.3 2001/06/19 21:44:14 kauff Exp $
      ! $Author: kauff $
      !-----------------------------------------------------------------------

Prologue for a Header File

      !----------------------------------------------------------------------- 
      ! BOP
      !
      ! !INCLUDE:  <Header file name>
      ! 
      ! !DEFINED PARAMETERS:
                  <type> :: <parameter>          ! Parameter description

      ! !DESCRIPTION: 
      ! <Describe the contents of the header file.> 
      ! 
      ! !REVISION HISTORY:
      ! YY.MM.DD  <Name> <Description of activity> 
      ! 
      ! EOP
      !-----------------------------------------------------------------------
      ! $Id: code_conv_cam.tex,v 1.3 2001/06/19 21:44:14 kauff Exp $
      ! $Author: kauff $
      !-----------------------------------------------------------------------

I/O Error Conditions I/O statements which need to check an error condition will use the "iostat=<integer variable>" construct instead of the outmoded end= and err=. Note that a 0 value means success, a positive value means an error has occurred, and a negative value means the end of record or end of file was encountered.

Intent All dummy arguments must include the INTENT clause in their declaration. This is extremely valuable to someone reading the code, and can be checked by compilers. An example is:

            subroutine sub1 (x, y, z) 
            implicit none 
            real(r8), intent(in) :: x 
            real(r8), intent(out) :: y 
            real(r8), intent(inout) :: z 

            y = x 
            z = z + x 

            return 
            end

6.1.3 Package Coding Rules

The term "package" in the following rules refers to a routine or group of routines which takes a well-defined set of input and produces a well-defined set of output. A package can be large, such as a dynamics package, which computes large scale advection for a single timestep. It can also be relatively small, such as a parameterization to compute the effects of gravity wave drag.

Self-containment A package should refer only to its own modules and subprograms and to those intrinsic functions included in the Fortran 90 standard. This is crucial to attaining plug-compatibility. An exception to the rule might occur when a given computation needs to be done in a consistent manner throughout the model. Thus for example a package which requires saturation vapor pressure would be allowed to call a generic routine used elsewhere in the main model code to compute this quantity.
When exceptions to the above rule apply, (i.e. routines are required by a package which are not f90 intrinsics or part of the package itself) the required routines which violate the rule must be specified within the package.
Single entry point A package shall provide separate setup and running procedures, each with a single entry point. All initialization of time-invariant data must be done in the setup procedure and these data must not be altered by the running procedure. This distinction is important when the code is being run in a multitasked environment. For example, constructs of the following form will not work when they are multitasked:
```
            subroutine sub 
            logical first/.true./ 
            if (first) then 
               first = .false. 
               <set time-invariant values> 
            end if
```
Communication All communication with the package will be through the argument list or namelist input. The point behind this rule is that packages should not have to know details of the surrounding model data structures, or the names of variables outside of the package. A notable exception to this rule is model resolution parameters. The reason for the exception is to allow compile-time array sizing inside the package. This is often important for efficiency.
Precision Parameterizations should not rely on vendor-supplied flags to supply a default floating point precision or integer size. The f90 "kind" feature should be used instead. For example, in CCM4, all routines and modules USE a module named "precision" which defines:
```
            integer, parameter :: r8 = selected_real_kind(12) 
            integer, parameter :: i8 = selected_int_kind(13)
```
Thus, any variable declared real(r8) will be of sufficient size to maintain 12 decimal digits in their mantissa. Likewise, integer variables declared integer(i8) will be able to represent an integer of at least 13 decimal digits. Note that the names r8 and i8 defined above are meant to reflect the size in bytes of variables which are subsequently defined with that "kind" value.
Bounds checking All parameterizations must be able to run when a compile-time and/or run-time array bounds checking option is enabled. Thus, constructs of the following form are disallowed:
```
            real(r8) :: arr(1)
```
where "arr" is an input argument into which the user wishes to index beyond 1. Use of the (*) construct in array dimensioning to circumvent this problem is forbidden because it effectively disables array bounds checking.
Error conditions When an error condition occurs inside a package, a message describing what went wrong will be printed. The name of the routine in which the error occurred must be included. It is acceptable to terminate execution within a package, but the developer may instead wish to return an error flag through the argument list. If the user wishes to terminate execution within the package, generic CCM termination routine "endrun" should be called instead of issuing a Fortran "stop". Otherwise a message-passing version of the model could hang. Note that this is an exception to the package coding rule that "A package should refer only to its own modules and subprograms and to those intrinsic functions included in the Fortran 90 standard".
Inter-procedural code analysis Use of a tool to diagnose problems such as array size mismatches, type mismatches, variables which are defined but not used, etc. is strongly encouraged. Flint is one such tool which has proved valuable in this regard. It is not a strict rule that all CCM4 code and packages must be "flint-free", but the developer must be able to provide adequate explanation for why a given coding construct should be retained even though it elicits a complaint from flint. If too many complaints are issued, the diagnostic value of the tool diminishes toward zero.
Memory management The use of dynamic memory allocation is not discouraged because we realize that there are many situations in which run-time array sizing is desirable. However, this type of memory allocation can cause performance problems on some machines, and some debuggers get confused when trying to diagnose the contents of such variables. Therefore, dynamic memory allocation is allowed only "when necessary". The ability to run a code at a different spatial resolution without recompiling is not considered to be an adequate reason to use dynamically allocated arrays.
The preferable mechanism for dynamic memory allocation is automatic arrays, as opposed to ALLOCATABLE or POINTER arrays for which memory must be explicitly allocated and deallocated. An example of an automatic array is:
```
            subroutine sub(n) 
            real :: a(n) 
            ... 
            return 
            end
```
The same routine using an allocatable array would look like:
```
            subroutine sub(n) 
            real, allocatable :: a(:) 
            allocate(a(n)) 
            ... 
            deallocate(a) 
            return 
            end
```
Constants and magic numbers Magic numbers should be avoided. Physical constants (e.g. pi, gas constants) must NEVER be hardwired into the executable portion of a code. Instead, a mnemonically named variable or parameter should be set to the appropriate value, probably in the setup routine for the package. We realize than many parameterizations rely on empirically derived constants or fudge factors, which are not easy to name. In these cases it is not forbidden to leave such factors coded as magic numbers buried in executable code, but comments should be included referring to the source of the empirical formula.
Hard-coded numbers should never be passed through argument lists. One good reason for this rule is that a compiler flag, which defines a default precision for constants, cannot be guaranteed. Fortran 90 allows specification of the precision of constants through the "_" compile-time operator (e.g. 3.14_r8 or 365_i8). So if you insist on passing a constant through an argument list, you must also include a precision specification. If this is not done, a called routine which declares the resulting dummy argument as, say, real(r8) or 8 bytes, will produce erroneous results if the default floating point precision is 4 byte reals.

Next: 7 Configuration Management Up: dev_guide Previous: 5 Target Architectures Contents

csm@ucar.edu