Go to the bottom of this page. See the search engine and sub-section links.
Go to next page Go to previous page Go to top of this section Go to top page Go to table of contents

Previous Section Headers


User's Guide to NCAR CAM2.0


1. Introduction


1.3 Troubleshooting Guide

This section presents information which should help with some common problems users encounter when running the CAM.

1.3.1 Platforms CAM2.0 ported to

CAM2.0 is fully ported and supported on the IBM-SP, SGI-Origin, Solaris, Compaq-alpha-cluster, and Linux-PC with the Portland group FORTRAN-90 and "C" compilers, Linux-PC with the Lahey compiler, and Linux-PC with Portland group FORTRAN-90 and the GNU "C" compilers.

1.3.2 Known problems

1.3.3 General

Many times when a model run fails the user is provided with an error message from the operating system or CAM itself which helps to identify the source of the problem and hopefully points the way to a solution.   The first step in troubleshooting a failed model run is to check the basics.  Look at the logs for error messages.  Make sure the model executable is up to date with any source code changes.  Rebuild the model cleanly (i.e. issue a "gmake clean" before rerunning the script) if you are unsure of the state of any code. Ask yourself what has changed since the last successful run.

Other times CAM may fail for no obvious reason or perhaps the error message returned is cryptic or misleading.  It has been our experience that the majority of these types of symptoms can be attributed to an incorrect allocation of hardware and/or software resources (e.g. the user sets the value of $OMP_NUM_THREADS to a value inconsistent with the number of physical CPUs per node). Often the configuration for machine resources has default settings which are insufficient.  Most often an incorrect setting for the per-thread stack size will cause the model to fail with a segmentation fault, allocation error, or stack pointer error.  Usually the default setting for this resource is too low and must be adjusted by setting the appropriate environment variables.  Values in the range of 40-70 Mbytes seem to work well on most architectures.  As a simple troubleshooting step the user may try adjusting this resource, or the process stack size, for their particular application.  Here is a list of suggested runtime resource settings affecting the process and/or thread stack sizes.

1.3.4 How to increase the stacksize on different platforms

1.3.5 General problems on different platforms

Most distributed-memory platforms also provide runtime settings to enable a user to override the multiprocessing defaults and customize the machine parallelism to a particular application.  CAM performance can be adversely affected by an incorrect configuration of the machine parallelism.  The run scripts provided in the distribution create an executable that will run in a hybrid mode on distributed architectures, using MPI for communication between nodes and OpenMP directives on processes within a node.  When running in hybrid mode the user should set the number of MPI tasks per node to be 1.  Thread-based OpenMP multitasking will utilize all processors on the node.  If the user makes the appropriate changes to the Makefile to disable OpenMP and use only MPI, the number of MPI tasks per node should be set equal to the number of physical processors per node.

In addition to properly configuring machine resources, we've identified the following problems often encountered when building and running CAM on the machines here at NCAR.


Sub Sections


    1.3.1 Platforms CAM2.0 ported to

    1.3.2 Known problems

    1.3.3 General

    1.3.4 How to increase the stacksize on different platforms

    1.3.4 General problems on different platforms


 Go to the top of this page. See links to previous section headers.
Go to next page Go to previous page Go to top of this section Go to top page Go to table of contents

 Search for keywords in the CAM2.0 Users GuideSearch page

Questions on these pages can be sent to... erik@ucar.edu .


$Name: $ $Revision: 1.1 $ $Date: 2004/06/08 02:57:20 $ $Author: jmccaa $