Skip to content
Matthew Thompson edited this page May 17, 2023 · 40 revisions

If you need additional help after reading this document, please contact the SI Team at siteam at gmao.gsfc.nasa.gov


This document provides an overview of GEOS GCM by describing a basic structure of the code, presentaing the supported computing environments where it is used, and how to obtain, compile and run it.

This page will point users of the GEOS GCM to documentation designed to help users of the GEOS GCM.

1 Generic Information

1.1 Structure of GEOS

The GEOS GCM is made of up of variety of gridded components linked together by an infrastructure layer called MAPL based on ESMF. The model is based on a "fixture" which is a base repository which contains the necessary CMake and control files to build the model.

1.2 mepo

The GEOS GCM uses a Python utility called mepo to manage multiple git repositories instead of using other technologies like Git submodules. mepo uses a YAML file that provides a list of components (and their versions) that are required for a particular configuration of GEOS GCM.

To learn more about mepo, consult the following links:

1.3 Fixtures

A "fixture" is what we call the "base" of the GEOS GCM. This repository, GEOSgcm, is the "fixture" for the GEOS GCM. If you clone it with git, you'll see that is it very light and there is no real source code. Instead, it is mainly just CMake code as well as an important file: components.yaml. This YAML file is what controls how our model is laid out and what it consists of.

1.4 Components

The main components of GEOS are other repositories in the GEOS-ESM GitHub organization. Examples include:

This components are laid out in the source tree by the components.yaml file in the main fixture. In this file you'll see entries like:

GEOSgcm_GridComp:
  local: ./src/Components/@GEOSgcm_GridComp
  remote: ../GEOSgcm_GridComp.git
  tag: v1.17.3
  sparse: ./config/GEOSgcm_GridComp.sparse
  develop: develop

FVdycoreCubed_GridComp:
  local: ./src/Components/@GEOSgcm_GridComp/GEOSagcm_GridComp/GEOSsuperdyn_GridComp/@FVdycoreCubed_GridComp
  remote: ../FVdycoreCubed_GridComp.git
  tag: v1.12.1
  develop: develop

This file tells mepo to clone version v1.17.3 (tag) of GEOSgcm_GridComp from the ../GEOSgcm_GridComp.git repository (remote, where ../ means its URL is relative to the fixtures) and put it on disk at ./src/Components/@GEOSgcm_GridComp (local). For more information about mepo, see the above links or contact the GMAO SI Team.

2 Computing Centers

Most users of GEOS will build and run GEOS on NASA Supercomputing resources. This is mainly due to both the availability of the libraries needed to build and run GEOS as well as the boundary conditions, emissions, etc. needed to run the model.

In general, GEOS is supported at the NASA Center for Climate Simulation (NCCS) and the NASA Advanced Supercomputing (NAS). These following links contain pages at the two centers' documentation about using the systems:

Containerized GEOS

GEOS does currently build Docker containers with each release, but this is still slightly unsupported and is more in a testing/experimental page. If you need information about this, please contact the SI Team.

3 Working with the GEOS GCM

The instructions to obtain, build, and run the GEOS GCM can be found in the main README for the fixture and often that is the canonical place to find instructions but we will provide here

3.1 Preliminary Steps

3.1.1 Shell configuration

Users are recommended to configure their shell start up files as below. For users of bash, for .bashrc:

umask 0022
ulimit -s unlimited

# Run things in this if-block only if we're in an interactive shell
if [[ $- == *i* ]]
then

   # Only put module use or other module commands here

   # The below lines are to get the SI Team maintained modulefiles. 
   # Remove #NCCS if at NCCS and #NAS if at NAS
   #NCCS module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
   #NAS module use -a /nobackup/gmao_SIteam/modulefiles
   module load GEOSenv

   ...
fi

and for users of csh or tcsh we recommend adding to .cshrc or .tcshrc:

umask 0022
limit stacksize unlimited

# Run things in this if-block only if we are in an interactive shell
if ($?prompt) then

   # Only put module use or other module commands here

   # The below lines are to get the SI Team maintained modulefiles. 
   # Remove #NCCS if at NCCS and #NAS if at NAS
   #NCCS module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
   #NAS module use -a /nobackup/gmao_SIteam/modulefiles
   module load GEOSenv

   ...
endif

In the above codes, we run umask 0022 and the limit/ulimit calls at all times and these are safe. The umask 0022 tells the shell to make all files and directories default readable. The limit/ulimit calls set the stacksize to unlimited which is needed by GEOS.

Finally, you'll see an if-block, this block is used for things you'd like in bash or tcsh, but that should only be done if in an interactive shell. You never want to run any module commands in a non-interactive shell as it can have bad side-effects on scripts, other commands, etc. We've added some lines for loading our SI Team maintained modulefiles and a GEOSenv metamodule in this block.

3.1.3 Load Build Modules (at NASA or other systems)

In your .bashrc or .tcshrc or other rc file add a line in the interactive-only section (as shown above):

NCCS
module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
NAS
module use -a /nobackup/gmao_SIteam/modulefiles
GMAO Desktops

On the GMAO desktops, the SI Team modulefiles should automatically be part of running module avail but if not, they are in:

module use -a /ford1/share/gmao_SIteam/modulefiles

You can also run this in any interactive window you have, or just re-source your rc file. This allows you to get module files needed to correctly checkout and build the model.

GEOSenv

At NASA centers, we maintain a module, GEOSenv which provided more recent git and CMake modules as well as access to mepo. You can get this by loading the GEOSenv module:

module load GEOSenv

Again, as above, you should only add this to .bashrc or .tcshrc in the interactive-only block. Running module load commands in shell startup files can have adverse effects otherwise.

3.2 Obtaining (Cloning) the Model

On GitHub, there are three ways to clone the model: SSH, HTTPS, or GitHub CLI. The first two are "git protocols" which determine how git communicates with GitHub (either through https or ssh). The latter is a CLI that uses either ssh or https protocol underneath.

For developers of GEOS, the SSH git protocol is recommended as it can avoid some issues if two-factor authentication (2FA) is enabled on GitHub.

Obtaining a GitHub Account

If you do not yet have a GitHub account and wish to develop GEOS, you will need to sign up for one. We recommend a username that "maps" to you if possible. For example, if your NASA AUID is jdoe try for jdoe at GitHub. But any username will work and we highly recommend you add your full name to your profile so that it can be easily looked for when using GitHub.

Note that GEOS-ESM requires users to have two-factor authentication enabled on GitHub for security reasons.

Being added to GEOS-ESM (NASA only)

If you are a NASA employee, you will also need to be added to the GEOS-ESM team so you can have write permissions to the GEOS-ESM repos. To obtain this, please send an email to the SI Team at siteam at gmao.gsfc.nasa.gov with your name and GitHub username.

Note that at the moment non-NASA contributors to GEOS should use forks to contribute.

Git Configuration

Before you use git to make changes to the model, you should make sure you have git itself set up to sign commits correctly. Please run:

git config --global user.name "First Last"
git config --global user.email "email@domain.com"

Note, the email address you set above should be an address your GitHub account knows. You can follow this page to add emails to your GitHub account.

SSH

To clone the GEOSgcm code using the SSH url (starts with git@github.com), issue the command:

git clone -b vX.Y.Z git@github.com:GEOS-ESM/GEOSgcm.git

where vX.Y.Z is a tag from a GEOSgcm release. Note if you don't use -b, you will get the main branch and that can change from day-to-day.

Permission denied (publickey)

If this is your first time using GitHub with any SSH URL, you might get this error:

Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

If you do see this, you need to upload an ssh key to your GitHub account. This needs to be done on any machine that you want to use the SSH URL through.

Permission denied (publickey)...but it was working before (NAS)

Issues have been seen when using SSH access to GitHub with the update at NAS to TOSS4. There are two possible solutions for this. One replaces your SSH key and the other adds a new key and uses it. The former is probably the easiest method, but if you have concerns that that RSA key might be used elsewhere, adding a new key might be safer.

Replacing SSH Key

This section assumes that you are using an RSA key to interact with GitHub from NAS. The steps are as follows (assuming the key used on GitHub is id_rsa.pub):

  1. Go to the GitHub SSH Keys page and remove the NAS RSA key.
  2. At NAS, replace the old RSA key (below we remove the old key, but if you want you can rename it):
cd $HOME/.ssh
rm id_rsa.pub id_rsa
ssh-keygen -t rsa -b 4096
  1. Upload the new id_rsa.pub to GitHub
Create and Use New RSA Key

In this method, we will make a new RSA key and use that

  1. Create a new RSA key
cd $HOME/.ssh
ssh-keygen -t rsa -b 4096 -f id_rsa_new
  1. Upload this new RSA key to GitHub
  2. Edit .ssh/config to tell GitHub to use the new key. For this, create or add to a section in .ssh/config like:
Host github.com
   ForwardX11 no
   IdentityFile ~/.ssh/id_rsa_toss4
   IdentitiesOnly yes

This is necessary because, by default, SSH will always offer id_rsa first if it exists and if that key is still at GitHub, the bad behavior will persist.

HTTPS

To clone the model through HTTPS you run:

git clone -b vX.Y.Z https://github.com/GEOS-ESM/GEOSgcm.git

where vX.Y.Z is a tag from a GEOSgcm release. Note if you don't use -b, you will get the main branch and that can change from day-to-day.

Note that if you use the HTTPS URL and have 2FA set up on GitHub, you will need to use personal access tokens as a password.

GitHub CLI

You can also use the GitHub CLI with:

gh repo clone GEOS-ESM/GEOSgcm -- -b vX.Y.Z

where vX.Y.Z is a tag from a GEOSgcm release. Note if you don't use -b, you will get the main branch and that can change from day-to-day.

Note that when you first use gh, it will ask what your preferred git protocol is (https or ssh) to use "underneath". The caveats above will apply to whichever you choose.

Regardless of the cloning method you use, you will get the directory GEOSgcm/.

3.3 Building the Code

To build build the model, you need to first go to the GEOSgcm/ directory:

cd GEOSgcm

3.3.1 Single Step Building of the Model

Fromthe head node, run the parallel_build.csh script:

./parallel_build.csh

Doing so will checkout all the external repositories of the model and build it. When done, the resulting model build will be found in build/ and the installation will be found in install/ with setup scripts like gcm_setup and fvsetup in install/bin.

Develop Version of GEOS GCM

parallel_build.csh provides a special flag for checking out the development branches of GEOSgcm_GridComp, GEOSgcm_App, GMAO_Shared, and GEOS_Util. If you run:

./parallel_build.csh -develop

then mepo will run:

mepo develop GEOSgcm_GridComp GEOSgcm_App GMAO_Shared GEOS_Util
Debug Version of GEOS GCM

To obtain a debug version, you can run

./parallel_build.csh -debug

which will build with debugging flags. This will build in build-Debug/ and install into install-Debug/.

Do not create and install source tarfile with parallel_build

Note that running with parallel_build.csh will create and install a tarfile of the source code at build time. If you wish to avoid this, run parallel_build.csh with the -no-tar option:

./parallel_build.csh -no-tar
Passing additional CMake options to parallel_build.csh

While parallel_build.csh has many options, it does not cover all possible CMake options possible in GEOSgcm. If you wish to pass additional CMake options to parallel_build.csh, you can do so by using -- and then the CMake options. Note that anything after the -- will be interpreted as a CMake option, which could lead to build issues if not careful.

For example, if you want to build a develop Debug build on Cascade Lake while turning on StratChem reduced mechanism and the CODATA 2018 options:

parallel_build.csh -develop -debug -cas -- -DSTRATCHEM_REDUCED_MECHANISM=ON -DUSE_CODATA_2018_CONSTANTS=ON

As noted above all the "regular" parallel_build.csh options must be listed before the -- flag.

3.3.2 Multiple Steps for Building the Model

The steps detailed below are essentially those that parallel_build.csh performs for you. Either method should yield identical builds.

Mepo

The GEOS GCM is comprised of a set of sub-repositories. These are managed by a tool called mepo. To clone all the sub-repos, you can run mepo clone inside the fixture:

cd GEOSgcm
mepo clone

This command initializes the multi-repository and clones and assembles all the sub-repositories according to components.yaml

Checking out develop branches of GEOSgcm_GridComp, GEOSgcm_App, GMAO_Shared, and GEOS_Util

To get development branches of GEOSgcm_GridComp, GEOSgcm_App, GMAO_Shared, and GEOS_Util (a la the -develop flag for parallel_build.csh, one needs to run the equivalent mepo command. As mepo itself knows (via components.yaml) what the development branch of each subrepository is, the equivalent of -develop for mepo is to checkout the development branches of GEOSgcm_GridComp, GEOSgcm_App, GMAO_Shared, and GEOS_Util:

mepo develop GEOSgcm_GridComp GEOSgcm_App GMAO_Shared GEOS_Util

This must be done after mepo clone as it is running a git command in each sub-repository.

Load Compiler, MPI Stack, and Baselibs

On tcsh:

source @env/g5_modules

or on bash:

source @env/g5_modules.sh
Create Build Directory

We currently do not allow in-source builds of GEOSgcm. So we must make a directory:

mkdir build

The advantages of this is that you can build both a Debug and Release version with the same clone if desired.

Run CMake

CMake generates the Makefiles needed to build the model.

cd build
cmake .. -DBASEDIR=$BASEDIR/Linux -DCMAKE_Fortran_COMPILER=ifort -DCMAKE_INSTALL_PREFIX=../install

This will install to a directory parallel to your build directory. If you prefer to install elsewhere change the path in:

-DCMAKE_INSTALL_PREFIX=<path>

and CMake will install there.

Debug Version of GEOS GCM

To obtain a debug version, you can should add:

-DCMAKE_BUILD_TYPE=Debug

which will build with debugging flags.

Create and install source tarfile

Note that running with parallel_build.csh will create and install a tarfile of the source code at build time. But if CMake is run by hand, this is not the default action (as many who build with CMake by hand are developers and not often running experiments). In order to enable this at install time, add:

-DINSTALL_SOURCE_TARFILE=ON

to your CMake command.

Build and Install with Make
make -jN install

where N is the number of parallel processes. On discover head nodes, this should only be as high as 2 due to limits on the head nodes. On a compute node, you can set N has high as you like, though 8-12 is about the limit of parallelism in our model's make system.

3.4 Running the Model

Once the model has built successfully, you will have an install/ directory in your checkout. To run gcm_setup go to the install/bin/ directory and run it there:

cd install/bin
./gcm_setup

You can find more information to run the model in either atmosphere/data ocean mode (aka AMIP):

or coupled atmosphere/ocean mode: