Skip to content

HISTORY Samplers

Yonggang Yu edited this page Jan 18, 2025 · 26 revisions

$\textcolor{red}{\textbf{Introduction}}$

The history sampler is a tool to sample (or regrid) the simulated GEOS fields to time-dependent locations. The sampler runs on-the-fly with GEOS, generating highly accurate results. When GEOS runs with heartbeat dt, the observation locations belonging to this time interval will get the field values from the model, thanks to the fast ESMF regrid tools. This is like the HofX calculation (GEOVALS) at real time.

Examples of the time-varying grids include, for example, aircraft trajectory, satellite observation locations typicalled used by NWP (e.g., SONDES, AMSUA_METOP-B, IASI_METOP-B, CRIS-FSR_N20, GPS, etc.), and satellite SWATH.

The history sampler provides five sampling options:

  • Station Sampler (time independent, for Aeronet and NOAA GHCND)
  • Trajectory Sampler (time dependent, for JEDI IODA files, CAMP2Ex, FIREX-AQ, etc.)
  • Swath Sampler (time dependent, for satellite SWATH, e.g., ATMS)
  • Geostationary Sampler (time independent, for ABI from GOES-R series)
  • Mask Sampler (time independent, for GOES-R series)

In this document, we describe the syntax in MAPL HISTORY.rc to activate history sampler when running the GEOS model.

$\textcolor{red}{\textbf{Types of samplers}}$

$\textcolor{blue}{\textbf{Station sampler}}$

Is used to produce geophysical variables at a set of time-independent geospatial coordinates corresponding to fixed ground stations (for instance NASA AERONET or NOAA GHCNd land surface stations).

$\textcolor{green}{\textbf{Station sampler: list of stations}}$

The user needs to create a csv file to list all the stations of interest. Each row should have at least the following information:

  • station name
  • station latitude
  • station longitude

The user may specify other parameters (such as the station ID) to add more description of a station as long as all the lines have the same number of columns. Currently, the code supports files with any of the following line contents:

station_id, station_name, station_longitude, station_latitude
station_name, station_id, station_longitude, station_latitude
station_name, station_longitude, station_latitude
station_name, station_latitude, station_longitude

Note

Since the most important parameters are the station name and its position, the source code will be refactored in the future so that the station file could include any number of columns as long as the key parameters are present in a consistent order.

Here is a sample station file:

List of stations from AERONET
name,lon,lat                                                                                                
Anchorage,-149.9,61.2
Atlanta,-84.4,33.7
Greenbelt,-76.9,39.1
Bismarck,-100.8,46.8

It obeys the line formatting:

station_name, station_longitude, station_latitude

$\textcolor{green}{\textbf{Station sampler: settings in HISTORY.rc}}$

The HISTORY.rc file settings for the station sampler follow the same syntax as described in the MAPL History Component document. However, specific parameters are required to be able to exercise the station sampler:

  • sampler_spec: A string that needs to be set to 'station' to select a station sampler collection.
  • station_id_file: Full path to the file containing the list of stations and their locations (latitude and longitude in degrees). _ station_skip_line: An integer specifying the numbers of lines to skip on top the station file.
  • regrid_method: A string specifying the regridding method (for instance 'BILINEAR', 'CONSERVATIVE') to be used to interpolate the model fields at the different stations.
  COLLECTIONS:                            
  Aeronet                                 
  ::                                                                                                                   
                                          
  Aeronet.sampler_spec: 'station'         
  Aeronet.station_id_file:   FULL_PATH/my_station_file.csv
  Aeronet.station_skip_line:  2           
  Aeronet.template: %y4%m2%d2_%h2%n2.nc4
  Aeronet.format: 'CFIO'                  
  Aeronet.frequency: 001000,  
  Aeronet.duration:  240000,   
  Aeronet.regrid_method:     'BILINEAR' ,
  Aeronet.fields: 'PHIS'       , 'AGCM'       , 'phis'       ,
                  'TROPT'      , 'AGCM'       ,    
                  'TS'         , 'SURFACE'    , 'ts'         , 
                  'TSOIL1'     , 'SURFACE'    ,   
                  'PS'         , 'DYN'        , 'ps'         ,    
                  'Q'          , 'MOIST'      , 'sphu'       ,
::

$\textcolor{blue}{\textbf{Trajectory sampler}}$

The trajectory sampler is used to produce any geophysical variables at time-dependent geospatial specific points along a defined path or trajectory through the atmosphere (corresponding to tracks of aircraft, balloons, ships or nadir-viewing spaceborne assets). The goal is to provide a snapshot of atmospheric conditions as an object would experience them while moving through that path.

To exercise the trajectory sampler, the user need to provide in the HISTORY.rc file at least the following information:

  • A list of names of the trajectories to be considered for outputs.
  • The date/time range to produce outputs along trajectories.
    • The range is specified through two parameters (beginning and end) in the format YYYY-MM-DDThh:mm:ss.
    • The experiment needs to start within that range, otherwise the code will abort.
    • The outputs along a trajectory will only be written out within the range through the simulation may proceed.
  • The frequency of the outputs.
  • For each trajectory:
    • A the full path to a netCDF file template.
      • The code will use the template to point to the actual netCDF file.
      • The netCDF file contains a list of specific geolocated points that the code will use for the trajectory sampler.
    • The list of fields to produce along the trajectory. The list is unique to the trajectory.

$\textcolor{green}{\textbf{Trajectory sampler: settings in HISTORY.rc}}$

To be able to use the trajectory sampler, it is important to set the following parameters in the HISTORY.rc file in the appropriate collection:

  • sampler_spec: A string that needs to be set to 'trajectory' to select a trajectory sampler collection.
  • ObsPlatforms: list of names (two consecutive names separated by a comma) of the different observation trajectories we want to produce outputs along.
  • obs_file_begin: date/time (in the format YYYY-MM-DDThh:mm:ss) for the beginning of the observation file. If not provided, the code will use the current date/time and will verify that a trajectory file exists on that specific date/time.
  • obs_file_interval: required parameter (in the format: yymmdd hhmmss) providing the date/time interval between two consecutive observation files.
  • obs_file_end: date/time (in the format YYYY-MM-DDThh:mm:ss) for the end of the observation file. If not provided, the code will use the current date/time plus 14 days.
  • Epoch: integer determining the output frequency in hours/minutes/seconds (in the format: hhmmss) .
  • regrid_method: A string specifying the regridding method (for instance 'BILINEAR', 'CONSERVATIVE') to be used to interpolate the model fields at the different stations.

It is not at the level of the trajectory collection that the fields to write out is listed. For each trajectory listed in the ObsPlatforms parameter, we need to provide additional settings to define at least the observation trajectory file template and the list of fields to produce along the defined trajectory. Assume that obs_traj is one value included in ObsPlatforms, here is a template setting for the corresponding trajectory:

PLATFORM.obs_traj::
  IODA_SCHEMA::
    index_name_x:     Location
    var_name_lon:     MetaData/longitude
    var_name_lat:     MetaData/latitude
    var_name_time:    MetaData/dateTime
    file_name_template:  FULL_PATH/obs_traj.%y4%m2%d2T%h2%n2%S2Z.nc4
  :: 
  GEOVALS_SCHEMA::
    geovals_fields::
      'PHIS'       , 'AGCM'       , 'phis'       ,
      'TROPT'      , 'AGCM'       ,    
      'TS'         , 'SURFACE'    , 'ts'         , 
      'TSOIL1'     , 'SURFACE'    ,   
      'PS'         , 'DYN'        , 'ps'         ,    
      'Q'          , 'MOIST'      , 'sphu'       ,
    ::
  ::
::

Here is a sample HISTORY.rc file:

  COLLECTIONS:                            
  'jedi'                                 
  ::                                                                                                                   
 
  jedi.sampler_spec:        trajectory                                                           
  jedi.ObsPlatforms:         aircraft atms_npp
  jedi.template:             '%y4%m2%d2_%h2%n2z.nc4',
  jedi.format:               'CFIO',
  jedi.obs_file_begin:       2019-07-31T21:00:00
  jedi.obs_file_interval:    '000000 060000'   
  jedi.obs_file_end:         2019-11-01T00:00:00
  jedi.Epoch:                060000          
  jedi.regrid_method:        'BILINEAR' ,
::  

#______ Format below does not obey the normal HISTORY.rc settings ____

DEFINE_OBS_PLATFORM::                                        

PLATFORM.aircraft::
  IODA_SCHEMA::
    index_name_x:     Location
    var_name_lon:     MetaData/longitude
    var_name_lat:     MetaData/latitude
    var_name_time:    MetaData/dateTime
    file_name_template:  /discover/nobackup/projects/gmao/aist-nr/data/ioda_reshuffle/%y4%m2%d2/geos_atmosphere/aircraft.%y4%m2%d2T%h2%n2%S2Z.nc4
  ::      
  GEOVALS_SCHEMA::
    geovals_fields::
      'PHIS'       , 'AGCM'       , 'phis'       ,
      'TROPT'      , 'AGCM'       ,    
      'TS'         , 'SURFACE'    , 'ts'         , 
    ::    
  ::      
:: 

PLATFORM.atms_npp::                                                                                IODA_SCHEMA::
    index_name_x:     Location
    var_name_lon:     MetaData/longitude
    var_name_lat:     MetaData/latitude
    var_name_time:    MetaData/dateTime
    file_name_template:  /discover/nobackup/projects/gmao/aist-nr/data/ioda_reshuffle/%y4%m2%d2/geos_atmosphere/atms_npp.%y4%m2%d2T%h2%n2%S2Z.nc4
  ::      
  GEOVALS_SCHEMA::
    geovals_fields::
      'TSOIL1'     , 'SURFACE'    ,   
      'PS'         , 'DYN'        , 'ps'         ,    
      'Q'          , 'MOIST'      , 'sphu'       ,
    ::    
  ::      
:: 

$\textcolor{blue}{\textbf{Swath sampler}}$

Are used to produce geophysical at time-dependent geospatial coordinates corresponding to the two-dimensional swath of an orbiting instrument. Swaths are typically represented by logically rectangular curvilinear grids that may have higher or lower resolution than the NR. When the swath has lower resolution than the NR, conservative regridding will be performed. However, in cases when the observing system has a much higher resolution than the NR, it maybe more advantageous to use masked samplers and perform any necessary interpolation offline.

$\textcolor{blue}{\textbf{Masked sampler}}$

Are used when the observing system has a much higher resolution than the NR. In this case, gridded geophysical variables are masked in such a way that values are preserved at those grid-points that have been visited by the satellite, with possibly the addition of a “halo” for aiding off-line interpolation, with all other grid-points receiving a constant undefined value. These gridded fields can be efficiently output using internal compression algorithms available with most modern formats (e.g., NetCDF-4, HDF-5), or alternatively using a sparse storage scheme.

$\textcolor{red}{\textbf{References}}$

  • Atlas, R., 1997: Atmospheric observations and experiments to assess their usefulness in data assimilation. J. Meteor. Soc. Japan, 75, 111–130, https://doi.org/10.2151/jmsj1965.75.1B_111.
  • Atlas, R., L. Bucci, B. Annane, R. Hoffman, and S. Murillo, 2015: Observing system simulation experiments to assess the potential impact of new observing systems on hurricane forecasting. Mar. Technol. Soc. J., 49, 140–148, https://doi.org/10.4031/MTSJ.49.6.3.
  • Boukabara, S. A., and Coauthors, 2016: Community Global Observing System Simulation Experiment (OSSE) Package (CGOP): Description and usage. J. Atmos. Oceanic Technol., 33, 1759–1777, https://doi.org/10.1175/JTECH-D-16-0012.1.
Clone this wiki locally