Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output cubed-sphere grid in a single file instead of individual tiles. #827

Closed
aerorahul opened this issue Sep 24, 2021 · 21 comments · Fixed by #1011 or NOAA-EMC/fv3atm#466
Closed
Labels
enhancement New feature or request

Comments

@aerorahul
Copy link
Contributor

aerorahul commented Sep 24, 2021

Description

Currently, when ouptut_grid = 'cubed_sphere_grid in model_configure, the forecast output is produced in individual tile files such as atmf000.tile1.nc, atmf000.tile2.nc, and so on.

The JEDI system would like to read this output for backgrounds instead of the restarts as:

  1. writing out restarts hourly for the assimilation window is prohibitively expensive and slow.
  2. combining tiles into a single netcdf file with a tile dimension will allow storage, handling and IO in JEDI a lot more simpler
  3. this will consolidate reading of cubed-sphere output between different implementations of FV3 models e.g. GEOS
  4. this will allow use of visualization tool such as panoply without the need for interpolation from cubed-sphere to Gaussian. Panoply is extensively used for GEOS and at JEDI for interactive visualization of the native grid output (both FV3 and MOM6).
  5. this will reduce the overall data that is duplicated across the different tile files e.g. ak, bk, etc.

Solution

Provide an option to produce cubed-sphere forecast output in a single file instead of tiles with a tile dimension.
The output will then be a single atmf000.nc and sfcf000.nc file containing all tiles.

Alternatives

Anything offline is a possibility, but that is going to add a workflow step in the cycled DA application.

Tagging @danholdaway to provide a sample output of single file containing all tiles from GEOS with panoply attributes.

@aerorahul aerorahul added the enhancement New feature or request label Sep 24, 2021
@danholdaway
Copy link

Here is a header dump showing how the file needs to be structured to be read by JEDI & Panoply:

Screen Shot 2021-09-24 at 10 14 35 AM

@aerorahul mentioned Panoply, this is a phenomenal utility that can be used to make plots directly from the cube sphere fields. Here's what that interface looks like:

Screen Shot 2021-09-24 at 10 18 28 AM

Screen Shot 2021-09-24 at 10 19 12 AM

I've put an example file here: https://drive.google.com/file/d/1JPN_Far-vGNeM8Z1nheH1KTz1WzCtR1E/view?usp=sharing (requires noaa account to access) in case you want to look more or play around with Panoply.

@junwang-noaa
Copy link
Collaborator

@aerorahul @danholdaway Dusan added a capability for ufs-weather-model to output one netcdf file with 6 tiles. Please let us know if you want to try some sample files.

@aerorahul
Copy link
Contributor Author

@aerorahul @danholdaway Dusan added a capability for ufs-weather-model to output one netcdf file with 6 tiles. Please let us know if you want to try some sample files.

Adding @CoryMartin-NOAA as he will likely try using these.

@CoryMartin-NOAA
Copy link

As the variable and dimension names, etc. are different between UFS and GEOS, this will require code changes in FV3-JEDI before being able to properly test. But thank you, @DusanJovic-NOAA , for adding these, they will be very useful going forward and likely what we will want to use in future GDAS implementations.

@aerorahul
Copy link
Contributor Author

@C

As the variable and dimension names, etc. are different between UFS and GEOS, this will require code changes in FV3-JEDI before being able to properly test. But thank you, @DusanJovic-NOAA , for adding these, they will be very useful going forward and likely what we will want to use in future GDAS implementations.

If the naming of variables and dimensions are the only thing that are different, we should be able to refactor existing code to handle that distinction rather than writing new, largely similar code.

Something we should work with @danholdaway

@CoryMartin-NOAA
Copy link

@aerorahul agreed, I have already talked to @danholdaway briefly about this and we think we can refactor the GEOS I/O in FV3-JEDI to be more generic to handle UFS and GEOS netCDF history files. There can be options for 'history' and 'FMS RESTART' instead of 'GEOS' and 'GFS' as it currently is. I will take the lead on this refactoring post-AMS annual meeting.

@junwang-noaa
Copy link
Collaborator

@aerorahul @CoryMartin-NOAA The output variable names are controlled by diag_table, you can change the output name (the third column) to set the same name as GEOS. We will need to check the impact of dimension names though.

@jswhit2
Copy link
Collaborator

jswhit2 commented Jan 14, 2022

Are we talking about the 'cubed_sphere_grid' option for 'output_grid' in model_configure? If that's going to be used by JEDI we should probably add a 'netcdf_parallel' option, and compression. We may also need to add the option for the model to read native grid increments (instead of just gaussian grid).

EDIT: regarding parallel write, I see from the comments in module_wrt_grid_comp.F90 that this is already supported

!***  The ESMF field bundle write uses parallel write, so if output grid
!***  is cubed sphere grid, the  six tiles file will be written out at
!***  same time.

@aerorahul
Copy link
Contributor Author

@jswhit2
This is for the cubed_sphere_grid option for output_grid in model_configure.
The idea is to then read the "native grid" history instead of FV3 restarts in FV3-JEDI.
Along the same lines, the idea is to leverage as much as possible existing code in FV3-JEDI that reads "native grid" history from GEOS output. The exceptions noted are the naming of variables, metadata etc, but the hope is that can be a configurable option rather than separate code.

Re. reading native grid increments (instead of just Gaussian grid), Yes! We need that capability for initializing the model with IAU. FV3-JEDI/JEDI in general needs an update to be able to write out increments (instead of analysis) after solve.

@DusanJovic-NOAA
Copy link
Collaborator

Are we talking about the 'cubed_sphere_grid' option for 'output_grid' in model_configure? If that's going to be used by JEDI we should probably add a 'netcdf_parallel' option, and compression. We may also need to add the option for the model to read native grid increments (instead of just gaussian grid).

EDIT: regarding parallel write, I see from the comments in module_wrt_grid_comp.F90 that this is already supported

!***  The ESMF field bundle write uses parallel write, so if output grid
!***  is cubed sphere grid, the  six tiles file will be written out at
!***  same time.

Yes. For example this 'model_configure' options:

num_files:               2
filename_base:           'atm' 'sfc'
output_grid:             cubed_sphere_grid
output_file:             'netcdf_parallel' 'netcdf_parallel'
ideflate:                1
nbits:                   14
ichunk2d:                96
jchunk2d:                96
ichunk3d:                96
jchunk3d:                96
kchunk3d:                127

will create sequence of atmf???.nc and sfcf???.nc files each containing all 6 tiles, using parallel netcdf and compression.

@junwang-noaa
Copy link
Collaborator

@aerorahul The variable names are not an issue as they are controlled by the run time output configuration file diag_table.

@DusanJovic-NOAA Do you have any sample file from the model_configuration above?

@CoryMartin-NOAA
Copy link

@junwang-noaa presumably offline post/bufr soundings, etc. expect these current variable names, so I think we still would want JEDI to be consistent with the rest of UFS and not change variable names in the diag_table to match GEOS. There are other things like the time variable/units that differ between UFS and GEOS that will require FV3-JEDI changes.

@DusanJovic-NOAA
Copy link
Collaborator

@aerorahul The variable names are not an issue as they are controlled by the run time output configuration file diag_table.

@DusanJovic-NOAA Do you have any sample file from the model_configuration above?

See:
/scratch2/NCEPDEV/fv3-cam/Dusan.Jovic/cubed_sphere_grid_single_netcdf
and:
/scratch2/NCEPDEV/fv3-cam/Dusan.Jovic/cubed_sphere_grid_single_netcdf/compressed

These files correspond with:
/scratch1/NCEPDEV/nems/emc.nemspara/RT/NEMSfv3gfs/develop-20220112/INTEL/control_CubedSphereGrid_debug

@danholdaway
Copy link

FV3-JEDI IO code makes no reference to specific field names so there wouldn't be any benefit to picking something different from what is already used for GFS. We can make the dimension names configurable instead of hardcoded to what GEOS uses. @CoryMartin-NOAA when you're ready to start refactoring the fv3-jedi code lets have a quick chat as there are some things about that code that are a little too complex and we can simplify somewhat.

FYI @jswhit2 the increments would normally be lower CS resolution so some thought might be needed about how best to ingest. You could either interpolate/remap to the native resolution in an extra workflow step and then have just a simple read/iau. Otherwise you would need to be able to create another low res cubed sphere grid within the increment read routine of UFS and interpolate/remap inside before iau.

@yangfanglin
Copy link
Collaborator

yangfanglin commented Jan 14, 2022 via email

@jswhit2
Copy link
Collaborator

jswhit2 commented Jan 14, 2022

Looping @pjpegion into this discussion, since he wrote the increment reading/interpolation code for FV3

@junwang-noaa
Copy link
Collaborator

Maybe I misunderstood the questions. I am suggesting name changes in diag_table for a quick test code to confirm the new file working in fv3_JEDI, it's just for this PR.

@CoryMartin-NOAA
Copy link

CoryMartin-NOAA commented Jan 14, 2022

@junwang-noaa got it, yeah that won't really work because as @danholdaway said the variables themselves aren't so much the problem, it's the dimension names and the time variable/attributes (I believe).

@DusanJovic-NOAA
Copy link
Collaborator

Expecting that UFS and GEOS create history output files with identical metadata is probably unrealistic, hence JEDI will need to be able to handle them differently, at least at some level.

@junwang-noaa
Copy link
Collaborator

@CoryMartin-NOAA If more work needs to be done in JEDI side, we will plan to get this PR committed. We can make further change in the model side if it is required when JEDI is testing those files.

@CoryMartin-NOAA
Copy link

@junwang-noaa that works for me. I'll be sure to let you all know if I run into any issues and if anything needs modified. A quick spot check suggests this file format looks the same as what we now use for GSI, just an extra dimension for tiles (and a different grid of course).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
7 participants