Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metadata to MPAS Meshes #507

Merged
merged 11 commits into from
Apr 13, 2020

Conversation

xylar
Copy link
Collaborator

@xylar xylar commented Apr 3, 2020

This metadata is formatted according to the specs of @proteanplanet, see examples below.

To do this, each case with an e3sm_coupling step has been given its own config file that contains either the mesh metadata or ways to construct it from other keys of autodetect it from mesh fields.

Config files:

  • QU240
  • QU240wISC
  • EC60to30
  • EC60to30wISC
  • CUSP12
  • CUSP8
  • SO60to10wISC

@xylar xylar added Ocean COMPASS in progress For discussion PRs and Issues that are open for discussion and feedback labels Apr 3, 2020
@xylar
Copy link
Collaborator Author

xylar commented Apr 3, 2020

Example from the QU240wISC test case:

$ ncdump -h mesh.nc
...
		:MPAS\ Mesh\ Short\ Name = "QU.wISC.E3SMv2.240km.64L.rev001" ;
		:MPAS\ Mesh\ Long\ Name = "Quasi-uniform mesh Ice Shelf Cavities for E3SM Version 2, 240km resolution, 64 vertical levels" ;
		:MPAS\ Mesh\ Description = "MPAS quasi-uniform mesh at 240-km resolution includes cavities under the ice shelves in the Antarctic" ;
		:MPAS\ Mesh\ E3SM\ Version = 2 ;
		:MPAS\ Mesh\ QU.wISC\ Version = "001" ;
		:MPAS\ Mesh\ QU.wISC\ Version\ Author = "Micky Mouse" ;
		:MPAS\ Mesh\ QU.wISC\ Version\ Author\ Institution = "Disneyland" ;
		:MPAS\ Mesh\ QU.wISC\ Version\ Creation\ Date = "04/03/2020 16:21:27" ;
		:MPAS\ Mesh\ QU.wISC\ Minimum\ Resolution\ \(km\) = 240. ;
		:MPAS\ Mesh\ QU.wISC\ Maximum\ Resolution\ \(km\) = 240. ;
		:MPAS\ Mesh\ QU.wISC\ Maximum\ Depth\ \(m\) = 6000. ;
		:MPAS\ Mesh\ QU.wISC\ Number\ of\ Levels = 64. ;
		:MPAS\ Mesh\ Ice\ Shelf\ Cavities = "ON" ;
		:MPAS\ Mesh\ Runoff\ Description = "<<<Spreading function described here>>>" ;
		:MPAS\ Mesh\ COMPASS\ Version = "0.1.2" ;
		:MPAS\ Mesh\ JIGSAW\ Version = "0.9.12" ;
		:MPAS\ Mesh\ JIGSAW-Python\ Version = "0.2.1" ;
		:MPAS\ Mesh\ MPAS-Tools\ Version = "0.0.9" ;
		:MPAS\ Mesh\ NCO\ Version = "4.9.2" ;
		:MPAS\ Mesh\ ESMF\ Version = "8.0.0" ;
		:MPAS\ Mesh\ Geometric\ Features\ Version = "0.1.6" ;
		:MPAS\ Mesh\ Metis\ Version = "5.1.0" ;
		:MPAS\ Mesh\ pyremap\ Version = "0.0.5" ;

@xylar
Copy link
Collaborator Author

xylar commented Apr 3, 2020

@proteanplanet, the spaces in the attribute names are kind of non-standard and don't format well in ncdump. Do you want to use underscores or something else? Or are you okay with the escape characters? (The same is true of parentheses, like in the units.)

@proteanplanet
Copy link

Hi @xylar, Underscores are fine and I agree that formatting with ncdump is important. With regard to Author and Institution, the only way that's ever worked in a semi-automatic way for me is with environment variables.

@proteanplanet
Copy link

@xylar : Another thought that occurred to me on the Author and Institution problem. Instead of using "Institution" as a field, we could instead use email address, which usually is reflective of the institution anyway. Both name and email address can be grabbed from .gitconfig. Anyhow, just an idea.

@xylar
Copy link
Collaborator Author

xylar commented Apr 3, 2020

@proteanplanet, that's an idea. I was thinking of the .gitconfig as well. I use my personal email for GitHub so I'd need to override that for myself but that's not a big deal.

I went ahead and added command-line options for filling in author and institution but that's not a complete solution yet.

@proteanplanet
Copy link

I am pasting Jon's comment here from confluence: "We may need to revisit the naming convention – E3SM tests use periods to separate fields in a test name, meaning

create_test SMS.T62_oEC60to30v3.GMPAS-IAF.anvil_intel

has the test type (SMS) separated from the grid resolution (T62_oEC60to30v3) and compset (GMPAS-IAF), etc by periods. So the naming convention as it stands would cause problems. Or we at least need a “short” naming convention that would work with these limitations."

Do we need to consider this in the current PR?

@vanroekel
Copy link
Contributor

Just so I understand, on mesh creation this metadata gets created on its own? This is a ton of metadata to expect someone to have to fill in on creation by hand.

@xylar if using the gitconfig to fill email address, can you share how you would swap your personal email for lanl? I would need to do the same.

Also, thinking more about the name for meshes, I'd like to propose a change. I think the E3SMv2 is in an odd place. I'd prefer something like

CUSP.12kmto60km.L64.E3SMv2.rev001

@xylar
Copy link
Collaborator Author

xylar commented Apr 4, 2020

@proteanplanet

Do we need to consider this in the current PR?

Yes, absolutely! That's why this is a discussion. What would the proposed change be? Underscores instead of periods?

@vanroekel

Just so I understand, on mesh creation this metadata gets created on its own? This is a ton of metadata to expect someone to have to fill in on creation by hand.

First, this doesn't happen at mesh creation but rather at the e3sm_coupling stage. We could switch this to happening at an earlier stage (culled_mesh or initial_state) but it seems like we don't need this metadata for a typical standalone configuration, and the e3sm_coupling stage (essentially the last step after spin-up of an initial condition) is good enough.

This also gives us the option of having slightly different metadata for the sea-ice initial condition if desired.

Most of the metadata comes from a config file that that the user would not need to alter (it is specific to the test case). Other parts come from the mesh itself, or the conda environment, and are automatic. Only the author and institution are not automatic at the moment but the git idea is a good one for solving that.

@xylar if using the gitconfig to fill email address, can you share how you would swap your personal email for lanl? I would need to do the same.

My plan would be to have a config option for email address with a default value of autodetect. You could fill in your email before running the e3sm_coupling step in config_E3SM_coupling_files.ini. In other words, you would need to remember to change this config file each time you are creating E3SM coupling files.

Also, thinking more about the name for meshes, I'd like to propose a change. I think the E3SMv2 is in an odd place. I'd prefer something like CUSP.12kmto60km.L64.E3SMv2.rev001

Yes, I think that makes more sense to me, too. It's more similar to what we've done so far and it puts the more relevant info earlier.

Did you want km twice in 12kmto60km? That wasn't @proteanplanet's suggestion and I prefer to avoid the redundancy if possible.

@xylar
Copy link
Collaborator Author

xylar commented Apr 4, 2020

Here's the updated version of the metadata:

		:MPAS_Mesh_Short_Name = "QU_wISC_240km_64L_E3SMv2_rev001" ;
		:MPAS_Mesh_Long_Name = "Quasi-uniform mesh Ice Shelf Cavities for E3SM Version 2, 240km resolution, 64 vertical levels" ;
		:MPAS_Mesh_Description = "MPAS quasi-uniform mesh at 240-km resolution includes cavities under the ice shelves in the Antarctic" ;
		:MPAS_Mesh_E3SM_Version = 2 ;
		:MPAS_Mesh_QU_wISC_Version = "001" ;
		:MPAS_Mesh_QU_wISC_Version_Author = "Xylar Asay-Davis" ;
		:MPAS_Mesh_QU_wISC_Version_Author_E-mail = "xylarstorm@gmail.com" ;
		:MPAS_Mesh_QU_wISC_Version_Creation_Date = "04/04/2020 10:53:30" ;
		:MPAS_Mesh_QU_wISC_Minimum_Resolution_km = 240. ;
		:MPAS_Mesh_QU_wISC_Maximum_Resolution_km = 240. ;
		:MPAS_Mesh_QU_wISC_Maximum_Depth_m = 6000. ;
		:MPAS_Mesh_QU_wISC_Number_of_Levels = 64. ;
		:MPAS_Mesh_Ice_Shelf_Cavities = "ON" ;
		:MPAS_Mesh_Runoff_Description = "<<<Spreading function described here>>>" ;
		:MPAS_Mesh_COMPASS_Version = "0.1.2" ;
		:MPAS_Mesh_JIGSAW_Version = "0.9.12" ;
		:MPAS_Mesh_JIGSAW-Python_Version = "0.2.1" ;
		:MPAS_Mesh_MPAS-Tools_Version = "0.0.9" ;
		:MPAS_Mesh_NCO_Version = "4.9.2" ;
		:MPAS_Mesh_ESMF_Version = "8.0.0" ;
		:MPAS_Mesh_geometric_features_Version = "0.1.6" ;
		:MPAS_Mesh_Metis_Version = "5.1.0" ;
		:MPAS_Mesh_pyremap_Version = "0.0.5" ;

This is without me replacing my personal email address (bots, knock yourselves out!). It uses my work email address if I put into the config file.

@xylar xylar requested a review from vanroekel April 4, 2020 08:58
Comment on lines +28 to +30
# The following options are detected from .gitconfig if not explicitly entered
author = autodetect
email = autodetect
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanroekel, a copy of this file will end up in the e3sm_coupling directory. You could edit the email option here and put in your LANL email address.

Comment on lines 10 to 26
[mesh]
short_name = ${prefix}_${min_res}to${max_res}km_${levels}L_E3SMv${e3sm_version}_rev${mesh_version}
prefix = EC_wISC
long_name = Eddy Closure mesh with Ice Shelf Cavities for E3SM Version
${e3sm_version}, ${min_res}-${max_res}km resolution,
${levels} vertical levels
description = MPAS Eddy Closure mesh with enhanced resolution around the equator
(30 km) and poles (35 km) with 60-km resolution at mid latitudes
and includes cavities under the ice shelves in the Antarctic.
e3sm_version = 2
mesh_version = 001
creation_date = autodetect
min_res = 30
max_res = 60
max_depth = autodetect
levels = autodetect
runoff_description = <<<Spreading function described here>>>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanroekel, This is where a lot of the metadata comes from. the various ${blah} entries come from other config options. (This uses the so-called ExtendedInterpolation for config files.) Anything with autodetect will get looked up somehow or other.

@vanroekel
Copy link
Contributor

@xylar, thanks for the explanations. and regarding the tag, that was a mistake, I don't think we should have km twice in there, 12to60km is much better.

@xylar
Copy link
Collaborator Author

xylar commented Apr 5, 2020

Testing

I tested a merge of this PR, #506, #508, #510 and #511 together through spin-up for the following test cases:

  • EC60to30
  • EC60to30wISC
  • SO60to10wISC

@jonbob
Copy link
Contributor

jonbob commented Apr 6, 2020

@xylar - I think we need a really short name as well. As useful as it is to have this much information embedded in even the short name, the reality may be that we can't afford the short name that gets used by E3SM to be that long. The resolution for any coupled case has three grid specifications now that we have tri-grids, and I don't believe it will be workable if each of those grids requires 30+ characters to define it (like QU_wISC_240km_64L_E3SMv2_rev001). So we also need a shorthand notation -- my first recommendation is dropping the underscores completely, since they are typically used to separate the meshes for the different components in an E3SM resolution definition (i.e. T62_oEC60to30v3). Can we use the QU240 name as an example, and see exactly what information is necessary for an E3SM name?

  1. QU_wISC_240km_64L_E3SMv2_rev001
  2. QUwISC240km64LE3SMv2rev001 -- dropping underscores
  3. QUwISC240km64LE3SMv2r01 -- replacing rev001 with r01 (are we really likely to have more than 99 revisions of any grid?)
  4. QUwISC240km64LE2r01 -- replacing E3SMv2 with E2
  5. QUwISC24064LE2r01 -- getting rid of km (do we ever use anything else?)
  6. QU240wISC64LE2r01 -- reordered to keep some of the numbers separated

This is not great, but has gone from 31 characters to 17 and still has all the information? Anyone else?

@xylar
Copy link
Collaborator Author

xylar commented Apr 6, 2020

The verbose names were at @maltrud's and @proteanplanet's request, so they'll need to weigh in. I don't see a value in a short name distinct from the "really short name". When would we use the short name? @maltrud and @proteanplanet, does this essentially leave us back where we started with mesh names, with the exception that we now include the number of vertical layers and the E3SM version?

The main problem with 5 and 6 above are the numbers running together. For a case without ice shelves, we would have QU24064LE2r01. However, the simple solution is to move the L in front of the number of levels: QU240L64E2r01, or with ice shelves, QUwISC240L64E2r01. I don't have a strong preference but it seems odd to specify "with ice-shelf cavities" (wISC) in the middle of specifying the resolution (240L64) so I'd keep it toward the beginning.

@jonbob and @vanroekel, does anyone outside of the COSIM team need to weight in on these names? Anyone else from infrastructure, water cycle, etc.?

@jonbob
Copy link
Contributor

jonbob commented Apr 6, 2020

@xylar - I agree that L64 is more consistent with E2 and r01, that is, the identifier first and the value second. And it has the benefit of keeping the numbers distinct.

I think we are squeezed between keeping the names short and wanting as much information as possible. I think that's where the metadata you've added is especially critical. We'll let others weigh in about how much has to be in the actual name...

Copy link
Contributor

@milenaveneziani milenaveneziani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nice, @xylar.

Copy link

@proteanplanet proteanplanet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xylar. I think this is well done.

xylar added 9 commits April 10, 2020 20:49
Each test case with an E3SM coupling step now needs a default and
and custom config file for that test case.  The custom config file
will be used to point to the correct initial condition, give the
full mesh name and other metadata, etc.

For now, only the 2 EC60to30 test cases have these custom config
files, but ther remaining test cases with E3SM coupling steps will
have these added later on.
This metadata is formatted according to the specs of @proteanplanet
Switch from institution to email address, which can be parsed from
git config (along with the name).

Switch metadata from having spaces to underscores.

Switch mesh short names from dot separation to underscores.
For example, `ECwISC60to30kmL64E3SMv2r01`
short_name is now what E3SM will hopefully use in test and case
names while lon_name is what is getting put in mapping, initial
condition, etc.
Also, switch back from wI to wISC even in the short name
@xylar xylar force-pushed the add_mpas_mesh_metadata branch from d072f56 to adf34f8 Compare April 10, 2020 19:12
@mark-petersen
Copy link
Contributor

Tested e3sm_coupling with 1 and 36 cores. Appears to work with QU240, QU240wISC, EC60to30. Thanks @xylar!

raise ValueError("mesh name not found in path. Please specify "
"the mesh_name in config_E3SM_coupling_files.ini.")
else:
print("- mesh name specified in config file: ", mesh_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to see both long and short name here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I can add this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@mark-petersen
Copy link
Contributor

It looks like the long name is used everywhere in E3SM file names, not the short name. Was that intentional? I like the short name better.

@xylar
Copy link
Collaborator Author

xylar commented Apr 10, 2020

It looks like the long name is used everywhere in E3SM file names, not the short name. Was that intentional? I like the short name better.

That was the consensus as I understood it. Long names where possible but short names in mapping files (and eventually in test and case names) where the long name is too long to be practical.

#507 (comment)

@mark-petersen, I'm reluctant to revisit this after a 50+ message conversation but if you feel strongly, we need to have you and @maltrud come to an agreement on this.

@mark-petersen
Copy link
Contributor

OK, thanks. I see now. Thanks for pointing that out. I'll go with the decision above.

@mark-petersen mark-petersen self-requested a review April 13, 2020 14:16
Copy link
Contributor

@mark-petersen mark-petersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, everyone, for your work on this. This naming convention will really clarify our meshes.

@mark-petersen mark-petersen merged commit 25cbc8c into MPAS-Dev:ocean/develop Apr 13, 2020
@xylar xylar deleted the add_mpas_mesh_metadata branch April 13, 2020 15:20
@matthewhoffman matthewhoffman added COMPASS For discussion PRs and Issues that are open for discussion and feedback in progress Ocean labels Mar 17, 2021
@xylar xylar removed the in progress label Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
COMPASS For discussion PRs and Issues that are open for discussion and feedback Ocean
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants