Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NetCDF C version of mesh conversion tools #514

Merged
merged 14 commits into from
Aug 7, 2023

Conversation

xylar
Copy link
Collaborator

@xylar xylar commented Jul 4, 2023

This merge adds a new version of the mesh conversion tools that uses the standard NetCDF C library, rather than the legacy NetCDF C++ library.

Most of the work was done by @dengwirda, so I can only take credit for the conda-package plumbing and some minor debugging and clean-up.

We have encountered significant problems with the legacy NetCDF C++ library. Foremost is that it does not perform well on large meshes (over several million cells). The library is not being developed further and even the successor NetCDF C++ library has not had a release in about 4 years and does not appear to be in active use by many projects.

I have chosen to leave the existing mesh creation tools unchanged for now. The new tools are in their own mesh_creation_tools_netcdf_c directory. We could eventually decide to remove mesh_creation_tools and adopt mesh_creation_tools_netcdf_c in their place but I have not done that so far.

@dengwirda has ported the mesh converter and cell culler but not the mask creator. This could be done by anyone interested but our group has other tools we use for mask creation that we use instead, so it is not a priority for us. This is one of the reasons for maintaining mesh_creation_tools as it is.

I have changed the conda package to use mesh_creation_tools_netcdf_c. Since the mask creator is not available, I have change the mpas_tools.mesh.conversion.mask() method to use the python masking capabilities that are part of the conda package.

Finally, this merge includes changes to how mpas_toolsmesh.creation.build_mesh() works: It now uses subprocess calls rather than the convert() and cull() wrapper functions, since the wrapper functions have caused out-of-memory errors for large meshes (and provide no added value in this situation). Note: if desired, I could break this change into its own PR.

These work well even for very large meshes.

The mask creator has not yet been ported and is not included.
@xylar
Copy link
Collaborator Author

xylar commented Jul 4, 2023

@mgduda, I want to make sure you're okay with this addition. I realize it could be disruptive or cause confusion for your group to have mesh_conversion_tools and mesh_conversion_tools_netcdf_c. But @dengwirda and I certainly hope the tools will be useful to everyone, not just us.

Please give me your feedback as soon as you're able because we'd like to get this merged in and into use if you're okay with it.

@xylar
Copy link
Collaborator Author

xylar commented Jul 4, 2023

@mark-petersen and @matthewhoffman, I'm making sure you're aware of this change, since it could affect MPAS-Ocean and MALI. (I'm not sure who to ping for MPAS-Seaice at this point. @darincomeau, is that you?)

@xylar
Copy link
Collaborator Author

xylar commented Jul 4, 2023

@dengwirda, I'd like your feedback on whether the version of the code here looks good to you. I made only minor bug fixes, so it should basically be what you gave me.

Please also let me know if it works for you. You can install it yourself into a conda environment with:

conda install -c conda-forge/label/mpas_tools_dev mpas_tools=0.22.0rc1

@xylar
Copy link
Collaborator Author

xylar commented Jul 4, 2023

Testing

I tested this with the compass pr test suite using the current compass main branch on Chrysalis. All test passed and were bit-for-bit with the version of MPAS-Tools currently used in compass (0.19.0), which is very exciting!

I also plan to test this on the RRS6to18 mesh in MPAS-Dev/compass#576, which failed miserably with the existing mesh_conversion_tools after more than 2 days of execution.

@xylar
Copy link
Collaborator Author

xylar commented Jul 18, 2023

@mgduda, @mark-petersen, @matthewhoffman and @dengwirda, just a reminder that I'm looking for feedback on this PR.

@dengwirda
Copy link
Contributor

Thanks @xylar, I've had success on pm-cpu running the following:

conda create -n netcdf_c_tester -c conda-forge/label/mpas_tools_dev mpas_tools=0.22.0rc2
conda activate netcdf_c_tester
python3 test.py

where test.py is:

import time
import xarray

from mpas_tools.mesh.conversion import convert, cull
from mpas_tools.io import write_netcdf
from mpas_tools.mesh.creation import jigsaw_to_netcdf

if (__name__ == "__main__"):

    ttic = time.time()
    print("Forming mesh_triangles.nc")
    jigsaw_to_netcdf(
        msh_filename="mesh.msh",
        on_sphere=True,
        sphere_radius=6371220.,
        output_name="mesh_triangles.nc")
    ttoc = time.time()
    print("jgsw-to-mpas:", ttoc - ttic)

    ttic = time.time()
    print("Forming base_mesh.nc")
    write_netcdf(convert(
        xarray.open_dataset("mesh_triangles.nc"),
        graphInfoFileName="graph.info"),
        format="NETCDF4",
        fileName="base_mesh.nc")
    ttoc = time.time()
    print("mpas-convert:", ttoc - ttic)

With an 18M cell mesh the output is:

Forming mesh_triangles.nc
jgsw-to-mpas: 1331.344482421875
Forming base_mesh.nc
mpas-convert: 623.2475054264069

and I suspect the non-vectorised call to build the circumcentres is a place where a lot of the step 1 time is being spent.

xylar added 2 commits July 29, 2023 11:18
We want to be able to call `mpas_tools.io.logging.check_call()`
with `logger=None` so we can handle cases with and without a
logger without any special treatment.
@xylar xylar force-pushed the add-netcdf-c-mesh-conversion-tools branch from cb2f235 to 57d374b Compare July 29, 2023 09:29
@xylar
Copy link
Collaborator Author

xylar commented Jul 29, 2023

@mgduda, @mark-petersen, @matthewhoffman, another request for you to have a look at this please.

xylar added 6 commits July 29, 2023 12:16
The `cullCell` field doesn't necessarily exist and we want to
not raise an exception if it doesn't.
For now, we don't have an alternative because the mask creator
has not yet been rewritten to support NetCDF-C.

This merge also switches all 3 functions in the mesh.creation
module to use the logging.check_call() function, rather than
an equivalent private function.
The calls to `convert()` and `cull()` wrappers have caused
trouble for big meshes.
The `cull()` wrapper is more trouble in this case.
@xylar xylar force-pushed the add-netcdf-c-mesh-conversion-tools branch from 57d374b to 325d225 Compare July 29, 2023 10:16
Copy link
Collaborator

@mark-petersen mark-petersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all your work on this! If the new version passes the PR suite bfb, then it is certainly working. I've looked over the code. I am mostly trying to understand it, and I see that this code will be helpful in learning c++ for Omega. It is strange that the c version of the netcdf libraries are supported and the c++ are not - I would have expected the opposite.

@xylar
Copy link
Collaborator Author

xylar commented Aug 1, 2023

Thanks @mark-petersen!

@xylar
Copy link
Collaborator Author

xylar commented Aug 1, 2023

It is strange that the c version of the netcdf libraries are supported and the c++ are not - I would have expected the opposite.

There are several things to say about this. First, both the NetCDF Fortran and C++ libraries are based on (wrappers around?) the NetCDF C library. So the C library gets a lot more attention than the others. The C++ library hasn't been updated in years.

To add to the confusion, there was a major rewrite of the NetCDF C++ API awhile back, and our tools use the NetCDF C++ library from before the rewrite, so they are using the "legacy" NetCDF C++ library. That has made a bad situation worse.

The move to the NetCDF C library in @dengwirda's version of the tools means we're using the library that gets the most maintenance, so I think that's unquestionably the right direction to go.

@@ -102,7 +102,7 @@ test:
- translate_planar_grid -f 'periodic_mesh_10x20_1km.nc' -d 'periodic_mesh_20x40_1km.nc'
- MpasMeshConverter.x mesh_tools/mesh_conversion_tools/test/mesh.QU.1920km.151026.nc mesh.nc
- MpasCellCuller.x mesh.nc culled_mesh.nc -m mesh_tools/mesh_conversion_tools/test/land_mask_final.nc
- MpasMaskCreator.x mesh.nc arctic_mask.nc -f mesh_tools/mesh_conversion_tools/test/Arctic_Ocean.geojson
# - MpasMaskCreator.x mesh.nc arctic_mask.nc -f mesh_tools/mesh_conversion_tools/test/Arctic_Ocean.geojson
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as I said, this tool has been removed from the conda package.

Copy link
Member

@matthewhoffman matthewhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked this over to get a sense of the changes, but I don't feel I'm in a position for a critical review. The argument for doing this is convincing, and if the new version works as expected, then I'm happy to approve the PR.

Comment on lines -2152 to +2157
ds_culled = cull(dsIn=ds_mesh, dsInverse=ds_mask, logger=logger,
dir=temp_dir)
write_netcdf(ds_culled, out_mesh_filename)
args = ['MpasCellCuller.x',
mesh_filename,
out_mesh_filename,
'-i', f'{temp_dir}/mask.nc']
check_call(args=args, logger=logger)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewhoffman, I saw a question about this change (that maybe got deleted).

In general, I'm moving away from the wrapper function calls in any case where we want to read in and write out files anyway. The wrapper functions have to write out temporary files before calling the tools under the hood, so there's an unnecessary extra level of reading in and writing back out files with the wrapper functions if they already exist.

@xylar
Copy link
Collaborator Author

xylar commented Aug 3, 2023

@mgduda, I'm assuming you may be away. I am going to merge this on Monday unless I hear from you.

@xylar xylar force-pushed the add-netcdf-c-mesh-conversion-tools branch 3 times, most recently from 871ec07 to 781726e Compare August 4, 2023 12:57
We need one per optional variable or attribute.  Otherwise all
subsequent variables or attributes get skipped when the first is
not present.
@xylar xylar force-pushed the add-netcdf-c-mesh-conversion-tools branch from 781726e to 19ed985 Compare August 4, 2023 13:09
xylar added 2 commits August 4, 2023 15:13
Remove limit on number of arguments (since the code handles
unlimited numbers of masks)
@xylar xylar force-pushed the add-netcdf-c-mesh-conversion-tools branch from 19ed985 to 3e28c22 Compare August 4, 2023 13:13
@xylar xylar mentioned this pull request Aug 4, 2023
@xylar
Copy link
Collaborator Author

xylar commented Aug 4, 2023

More testing

With my recent fixes to the cell culler, I am re-testing with the following meshes. A check mark means the mesh produced by compass is BFB the same as with 0.21.0 release.

  • ocean/global_ocean/QU240/mesh
  • ocean/global_ocean/Icos240/mesh
  • ocean/global_ocean/QUwISC240/mesh
  • ocean/global_ocean/QU/mesh
  • ocean/global_ocean/Icos/mesh
  • ocean/global_ocean/QUwISC/mesh
  • ocean/global_ocean/IcoswISC/mesh
  • ocean/global_ocean/EC30to60/mesh
  • ocean/global_ocean/ECwISC30to60/mesh

@xylar xylar removed the request for review from mgduda August 7, 2023 10:05
@xylar xylar merged commit fb9c542 into MPAS-Dev:master Aug 7, 2023
@xylar xylar deleted the add-netcdf-c-mesh-conversion-tools branch August 7, 2023 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants