parallelization for mmf groundwater scheme #139

CharlesZheZhang · 2024-08-16T00:50:07Z

Adding parallelization capacity for mmf groundwater scheme
Code modifications done by Gonzalo (gonzalo.miguez@usc.es) and Prasanth (prasanth@iisertvm.ac.in),
tested and implemented by Zhe (zhezhang@ucar.edu)

Two codes are changed under noahmp directory:
/src/GroundWaterMmfMod.F90
/drivers/hrldas/NoahmpGroundwaterInitMod.F90

Essentially use mpp_land in Lateralflow subroutine and groundwater init subroutine:
https://github.com/CharlesZheZhang/noahmp/blob/30d661ab921372103dbd73da44a23d373cbb2618/drivers/hrldas/NoahmpGroundwaterInitMod.F90#L22
These lines moved to under subroutine NoahmpGroundwaterInitMain(grid, NoahmpIO)

Also let all distributed tiles communicate when reaching equilibrium within the 500 loops of lateralflow calls:
https://github.com/CharlesZheZhang/noahmp/blob/30d661ab921372103dbd73da44a23d373cbb2618/drivers/hrldas/NoahmpGroundwaterInitMod.F90#L123

cenlinhe · 2024-08-16T03:36:49Z

Thank you, Zhe, for testing.

Are the results from this parallelization run with multiple cpus the same as the results from single cpu run?
How much faster the parallelization run compared to single core run? (to make sure the parallelization code is indeed working).

CharlesZheZhang · 2024-08-16T14:23:44Z

Thank you, Zhe, for testing.

Are the results from this parallelization run with multiple cpus the same as the results from single cpu run?

How much faster the parallelization run compared to single core run? (to make sure the parallelization code is indeed working).

The results are identical - I run ncdiff to see the difference from two runs are 0.
For speed, it depends on domain size and how many cpu are used.
For a 600 x 600 domain, with 16 cpus, the mpi run time is 3/8 of the run time of single cpu run.

barlage · 2024-08-16T14:31:34Z

@CharlesZheZhang I suggest using something like nccmp to do the comparison, I think this is a more robust check and it is standard in regression testing. With options like:

nccmp -dsSqf file1.nc file2.nc

should report: Files "file1.nc" and "file2.nc" are identical.

If there are any data differences it will report stats on them.

barlage · 2024-08-16T14:36:02Z

drivers/hrldas/NoahmpGroundwaterInitMod.F90

-#endif
+#ifdef MPP_LAND
+      rcount=float(ncount)
+      call sum_real1(rcount)


is there any issue here when using different precision? since sum_real1 is a single precision function?

why not just write a new sum_integer function?

barlage · 2024-08-16T14:37:19Z

src/GroundWaterMmfMod.F90

@@ -225,8 +225,7 @@ subroutine LATERALFLOW  (NoahmpIO, ISLTYP,WTD,QLAT,FDEPTH,TOPO,LANDMASK,DELTAT,A
 !  USE NOAHMP_TABLES, ONLY : DKSAT_TABLE

 #ifdef MPP_LAND
-    ! MPP_LAND only for HRLDAS Noah-MP/WRF-Hydro - Prasanth Valayamkunnath (06/10/2022)
-     use module_mpp_land, only: mpp_land_com_real, mpp_land_com_integer, global_nx, global_ny, my_id
+     use module_mpp_land !, only: mpp_land_lr_com, mpp_land_ub_com, mpp_land_sync, global_nx, global_ny, my_id


remove the comment if not needed

barlage · 2024-08-16T14:40:44Z

src/GroundWaterMmfMod.F90

    jtsh=max(jts,jds+1)
-    jteh=min(jte,jde-2)
+    jteh=min(jte,jde-1)


these don't change answers for non-parallel runs?

barlage · 2024-08-16T14:42:15Z

src/GroundWaterMmfMod.F90

+    itsh=its
+    iteh=ite
+    jtsh=jts
+    jteh=jte


these don't change answers for non-parallel runs?

good question. I am also curious about this.

In the LIS group, we actually did something similar to these indices when adding MMF parallel into Noah-MP-4.0.1.

See this code change:
https://github.com/NASA-LIS/LISF/pull/1418/files#diff-7737fa95c763c3f4c44635b9bf558d8a063138b94eb0e208fc2209d2ed65ef3e

Scroll down to Lines 329 to 338 in the new code.

We haven't merged this PR yet, but will be doing it soon.

CharlesZheZhang · 2024-08-16T14:48:12Z

Thanks @barlage , good suggestion.
I used the nccmp function and the return shows output from the single and mpi run are identical
Files "output_single/1980010501.LDASOUT_DOMAIN1" and "output_mpi/1980010501.LDASOUT_DOMAIN1" are identical.

barlage · 2024-08-16T14:53:29Z

OK, maybe I misunderstood something here. Some of my comments are based on whether the new code reproduces the old code, i.e., are these non-answer changing modifications? So, there should probably be four tests:

old serial compared to old mpi (presumably this was the same, but good to check)
new serial compared to new mpi (this seems to be what you checked already)
old serial compared to new serial (answer changing for non-mpi?)
old mpi compared to new mpi (answer changing for mpi?)

cenlinhe · 2024-08-16T16:01:46Z

OK, maybe I misunderstood something here. Some of my comments are based on whether the new code reproduces the old code, i.e., are these non-answer changing modifications? So, there should probably be four tests:

old serial compared to old mpi (presumably this was the same, but good to check)

new serial compared to new mpi (this seems to be what you checked already)

old serial compared to new serial (answer changing for non-mpi?)

old mpi compared to new mpi (answer changing for mpi?)

There is no old mpi because originally MMF in HRLDAS cannot be run in a mpi mode.

cenlinhe · 2024-08-16T16:02:42Z

I agree that you should also test old code serial compared to new code serial

barlage · 2024-08-16T16:05:46Z

@cenlinhe OK, that's what I thought, but that makes me a little confused as to why there is MPP_LAND in the current code? For example, here.

barlage · 2024-08-16T16:07:44Z

Given the removal of DM_PARALLEL, does there also need to be WRF and MPAS tests?

cenlinhe · 2024-08-16T16:09:33Z

@cenlinhe OK, that's what I thought, but that makes me a little confused as to why there is MPP_LAND in the current code? For example, here.

This was added by Prasanth recently, which however does not work correctly. That is why we asked Gonzalo to fix the issue based on Prasanth's version.

cenlinhe · 2024-08-16T16:19:05Z

Given the removal of DM_PARALLEL, does there also need to be WRF and MPAS tests?

good point. I haven't thought about this. maybe we should use both DM_PARALLEL and MPP_LAND if we do not want to have two MMF code versions in HRLDAS and WRF/MPAS? I am not even sure about if current MPAS can run with MMF in parallel given the unstructured grids (I am not familiar with the parallelization of MPAS).

barlage · 2024-08-16T16:45:07Z

Given the removal of DM_PARALLEL, does there also need to be WRF and MPAS tests?

good point. I haven't thought about this. maybe we should use both DM_PARALLEL and MPP_LAND if we do not want to have two MMF code versions in HRLDAS and WRF/MPAS? I am not even sure about if current MPAS can run with MMF in parallel given the unstructured grids (I am not familiar with the parallelization of MPAS).

OK, I assume this probably doesn't work in MPAS yet. The use of HRLDAS specific (or any parent system) code in noahmp code is generally not a good idea. This MPP_LAND directive should probably be changed to something like MPI_HRLDAS or DM_PARALLEL_HRLDAS or something more clearly associated with HRLDAS.

Now I'm questioning whether these changes should be in the general noahmp code. First, I really hate #ifdef directives since it makes the code difficult to read/understand. Second, how will this parallelization fit into other parent models? Thinking of how this fits into @dmocko 's comment, do additional parent models just keep adding more ifdefs?

I know you are just trying to get parallel code committed but there are some bigger questions here for these non-column model processes.

CharlesZheZhang · 2024-08-16T16:57:02Z

Given the removal of DM_PARALLEL, does there also need to be WRF and MPAS tests?

good point. I haven't thought about this. maybe we should use both DM_PARALLEL and MPP_LAND if we do not want to have two MMF code versions in HRLDAS and WRF/MPAS? I am not even sure about if current MPAS can run with MMF in parallel given the unstructured grids (I am not familiar with the parallelization of MPAS).

OK, I assume this probably doesn't work in MPAS yet. The use of HRLDAS specific (or any parent system) code in noahmp code is generally not a good idea. This MPP_LAND directive should probably be changed to something like MPI_HRLDAS or DM_PARALLEL_HRLDAS or something more clearly associated with HRLDAS.

Now I'm questioning whether these changes should be in the general noahmp code. First, I really hate #ifdef directives since it makes the code difficult to read/understand. Second, how will this parallelization fit into other parent models? Thinking of how this fits into @dmocko 's comment, do additional parent models just keep adding more ifdefs?

I know you are just trying to get parallel code committed but there are some bigger questions here for these non-column model processes.

Hi Mike and Cenlin,
I will do the single and mpi run to compare with old codes.
MPAS-A can run with Noah-MP in v8.2.1, but I haven't run with MMF groundwater yet.
Will let you know soon with some answers.

cenlinhe · 2024-08-16T18:34:20Z

Yes, we need to find a way to make sure these changes work with all different parent models.

CharlesZheZhang · 2024-08-18T21:48:39Z

OK, maybe I misunderstood something here. Some of my comments are based on whether the new code reproduces the old code, i.e., are these non-answer changing modifications? So, there should probably be four tests:

old serial compared to old mpi (presumably this was the same, but good to check)

new serial compared to new mpi (this seems to be what you checked already)

old serial compared to new serial (answer changing for non-mpi?)

old mpi compared to new mpi (answer changing for mpi?)

Hi Mike and Cenlin, thank you for your suggestions. I have conducted a some tests with different hrldas/noahmp versions and both single and mpi run (v4.5, master, and the new mpi in this commit).

The short summary for the tests is - the master code has very minimum changes from the v4.5 only on the upper domain boundary, and the new mpi code is identical to the master code in both single run and mpi run. The mpi run (with 16 cpu is much faster, only 3/8 of the single cpu run time).

Three versions are used:
1: hrldas/noahmp v4.5 (this is the pre-refactor code and consistent with WRFv4.5, no mpi capability in hrldas)
2: hrldas/noahmp master (this is the current version, mpi code implemented by Prasanth but can’t pass groundwater init subroutine)
3: hrldas/noahmp new mpi (this is the new mpi version implemented by Gonzalo this summer)

Four simulations are run:
1: V4.5 single run
2: Master single run
3: New mpi code, run with single cpu
4: New mpi code, run with 16 cpu parallelization

The v4.5 single cpu run shows minor differences results from the master version and the new mpi, the differences are all in the upper boundary of the domain. For the master version and the new mpi version, results are identical.

Please see my note in shared google drive:
https://docs.google.com/document/d/1BB8V6ncdxU5XfGrQc48MyjkSYoTdO-9uZz8RHSMLg9o/edit?usp=sharing

cenlinhe · 2024-08-18T22:49:57Z

Thank you, Zhe! This means the MPI fix implemented by Gonzalo is correct. We will then need to think about an effective way for the MPI flag so that it can be compatible with different host models.

barlage · 2024-08-20T17:16:09Z

@CharlesZheZhang are the differences between 1 and 2 expected? This could indicate that there is a problem in the new implementation (or that there was a problem in the old code).

parallelization for mmf groundwater scheme

30d661a

CharlesZheZhang added the development label Aug 16, 2024

CharlesZheZhang requested a review from cenlinhe August 16, 2024 00:50

CharlesZheZhang self-assigned this Aug 16, 2024

cenlinhe requested review from barlage and tslin2 August 16, 2024 03:22

barlage reviewed Aug 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallelization for mmf groundwater scheme #139

parallelization for mmf groundwater scheme #139

CharlesZheZhang commented Aug 16, 2024

cenlinhe commented Aug 16, 2024 •

edited

Loading

CharlesZheZhang commented Aug 16, 2024

barlage commented Aug 16, 2024

barlage Aug 16, 2024

barlage Aug 16, 2024

barlage Aug 16, 2024

barlage Aug 16, 2024

barlage Aug 16, 2024

cenlinhe Aug 16, 2024

dmocko Aug 16, 2024

CharlesZheZhang commented Aug 16, 2024

barlage commented Aug 16, 2024 •

edited

Loading

cenlinhe commented Aug 16, 2024

cenlinhe commented Aug 16, 2024

barlage commented Aug 16, 2024

barlage commented Aug 16, 2024

cenlinhe commented Aug 16, 2024

cenlinhe commented Aug 16, 2024

barlage commented Aug 16, 2024

CharlesZheZhang commented Aug 16, 2024

cenlinhe commented Aug 16, 2024

CharlesZheZhang commented Aug 18, 2024

cenlinhe commented Aug 18, 2024

barlage commented Aug 20, 2024

parallelization for mmf groundwater scheme #139

Are you sure you want to change the base?

parallelization for mmf groundwater scheme #139

Conversation

CharlesZheZhang commented Aug 16, 2024

cenlinhe commented Aug 16, 2024 • edited Loading

CharlesZheZhang commented Aug 16, 2024

barlage commented Aug 16, 2024

barlage Aug 16, 2024

Choose a reason for hiding this comment

barlage Aug 16, 2024

Choose a reason for hiding this comment

barlage Aug 16, 2024

Choose a reason for hiding this comment

barlage Aug 16, 2024

Choose a reason for hiding this comment

barlage Aug 16, 2024

Choose a reason for hiding this comment

cenlinhe Aug 16, 2024

Choose a reason for hiding this comment

dmocko Aug 16, 2024

Choose a reason for hiding this comment

CharlesZheZhang commented Aug 16, 2024

barlage commented Aug 16, 2024 • edited Loading

cenlinhe commented Aug 16, 2024

cenlinhe commented Aug 16, 2024

barlage commented Aug 16, 2024

barlage commented Aug 16, 2024

cenlinhe commented Aug 16, 2024

cenlinhe commented Aug 16, 2024

barlage commented Aug 16, 2024

CharlesZheZhang commented Aug 16, 2024

cenlinhe commented Aug 16, 2024

CharlesZheZhang commented Aug 18, 2024

cenlinhe commented Aug 18, 2024

barlage commented Aug 20, 2024

cenlinhe commented Aug 16, 2024 •

edited

Loading

barlage commented Aug 16, 2024 •

edited

Loading