Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PNETCDF to couple WRF-SFIRE with mass-consistent solver #18

Closed
janmandel opened this issue May 31, 2020 · 21 comments
Closed

Use PNETCDF to couple WRF-SFIRE with mass-consistent solver #18

janmandel opened this issue May 31, 2020 · 21 comments
Labels
duplicate This issue or pull request already exists

Comments

@janmandel
Copy link
Member

janmandel commented May 31, 2020

For the PREEVENTS project.
Comments on WRF-SFIRE edits here, for the other side at UtahEFD/QES-Winds#2.

@janmandel
Copy link
Member Author

janmandel commented Jul 2, 2020

@janmandel
Copy link
Member Author

janmandel commented Jul 6, 2020

CHPC has installed pnetcdf/1.11.2 module load pnetcdf sets PNETCDF_INCDIR and PNETCDF_LIBDIR apparently built with intel18
Cheyenne: module load pnetcdf also sets PNETCDF and reloads on module swap intel gnu
https://cug.org/proceedings/cug2016_proceedings/includes/files/pap123s2-file1.pdf
build with pnetcdf /glade/work/jmandel/WRF-SFIRE-pnetcdf
run with pnetcdf /glade/p/univ/ucud0004/jmandel/em_rxcadre_pnetcdf

@janmandel
Copy link
Member Author

janmandel commented Jul 14, 2020

On CHPC, getting nf90mpi_open:wrf.nc error -128 NetCDF: Attempt to use feature that was not turned on when netCDF was built with It seems to be something about wrfout created by WRF only. With wrf.nc -> testfile.nc from https://github.com/janmandel/pnetcdf-tests this does not happen. Maybe some version issue, like conda-forge/libnetcdf-feedstock#42 Unidata/netcdf4-python/issues/713
tools/nc4-test.exe can write netcdf-4 compressed files and sets NC_NETCDF4 in nc_create. Its output used as wrf.nc -> nc4_test.nc gives the same error as the wrfout. Maybe pnetcdf on chpc was compiled without compression? See wrf-model#583 for more on the compression and classic issue.
nf-config shows what was compiled into the NetCDF library
Finally, on CHPC:
$ ncdump -k nc4_test.nc
netCDF-4
(created by WRF, cannot be opened by pnetcdf)
$ ncdump -k testfile.nc
cdf5
(created by pnetcdf tests, can be opened by pnetcdf)
/uufs/chpc.utah.edu/sys/installdir/pnetcdf/1.11.2i18
/uufs/chpc.utah.edu/sys/installdir/netcdf-c/4.4.1.1i18-c7
/uufs/chpc.utah.edu/sys/installdir/netcdf-f/4.4.4i18-c7
Here is why:
$ pnetcdf-config --netcdf4
disabled
Fix: compile with setenv NETCDF_classic 1

@janmandel
Copy link
Member Author

janmandel commented Jul 14, 2020

From https://www.unidata.ucar.edu/software/netcdf/docs/parallel_io.html
NetCDF-4 provides parallel file access to both classic and netCDF-4/HDF5 files. The parallel I/O to netCDF-4 files is achieved through the HDF5 library while the parallel I/O to classic files is through PnetCDF. A few functions have been added to the netCDF C API to handle parallel I/O. You must build netCDF-4 properly to take advantage of parallel features (see Building with Parallel I/O Support).
The nc_open_par() and nc_create_par() functions are used to create/open a netCDF file with parallel access.

From https://en.wikipedia.org/wiki/NetCDF#Parallel-NetCDF
An extension of netCDF for parallel computing called Parallel-NetCDF (or PnetCDF) has been developed by Argonne National Laboratory and Northwestern University.[25] This is built upon MPI-IO, the I/O extension to MPI communications. Using the high-level netCDF data structures, the Parallel-NetCDF libraries can make use of optimizations to efficiently distribute the file read and write applications between multiple processors. The Parallel-NetCDF package can read/write only classic and 64-bit offset formats. Parallel-NetCDF cannot read or write the HDF5-based format available with netCDF-4.0. The Parallel-NetCDF package uses different, but similar APIs in Fortran and Parallel I/O in the Unidata netCDF library has been supported since release 4.0, for HDF5 data files. Since version 4.1.1 the Unidata NetCDF C library supports parallel I/O to classic and 64-bit offset files using the Parallel-NetCDF library, but with the NetCDF API.

From https://parallel-netcdf.github.io
NetCDF started to support parallel I/O from version 4, whose parallel I/O feature was at first built on top of parallel HDF5. Thus, the file format required by NetCDF-4 parallel I/O operations was restricted to HDF5 format. Starting from the release of 4.1, NetCDF has also included a dispatcher that enables parallel I/O operations on files in classic formats (CDF-1 and 2) through PnetCDF. Official support for the CDF-5 format started in the release of NetCDF 4.4.0.
Note NetCDF now can be built with PnetCDF as its sole parallel I/O mechanism by using command-line option "--disable-netcdf-4 --enable-pnetcdf". Certainly, NetCDF can also be built with both PnetCDF and Parallel HDF5 enabled. In this case, a NetCDF program can choose either PnetCDF or Parallel HDF5 to carry out the parallel I/O by adding NC_MPIIO or NC_NETCDF4 respectively to the file open/create flag argument when calling API nc_create_par or nc_open_par. When using PnetCDF underneath, the files must be in the classic formats (CDF-1/2/5). Similarly for HDF5, the files must be in the HDF5 format (aka NetCDF-4 format). A few NetCDF-4 example programs are available that shows parallel I/O operations through PnetCDF and HDF5.

See also Parallel I/O and Portable Data Formats: PnetCDF and NetCDF4

@janmandel
Copy link
Member Author

janmandel commented Jul 16, 2020

From https://stackoverflow.com/questions/59506059/parallel-read-write-of-netcdf-file-using-fortran-and-mpi: The pnetcdf package (sometimes called parallel-netcdf) is an independent library, a totally separate implementation of netCDF ... high-performance parallel I/O library for accessing Unidata's NetCDF, files in classic formats ... pnetcdf has a netCDF-like API, but the function names are different. If pnetcdf is used in stand-alone mode (i.e. without the Unidata netCDF libraries) then user code must be written in the pnetcdf API. This code will not run using the Unidata netCDF library, it will run with pnetcdf only.... Also, pnetcdf can only be used with netCDF classic formats. It cannot read/write HDF5 files.

I/O on few processors (<10)

Using Unidata's netcdf-c/netcdf-fortran libraries would be simplest. Build pnetcdf, HDF5, then netcdf-c, then netcdf-fortran, all with MPI compilers. Make sure you specify --enable-parallel when building HDF5. (Not necessary with netcdf-c, netcdf-fortran, they will automatically detect parallel features of the HDF5 build).

Once built, the netcdf C and Fortran APIs can do parallel I/O on any netCDF file. (And also on almost all HDF5 files.) Use nc_open_par()/nc_create_par() to get parallel I/O.

I/O on some processors (10 - 1000)

Use of pnetcdf may be simplest and give best performance for classic format files. It has a slightly different API and will not work for HDF5 files.

@janmandel
Copy link
Member Author

janmandel commented Jul 16, 2020

From http://manpages.ubuntu.com/manpages/xenial/man3/pnetcdf_f90.3.html

function nf90mpi_put_var(ncid, varid, values, start, stride, imap)
          integer, intent(in) :: ncid, varid
          <<whatever>>, intent(in) :: values
          integer, dimension(:), optional, intent(in) :: start
          integer, dimension(:), optional, intent(in) ::  stride
          integer, dimension(:), optional, intent(in) ::  imap
          integer :: nf90mpi_put_var

          Writes a value or values to a netCDF variable.  The netCDF dataset must be open and
          in  data  mode.   values  contains  the value(s) what will be written to the netCDF
          variable identified by ncid and varid; it may be a scalar or an array and  must  be
          of     type    character,    integer(kind=OneByteInt),    integer(kind=TwoByteInt),
          integer(kind=FourByteInt), integer(kind=EightByteInt), real(kind=FourByteReal),  or
          real(kind=EightByteReal).   All  values  are  converted to the external type of the
          netCDF variable, if possible; otherwise, an nf90_erange  error  is  returned.   The
          optional  argument  start  specifies  the starting index in the netCDF variable for
          writing for each dimension of the netCDF variable.  The  optional  argument  stride
          specifies  the  sampling stride (the interval between accessed values in the netCDF
          variable)  for  each  dimension  of  the  netCDF  variable  (see  COMMON   ARGUMENT
          DESCRIPTIONS   below).    The   optional  argument  imap  specifies  the  in-memory
          arrangement of the data values (see COMMON ARGUMENT DESCRIPTIONS below).

   integer(kind=MPI_OFFSET) start
          specifies the starting point for accessing a netCDF variable's data values in terms
          of  the indicial coordinates of the corner of the array section.  The indices start
          at 1; thus, the first data value of a variable is (1, 1, ..., 1).  The size of  the
          vector  shall  be  at  least  the  rank  of  the associated netCDF variable and its
          elements shall correspond, in order, to the variable's dimensions.

   integer(kind=MPI_OFFSET) stride
          specifies the sampling interval along each dimension of the netCDF variable.    The
          elements  of  the  stride  vector  correspond,  in  order, to the netCDF variable's
          dimensions (stride(1)) gives the sampling interval along the most  rapidly  varying
          dimension  of  the  netCDF  variable).   Sampling  intervals are specified in type-
          independent units of elements (a value of 1 selects  consecutive  elements  of  the
          netCDF variable along the corresponding dimension, a value of 2 selects every other
          element, etc.).

   integer(kind=MPI_OFFSET) imap
          specifies the mapping between the dimensions of a netCDF variable and the in-memory
          structure  of  the  internal  data array.  The elements of the index mapping vector
          correspond, in order, to the netCDF variable's dimensions (imap gives the  distance
          between  elements  of  the internal array corresponding to the most rapidly varying
          dimension of the netCDF variable).  Distances between  elements  are  specified  in
          type-independent  units  of  elements  (the distance between internal elements that
          occupy adjacent memory locations is 1 and  not  the  element's  byte-length  as  in
          netCDF 2).

@janmandel
Copy link
Member Author

janmandel commented Jul 16, 2020

Ready for testing.

To build

git checkout develop-18
module load pnetcdf
setenv PNETCDF /uufs/chpc.utah.edu/sys/installdir/pnetcdf/1.11.2i18
setenv NETCDF_classic 1
./configure -d # select option 15 and 1
vi configure.wrf # add to INCLUDE_MODULES the line -I$(PNETCDFPATH)/include

To run

run ./real.exe and ./wrf.exe (that will produce an expected error)
ln -s wrfout_file_created wrf.nc
run wrf.exe using qsub simulation file
Then it rewrites U V PH in frame 1 of wrf.nc in every timestep
Test build is in /uufs/chpc.utah.edu/common/home/kochanski-group4/jmandel/WRF-SFIRE-pnetcdf
Test run is in test/em_fire/hill

Testing needed

  • are U V PH reasonable? no jumps and changing every time step
  • if wrfout is written in the same timestep, the values should be close, identical with u_2 v_2 ph_2
  • can CUDA QES-Winds run from this file?

janmandel added a commit that referenced this issue Feb 1, 2021
@janmandel
Copy link
Member Author

janmandel commented Feb 1, 2021

On cheyenne (from Angel for reference):

module unload gdal
module load netcdf
module load pnetcdf
setenv NETCDF_classic 1

or in sh:
export NETCDF_classic=1

janmandel added a commit that referenced this issue Jun 3, 2021
@janmandel janmandel changed the title Couple with QES-Winds Use PNETCDF to couple with QES-Winds Jun 3, 2021
janmandel added a commit that referenced this issue Jun 3, 2021
Conflicts:
	phys/module_fr_sfire_driver.F

Clean merge, different lines added. #36 #18
janmandel added a commit that referenced this issue Jun 3, 2021
…cdf code #18 #36

NOTE: does not support openmp, operates on whole patch inside the parallel do master loop
janmandel added a commit that referenced this issue Jun 23, 2021
janmandel added a commit that referenced this issue Jul 2, 2021
commenting out scalar pnetcdf write that does not wor:wk
janmandel added a commit that referenced this issue Jul 23, 2021
also fixing end subroutine pnetcdf_read_int
janmandel added a commit that referenced this issue Aug 5, 2021
@janmandel
Copy link
Member Author

janmandel commented Sep 4, 2021

Vertically staggered variables are "half level", the top is decreased by one. We need to make function get_chsum and its call for fmw compatible with that. See 75c25c7

janmandel added a commit that referenced this issue Sep 5, 2021
janmandel added a commit that referenced this issue Sep 5, 2021
…t.exe from openwfm/wrf-fire-matlab at 74cb95a69fb88579da78bb57ef60df403007838f

WRF-SFIRE #18  openwfm/wrf-fire-matlab#4
janmandel added a commit to openwfm/wrf-fire-matlab that referenced this issue Sep 6, 2021
moved function read_initial_wind to new module_wrfout.f90
openwfm/WRF-SFIRE#18  UtahEFD/QES-Winds#2 #4
janmandel added a commit to openwfm/wrf-fire-matlab that referenced this issue Feb 17, 2022
@willemsn
Copy link

Got the ping-pong working again. Code to do this is in the
pw-wrfqes-wbmerge
branch of qesWinds

This should test the ping-pong'ing a bit. I need to fix some initialization issues in qes for
setting up the correct time series and we should be in better shape.

janmandel added a commit to openwfm/wrf-fire-matlab that referenced this issue Feb 25, 2022
@janmandel
Copy link
Member Author

@willemsn
I am trying the recipe above from Jan 14 but cmake first did not like the - at the end of line and then can't find pnetcdf. module load pnetcdf not found, load module load pnetcdf/1.11.2 requires intel and impi which disable gcc

@willemsn
Copy link

willemsn commented Mar 15, 2022

Hi Jan,

QES does not use PNETCDF; our build will not need PNETCDF. For WRF-SFIRE though, I build it with this module list:

#!/bin/sh
module load geotiff/1.4.0
module load intel/2018.1.163
module load hdf5/1.8.19
module load netcdf-c/4.4.1.1
module load impi/2018.1.163
module load netcdf-f/4.4.4
module load pnetcdf
setenv PNETCDF /uufs/chpc.utah.edu/sys/installdir/pnetcdf/1.11.2i18
setenv NETCDF_classic 1
setenv NETCDF /uufs/chpc.utah.edu/sys/installdir/netcdf-f/4.4.4i18-c7

With QES, I use this module list:

#!/bin/sh
module load cuda/10.2
module load gcc/8.1.0
module load cmake/3.15.3
module load gdal/3.0.1
module load boost/1.69.0

janmandel added a commit to openwfm/wrf-fire-matlab that referenced this issue Mar 16, 2022
janmandel added a commit to openwfm/wrf-fire-matlab that referenced this issue Mar 16, 2022
wait for the first frame times out when compiling with -O3, all is good with -g -C but slow
openwfm/WRF-SFIRE#18 #4
janmandel added a commit to openwfm/wrf-fire-matlab that referenced this issue Mar 17, 2022
janmandel added a commit to openwfm/wrf-fire-matlab that referenced this issue Mar 23, 2022
janmandel added a commit that referenced this issue May 4, 2022
janmandel added a commit that referenced this issue May 6, 2022
@janmandel
Copy link
Member Author

Works fine on CHPC with the module parallel-netcdf which points a build of pnetcdf with the Intel oneapi compiler.

@willemsn
Copy link

I've got it compiling and running. Doesn't seem to be "syncing" and waiting for QES. Is there something that need to be turned on that I forgot about?

janmandel added a commit that referenced this issue Aug 7, 2022
1) ncarenv/1.3   2) intel/19.1.1   3) ncarcompilers/0.5.0   4) mpt/2.25   5) netcdf/4.8.1   6) pnetcdf/1.12.3
#18 openwfm/wrf-fire-matlab#4 7e837847cba1c35771cf0b15f6a5f05c0e326531
janmandel added a commit that referenced this issue Aug 18, 2022
modules loaded: openmpi/4.1.1  netcdf-fortran/4.5.3
intel-oneapi-compilers/2021.4.0 netcdf-c/4.8.1 parallel-netcdf/1.12.2
#18 UtahEFD/QES-Winds#2 openwfm/wrf-fire-matlab#4
@janmandel
Copy link
Member Author

janmandel commented Aug 19, 2022

On CHPC use:

 module purge
 module load intel-oneapi-compilers/2021.4.0 openmpi/4.1.1
 module load netcdf-c/4.8.1  netcdf-fortran/4.5.3
 module load parallel-netcdf/1.12.2
 setenv PNETCDF $PARALLEL_NETCDF_ROOT
 setenv NETCDF_classic 1

and setenv NETCDF to a directory with subdirectories include and lib populated by soft links to files from both $NETCDF_C_ROOT and $NETCDF_FORTRAN_ROOT in the same subdirectories.

This can be done by

 source  /uufs/chpc.utah.edu/common/home/u6015690/lib/intel-2021.4.0.tcsh

before the ./compile and in the slurm script.

Note: parallel-netcdf/1.12.2 seems to be bound to openmpi/4.1.1 built using the intel-oneapi-compilers/2021.4.0 compiler. Other combinations of dependencies listed by module spider parallel-netcdf/1.12.2 may link but will crash at runtime.

Also, parallel-netcdf will crash when opening a flle that is a soft link to a file in /scratch/general/lustre. The code works fine when everything is done on lustre.

@Fergui Fergui added the duplicate This issue or pull request already exists label Jun 25, 2023
@Fergui
Copy link
Member

Fergui commented Jun 25, 2023

Continued in #5

@Fergui Fergui closed this as completed Jun 25, 2023
Aurel31 added a commit that referenced this issue Jul 31, 2024
New feature: Add Balbi ros model, crown fire model, multilayer heat flux scheme, and StarFIRE submodule
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

3 participants