Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gfs_bufr fails to find a NetCDF library on WCOSS #53

Closed
WalterKolczynski-NOAA opened this issue Mar 13, 2024 · 2 comments · Fixed by #55
Closed

gfs_bufr fails to find a NetCDF library on WCOSS #53

WalterKolczynski-NOAA opened this issue Mar 13, 2024 · 2 comments · Fixed by #55
Assignees
Labels
bug Something isn't working

Comments

@WalterKolczynski-NOAA
Copy link
Contributor

gfs_bufr is now failing on WCOSS at execution time due to failure to find one of the NetCDF libraries:

+ gfs_bufr.sh[97]: mpiexec -l -n 40 --depth=8 --cpu-bind depth /lfs/h2/emc/global/save/walter.kolczynski/global-workflow/fix_gempak/exec/gfs_bufr.x
nid001058.cactus.wcoss2.ncep.noaa.gov 0: /lfs/h2/emc/global/save/walter.kolczynski/global-workflow/fix_gempak/exec/gfs_bufr.x: error while loading shared libraries: libnetcdf.so.7: cannot open
 shared object file: No such file or directory

ldd shows the problem:

+ gfs_bufr.sh[96]: ldd /lfs/h2/emc/global/save/walter.kolczynski/global-workflow/fix_gempak/exec/gfs_bufr.x
    linux-vdso.so.1 (0x00007ffe957a5000)
    libnetcdff.so.7 => /apps/prod/hpc-stack/intel-19.1.3.304/cray-mpich-8.1.4/netcdf/4.7.4/lib/libnetcdff.so.7 (0x0000154163833000)
    libnetcdf.so.18 => /apps/prod/hpc-stack/intel-19.1.3.304/cray-mpich-8.1.4/netcdf/4.7.4/lib/libnetcdf.so.18 (0x00001541634e5000)
    libnetcdf.so.7 => not found
    libiomp5.so => /pe/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libiomp5.so (0x00001541630c3000)
    libmpifort_intel.so.12 => /opt/cray/pe/lib64/libmpifort_intel.so.12 (0x0000154162e24000)
    libimf.so => /pe/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libimf.so (0x00001541627a1000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x000015416277d000)
    libm.so.6 => /lib64/libm.so.6 (0x0000154162630000)
    libdl.so.2 => /lib64/libdl.so.2 (0x000015416262b000)
    libc.so.6 => /lib64/libc.so.6 (0x0000154162436000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000154162217000)
    libhdf5_hl.so.100 => /apps/prod/hpc-stack/intel-19.1.3.304/cray-mpich-8.1.4/hdf5/1.10.6/lib/libhdf5_hl.so.100 (0x0000154161fee000)
    libhdf5.so.103 => /apps/prod/hpc-stack/intel-19.1.3.304/cray-mpich-8.1.4/hdf5/1.10.6/lib/libhdf5.so.103 (0x0000154161902000)
    libifport.so.5 => /pe/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libifport.so.5 (0x00001541616d2000)
    libifcoremt.so.5 => /pe/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libifcoremt.so.5 (0x0000154161534000)
    libsvml.so => /pe/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libsvml.so (0x000015415f9ea000)
    libintlc.so.5 => /pe/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x000015415f772000)
    libmpi_intel.so.12 => /opt/cray/pe/lib64/libmpi_intel.so.12 (0x000015415cb54000)
    libirng.so => /pe/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libirng.so (0x000015415c7e9000)
    /lib64/ld-linux-x86-64.so.2 (0x0000154163cc6000)
    libifcore.so.5 => /pe/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libifcore.so.5 (0x000015415c681000)
    libfabric.so.1 => /opt/cray/libfabric/1.11.0.0./lib64/libfabric.so.1 (0x000015415c3d6000)
    libatomic.so.1 => /usr/lib64/libatomic.so.1 (0x000015415c1cd000)
    librt.so.1 => /lib64/librt.so.1 (0x000015415c1c3000)
    libpmi.so.0 => /opt/cray/pe/lib64/libpmi.so.0 (0x000015415bfc1000)
    libpmi2.so.0 => /opt/cray/pe/lib64/libpmi2.so.0 (0x000015415bd89000)
    librdmacm.so.1 => /usr/lib64/librdmacm.so.1 (0x000015415bb69000)  
    libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x000015415b949000)
    libpals.so.0 => /opt/cray/pe/lib64/libpals.so.0 (0x000015415b744000)
    libnl-3.so.200 => /usr/lib64/libnl-3.so.200 (0x000015415b522000)
    libnl-route-3.so.200 => /usr/lib64/libnl-route-3.so.200 (0x000015415b2ac000)

The shared object does not appear in the netcdf library:

WCOSS2 (BACKUPSYS) sorc> ls /apps/prod/hpc-stack/intel-19.1.3.304/cray-mpich-8.1.4/netcdf/4.7.4/lib/ -l
total 6.4M
-rwxr-xr-x 1 hpc-adm hpc-adm  1.5K Oct 17  2021 libh5bzip2.la
-rwxr-xr-x 1 hpc-adm hpc-adm   96K Oct 17  2021 libh5bzip2.so
-rw-r--r-- 1 hpc-adm hpc-adm  1.8M Oct 17  2021 libnetcdf.a
-rw-r--r-- 1 hpc-adm hpc-adm  838K Oct 17  2021 libnetcdf_c++4.a
-rwxr-xr-x 1 hpc-adm hpc-adm  1.5K Oct 17  2021 libnetcdf_c++4.la
lrwxrwxrwx 1 hpc-adm hpc-adm    23 Oct 17  2021 libnetcdf_c++4.so -> libnetcdf_c++4.so.1.1.0
lrwxrwxrwx 1 hpc-adm hpc-adm    23 Oct 17  2021 libnetcdf_c++4.so.1 -> libnetcdf_c++4.so.1.1.0
-rwxr-xr-x 1 hpc-adm hpc-adm  466K Oct 17  2021 libnetcdf_c++4.so.1.1.0
-rw-r--r-- 1 hpc-adm hpc-adm 1019K Oct 17  2021 libnetcdff.a
-rwxr-xr-x 1 hpc-adm hpc-adm  1.5K Oct 17  2021 libnetcdff.la
-rw-r--r-- 1 hpc-adm hpc-adm  1.4K Oct 17  2021 libnetcdff.settings
lrwxrwxrwx 1 hpc-adm hpc-adm    19 Oct 17  2021 libnetcdff.so -> libnetcdff.so.7.0.0
lrwxrwxrwx 1 hpc-adm hpc-adm    19 Oct 17  2021 libnetcdff.so.7 -> libnetcdff.so.7.0.0
-rwxr-xr-x 1 hpc-adm hpc-adm  823K Oct 17  2021 libnetcdff.so.7.0.0
-rwxr-xr-x 1 hpc-adm hpc-adm  1.3K Oct 17  2021 libnetcdf.la
-rw-r--r-- 1 hpc-adm hpc-adm  1.4K Oct 17  2021 libnetcdf.settings
lrwxrwxrwx 1 hpc-adm hpc-adm    19 Oct 17  2021 libnetcdf.so -> libnetcdf.so.18.0.0
lrwxrwxrwx 1 hpc-adm hpc-adm    19 Oct 17  2021 libnetcdf.so.18 -> libnetcdf.so.18.0.0
-rwxr-xr-x 1 hpc-adm hpc-adm  1.4M Oct 17  2021 libnetcdf.so.18.0.0
drwxr-xr-x 2 hpc-adm hpc-adm  4.0K Oct 17  2021 pkgconfig

Not sure if this is related to the recent PRs (though it seems likely) or is just a coincidence.

@WalterKolczynski-NOAA WalterKolczynski-NOAA added the bug Something isn't working label Mar 13, 2024
@WalterKolczynski-NOAA WalterKolczynski-NOAA changed the title gfs_bufr fails to find NetCDF library on WCOSS gfs_bufr fails to find a NetCDF library on WCOSS Mar 13, 2024
@WalterKolczynski-NOAA
Copy link
Contributor Author

Isolated to PR #50

@WalterKolczynski-NOAA
Copy link
Contributor Author

Isolated to the gempak module

WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/gfs-utils that referenced this issue Mar 18, 2024
Turns off the loading of the gempak module on all machines, disabling
rdbfmsua. The gempak module creates conflicts with the bufr executable.
An alternative solution will need to be found to build both. Issue NOAA-EMC#54
has been opened to reenable rdbfmsua.

Resolves NOAA-EMC#53
Refs NOAA-EMC#54
This was referenced Mar 18, 2024
aerorahul pushed a commit that referenced this issue Mar 18, 2024
Turns off the loading of the gempak module on all machines, disabling
rdbfmsua. The gempak module creates conflicts with the bufr executable.
An alternative solution will need to be found to build both. Issue #54
has been opened to reenable rdbfmsua.

Resolves #53
Refs #54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants