Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add settings for compy, for standalone homme and coupled #3117

Merged
merged 9 commits into from
Oct 30, 2019

Conversation

oksanaguba
Copy link
Contributor

@oksanaguba oksanaguba commented Aug 9, 2019

Adding cmake cache files for compy intel and pgi. Fixes for cxx building and for a linker for building with cime.

Fixes #3225 .

[bfb] for the machines we run on.

Oksana Guba added 2 commits August 9, 2019 11:48
… could not read homme nl file from stream, intel18 build is ok.
@oksanaguba oksanaguba self-assigned this Aug 9, 2019
@oksanaguba oksanaguba added HOMME standalone issues with the standalone HOMME code that dont impact E3SM BFB PR leaves answers BFB labels Aug 9, 2019
@oksanaguba oksanaguba changed the title Add homme copy cmake file Add homme compy cmake file Aug 9, 2019
…b on 24 skybridge ranks (75 sec) for intel18. i assume no more binding options are needed.
@oksanaguba oksanaguba requested a review from mt5555 August 15, 2019 15:44
@mt5555
Copy link
Contributor

mt5555 commented Aug 15, 2019

why does intel18 need its own machine file? Why cant the intel19 machine file work with intel18 if the user loads intel18 and the appropriate netcdf. I think we should get the netcdf path from the environment (like the intel19 machine file) instead of hardcoding it (like in the intel18 machine file)

My only hard request would be to remove the ADD_LINKER_FLAGS. Its cmake's find_netcdf's job to set that variable.

@oksanaguba
Copy link
Contributor Author

By some reason NETCDF_ROOT is not set in environment for intel18. About linker flags -- I do not know exactly how it works in homme or cmake, but libs were not listed in link.txt , so, I added a hack. What is a proper way?

@oksanaguba
Copy link
Contributor Author

For linker flags, there are other files that have it:

[guba538@compy-e machineFiles]$ grep ADD_LINKER *
bebop.cmake:  SET (ADD_LINKER_FLAGS " ${NFCONFIG_OUTPUT} " CACHE STRING "")
blues.cmake:  SET (ADD_LINKER_FLAGS " ${NFCONFIG_OUTPUT} " CACHE STRING "")
cori-haswell.cmake:  SET (ADD_LINKER_FLAGS " ${NFCONFIG_OUTPUT} " CACHE STRING "")
cori-knl.cmake:#  SET (ADD_LINKER_FLAGS " ${NFCONFIG_OUTPUT} " CACHE STRING "")
theta.cmake:  SET (ADD_LINKER_FLAGS " ${NFCONFIG_OUTPUT} " CACHE STRING "")

@mt5555
Copy link
Contributor

mt5555 commented Aug 16, 2019

some notes on what I think is the best cmake style approach:

anvil: set NetCDF_Fortran_PATH and NetCDF_C_PATH based on output of nf-config and nc-config (needed when fortran and c libraries are in different locations)

cori-knl: set NETCDF_DIR based on environment variables

If a module sets an environment variable, that results in simpler cmake code. But if the modules do not set an appropriate variable, then we assume they set the path and we use "nc-config --prefix". (or, if fortran and c libraries are in different locations, nc-config and nf-config must be used). If a module sets neither environment or path, then more work will be needed to figure out how the module expects the build system to find the library.

theta, bebop, blues: shouldn't need to use ADD_LINKER_FLAGS - something is wrong

@oksanaguba oksanaguba changed the title Add homme compy cmake file [wip] Add homme compy cmake file Sep 1, 2019
@oksanaguba
Copy link
Contributor Author

oksanaguba commented Sep 3, 2019

With heavy input from @ambrad :

It seems that compy netcdff library is built without its C netcdf dependence.
First, there is this code in components/homme/utils/cime/src/externals/pio2/cmake/FindNetCDF.cmake

            elseif (NCDFcomp STREQUAL Fortran AND NOT NetCDF_Fortran_IS_SHARED)

                # DEPENDENCY: NetCDF
                set (orig_comp ${NCDFcomp})
                set (orig_comps ${NetCDF_FIND_VALID_COMPONENTS})
                find_package (NetCDF COMPONENTS C)
                set (NetCDF_FIND_VALID_COMPONENTS ${orig_comps})
                set (NCDFcomp ${orig_comp})
                if (NetCDF_C_FOUND)
                    list (APPEND NetCDF_Fortran_INCLUDE_DIRS ${NetCDF_C_INCLUDE_DIRS})
                    list (APPEND NetCDF_Fortran_LIBRARIES ${NetCDF_C_LIBRARIES})
                endif ()

            endif ()

which says to append C netcdf libs if netcdf F library is not shared. On compy, the library is shared,
#compy
/share/apps/netcdf/4.6.3/intel/18.0.0/lib/libnetcdff.so
and so is on anvil
#anvil
/blues/gpfs/home/software/spack-0.9.1/opt/spack/linux-centos6-x86_64/intel-17.0.0/netcdf-4.4.1-gpk22cidfgknxbc6wjuimdkqifhfhg2j/lib/libnetcdff.so

but on anvil its library is built with netcdf C dependency:

ldd /blues/gpfs/home/software/spack-0.9.1/opt/spack/linux-centos6-x86_64/intel-17.0.0/netcdf-4.4.1-gpk22cidfgknxbc6wjuimdkqifhfhg2j/lib/libnetcdff.so | grep netcdf
	libnetcdf.so.11 => /blues/gpfs/home/software/spack-0.9.1/opt/spack/linux-centos6-x86_64/intel-17.0.0/netcdf-4.4.1-gpk22cidfgknxbc6wjuimdkqifhfhg2j/lib/libnetcdf.so.11 (0x00002b12305b2000)

but on compy it is not:

ldd /share/apps/netcdf/4.6.3/intel/18.0.0/lib/libnetcdff.so | grep netcdf 

@PeterCaldwell
Copy link
Contributor

Pinging @AaronDonahue , who was lamenting C-less netcdf libraries the other day as well.

@oksanaguba
Copy link
Contributor Author

oksanaguba commented Sep 3, 2019

Adding comment on why using pio2 cmake file above:
When cmake was run with --trace, output contained

...
/qfs/people/guba538/acme-master/components/homme/utils/cime/src/externals/pio1/pio/CMakeLists.txt(33):  FIND_PACKAGE(NetCDF 4.3.3 COMPONENTS C Fortran )
/qfs/people/guba538/acme-master/components/homme/utils/cime/src/externals/pio2/cmake/FindNetCDF.cmake(18):  include(LibFind )
/qfs/people/guba538/acme-master/components/homme/utils/cime/src/externals/pio2/cmake/FindNetCDF.cmake(19):  include(LibCheck )
/qfs/people/guba538/acme-master/components/homme/utils/cime/src/externals/pio2/cmake/FindNetCDF.cmake(22):  define_package_component(NetCDF DEFAULT COMPONENT C INCLUDE_NAMES netcdf.h LIBRARY_NAMES netcdf )
...

Also, in homme cmake list:
SET (CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/utils/cime/src/externals/pio2/cmake" ${CMAKE_MODULE_PATH})

@oksanaguba
Copy link
Contributor Author

The same issue is with intel19 version:

[guba538@compy-e lib]$ ldd libnetcdff.so | grep netcdf
[guba538@compy-e lib]$ pwd
/share/apps/netcdf/4.6.3/intel/19.0.3/lib
[guba538@compy-e lib]$ ldd libnetcdff.so | grep netcdf
[guba538@compy-e lib]$ 

Oksana Guba added 2 commits October 8, 2019 14:09
…me: added a compy-pgi cache file, removed add_folders from homme cmake build, added stdc++ lib for cime and switched linker there to F linker. eam_theta suite ran after that, not sure what to do about standalone suite which looks for compy.cmake (while there are compy-intel and compy-pgi)
@oksanaguba
Copy link
Contributor Author

I had to merge master into this branch to have most recent cime for compy.

The last commit was done with heavy input from @ambrad .

With PGI standalone homme produces a bunch of warnings in cmake

  The dependency target "blas" of target "theta-l-nlev30" does not exist.
Call Stack (most recent call first):
  test_execs/theta-l-nlev30/CMakeLists.txt:6 (createTestExec)
This warning is for project developers.  Use -Wno-dev to suppress it.

Homme has two switches, HOMME_USE_MKL and HOMME_FIND_BLASLAPACK . If neither is set to ON, then cmake tries to use system blas/lapack. At the moment, system's blas/lapack are not supported on a compute node. When homme is built on login node, there are libs in /usr/lib64/ , but on allocated node (ssh to it, sallow is not enough), the folder does not have blas lib.

I am leaving warnings alone because it is likely that compy configuration will change, instead of introducing more changes into homme cmake.

Homme standalone tests are running now.

@oksanaguba
Copy link
Contributor Author

Not true that eam_theta suite passed. 3 tests failed: thetahy_sl, both ERS tests. For SL and ERS with thetahy, both fails are in init, but there is nbo meaningful message, log file is cut somewhere in hybrid coordinates output. For ERS with thetanh, restart and no-restart files do not pass diffs in 189 fields.

@oksanaguba
Copy link
Contributor Author

For the new intel file, 'make baseline' was successful.

1)For intel, 'make baseline' runs many times faster that for phi.
2)For standalone homme and pgi, crash is in SL after

component   6 length=20800 displacement= 113664
component   7 length= 1248 displacement= 280064
 creating sorted ghost cell neigbor map...
 checking ghost cell neighbor buffer sorting...
 passed.
...
SEGFAULT...

Traceback is on but there is no traceback info in output.

I suggest we track failing tests on compy/pgi with a separate PR.

@oksanaguba oksanaguba changed the title [wip] Add homme compy cmake file Add settings for compy, for standalone homme and coupled Oct 9, 2019
@oksanaguba
Copy link
Contributor Author

oksanaguba commented Oct 9, 2019

This is almost ready -- I need to figure out compy-intel, compy-pgi for jenkins runs. Otherwise, @mt5555 would you please review? Not sure how urgent it is.

@rljacob
Copy link
Member

rljacob commented Oct 14, 2019

@mt5555 please re-review. We need this merged to next so we can get compy pgi tests working.

@oksanaguba
Copy link
Contributor Author

I renamed compy-pgi.cmake to compy.cmake assuming nightlies will only be run with pgi.
One last issue here is that pgi is much slower that intel, not sure whether to spend time on this.

@rljacob rljacob added the HOMME label Oct 22, 2019
oksanaguba added a commit that referenced this pull request Oct 30, 2019
…3117)

Adding cmake cache files for compy intel and pgi.
Fixes for cxx building and for a linker for building with cime.

Fixes #3225 .

[bfb] for the machines we run on.
@oksanaguba
Copy link
Contributor Author

merged to next

@mt5555
Copy link
Contributor

mt5555 commented Oct 30, 2019

One thing I just noticed: this PR makes the default (compy.cmake = compy-pgi.cmake) for HOMME PGI. But I thought for compy we recently switched back to intel as the default?

@oksanaguba
Copy link
Contributor Author

Nightlies run with pgi (today's cdash) and cime picks ${machine}.cmake or something like that as cache file. So, I renamed the file to make it work with nightlies.

oksanaguba added a commit that referenced this pull request Oct 30, 2019
#3117)

Adding cmake cache files for compy intel and pgi.
Fixes for cxx building and for a linker for building with cime.

Fixes #3225 .

[bfb] for the machines we run on.
@oksanaguba oksanaguba merged commit 4dd7a25 into master Oct 30, 2019
@oksanaguba oksanaguba deleted the oksanaguba/homme/compy-cmake-file branch October 30, 2019 20:02
rljacob pushed a commit that referenced this pull request Apr 21, 2021
Maint 5.6 merge
maint-5.6 branch with merge conflicts resolved

Test suite: scripts_regression_tests.py
Test baseline:
Test namelist changes:
Test status: [bit for bit, roundoff, climate changing]

Fixes

User interface changes?:

Update gh-pages html (Y/N)?:

Code review:
rljacob pushed a commit that referenced this pull request May 6, 2021
Maint 5.6 merge
maint-5.6 branch with merge conflicts resolved

Test suite: scripts_regression_tests.py
Test baseline:
Test namelist changes:
Test status: [bit for bit, roundoff, climate changing]

Fixes

User interface changes?:

Update gh-pages html (Y/N)?:

Code review:
jgfouca pushed a commit that referenced this pull request Nov 21, 2024
…process-pr

Fetch changed files from PR in chunks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BFB PR leaves answers BFB HOMME standalone issues with the standalone HOMME code that dont impact E3SM HOMME
Projects
None yet
Development

Successfully merging this pull request may close these issues.

This print statement needs to be removed.
4 participants