Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMake failing for gnugpu on Perlmutter-GPU #56

Closed
xylar opened this issue Jan 29, 2024 · 4 comments
Closed

CMake failing for gnugpu on Perlmutter-GPU #56

xylar opened this issue Jan 29, 2024 · 4 comments
Labels
bug Something isn't working CMake CMake-related issues

Comments

@xylar
Copy link

xylar commented Jan 29, 2024

When I run:

module load cmake

mkdir -p build_omega/build_pm-gpu_gnugpu
cd build_omega/build_pm-gpu_gnugpu

export METIS_ROOT=/pscratch/sd/x/xylar/spack_pm-gpu_test//dev_polaris_0_3_0_gnugpu_mpich/var/spack/environments/dev_polaris_0_3_0_gnugpu_mpich/.spack-env/view
export PARMETIS_ROOT=/pscratch/sd/x/xylar/spack_pm-gpu_test//dev_polaris_0_3_0_gnugpu_mpich/var/spack/environments/dev_polaris_0_3_0_gnugpu_mpich/.spack-env/view

cmake \
   -DOMEGA_BUILD_TYPE=Release \
   -DOMEGA_CIME_COMPILER=gnugpu \
   -DOMEGA_CIME_MACHINE=pm-gpu \
   -DOMEGA_METIS_ROOT=${METIS_ROOT}\
   -DOMEGA_PARMETIS_ROOT=${PARMETIS_ROOT}\
   -DOMEGA_BUILD_TEST=ON \
   -S /global/u2/x/xylar/e3sm_work/polaris/add-omega-ctest-util/e3sm_submodules/Omega/components/omega/ \
   -B . 

I'm seeing:

...
-- Configuring done
CMake Error: Error required internal CMake variable not set, cmake may not be built correctly.
Missing variable is:
_CMAKE_CUDA_WHOLE_FLAG
CMake Error: Error required internal CMake variable not set, cmake may not be built correctly.
Missing variable is:
CMAKE_CUDA_COMPILE_OBJECT
-- Generating done
CMake Generate step failed.  Build files cannot be regenerated correctly.
@xylar xylar added bug Something isn't working CMake CMake-related issues labels Jan 29, 2024
@xylar
Copy link
Author

xylar commented Jul 10, 2024

I'm now able to build CTests but I'm seeing:

no CUDA-capable device is detected

over and over, specifically:

0: terminate called after throwing an instance of 'std::runtime_error'
0:   what():  cudaGetDeviceCount(&m_cudaDevCount) error( cudaErrorNoDevice): no CUDA-capable device is detected /global/u2/x/xylar/e3sm_work/polaris/main/e3sm_submodules/omega/develop/externals/ekat/extern/kokkos/core/src/Cuda/Kokkos_Cuda_Instance.cpp:275

@xylar
Copy link
Author

xylar commented Jul 10, 2024

I'm re-testing this now (with --gpus=4).

@xylar
Copy link
Author

xylar commented Jul 10, 2024

Yep, with --gpus=4, all the tests pass for me now!

@xylar
Copy link
Author

xylar commented Jul 11, 2024

I think this is fixed, so closing.

@xylar xylar closed this as completed Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CMake CMake-related issues
Projects
None yet
Development

No branches or pull requests

1 participant