Skip to content

Commit

Permalink
coral: update test command for CUDA-enabled hwloc
Browse files Browse the repository at this point in the history
Problem: Right now the documentation instructs the user to run `lstopo |
grep -i coproc`, which is an OK test, but it is possible:

1. for the right binary to be in the user's PATH but the wrong library
to be found by ld
2. the wrong binary to be in the user's PATH but the right library be
found by ld.

The latter is the case on Lassen where the module file only updates the
LD_LIBRARY_PATH.

Solution: Have the user test the intended behavior directly with
`flux start flux resource list`.

Closes flux-framework#71
  • Loading branch information
SteVwonder committed Jan 6, 2021
1 parent 4f41727 commit 9041684
Showing 1 changed file with 12 additions and 7 deletions.
19 changes: 12 additions & 7 deletions coral.rst
Original file line number Diff line number Diff line change
Expand Up @@ -88,17 +88,22 @@ On all systems, Flux relies on hwloc to auto-detect the on-node resources
available for scheduling. The hwloc that Flux is linked against must be
configured with ``--enable-cuda`` for Flux to be able to detect Nvidia GPUs.

You can test to see if your system default hwloc is CUDA-enabled with:
If running on an LLNL CORAL system, you can load a CUDA-enabled hwloc with:

.. code-block:: sh
lstopo | grep CoProc
If no output is produced, then your hwloc is not CUDA-enabled.
module load hwloc/1.11.10-cuda
If running on an LLNL CORAL system, you can load a CUDA-enabled hwloc with:
You can test to see if the hwloc that Flux is linked against is CUDA-enabled by
running:

.. code-block:: sh
.. code-block:: terminal
module load hwloc/1.11.10-cuda
$ flux start flux resource list
STATE NNODES NCORES NGPUS
free 1 40 4
allocated 0 0 0
down 0 0 0
If the number of free GPUs is 0, then the hwloc that Flux is linked against is
not CUDA-enabled.

0 comments on commit 9041684

Please sign in to comment.