Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make "Scheduling GPUs" its own (sub-)section #77

Open
SteVwonder opened this issue Nov 18, 2020 · 2 comments
Open

Make "Scheduling GPUs" its own (sub-)section #77

SteVwonder opened this issue Nov 18, 2020 · 2 comments

Comments

@SteVwonder
Copy link
Member

This should include details about hwloc needing to be compiled against NVML and OpenCL. The check for if GPU detection in Flux worked (#71). And that CUDA_DEVICE_ORDER should be set if launching with a system launcher like srun or jsrun.

@SteVwonder
Copy link
Member Author

Instructions for loading GPU-enabled Flux on TOSS systems:

module use /opt/modules/modulefiles
module use /usr/global/tools/flux/toss_3_x86_64_ib/modulefiles
module load flux/0.19.0-cuda

Confirm that Flux is detecting GPUs and marking them as schedulable:

❯ flux start flux resource list                                                                                                                                                                                                13:24:14 ()
     STATE NNODES   NCORES    NGPUS
      free      1       16        2
 allocated      0        0        0
      down      0        0        0

@SteVwonder
Copy link
Member Author

It should also cover the fact that if you don't have Fluxion loaded, hwloc may be detecting and reporting GPUs but sched-simple is ignoring them.

To test if hwloc is detecting them: flux start flux hwloc topology | grep CoProc. To test that fluxion is loaded: flux module list | grep fluxion (you should see a qmanager and resource module). If all of that is together, then flux resource list should list the GPUs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant