Skip to content

Commit

Permalink
Check device count before running dist tests (#2799)
Browse files Browse the repository at this point in the history
* Check device count before running dist tests

* fixing format for "Check device count before running dist tests"

* Check device count against max world size

* Check GPU count before launching dist tests

* double-check GPU actually exists

---------

Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
  • Loading branch information
4 people authored Feb 23, 2023
1 parent 859d7c9 commit 7e77cf7
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions tests/unit/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,10 @@ def _get_fixture_kwargs(self, request, func):
return fixture_kwargs

def _launch_procs(self, num_procs):
if torch.cuda.is_available() and torch.cuda.device_count() < num_procs:
pytest.skip(
f"Skipping test because not enough GPUs are available: {num_procs} required, {torch.cuda.device_count()} available"
)
mp.set_start_method('forkserver', force=True)
skip_msg = mp.Queue() # Allows forked processes to share pytest.skip reason
processes = []
Expand Down

0 comments on commit 7e77cf7

Please sign in to comment.