[TESTS] Refactor tests to run on either the GPU or CPU #6331

tkonolige · 2020-08-24T18:42:32Z

Much of the time spent in testing is duplicated work between CPU and GPU test nodes. The main reason is that there is no way to control which TVM devices are enabled at runtime, so tests that use LLVM will run on both GPU and CPU nodes.

This patch adds an environment variable, TVM_TEST_DEVICES, which controls which TVM devices should be used by tests. Devices not in TVM_TEST_DEVICES can still be used, so tests must be careful to check that the desired device is enabled with tvm.testing.device_enabled or by enumerating all devices with tvm.testing.enabled_devices. All tests have been retrofitted with these checks.

This patch also provides the decorator @tvm.testing.gpu to mark a test as possibly using the gpu. Tests that require the gpu can use @tvm.testing.requires_gpu. Tests without these flags will not be run on GPU nodes.

python/tvm/testing.py

python/tvm/_ffi/runtime_ctypes.py

python/tvm/testing.py

tests/python/topi/python/test_topi_conv2d_nhwc_winograd.py

python/tvm/testing.py

tests/scripts/setup-pytest-env.sh

Jenkinsfile

tqchen · 2020-08-24T20:08:57Z

cc @zhiics @yzhliu @junrushao1994 @merrymercy @ajtulloch @kparzysz-quic

tqchen · 2020-08-24T20:14:41Z

It might also be helpful to send a quick RFC given that the change of test infra will affect quite a lot of people

tkonolige · 2020-08-24T20:16:53Z

I agree that an RFC could be useful, but maybe I could just add information to the docs instead?

zhiics

A few nitpicks

tests/python/contrib/test_random.py

tests/python/relay/dyn/test_dynamic_op_level3.py

kparzysz-quic

Please keep in mind that eventually the mapping target_name -> device will no longer be sufficient. There can be several devices with different characteristics that may all fall under the same general target (e.g. gpu), and that any individual test should be able to specify required device properties.
The use of strings to pass targets around may be phased out at some point, please document somewhere the necessary updates to this scheme when that happens.

python/tvm/testing.py

tkonolige · 2020-08-25T17:35:09Z

@kparzysz-quic I agree that using strings for targets here is pretty awkward. Is there a better way to check which device a target requires? I see that #6315 makes targets more structured, but I don't see a way to take a target string and get the associated device.

kparzysz-quic · 2020-08-25T17:59:23Z

Currently there is no good way to switch away from str here, but the question is how much work would it take to allow, for example, two different 'cuda' devices to be available. I haven't analyzed the code well enough to see, but if you could outline somewhere what changes would be necessary, it would help the future transition.

tkonolige · 2020-08-26T21:41:24Z

@tqchen I think I've addressed all issues. Do I need to do something to have Jenkins use the new Jenkinsfile?

tqchen · 2020-08-26T22:00:36Z

@tkonolige because uses the old Jenkinsfile (before it get merged). Please try to send another PR to add gpuonly_test that redirects to the normal integration test first, then we can tweak the scripts once that PR get merged

tkonolige · 2020-08-28T00:22:20Z

@tqchen I'm getting a couple errors with CUDA initialization failing. I'm really sure of the cause, but it seems like it might have to do with forking.

Much of the time spent in testing is duplicated work between CPU and GPU test nodes. The main reason is that there is no way to control which TVM devices are enabled at runtime, so tests that use LLVM will run on both GPU and CPU nodes. This patch adds an environment variable, TVM_TEST_DEVICES, which controls which TVM devices should be used by tests. Devices not in TVM_TEST_DEVICES can still be used, so tests must be careful to check that the desired device is enabled with `tvm.testing.device_enabled` or by enumerating all devices with `tvm.testing.enabled_devices`. All tests have been retrofitted with these checks. This patch also provides the decorator `@tvm.testing.gpu` to mark a test as possibly using the gpu. Tests that require the gpu can use `@tvm.testing.requires_gpu`. Tests without these flags will not be run on GPU nodes.

tkonolige · 2020-09-02T21:39:22Z

@tqchen All tests are passing now... can we merge?

tqchen · 2020-09-02T21:51:08Z

Thanks @tkonolige @zhiics @kparzysz-quic !

Much of the time spent in testing is duplicated work between CPU and GPU test nodes. The main reason is that there is no way to control which TVM devices are enabled at runtime, so tests that use LLVM will run on both GPU and CPU nodes. This patch adds an environment variable, TVM_TEST_DEVICES, which controls which TVM devices should be used by tests. Devices not in TVM_TEST_DEVICES can still be used, so tests must be careful to check that the desired device is enabled with `tvm.testing.device_enabled` or by enumerating all devices with `tvm.testing.enabled_devices`. All tests have been retrofitted with these checks. This patch also provides the decorator `@tvm.testing.gpu` to mark a test as possibly using the gpu. Tests that require the gpu can use `@tvm.testing.requires_gpu`. Tests without these flags will not be run on GPU nodes.

tkonolige force-pushed the test_markers branch 3 times, most recently from e64a748 to 324066a Compare August 24, 2020 19:09

tqchen requested changes Aug 24, 2020

View reviewed changes

Jenkinsfile Outdated Show resolved Hide resolved

tqchen added the status: need review label Aug 24, 2020

zhiics reviewed Aug 24, 2020

View reviewed changes

tests/python/contrib/test_random.py Show resolved Hide resolved

tests/python/relay/dyn/test_dynamic_op_level3.py Outdated Show resolved Hide resolved

kparzysz-quic reviewed Aug 25, 2020

View reviewed changes

python/tvm/testing.py Outdated Show resolved Hide resolved

python/tvm/testing.py Outdated Show resolved Hide resolved

tkonolige mentioned this pull request Aug 26, 2020

[TESTS] add gpuonly tests for python unittests and integration #6346

Merged

tkonolige force-pushed the test_markers branch 3 times, most recently from 89f2623 to 64265ae Compare August 28, 2020 00:21

tkonolige force-pushed the test_markers branch 10 times, most recently from a5abf4f to f18f71b Compare September 1, 2020 17:47

tkonolige force-pushed the test_markers branch 2 times, most recently from ab0b4bc to bd404c6 Compare September 2, 2020 00:01

tkonolige force-pushed the test_markers branch from bd404c6 to 574a7dd Compare September 2, 2020 17:14

tqchen approved these changes Sep 2, 2020

View reviewed changes

tqchen merged commit b235e59 into apache:master Sep 2, 2020

tqchen added status: accepted and removed status: need review labels Sep 2, 2020

leandron mentioned this pull request Sep 4, 2020

import tvm now requires pytest #6398

Closed

Lunderberg mentioned this pull request Jul 28, 2021

[UnitTests] Apply correct requires_gpu() pytest marks for parametrized target #8542

Merged

Johnson9009 mentioned this pull request Aug 2, 2021

[Refactor] Avoid Override Generic Op Strategy in "hls.py" #8614

Merged

Lunderberg mentioned this pull request Aug 20, 2021

[UnitTests] Expose TVM pytest helpers as plugin #8532

Merged

Lunderberg mentioned this pull request Jan 11, 2022

[TEST] Remove llvm -device=arm_cpu and cuda -libs=cudnn from default enabled list #9905

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TESTS] Refactor tests to run on either the GPU or CPU #6331

[TESTS] Refactor tests to run on either the GPU or CPU #6331

tkonolige commented Aug 24, 2020

tqchen commented Aug 24, 2020

tqchen commented Aug 24, 2020

tkonolige commented Aug 24, 2020

zhiics left a comment

kparzysz-quic left a comment

tkonolige commented Aug 25, 2020

kparzysz-quic commented Aug 25, 2020

tkonolige commented Aug 26, 2020

tqchen commented Aug 26, 2020

tkonolige commented Aug 28, 2020

tkonolige commented Sep 2, 2020

tqchen commented Sep 2, 2020 •

edited

Loading

[TESTS] Refactor tests to run on either the GPU or CPU #6331

[TESTS] Refactor tests to run on either the GPU or CPU #6331

Conversation

tkonolige commented Aug 24, 2020

tqchen commented Aug 24, 2020

tqchen commented Aug 24, 2020

tkonolige commented Aug 24, 2020

zhiics left a comment

Choose a reason for hiding this comment

kparzysz-quic left a comment

Choose a reason for hiding this comment

tkonolige commented Aug 25, 2020

kparzysz-quic commented Aug 25, 2020

tkonolige commented Aug 26, 2020

tqchen commented Aug 26, 2020

tkonolige commented Aug 28, 2020

tkonolige commented Sep 2, 2020

tqchen commented Sep 2, 2020 • edited Loading

tqchen commented Sep 2, 2020 •

edited

Loading