-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore "pytest.mark.gpu" for RELAX tests #16741
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for finding and fixing this issue. The tvm.testing.requires_*
marks all provide the pytest.mark.gpu
flag, but they aren't yet used in all cases.
|
||
pytestmark = [cudnn_enabled] | ||
pytestmark = [*tvm.testing.requires_cudnn.marks()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: The tvm.testing.requires_*
marks can be used as marks directly, rather than needing to be explicitly expanded with foo.marks()
. In addition, the pytestmark
can be a single marker, rather than a list. Between these two, this can cleaned up as:
pytestmark = tvm.testing.requires_cudnn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that it's possible. Global object pytestmark
should has type Mark
or List[Mark]
but our TVM markers has type tvm.testing.utils.Feature
. So direct assignment, like you suggest above, will lead to type check error:
TypeError: got <tvm.testing.utils.Feature object at 0x7ff96f507340> instead of Mark
Moreover, the idea of this change was to utilise hierarchical structure of tvm testing features. pytest.mark.gpu
mark is a root of this hierarchy. So I have to extract all markers from current feature and all parents. That is exactly method marks()
do.
But you are right this line can be slightly simplified by removing explicit list generator, like:
pytestmark = tvm.testing.requires_cudnn.marks()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I thought I had made Feature
be a subclass of Mark
, but I guess I never did. In that case, I like your simplification.
@@ -36,12 +36,8 @@ def reset_seed(): | |||
|
|||
has_cudnn = tvm.get_global_func("relax.ext.cudnn", True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need has_cudnn
now that it isn't used in pytest.mark.skipif
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You re absolutely right. Will remove it.
Missed pytest.mark.gpu prevents tests from launch in CI. Signed-off-by: Alexander Peskov <alexander.peskov@deelvin.com>
695d95f
to
91e8412
Compare
Hi @Lunderberg thank you for reviewing this PR.
Yes I know this issue. And I absolutely agree that this PR contains logically incorrect test marks. Some tests are marked like "require_XXX" but in fact they don't. But this incorrectness was introduced not by me. A lot of test from files touched by me verify pattern matching functionality. Pattern matching is pure python code and doesn't require any cmake enabled features like specific code generator or runtime support. But they still has marker like "requires_tensorrt_codegen". So I changed one incorrect marker by another. The reason of that was in a need to enable tests in CI. I saw 4 way to do that:
I selected the fourth way as easiest. And I realise that it's imperfect solution. If you have any other ideas how to enable all test in CI and keep testing code correct please point that. |
Thank you on the explanation of the different options. I agree that changing all markers in all locations is far too heavy of a lift for any one PR, and should be avoided. For long-term solutions, I think it would make sense to have the "cpu" and "gpu" indicate which environment should be used, and not which functionality is being tested. That way, we could distinguish between a test that compiles TensorRT (doesn't require a GPU) and a test that executes TensorRT (does require a GPU). The former would run in the |
(Aside from all my pontificating, I think this PR is ready to merge once the CI passes.) I took a quick glance at the failing cases, and it looks like this line should be changed from
So the test worked when it was added, should have been updated in subsequent PRs, but due to the missing pytest marks, was never noticed until this PR. |
Signed-off-by: Alexander Peskov <alexander.peskov@deelvin.com>
Signed-off-by: Alexander Peskov <alexander.peskov@deelvin.com>
Hi, team! |
@Lunderberg could you please merge this one if you have no objections. |
Looks like it is still making its way through CI, but I can merge it after the CI passes. |
Unfortunately, relax GPU tests are excluded from CI right now.
The testing script uses flag "-m gpu" to run gpu tests, but actual tests have no such mark.