Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for CI testing #124

Merged
merged 58 commits into from
Dec 8, 2024
Merged

Add support for CI testing #124

merged 58 commits into from
Dec 8, 2024

Conversation

sandeepd-nv
Copy link
Member

No description provided.

@sandeepd-nv sandeepd-nv added the CI/CD CI/CD infrastructure label Sep 23, 2024
@sandeepd-nv sandeepd-nv self-assigned this Sep 23, 2024
@sandeepd-nv
Copy link
Member Author

Blocked on #128.

@leofang
Copy link
Member

leofang commented Oct 9, 2024

Blocked on #128.

See #128 (comment), thx!

@leofang leofang added the P0 High priority - Must do! label Dec 4, 2024
Copy link

copy-pr-bot bot commented Dec 7, 2024

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@leofang
Copy link
Member

leofang commented Dec 7, 2024

/ok to test

@leofang
Copy link
Member

leofang commented Dec 8, 2024

/ok to test

@leofang
Copy link
Member

leofang commented Dec 8, 2024

/ok to test

@leofang
Copy link
Member

leofang commented Dec 8, 2024

/ok to test

@leofang
Copy link
Member

leofang commented Dec 8, 2024

/ok to test

1 similar comment
@leofang
Copy link
Member

leofang commented Dec 8, 2024

/ok to test

@leofang leofang closed this Dec 8, 2024
@leofang leofang reopened this Dec 8, 2024
@leofang
Copy link
Member

leofang commented Dec 8, 2024

/ok to test

@leofang leofang marked this pull request as ready for review December 8, 2024 23:09
@leofang leofang added this to the cuda.core beta 2 milestone Dec 8, 2024
@leofang leofang added test Addition or improved tests cuda.bindings Everything related to the cuda.bindings module cuda.core Everything related to the cuda.core module labels Dec 8, 2024
@leofang
Copy link
Member

leofang commented Dec 8, 2024

Here are some updates since Friday:

  1. it makes the CI logs hard to browse (everything from the same workflow is collapsed inside the same title)

It turns out that this is a misunderstanding (of mine), sorry! It's the other way around: It's the composite actions that do this, not reusable workflows, see, e.g. https://docs.github.com/en/actions/sharing-automations/avoiding-duplication#comparison-of-reusable-workflows-and-composite-actions. So I refactored in the opposite (and wrong) direction.

Let us do this (change all composite actions to reusable workflows) in a follow-up PR since the CI is now working and there's no reason to delay.

  1. just discovered: environment variables cannot be passed across workflows (https://github.com/orgs/community/discussions/26671), which therefore requires both build and test workflows to do the same setup, and it requires complex logic to manage

This is another misunderstanding (of mine), sorry (again)! For both cases (distinct jobs vs distinct workflows), it require some handling of input/output. There's no way for sharing the env vars in either case.

We are overcomplicating the CI for a simple project and bending backward

As part of this I removed all CI scripts in commit 7b074f0. It is the best that we focus on testing pip-based workflows for now, and add conda next (which would be treated differently). Mixing-and-matching is not ideal.

Comment on lines +24 to +26
@pytest.fixture(scope="session", autouse=True)
def always_init_cuda():
handle_return(driver.cuInit(0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, @ksimpson-work the CI was able to catch this issue: Depending on how the tests are run, it could be possible that a test ends without CUDA even initialized. So we must ensure it ourselves by the test start time.

Comment on lines +39 to +42
ctx = handle_return(driver.cuCtxGetCurrent())
if int(ctx) == 0:
# no active context, do nothing
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, @ksimpson-work another issue caught by the CI (and also back in #261): A test could end early without a CUDA context set current, so we need to detect this at the test teardown time.

@@ -10,7 +10,6 @@
import os
import sys

import cupy as cp
Copy link
Member

@leofang leofang Dec 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I treat CuPy as an optional test dependency, so any reference to CuPy in this file should be removed. (We're not using too much memory during tests anyway.)

Comment on lines +16 to +21
def can_load_generated_ptx():
_, driver_ver = cuda.cuDriverGetVersion()
_, nvrtc_major, nvrtc_minor = nvrtc.nvrtcVersion()
if nvrtc_major * 1000 + nvrtc_minor * 10 > driver_ver:
return False
return True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @ksimpson-work this is akin to this snippet that I added to CuPy in the past (PTX might not be loadable/JIT'able if it's newer than the driver):
https://github.com/cupy/cupy/blob/8eb16ac910e85c119a20f68a69de9a2e6034069c/tests/cupy_tests/core_tests/test_raw.py#L557-L568

@leofang leofang merged commit 0723d62 into NVIDIA:main Dec 8, 2024
46 of 48 checks passed
@leofang
Copy link
Member

leofang commented Dec 8, 2024

Thanks for help, @sandeepd-nv!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD CI/CD infrastructure cuda.bindings Everything related to the cuda.bindings module cuda.core Everything related to the cuda.core module P0 High priority - Must do! test Addition or improved tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add tests to github actions
2 participants