Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement judicious skipping of CI jobs #94

Open
vyasr opened this issue Aug 21, 2024 · 13 comments
Open

Implement judicious skipping of CI jobs #94

vyasr opened this issue Aug 21, 2024 · 13 comments
Assignees

Comments

@vyasr
Copy link
Contributor

vyasr commented Aug 21, 2024

Currently any change to any RAPIDS repo triggers a complete run of the entire build and test suite of CI jobs. This is very expensive, and is often unnecessary. RAPIDS libraries are typically structured such that they have a clear, linear dependency chain between different components. Some examples:

  • Changes to Python code should have no effect whatsoever on C++ testing
  • For repositories that publish multiple Python packages, there is typically a linear dependency between at least some of these such that changes to downstream packages has no effect on the tests of upstream package (e.g. changes to cugraph would not affect pylibcugraph tests).
  • Changes to documentation should only require documentation rebuilds, no test runs.

We can reduce the number of unnecessary jobs that we run in CI by more judiciously skipping jobs that are unnecessary. The simplest approach to do this is by simply checking what files have changed. We have previously implemented a form of this in cudf for both cudf.pandas and now for cudf-polars. To do this, I would propose the following steps:

  • Generalizing the above logic for detecting changes into a shared workflow
  • Enable filtering jobs using the above shared workflow
  • Generalize the pr-builder job to allow some jobs to be skipped under appropriate circumstances.
@KyleFromNVIDIA
Copy link
Contributor

pytest-incremental may help us with skipping unneeded Python tests:

https://pypi.org/project/pytest-incremental/

It hasn't been updated in a while though, it may need some work.

KyleFromNVIDIA added a commit to KyleFromNVIDIA/cudf that referenced this issue Aug 22, 2024
Only run tests based on things that have actually changed. For
example, if only Python files have changed, we don't need to run
the C++ tests.

Contributes to rapidsai/build-planning#94
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this issue Aug 27, 2024
Only run tests based on things that have actually changed. For example, if only Python files have changed, we don't need to run the C++ tests.

Contributes to rapidsai/build-planning#94

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Robert Maynard (https://github.com/robertmaynard)

URL: #16642
KyleFromNVIDIA added a commit to KyleFromNVIDIA/cugraph that referenced this issue Aug 27, 2024
Only run tests based on things that have actually changed. For
example, if only Python files have changed, we don't need to run
the C++ tests.

Contributes to rapidsai/build-planning#94
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this issue Sep 26, 2024
copy-pr-bot bot pushed a commit to rapidsai/cudf that referenced this issue Sep 28, 2024
rapids-bot bot pushed a commit to rapidsai/cugraph that referenced this issue Oct 1, 2024
Only run tests based on things that have actually changed. For example, if only Python files have changed, we don't need to run the C++ tests.

Contributes to rapidsai/build-planning#94

Depends on rapidsai/shared-workflows#239

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - Jake Awe (https://github.com/AyodeAwe)

URL: #4634
@vyasr
Copy link
Contributor Author

vyasr commented Oct 1, 2024

@KyleFromNVIDIA now that we have a common shared workflow and about a month of data on how pruning the jobs works, do we want to roll this feature out to the rest of RAPIDS and then close this issue?

KyleFromNVIDIA added a commit to KyleFromNVIDIA/cuml that referenced this issue Oct 3, 2024
KyleFromNVIDIA added a commit to KyleFromNVIDIA/raft that referenced this issue Oct 3, 2024
rapids-bot bot pushed a commit to rapidsai/cuml that referenced this issue Oct 3, 2024
KyleFromNVIDIA added a commit to KyleFromNVIDIA/kvikio that referenced this issue Oct 3, 2024
KyleFromNVIDIA added a commit to KyleFromNVIDIA/cuvs that referenced this issue Oct 3, 2024
rapids-bot bot pushed a commit to rapidsai/raft that referenced this issue Oct 4, 2024
rapids-bot bot pushed a commit to rapidsai/cuvs that referenced this issue Oct 4, 2024
KyleFromNVIDIA added a commit to KyleFromNVIDIA/ucxx that referenced this issue Oct 4, 2024
KyleFromNVIDIA added a commit to KyleFromNVIDIA/cuxfilter that referenced this issue Oct 4, 2024
KyleFromNVIDIA added a commit to KyleFromNVIDIA/rmm that referenced this issue Oct 4, 2024
KyleFromNVIDIA added a commit to KyleFromNVIDIA/wholegraph that referenced this issue Oct 4, 2024
rapids-bot bot pushed a commit to rapidsai/rmm that referenced this issue Oct 4, 2024
rapids-bot bot pushed a commit to rapidsai/cuxfilter that referenced this issue Oct 4, 2024
rapids-bot bot pushed a commit to rapidsai/wholegraph that referenced this issue Oct 4, 2024
rapids-bot bot pushed a commit to rapidsai/ucxx that referenced this issue Oct 4, 2024
rapids-bot bot pushed a commit to rapidsai/kvikio that referenced this issue Oct 7, 2024
@jakirkham
Copy link
Member

In some cases we want to run all CI. Is there a good way to bypass this selection behavior?

@KyleFromNVIDIA
Copy link
Contributor

Please describe a scenario in which you want to run a CI job even though the relevant files haven't changed. We can look at ways to force it to run.

In the meantime, as a workaround, you can add a comment to a relevant file to force CI to run, then remove it again once CI has passed.

@KyleFromNVIDIA
Copy link
Contributor

We could create a new label called "Force CI Run". Then the condition could be something like this:

if: fromJSON(needs.changed-files.outputs.changed_file_groups).test_cpp || fromJSON(needs.pr-info.outputs.pr-info).labels.*.name == "Force CI Run"

Or we could even modify the changed-files workflow with a new output to give us the "force CI run" status:

if: fromJSON(needs.changed-files.outputs.changed_file_groups).test_cpp || needs.changed-files.outputs.force_ci_run

If the label is initially not provided, and is added later, we can then trigger a new workflow run with the label applied by:

git commit --allow-empty -m 'Re-run Ci'
git push

@jameslamb
Copy link
Member

My perspective: if the main goal is just to support PRs like rapidsai/pynvjitlink#107 of the form "I just want to see if a CI issue is unrelated to my code changes", I'd prefer just adding a comment to a central file (like a C++ source file) instead of adding complexity to the changed-files mechanism.

@KyleFromNVIDIA
Copy link
Contributor

I've opened rapidsai/shared-workflows#249 and rapidsai/cudf#17064 to test out the approach outlined at #94 (comment).

@KyleFromNVIDIA
Copy link
Contributor

The above POC works, but if @jameslamb is correct then it may be better to just modify a C++ file with a comment to force the run. @jakirkham WDYT?

@vyasr
Copy link
Contributor Author

vyasr commented Oct 11, 2024

I made the same assumption as James regarding the purpose of the request and come to the same conclusion. I would prefer to discourage using CI as a way to "just run tests", which is one possible use case. If we are specifically testing the behavior of CI (e.g. CI fails but locally tests pass), then adding a trivial change is not a very high barrier. Alternatively, instead of using PR CI you can use either build or test workflows. Those both have workflow triggers available so they can be run directly from the Actions tab, no PR needed. A special label feels like extra machinery that we don't need.

@jameslamb
Copy link
Member

@KyleFromNVIDIA what is left for this? It's difficult for me to tell which repos we still haven't made these changes for.

@KyleFromNVIDIA
Copy link
Contributor

Looks like cuspatial has not yet been done. I'll go work on that now. I'm not sure if there's anything else. The way to tell is to look at .github/workflows/pr.yaml and see if there is a changed-files workflow.

@jameslamb
Copy link
Member

jameslamb commented Oct 23, 2024

The way to tell is to look at .github/workflows/pr.yaml and see if there is a changed-files workflow.

Ok, I was hoping you had a list of what remained and it just hadn't made it into this issue yet. I'm not planning to go through all the repos looking for that workflow, I trust your estimation of what's left.

rapids-bot bot pushed a commit to rapidsai/cuspatial that referenced this issue Oct 23, 2024
@jameslamb
Copy link
Member

@KyleFromNVIDIA could you please do one more check like the one you mentioned in #94 (comment), and then close this issue if you think it's done?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants