Skip to content

Conversation

@jameslamb
Copy link
Member

@jameslamb jameslamb commented Sep 22, 2025

Description

Replaces #359 (my more-complicated earlier attempt at this)

This project runs nightly builds and tests on a cron schedule:

name: Trigger Nightly cuOpt Pipeline
on:
workflow_dispatch:
schedule:
- cron: "0 5 * * *" # 5am UTC / 1am EST

Tests need to wait for builds to finish, and that's currently done with some shell scripts that hit the GitHub API, using a mix of sleep and polling.

This has sometimes resulted in nightly failures (network errors, timeouts, etc.). This PR proposes reducing the risk of such failures by moving that logic into GitHub Actions configuration directly, specifically:

  • making build.yaml trigger test.yaml with the GitHub CLI only after all package builds and publishing have finished

Issue

Contributes to #122

Notes for Reviewers

How I tested this

I manually triggered this run of the "Trigger Nightly cuOpt Pipeline": https://github.com/NVIDIA/cuopt/actions/runs/17935159871

Which triggered this build run: https://github.com/NVIDIA/cuopt/actions/runs/17935161536

Which triggered this test run: https://github.com/NVIDIA/cuopt/actions/runs/17936474025

Things look ok to me!

The test run was triggered until after all the relevant package builds and uploads were done, and BEFORE the docker image builds were done (as intended, to not be delayed waiting on them).

There are some test failures from artifact-downloading, like this:

[rapids-github-run-id] Querying the GitHub API to determine relevant run of 'build.yaml'.
Downloading and decompressing cuopt_wheel_python_cuopt_server_cu12_py312_x86_64 from Run ID 17936253863 into /tmp/tmp.pqrBXIhMlP

But I think they'll be fixed by merging #409

And the naming changes for the image builds look good 😁

image

@jameslamb jameslamb added the do not merge Do not merge if this flag is set label Sep 22, 2025
@copy-pr-bot
Copy link

copy-pr-bot bot commented Sep 22, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@jameslamb
Copy link
Member Author

/ok to test

@jameslamb jameslamb added non-breaking Introduces a non-breaking change improvement Improves an existing functionality and removed do not merge Do not merge if this flag is set labels Sep 23, 2025
@jameslamb jameslamb changed the title WIP: avoid triggering nightly tests until builds are complete avoid triggering nightly tests until builds are complete Sep 23, 2025
@jameslamb jameslamb marked this pull request as ready for review September 23, 2025 05:29
@jameslamb jameslamb requested a review from a team as a code owner September 23, 2025 05:29
@rgsl888prabhu
Copy link
Collaborator

If you still see the issue in doc, can you please update the link to https://docs.nvidia.com/ngc/latest/ngc-private-registry-user-guide.html#generating-a-personal-api-key

@jameslamb
Copy link
Member Author

Ok yep will do!

And I think this new error:

ImportError: /opt/conda/envs/docs/lib/python3.13/site-packages/pylibcudf/../../../libcudf.so: undefined symbol: _ZN3rmm16cuda_stream_poolC1Em

(docs-build link)

Is a result of rapidsai/rmm#2036

It should be fixed by other RAPIDS packages being rebuilt, which @bdice triggered here: https://github.com/rapidsai/workflows/actions/runs/17949845053

@jameslamb
Copy link
Member Author

It should be fixed by other RAPIDS packages being rebuilt

Looks like that was not enough, probably for the reasons being discussed in rapidsai/build-planning#218

This should hopefully be resolved later today when the RAPIDS Ops team deletes some nightly packages to allow new ones to be published.

@jameslamb jameslamb requested a review from a team as a code owner September 24, 2025 03:27
@jameslamb jameslamb requested a review from Iroy30 September 24, 2025 03:27
@jameslamb
Copy link
Member Author

If you still see the issue in doc, can you please update the link to https://docs.nvidia.com/ngc/latest/ngc-private-registry-user-guide.html#generating-a-personal-api-key

This did fail again 😭

https://github.com/NVIDIA/cuopt/actions/runs/17962030366/job/51090641582?pr=408

Updated those links in the way you suggested: 386dad2

@jameslamb
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 0c69099 into branch-25.10 Sep 24, 2025
173 of 174 checks passed
@jameslamb jameslamb deleted the trigger-nightly-tests-after-builds branch September 24, 2025 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants