Skip to content

Conversation

@kaatish
Copy link
Contributor

@kaatish kaatish commented Jul 26, 2025

This PR removes cuda graph capture from load balanced bounds strengthening to address crashes encountered in the constructor. The cuda graph is instead created manually. Fixes #219

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jul 26, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@kaatish kaatish added bug Something isn't working non-breaking Introduces a non-breaking change labels Jul 26, 2025
@rgsl888prabhu rgsl888prabhu added this to the 25.08 milestone Jul 29, 2025
@tmckayus
Copy link
Contributor

This one is critical for 25.08

@tmckayus
Copy link
Contributor

/ok to test 8596529

@rgsl888prabhu
Copy link
Collaborator

/ok to test 10f87ea

@rgsl888prabhu
Copy link
Collaborator

@kaatish Should we merge this PR ?

@kaatish kaatish marked this pull request as ready for review July 30, 2025 20:59
@kaatish kaatish requested review from a team as code owners July 30, 2025 20:59
@kaatish
Copy link
Contributor Author

kaatish commented Jul 30, 2025

/ok to test 2bb8de9

@kaatish kaatish changed the title Remove cuda graphs from load balanced bounds presolve Manual cuda graph creation in load balanced bounds presolve Jul 30, 2025
@kaatish
Copy link
Contributor Author

kaatish commented Jul 30, 2025

/ok to test 18b7202

@kaatish
Copy link
Contributor Author

kaatish commented Jul 30, 2025

/ok to test 5502c8d

Copy link
Collaborator

@rgsl888prabhu rgsl888prabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor suggestion, rest looks good

{
using f_t2 = typename type_2<f_t>::type;
cudaGraph_t cnst_slack_graph;
cudaGraphCreate(&cnst_slack_graph, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we recreating the graph each time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We create this structure to add graph nodes to it. Once the structure is complete, we instantiate the actual graph with a call like :
cudaGraphInstantiate(&upd_bnd_exec, upd_graph, NULL, NULL, 0);
This is done once per graph in the setup function inside constructor by calling either create_bounds_update_graph() or create_constraint_slack_graph().

@kaatish
Copy link
Contributor Author

kaatish commented Jul 31, 2025

/ok to test e6afa2e

@kaatish
Copy link
Contributor Author

kaatish commented Jul 31, 2025

/merge

@rapids-bot rapids-bot bot merged commit c3ecbb8 into NVIDIA:branch-25.08 Jul 31, 2025
143 of 144 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Load balanced presolve constructor crashes intermittently

4 participants