Fix out-of-bound access in `compute_max_variable_violation`. #146

legrosbuffle · 2025-06-26T12:28:55Z

When there are free variables, the lower/upper bounds and assignment arrays are not necessarily the same size as that the original problem.

In that case, when computing max violation we're accessing out-of-bounds data.

This is already covered by bound_standardization_test.

I'm not quite sure about whether assert(problem_ptr->variable_lower_bounds.size() >= num_variables); should actually be an equality (i.e., how/if the mapping from variables to bounds is handled). But at least this is strictly better than out-of-bound accesses.

When [there are free variables](https://github.com/NVIDIA/cuopt/blob/63fbb6b22c8949798fffe8cb34ace85ad203f2bb/cpp/src/mip/problem/problem.cu#L1234), the lower/upper bounds and assignment arrays are not necessarily the same size as that the original problem. In that case, when computing max violation we're accessing out-of-bounds data. This is already covered by `bound_standardization_test`.

copy-pr-bot · 2025-06-26T12:29:00Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

legrosbuffle · 2025-06-26T13:28:29Z

I'm not quite sure about whether assert(problem_ptr->variable_lower_bounds.size() >= num_variables); should actually be an equality (i.e., how/if the mapping from variables to bounds is handled). But at least this is strictly better than out-of-bound accesses.

Actually I'm seeing that same pattern of OOB accesses in other places elsewhere, so I'm suspecting that there is a larger pattern of broken invariants with the model. I'm opening a bug to discuss that.

akifcorduk

What's happenning here is actually the opposite. assignment is resized to original problem before we return to the user, but the internal problem_ptr still has the sizes of the modified problem.

akifcorduk · 2025-07-03T06:40:08Z

cpp/src/mip/solution/solution.cu

 f_t solution_t<i_t, f_t>::compute_max_variable_violation()
 {
+  const auto num_variables = view().assignment.size();
+  assert(problem_ptr->variable_lower_bounds.size() >= num_variables);


The standard is to use cuopt_assert to convey the error message. There, we also have the control to enable and disable asserts easier.

I also think it is not possible to infer any asserts here. The size might be greater or lower. The reason is that, we might be eliminating some vars in presolve, or we might be adding some vars because of free vars.

rgsl888prabhu · 2025-07-22T15:53:01Z

/ok to test 4cb0d22

rgsl888prabhu · 2025-07-29T14:28:12Z

/ok to test 11eadc5

tmckayus · 2025-07-29T19:01:25Z

Looks like this is break on assert, perhaps @akifcorduk comment still holds
conda_cpp_tests breaking consistently with this:

after trivial presolve updated 233 constraints 2009 variables. Objective offset 0.000000
ELIM_VAR_REMAP_TEST: /tmp/conda-bld-output/bld/rattler-build_libmps-parser/work/cpp/src/mip/solution/solution.cu:545: f_t cuopt::linear_programming::detail::solution_t<i_t, f_t>::compute_max_variable_violation() [with i_t = int; f_t = double]: Assertion `problem_ptr->variable_lower_bounds.size() >= num_variables' failed.

rgsl888prabhu · 2025-07-31T15:10:00Z

@akifcorduk what's the next course of action here ?

tmckayus · 2025-07-31T16:20:12Z

Moving to 25.10 milestone

akifcorduk · 2025-08-06T12:13:37Z

I couldn't reproduce this issue. It might have been fixed by one of the PRs or this is specific to the custom environment that the OP is using.

anandhkb · 2025-08-08T03:09:58Z

@akifcorduk Could this have already been fixed by one of the merged PRs?

legrosbuffle · 2025-08-18T12:42:46Z

@akifcorduk Could this have already been fixed by one of the merged PRs?

I just checked after updating (at f298994), the invalid memory accesses are still happening.

legrosbuffle · 2025-08-18T13:39:25Z

What's expected of me here ?

rgsl888prabhu · 2025-08-18T13:55:57Z

What's expected of me here?

Nah, I just assigned it to you since you are the owner and added awaiting response for @akifcorduk to get back to your question.

legrosbuffle · 2025-08-18T14:05:17Z

What's expected of me here?

Nah, I just assigned it to you since you are the owner and added awaiting response for @akifcorduk to get back to your question.

Ah ok, thanks.

akifcorduk · 2025-08-19T16:12:12Z

@legrosbuffle could you give me the instructions to reproduce this issue? Data, machine, compiler, settings etc.

legrosbuffle · 2025-08-21T06:00:51Z

@legrosbuffle could you give me the instructions to reproduce this issue? Data, machine, compiler, settings etc.

I thought you had a repro here: https://github.com/NVIDIA/cuopt/actions/runs/16776602702/job/47507156031?pr=258#step:10:2580 ?

For the data, this simply happens in several of the existing unit tests (see the bug for details: #150.

Compiler & settings: Unfortunately we're using a custom toolchain (based on clang clang + libc++) and the only machines with GPUs I have access to require using that toolchain, so I can't give you a usable command-line for repro. But I can reproduce in two different ways: with asserts on (which triggers bound cheking errors in the span), or when running with address sanitizer on . Note that this is not the first time that I'm triggering asserts that you don't seem to be able to reproduce (rapidsai/raft#2732 (comment)).

rgsl888prabhu · 2025-08-22T15:14:12Z

/ok to test 35450be

rgsl888prabhu · 2025-08-22T15:15:00Z

@akifcorduk @legrosbuffle I fixed a merge conflict in the last commit, please revert if there any mistakes.

legrosbuffle · 2025-08-25T13:53:27Z

I can't reproduce the issue in 25.10. Instead the code is failing later on different OOB. The bug in the latter is more obvious and the fix is here: 346

legrosbuffle requested a review from a team as a code owner June 26, 2025 12:28

legrosbuffle requested review from Kh4ster and akifcorduk June 26, 2025 12:28

legrosbuffle mentioned this pull request Jun 26, 2025

[BUG] Lots of array out-of-bound accesses #150

Closed

anandhkb added this to the 25.08 milestone Jul 1, 2025

akifcorduk reviewed Jul 3, 2025

View reviewed changes

akifcorduk added bug Something isn't working non-breaking Introduces a non-breaking change labels Jul 4, 2025

rgsl888prabhu added 2 commits July 17, 2025 11:45

Merge branch 'branch-25.08' into fix-oob-solution

bb772cc

Merge branch 'branch-25.08' into fix-oob-solution

4cb0d22

Merge branch 'branch-25.08' into fix-oob-solution

11eadc5

tmckayus modified the milestones: 25.08, 25.10 Jul 31, 2025

rgsl888prabhu assigned akifcorduk Aug 18, 2025

rgsl888prabhu added the awaiting response This expects a response from maintainer or contributor depending on who requested in last comment. label Aug 18, 2025

rgsl888prabhu assigned legrosbuffle and unassigned akifcorduk Aug 18, 2025

legrosbuffle removed their assignment Aug 18, 2025

rgsl888prabhu requested a review from akifcorduk August 18, 2025 14:00

rgsl888prabhu changed the base branch from branch-25.08 to branch-25.10 August 22, 2025 15:09

Merge branch 'branch-25.10' into fix-oob-solution

35450be

legrosbuffle closed this Aug 25, 2025

Fix out-of-bound access in compute_max_variable_violation. #146

Fix out-of-bound access in compute_max_variable_violation. #146

Uh oh!

Conversation

legrosbuffle commented Jun 26, 2025

Uh oh!

copy-pr-bot bot commented Jun 26, 2025

Uh oh!

legrosbuffle commented Jun 26, 2025

Uh oh!

akifcorduk left a comment

Choose a reason for hiding this comment

Uh oh!

akifcorduk Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

akifcorduk Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

rgsl888prabhu commented Jul 22, 2025

Uh oh!

rgsl888prabhu commented Jul 29, 2025

Uh oh!

tmckayus commented Jul 29, 2025

Uh oh!

rgsl888prabhu commented Jul 31, 2025

Uh oh!

tmckayus commented Jul 31, 2025

Uh oh!

akifcorduk commented Aug 6, 2025

Uh oh!

anandhkb commented Aug 8, 2025

Uh oh!

legrosbuffle commented Aug 18, 2025

Uh oh!

legrosbuffle commented Aug 18, 2025

Uh oh!

rgsl888prabhu commented Aug 18, 2025

Uh oh!

legrosbuffle commented Aug 18, 2025

Uh oh!

akifcorduk commented Aug 19, 2025

Uh oh!

legrosbuffle commented Aug 21, 2025

Uh oh!

rgsl888prabhu commented Aug 22, 2025

Uh oh!

rgsl888prabhu commented Aug 22, 2025

Uh oh!

legrosbuffle commented Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Fix out-of-bound access in `compute_max_variable_violation`. #146

Fix out-of-bound access in `compute_max_variable_violation`. #146