Skip to content

Conversation

@akifcorduk
Copy link
Contributor

This PR fixes a log level and a compilation error that happens on non-CI envs.

@akifcorduk akifcorduk added this to the 25.10 milestone Oct 6, 2025
@akifcorduk akifcorduk requested a review from a team as a code owner October 6, 2025 16:55
@akifcorduk akifcorduk added bug Something isn't working non-breaking Introduces a non-breaking change labels Oct 6, 2025
Copy link
Contributor

@hlinsen hlinsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Akif. Do you think we could also disable postsolve logs or change the log level https://github.com/NVIDIA/cuopt/blob/branch-25.10/cpp/src/mip/presolve/third_party_presolve.cpp#L311?

@akifcorduk
Copy link
Contributor Author

@hlinsen IMO, we can keep it. The user would know why we are returning a slightly infeasible solution. But I think we can get @chris-maes opinion too.

@rgsl888prabhu
Copy link
Collaborator

@akifcorduk not sure why it is failing or even where it is failing, but this test run has been failing several times, I think it has been run atleast 4 times. Please take a look in earlier run if you can find any clue https://github.com/NVIDIA/cuopt/actions/runs/18288323659/job/52087939557?pr=450

On the top you can see attempts, where you can switch to previous runs.

@nguidotti
Copy link
Contributor

nguidotti commented Oct 7, 2025

Some test runs returns a SIGSEGV, but their location depends on each attempt...

Latest:

 [ RUN      ] level0_ges/double_test_vrp.GES_VRP/0
double free or corruption (out)
timeout: the monitored command dumped core
ci/test_cpp.sh: line 61:  1634 Aborted                 timeout 20m "${gt}" --gtest_output=xml:"{RAPIDS_TESTS_DIR}"

3rd attempt:

 [ RUN      ] pdlp_class.test_max_with_offset
cuOpt version: 25.10.0, git hash: 2fae644, host arch: x86_64, device archs: 70-real,75-real,80-real,86-real,90a-real,100f-real,120a-real,120
CPU: AMD EPYC 9554 64-Core Processor, threads (physical/logical): 12/12, RAM: 42.45 GiB
CUDA 12.2, device: NVIDIA L4 (ID 0), VRAM: 21.96 GiB
CUDA device UUID: ffffff984a044b-0337-ffffff80ffffffc5

Third-party presolve is disabled, skipping
Solving a problem with 1 constraints 3 variables (0 integers) and 3 nonzeros
Objective offset 4.000000 scaling_factor -1.000000
   Iter    Primal Obj.      Dual Obj.    Gap        Primal Res.  Dual Res.   Time
      0 -4.00000000e+00 -4.00000000e+00  0.00e+00   0.00e+00     3.74e+00   0.003s
    210 -7.25214461e-05 -7.23309242e-05  1.91e-07   0.00e+00     1.23e-04   0.012s
LP Solver status:                Optimal
Primal objective:                -7.25214461e-05
Dual objective:                  -7.23309242e-05
Duality gap (abs/rel):           +1.91e-07 / +1.90e-07
Primal infeasibility (abs/rel):  +0.00e+00 / +0.00e+00
Dual infeasibility (abs/rel):    +1.23e-04 / +2.60e-05
PDLP finished
Status: Optimal   Objective: -7.25214461e-05  Iterations: 210  Time: 0.012s, Total time 0.013s
[       OK ] pdlp_class.test_max_with_offset (17 ms)
[ RUN      ] pdlp_class.test_lp_no_constraints
cuOpt version: 25.10.0, git hash: 2fae644, host arch: x86_64, device archs: 70-real,75-real,80-real,86-real,90a-real,100f-real,120a-real,120
CPU: AMD EPYC 9554 64-Core Processor, threads (physical/logical): 12/12, RAM: 42.45 GiB
CUDA 12.2, device: NVIDIA L4 (ID 0), VRAM: 21.96 GiB
CUDA device UUID: ffffff984a044b-0337-ffffff80ffffffc5

Third-party presolve is disabled, skipping
Solving a problem with 0 constraints 1 variables (0 integers) and 0 nonzeros
Objective offset -0.000000 scaling_factor -1.000000
Running concurrent

No constraints in the problem: PDLP can't be run, use Dual Simplex instead.
PDLP finished
Status: A numerical error was encountered.   Objective: nan  Iterations: -1  Time: 0.000s, Total time 0.001s
Handling removed variables 1
Dual simplex finished in 0.00 seconds, total time 0.00
terminate called after throwing an instance of 'thrust::system::system_error'
  what():  parallel_for failed: cudaErrorInvalidConfiguration: invalid configuration argument
timeout: the monitored command dumped core
ci/test_cpp.sh: line 61:  1595 Aborted                 timeout 20m "${gt}" --gtest_output=xml:"{RAPIDS_TESTS_DIR}"

akifcorduk and others added 3 commits October 7, 2025 04:19
This PR fixes the delayed termination of the branch-and-bound algorithm after reaching the time limit.

Authors:
  - Nicolas L. Guidotti (https://github.com/nguidotti)

Approvers:
  - Chris Maes (https://github.com/chris-maes)

URL: NVIDIA#451
@akifcorduk
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit cfd982f into NVIDIA:branch-25.10 Oct 7, 2025
88 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants