Skip to content

Conversation

@nguidotti
Copy link
Contributor

@nguidotti nguidotti commented Dec 15, 2025

This PR introduces the following changes:

  • Impose a node, iteration and backtrack limit to force the exploration of different branches of the tree.
  • Unified the diving and best-first heap into a single object (node_queue) to allow the sharing of information between them. This also greatly reduces memory consumption (33GB vs 48GB for neos-848589 after 250s) since the lower and upper variable no longer needs to be stored for diving.
  • Order the node in the diving heap based on the pseudocost estimate [1, Section 6.4].

The performance remains the same as the main branch: 226 feasible solutions with ~12.8% primal gap on a GH200 for the MIPLIB2017 dataset.

This PR was split from #697 to be easier to review.

Reference:

[1] T. Achterberg, “Constraint Integer Programming,” PhD, Technischen Universität Berlin,
Berlin, 2007. doi: 10.14279/depositonce-1634.

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

Summary by CodeRabbit

  • Bug Fixes

    • Improved iteration-limit handling across solver threads to prevent runaway solves and surface iteration-limit outcomes.
    • Switched to consistent debug-style logging for clearer diagnostics.
  • Refactor

    • Replaced multiple legacy heap/queue mechanisms with a unified, thread-safe node-scheduling queue to reduce contention and simplify best-first/diving workflows.
    • Removed the specialized diving queue and related mutexed structures.
  • New Features

    • Nodes now carry an objective estimate to better guide branching.
    • Per-call statistics capture timing, node counts, and iteration metrics for diagnostics.
  • API

    • Added an objective-estimate helper to compute branch scoring and expose estimate results.

✏️ Tip: You can customize this high-level summary in your review settings.

@nguidotti nguidotti added this to the 26.02 milestone Dec 15, 2025
@nguidotti nguidotti self-assigned this Dec 15, 2025
@nguidotti nguidotti requested a review from a team as a code owner December 15, 2025 17:36
@nguidotti nguidotti added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Dec 15, 2025
@nguidotti nguidotti added the mip label Dec 15, 2025
@nguidotti nguidotti changed the title Node queue Unified Node Queue + Diving Node and Iteration Limits Dec 15, 2025
@nguidotti nguidotti requested a review from chris-maes December 15, 2025 17:37
@coderabbitai
Copy link

coderabbitai bot commented Dec 15, 2025

📝 Walkthrough

Walkthrough

Replaces heap/diving-queue node management with a thread-safe dual-heap node_queue, introduces bnb_stats_t and iteration-limit handling, adds per-node objective_estimate and a variable-selection API that returns it, removes diving_queue, and updates logging/locking and pseudo-costs with an obj_estimate API.

Changes

Cohort / File(s) Summary
Node queue & heap utility
cpp/src/dual_simplex/node_queue.hpp
New header: heap_t generic heap and node_queue_t implementing synchronized best-first (lower_bound) and diving (score) heaps, thread-safe push/pop, lower-bound query, and size accessors.
Branch-and-bound core
cpp/src/dual_simplex/branch_and_bound.hpp, cpp/src/dual_simplex/branch_and_bound.cpp
Replace heap/diving-queue with node_queue_; add bnb_stats_t and use it for exploration_stats_; rename/replace explore_subtreeplunge_from and add dive_from; solve_node(...) signature adds bnb_stats_t& stats; switch to variable_selection_and_obj_estimate(...) storing node->objective_estimate; propagate ITERATION_LIMIT; replace direct heap ops with node_queue APIs; aggregate lower bounds via node_queue.
Diving queue removal
cpp/src/dual_simplex/diving_queue.hpp
Deleted file: removes diving_root_t and diving_queue_t min-heap and its public API (push/emplace/pop/size/top/clear).
Node data changes
cpp/src/dual_simplex/mip_node.hpp
Add public objective_estimate (initialized to +inf), propagate from parent constructors and in detach_copy(); remove old node_compare_t.
Variable selection / pseudo-costs
cpp/src/dual_simplex/pseudo_costs.hpp, cpp/src/dual_simplex/pseudo_costs.cpp
Add obj_estimate(...) API computing an objective estimate from pseudo-costs; replace manual mutex lock/unlock with std::lock_guard<omp_mutex_t>; convert some printf logs to log.debug; integrate new estimate into selection flow.
Bounds strengthening logging
cpp/src/dual_simplex/bounds_strengthening.cpp
Replace printf logging calls with log.debug for infeasibility and post-update checks; no behavioral changes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the three main changes: introducing a unified node queue, diving node ordering, and iteration limits—matching the stated PR objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

🧹 Recent nitpick comments
cpp/src/dual_simplex/pseudo_costs.cpp (2)

317-360: Review the epsilon-clamping approach in the objective estimate formula.

The implementation uses std::min(std::max(pseudo_cost_down * f_down, eps), std::max(pseudo_cost_up * f_up, eps)), which applies the epsilon floor to each term before taking the minimum. This differs from Achterberg's formula which takes min(Ψ_j^- * f_j^-, Ψ_j^+ * f_j^+) directly.

If a pseudo-cost product is very small (e.g., 1e-8), flooring it to eps (1e-6) before the min operation could overestimate the contribution. Consider whether this behavior is intentional for numerical stability or should be adjusted:

-    estimate +=
-      std::min(std::max(pseudo_cost_down * f_down, eps), std::max(pseudo_cost_up * f_up, eps));
+    f_t down_term = pseudo_cost_down * f_down;
+    f_t up_term = pseudo_cost_up * f_up;
+    f_t contribution = std::max(std::min(down_term, up_term), eps);
+    estimate += contribution;

Additionally, consider adding a check for non-finite values to prevent NaN/inf from corrupting the estimate:

if (!std::isfinite(contribution)) { contribution = eps; }

335-355: Consider extracting common pseudo-cost lookup logic.

The pseudo-cost retrieval pattern (lines 340-350) duplicates the logic in variable_selection (lines 282-292). A small helper could reduce this duplication:

std::pair<f_t, f_t> get_pseudo_costs(i_t j, f_t pseudo_cost_down_avg, f_t pseudo_cost_up_avg) const {
  f_t down = pseudo_cost_num_down[j] != 0 
             ? pseudo_cost_sum_down[j] / pseudo_cost_num_down[j] 
             : pseudo_cost_down_avg;
  f_t up = pseudo_cost_num_up[j] != 0 
           ? pseudo_cost_sum_up[j] / pseudo_cost_num_up[j] 
           : pseudo_cost_up_avg;
  return {down, up};
}
cpp/src/dual_simplex/node_queue.hpp (2)

116-137: Inconsistent locking pattern between pop methods and other methods.

pop_best_first() and pop_diving() do not acquire the mutex internally, requiring callers to use lock()/unlock() explicitly. However, push(), get_lower_bound(), and the size methods acquire the lock internally. This inconsistency is error-prone.

The current design requires careful coordination:

node_queue_.lock();
auto node = node_queue_.pop_best_first();
// ... use node ...
node_queue_.unlock();

While the comment explains the rationale (need to increment active_subtree atomically with pop), consider either:

  1. Documenting this requirement prominently in the class
  2. Providing a combined pop_best_first_locked() that returns while holding the lock, with an RAII guard

The pop_diving() loop correctly skips entries nullified by pop_best_first(), which is good.


36-41: Minor: Use std::forward<Args>(args)... without extra &&.

The emplace method has a minor forwarding issue:

   template <typename... Args>
   void emplace(Args&&... args)
   {
-    buffer.emplace_back(std::forward<Args&&>(args)...);
+    buffer.emplace_back(std::forward<Args>(args)...);
     std::push_heap(buffer.begin(), buffer.end(), comp);
   }

std::forward<Args> is the correct pattern; std::forward<Args&&> is redundant.

cpp/src/dual_simplex/branch_and_bound.cpp (2)

1088-1089: Consider making diving limits configurable.

The diving_node_limit = 500 and diving_backtrack = 5 are hard-coded. While the PR author noted this is temporary pending PR #697, consider at minimum extracting these as named constants at namespace/class scope for visibility.


1123-1127: Simplify dive_stats initialization.

nodes_unexplored is initialized to 0 and never used in the diving loop. Consider removing unused fields or documenting why they're present:

       bnb_stats_t<i_t, f_t> dive_stats;
       dive_stats.total_lp_iters      = 0;
       dive_stats.total_lp_solve_time = 0;
       dive_stats.nodes_explored      = 0;
-      dive_stats.nodes_unexplored    = 0;

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a36bf03 and 640d378.

📒 Files selected for processing (5)
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/node_queue.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{cu,cuh,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...

Files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/node_queue.hpp
**/*.{h,hpp,py}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

Verify C API does not break ABI stability (no struct layout changes, field reordering); maintain backward compatibility in Python and server APIs with deprecation warnings

Files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/node_queue.hpp
**/*.{cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state

Files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/node_queue.hpp
**/*.{cu,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/node_queue.hpp
🧠 Learnings (21)
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/node_queue.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/node_queue.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
📚 Learning: 2025-12-04T04:11:12.640Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 500
File: cpp/src/dual_simplex/scaling.cpp:68-76
Timestamp: 2025-12-04T04:11:12.640Z
Learning: In the cuOPT dual simplex solver, CSR/CSC matrices (including the quadratic objective matrix Q) are required to have valid dimensions and indices by construction. Runtime bounds checking in performance-critical paths like matrix scaling is avoided to prevent slowdowns. Validation is performed via debug-only check_matrix() calls wrapped in `#ifdef` CHECK_MATRIX.

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-12-04T20:09:09.264Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 602
File: cpp/src/linear_programming/solve.cu:732-742
Timestamp: 2025-12-04T20:09:09.264Z
Learning: In cpp/src/linear_programming/solve.cu, the barrier solver does not currently return INFEASIBLE or UNBOUNDED status. It only returns OPTIMAL, TIME_LIMIT, NUMERICAL_ISSUES, or CONCURRENT_LIMIT.

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
📚 Learning: 2026-01-14T00:38:33.700Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 746
File: cpp/src/dual_simplex/barrier.cu:549-552
Timestamp: 2026-01-14T00:38:33.700Z
Learning: In cpp/src/dual_simplex/barrier.cu's form_adat() method within the barrier solver, the raft::copy from d_original_A_values to device_AD.x is necessary and must not be removed. This copy restores the original matrix A values before they are scaled by the diagonal matrix D (via d_inv_diag_prime). Since D changes between iterations and the scaling mutates device_AD.x, copying the original values each iteration is required for correctness.

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Ensure test isolation: prevent GPU state, cached memory, and global variables from leaking between test cases; verify each test independently initializes its environment

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
🧬 Code graph analysis (2)
cpp/src/dual_simplex/branch_and_bound.cpp (3)
cpp/src/dual_simplex/branch_and_bound.hpp (6)
  • node_ptr (252-264)
  • node_ptr (267-267)
  • node (212-215)
  • task_id (220-227)
  • task_id (231-233)
  • start_node (237-245)
cpp/src/dual_simplex/mip_node.hpp (6)
  • node_ptr (277-283)
  • node_ptr (277-277)
  • log (329-337)
  • log (329-332)
  • log (339-354)
  • log (339-344)
cpp/src/dual_simplex/node_queue.hpp (4)
  • node (24-28)
  • node (24-24)
  • node (30-34)
  • node (30-30)
cpp/src/dual_simplex/pseudo_costs.cpp (1)
cpp/src/dual_simplex/pseudo_costs.hpp (4)
  • fractional (46-49)
  • fractional (51-53)
  • fractional (55-56)
  • num_initialized_down (41-44)
🪛 Clang (14.0.6)
cpp/src/dual_simplex/node_queue.hpp

[error] 8-8: 'algorithm' file not found

(clang-diagnostic-error)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: wheel-build-cuopt-server / 12.9.1, 3.13, amd64, rockylinux8
  • GitHub Check: conda-cpp-build / 12.9.1, 3.10, arm64, rockylinux8
  • GitHub Check: conda-cpp-build / 13.1.0, 3.10, amd64, rockylinux8
  • GitHub Check: conda-cpp-build / 12.9.1, 3.10, amd64, rockylinux8
  • GitHub Check: conda-cpp-build / 13.1.0, 3.10, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-sh-client / 13.1.0, 3.10, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.10, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.12, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.11, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.12, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.13, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.10, amd64, rockylinux8
🔇 Additional comments (11)
cpp/src/dual_simplex/pseudo_costs.hpp (1)

46-49: New obj_estimate API declaration looks good.

The function signature follows the existing patterns in the class and addresses the previous review feedback about separating the objective estimate computation from variable selection.

cpp/src/dual_simplex/pseudo_costs.cpp (1)

202-202: RAII mutex management is a good improvement.

Using std::lock_guard<omp_mutex_t> instead of manual lock/unlock improves exception safety and reduces risk of deadlocks from early returns.

cpp/src/dual_simplex/node_queue.hpp (1)

63-76: LGTM: Dual-heap entry design for memory sharing.

The heap_entry_t with shared_ptr allows both heaps to reference the same node, enabling the memory-saving optimization mentioned in the PR objectives (33 GB vs 48 GB). The nullification pattern in pop_best_first correctly prevents double-processing.

cpp/src/dual_simplex/branch_and_bound.hpp (1)

176-177: Good: Unified node queue replaces separate heap and diving queue.

The node_queue_t abstraction simplifies node management and enables memory sharing between best-first and diving strategies as described in the PR objectives.

cpp/src/dual_simplex/branch_and_bound.cpp (7)

600-605: Iteration limit logic for diving threads looks correct.

The formula 0.05 * bnb_lp_iters - stats.total_lp_iters limits diving threads to 5% of total branch-and-bound LP iterations, preventing them from consuming too many resources. The early return for non-positive limits is appropriate.


735-739: Good: Objective estimate computed after variable selection.

Computing objective_estimate here (only for exploration threads that insert into the heap) is efficient since diving threads don't need it for their node ordering.


1020-1026: Correct lock/unlock pattern around pop_best_first.

The explicit locking ensures active_subtrees_++ is atomic with the pop operation, preventing race conditions where threads might exit early thinking the solver has finished.


1030-1037: Good: Fathom check before expensive operations.

Checking get_upper_bound() < node->lower_bound before starting the plunge avoids unnecessary bound strengthening and LP solves on already-fathomed nodes.


1100-1109: Thread-safe node detachment for diving.

The pattern of copying variable bounds and detaching the node while holding the lock is correct. This prevents the original node from being fathomed (deleted) before the diving thread reads its bounds.


1176-1180: Backtrack depth limiting is well-documented.

The comment explains the purpose clearly: limiting backtracks to nodes within 5 levels of the current depth encourages deep dives rather than broad exploration. This aligns with the PR's goal of finding feasible solutions quickly.


1466-1468: Final lower bound calculation handles empty queue correctly.

Using node_queue_.best_first_queue_size() > 0 to guard the get_lower_bound() call and falling back to search_tree_.root.lower_bound prevents issues when all nodes have been explored.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
cpp/src/dual_simplex/node_queue.hpp (1)

36-41: Minor: Redundant && in std::forward.

std::forward<Args&&> works but std::forward<Args> is the canonical form.

   template <typename... Args>
   void emplace(Args&&... args)
   {
-    buffer.emplace_back(std::forward<Args&&>(args)...);
+    buffer.emplace_back(std::forward<Args>(args)...);
     std::push_heap(buffer.begin(), buffer.end(), comp);
   }
cpp/src/dual_simplex/branch_and_bound.cpp (4)

599-604: Potential early termination before any LP work is done.

The iteration limit is calculated as 0.05 * bnb_lp_iters - stats.total_lp_iters. On the first call when both values are small, this could be zero or negative, causing immediate ITERATION_LIMIT return before any LP iterations are performed. This could cause diving threads to spin without making progress.

Consider adding a minimum iteration allowance:

   if (thread_type != thread_type_t::EXPLORATION) {
     i_t bnb_lp_iters            = exploration_stats_.total_lp_iters;
     f_t max_iter                = 0.05 * bnb_lp_iters;
-    lp_settings.iteration_limit = max_iter - stats.total_lp_iters;
-    if (lp_settings.iteration_limit <= 0) { return node_solve_info_t::ITERATION_LIMIT; }
+    lp_settings.iteration_limit = std::max(static_cast<f_t>(100), max_iter - stats.total_lp_iters);
+    if (max_iter > 0 && stats.total_lp_iters >= max_iter) { return node_solve_info_t::ITERATION_LIMIT; }
   }

1127-1131: Consider documenting initialization of dive_stats.

The explicit initialization of dive_stats fields to zero is clear, but since bnb_stats_t already has default member initializers (line 72-76 in header), this could be simplified. However, the explicit initialization is not incorrect and provides clarity.

       bnb_stats_t<i_t, f_t> dive_stats;
-      dive_stats.total_lp_iters      = 0;
-      dive_stats.total_lp_solve_time = 0;
-      dive_stats.nodes_explored      = 0;
-      dive_stats.nodes_unexplored    = 0;
+      // dive_stats uses default initialization from bnb_stats_t

1144-1145: Consider making dive limits configurable.

The hardcoded limits (time limit check and dive_stats.nodes_explored > 500) are reasonable heuristics but could benefit from being configurable via simplex_solver_settings_t for tuning on different problem types.


1180-1182: Consider documenting the sibling pruning heuristic.

The condition stack.front()->depth - stack.back()->depth > 5 prunes siblings that are more than 5 levels apart. This is a valid heuristic to keep dives focused, but a brief comment explaining the rationale would improve maintainability.

+        // Prune distant siblings to keep the dive focused on a single path
         if (stack.size() > 1 && stack.front()->depth - stack.back()->depth > 5) {
           stack.pop_back();
         }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f44e3ec and a26a816.

📒 Files selected for processing (8)
  • cpp/src/dual_simplex/bounds_strengthening.cpp (2 hunks)
  • cpp/src/dual_simplex/branch_and_bound.cpp (18 hunks)
  • cpp/src/dual_simplex/branch_and_bound.hpp (6 hunks)
  • cpp/src/dual_simplex/diving_queue.hpp (0 hunks)
  • cpp/src/dual_simplex/mip_node.hpp (4 hunks)
  • cpp/src/dual_simplex/node_queue.hpp (1 hunks)
  • cpp/src/dual_simplex/pseudo_costs.cpp (5 hunks)
  • cpp/src/dual_simplex/pseudo_costs.hpp (1 hunks)
💤 Files with no reviewable changes (1)
  • cpp/src/dual_simplex/diving_queue.hpp
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{cu,cuh,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...

Files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/bounds_strengthening.cpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/node_queue.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
**/*.{h,hpp,py}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

Verify C API does not break ABI stability (no struct layout changes, field reordering); maintain backward compatibility in Python and server APIs with deprecation warnings

Files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/node_queue.hpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
**/*.{cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state

Files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/bounds_strengthening.cpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/node_queue.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
**/*.{cu,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/bounds_strengthening.cpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/node_queue.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
🧠 Learnings (10)
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/bounds_strengthening.cpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/node_queue.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)

Applied to files:

  • cpp/src/dual_simplex/bounds_strengthening.cpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/node_queue.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover

Applied to files:

  • cpp/src/dual_simplex/bounds_strengthening.cpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)

Applied to files:

  • cpp/src/dual_simplex/bounds_strengthening.cpp
📚 Learning: 2025-12-04T20:09:09.264Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 602
File: cpp/src/linear_programming/solve.cu:732-742
Timestamp: 2025-12-04T20:09:09.264Z
Learning: In cpp/src/linear_programming/solve.cu, the barrier solver does not currently return INFEASIBLE or UNBOUNDED status. It only returns OPTIMAL, TIME_LIMIT, NUMERICAL_ISSUES, or CONCURRENT_LIMIT.

Applied to files:

  • cpp/src/dual_simplex/node_queue.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-12-04T04:11:12.640Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 500
File: cpp/src/dual_simplex/scaling.cpp:68-76
Timestamp: 2025-12-04T04:11:12.640Z
Learning: In the cuOPT dual simplex solver, CSR/CSC matrices (including the quadratic objective matrix Q) are required to have valid dimensions and indices by construction. Runtime bounds checking in performance-critical paths like matrix scaling is avoided to prevent slowdowns. Validation is performed via debug-only check_matrix() calls wrapped in #ifdef CHECK_MATRIX.

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
🧬 Code graph analysis (3)
cpp/src/dual_simplex/node_queue.hpp (1)
cpp/src/dual_simplex/mip_node.hpp (4)
  • lower (97-108)
  • lower (97-99)
  • lower (112-130)
  • lower (112-114)
cpp/src/dual_simplex/branch_and_bound.cpp (4)
cpp/src/dual_simplex/branch_and_bound.hpp (3)
  • node_ptr (245-257)
  • node_ptr (260-260)
  • node (209-212)
cpp/src/dual_simplex/mip_node.hpp (6)
  • node_ptr (277-283)
  • node_ptr (277-277)
  • log (329-337)
  • log (329-332)
  • log (339-354)
  • log (339-344)
cpp/src/dual_simplex/pseudo_costs.hpp (3)
  • node_ptr (31-31)
  • fractional (46-49)
  • fractional (51-52)
cpp/src/dual_simplex/node_queue.hpp (4)
  • node (24-28)
  • node (24-24)
  • node (30-34)
  • node (30-30)
cpp/src/dual_simplex/pseudo_costs.cpp (1)
cpp/src/dual_simplex/pseudo_costs.hpp (3)
  • fractional (46-49)
  • fractional (51-52)
  • num_initialized_down (41-44)
🪛 Cppcheck (2.18.0)
cpp/src/dual_simplex/pseudo_costs.cpp

[warning] 320-320: Array index -1 is out of bounds.

(negativeContainerIndex)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-sh-client / 13.0.2, 3.10, amd64, rockylinux8
  • GitHub Check: checks / check-style
🔇 Additional comments (30)
cpp/src/dual_simplex/bounds_strengthening.cpp (2)

157-165: LGTM: Debug logging for infeasibility detection.

Switching to settings.log.debug() aligns with the project's logging conventions and keeps diagnostic output manageable in production.


214-216: LGTM: Consistent debug logging for bound violation.

Same pattern applied for the variable bound infeasibility check.

cpp/src/dual_simplex/pseudo_costs.hpp (1)

46-49: LGTM: API update for combined variable selection and objective estimation.

The new signature returning std::pair<i_t, f_t> cleanly provides both the branch variable and the objective estimate in a single call, reducing redundant computation and aligning with the pseudocost-based node ordering in the diving heap.

cpp/src/dual_simplex/mip_node.hpp (3)

48-49: LGTM: Proper default initialization of objective_estimate.

Initializing to infinity is correct—nodes start with the worst possible estimate until computed.


85-86: LGTM: Parent-to-child propagation of objective_estimate.

Propagating the parent's estimate to children is correct; the estimate will be updated when the child node is processed by variable_selection_and_obj_estimate.


233-240: LGTM: detach_copy preserves objective_estimate.

The detached copy correctly includes all node state including the new objective_estimate field.

cpp/src/dual_simplex/pseudo_costs.cpp (6)

202-202: LGTM: RAII-based locking.

Using std::lock_guard instead of manual lock/unlock ensures the mutex is released even if an exception occurs.


262-262: LGTM: Thread-safe variable selection.

Acquiring the lock protects the pseudo-cost data structures during read operations.


268-268: LGTM: Objective estimate initialization.

Starting the estimate from lower_bound is correct—the pseudocost contributions are then added to this base.


301-304: LGTM: Pseudocost-based objective estimate accumulation.

The estimate uses min(down_cost, up_cost) per fractional variable, which is a standard optimistic estimate for the child node's objective (assuming optimal branching direction). This aligns with Achterberg's approach cited in the PR description.


323-323: LGTM: Return type matches new API.

Returning the pair {branch_var, estimate} correctly fulfills the new variable_selection_and_obj_estimate contract.


306-321: Undefined behavior concern is invalid—both callers guarantee non-empty fractional.

The first call site in solve_node() is protected by an else if that only executes when leaf_num_fractional != 0 (line 17). The second call site in root solving checks if (num_fractional == 0) and returns early (lines 13–35), ensuring variable_selection_and_obj_estimate() is never called with empty fractional. The suggested defensive guard is unnecessary.

Likely an incorrect or invalid review comment.

cpp/src/dual_simplex/node_queue.hpp (5)

18-61: LGTM: Clean STL-based heap wrapper.

The generic heap_t correctly leverages std::push_heap/std::pop_heap while exposing the underlying container for direct access when needed.


67-76: LGTM: Shared heap entry design.

Using heap_entry_t with shared_ptr allows both heaps to reference the same node without duplication, directly supporting the PR's memory reduction goal.


103-109: LGTM: Dual-heap push with thread safety.

Pushing to both heaps under a single lock ensures consistent state and avoids partial updates.


134-152: LGTM: pop_diving correctly handles consumed entries.

The loop skips entries where node_ptr is nullptr (already consumed via pop_best_first), and extracts bounds before returning the detached copy to avoid races with node fathoming.


166-170: LGTM: Safe lower bound retrieval.

Returns inf for an empty heap, which is the correct semantics (no nodes means no finite lower bound).

cpp/src/dual_simplex/branch_and_bound.hpp (3)

70-81: LGTM on the new bnb_stats_t structure.

The use of omp_atomic_t for thread-safe counters (total_lp_solve_time, nodes_explored, nodes_unexplored, total_lp_iters) is appropriate for the multi-threaded B&B context. The non-atomic start_time is acceptable since it's initialized once before the parallel region.

Minor observation: last_log and nodes_since_last_log are marked atomic but the comment states they're only used by the main thread. This is safe but slightly over-synchronized.


214-239: Approve updated traversal method signatures.

The new plunge_from and dive_from signatures properly separate shallow plunging (best-first with bounded depth) from deep diving. The dive_from taking explicit start_lower/start_upper bounds is consistent with the memory optimization goal of not storing bounds per-node in the diving queue.


176-177: Thread-safety verification complete: node_queue_t properly synchronizes concurrent access.

The node_queue_t implementation correctly provides internal synchronization via omp_mutex_t mutex. All public methods that may be called concurrently (push, pop_best_first, pop_diving, and size/bound queries) are protected with std::lock_guard, ensuring thread-safe access to the underlying heaps and shared state.

cpp/src/dual_simplex/branch_and_bound.cpp (10)

247-255: LGTM on get_lower_bound() refactoring.

The aggregation from lower_bound_ceiling_, node_queue.get_lower_bound(), and local_lower_bounds_ correctly computes the global lower bound. The final check returning -inf for non-finite values is a good defensive practice.


680-681: LGTM on stats accumulation.

The LP solve time and iteration counts are correctly accumulated into the stats structure passed to solve_node.


725-727: Approve new variable selection API usage.

The structured binding correctly captures both branch_var and obj_estimate from the new variable_selection_and_obj_estimate API, and objective_estimate is properly stored on the node for use in diving queue ordering.


751-752: LGTM on ITERATION_LIMIT propagation.

The new ITERATION_LIMIT status is correctly propagated from the LP solver through solve_node.


881-882: LGTM on node_queue.push() usage in ramp-up.

Nodes are correctly pushed to the unified queue once the ramp-up phase has generated enough nodes to keep threads busy.


924-924: Good fix: Reset basis when node is fathomed.

Setting recompute_bounds_and_basis = true when a node is fathomed ensures the next node processed will have its bounds and basis correctly recomputed rather than relying on incremental updates from a now-invalid parent path.


1039-1063: LGTM on best_first_thread refactoring.

The transition from direct heap access to node_queue.pop_best_first() is clean. The optional-based return correctly handles the case when no node is available while other subtrees are still active.


1163-1168: LGTM on ITERATION_LIMIT handling in diving.

Breaking out of the dive loop on ITERATION_LIMIT is correct behavior - it allows the diving thread to abandon this dive and potentially pick up a new starting node rather than continuing to consume iterations on a potentially unproductive path.


1404-1405: LGTM on root variable selection with new API.

The structured binding correctly uses the new variable_selection_and_obj_estimate API at the root node, consistent with usage in solve_node.


1469-1470: LGTM on final lower bound calculation.

The conditional correctly uses node_queue.get_lower_bound() when the queue is non-empty, falling back to the root's lower bound when the tree has been fully explored.

@nguidotti nguidotti mentioned this pull request Dec 16, 2025
8 tasks
@github-actions
Copy link

🔔 Hi @anandhkb @nguidotti, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

1 similar comment
@github-actions
Copy link

github-actions bot commented Jan 2, 2026

🔔 Hi @anandhkb @nguidotti, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
cpp/src/dual_simplex/branch_and_bound.cpp (1)

1127-1131: Hardcoded limits in diving thread.

The 500-node limit per dive (line 1145) and the 5-level depth difference for stack pruning (line 1180) are hardcoded. Consider making these configurable via settings for tuning on different problem types.

🔎 Suggested improvement
+  // In simplex_solver_settings_t or similar:
+  i_t max_dive_nodes = 500;
+  i_t max_backtrack_depth = 5;
+
   // In diving_thread:
-  if (dive_stats.nodes_explored > 500) { break; }
+  if (dive_stats.nodes_explored > settings_.max_dive_nodes) { break; }
   
-  if (stack.size() > 1 && stack.front()->depth - stack.back()->depth > 5) {
+  if (stack.size() > 1 && stack.front()->depth - stack.back()->depth > settings_.max_backtrack_depth) {

Also applies to: 1144-1145

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a26a816 and 319bd22.

📒 Files selected for processing (7)
  • cpp/src/dual_simplex/bounds_strengthening.cpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/node_queue.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
🚧 Files skipped from review as they are similar to previous changes (2)
  • cpp/src/dual_simplex/bounds_strengthening.cpp
  • cpp/src/dual_simplex/node_queue.hpp
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{cu,cuh,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...

Files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
**/*.{h,hpp,py}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

Verify C API does not break ABI stability (no struct layout changes, field reordering); maintain backward compatibility in Python and server APIs with deprecation warnings

Files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
**/*.{cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state

Files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
**/*.{cu,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
🧠 Learnings (18)
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)

Applied to files:

  • cpp/src/dual_simplex/mip_node.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-12-04T20:09:09.264Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 602
File: cpp/src/linear_programming/solve.cu:732-742
Timestamp: 2025-12-04T20:09:09.264Z
Learning: In cpp/src/linear_programming/solve.cu, the barrier solver does not currently return INFEASIBLE or UNBOUNDED status. It only returns OPTIMAL, TIME_LIMIT, NUMERICAL_ISSUES, or CONCURRENT_LIMIT.

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-12-04T04:11:12.640Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 500
File: cpp/src/dual_simplex/scaling.cpp:68-76
Timestamp: 2025-12-04T04:11:12.640Z
Learning: In the cuOPT dual simplex solver, CSR/CSC matrices (including the quadratic objective matrix Q) are required to have valid dimensions and indices by construction. Runtime bounds checking in performance-critical paths like matrix scaling is avoided to prevent slowdowns. Validation is performed via debug-only check_matrix() calls wrapped in #ifdef CHECK_MATRIX.

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
🧬 Code graph analysis (2)
cpp/src/dual_simplex/branch_and_bound.cpp (2)
cpp/src/dual_simplex/pseudo_costs.hpp (3)
  • node_ptr (31-31)
  • fractional (46-49)
  • fractional (51-52)
cpp/src/dual_simplex/node_queue.hpp (4)
  • node (24-28)
  • node (24-24)
  • node (30-34)
  • node (30-30)
cpp/src/dual_simplex/pseudo_costs.cpp (1)
cpp/src/dual_simplex/pseudo_costs.hpp (3)
  • fractional (46-49)
  • fractional (51-52)
  • num_initialized_down (41-44)
🪛 Cppcheck (2.19.0)
cpp/src/dual_simplex/pseudo_costs.cpp

[warning] 320-320: Array index -1 is out of bounds.

(negativeContainerIndex)

🔇 Additional comments (16)
cpp/src/dual_simplex/pseudo_costs.hpp (1)

46-49: API change looks good.

The new variable_selection_and_obj_estimate method appropriately extends the previous variable_selection interface by returning both the selected branch variable and an objective estimate. The additional lower_bound parameter enables computing the pseudocost-based objective estimate as described in Achterberg's thesis (Section 6.4).

cpp/src/dual_simplex/mip_node.hpp (2)

48-49: Proper initialization and propagation of objective_estimate.

The objective_estimate field is correctly initialized to infinity in the default and root constructors, and properly propagated from the parent node in the child constructor. This ensures the estimate is available for node scoring in the diving queue.

Also applies to: 63-64, 85-86


233-239: detach_copy correctly preserves objective_estimate.

The detached copy preserves the objective estimate, which is essential for the diving queue that operates on detached node copies.

cpp/src/dual_simplex/pseudo_costs.cpp (3)

202-202: Good use of RAII for mutex locking.

Switching from manual lock/unlock to std::lock_guard ensures the mutex is properly released even if an exception occurs, improving safety. As per coding guidelines for thread-safe code.


256-268: New API implementation correctly computes objective estimate.

The function properly initializes the estimate from lower_bound and accumulates pseudocost-based estimates for each fractional variable, following the approach described in Achterberg's thesis for diving node ordering.


306-323: Review comment is unnecessary—both callers guarantee fractional is non-empty.

The function variable_selection_and_obj_estimate is called only when fractional is guaranteed to be non-empty by the calling code:

  • At line 725: called inside else if block after if (leaf_num_fractional == 0) return
  • At line 1405: called only after if (num_fractional == 0) return

Both callers enforce this precondition through explicit early returns. The function correctly relies on this precondition—adding a defensive check is redundant.

Likely an incorrect or invalid review comment.

cpp/src/dual_simplex/branch_and_bound.hpp (3)

70-81: Well-structured statistics tracking with thread-safe atomics.

The bnb_stats_t struct appropriately uses omp_atomic_t for thread-safe statistics accumulation. The comment on line 78-79 clarifying that last_log and nodes_since_last_log are main-thread-only is helpful for maintainability.


176-177: Unified node queue simplifies node management.

Replacing the separate heap and diving queue with a unified node_queue_t member aligns with the PR objective to reduce memory consumption and share information between best-first and diving strategies.


214-239: Updated method signatures support new traversal strategy.

The renamed plunge_from (shallow exploration) and new dive_from (deep exploration) methods, along with the bnb_stats_t& stats parameter in solve_node, properly support the new node queue architecture and per-thread statistics tracking.

Also applies to: 256-257

cpp/src/dual_simplex/branch_and_bound.cpp (7)

247-255: Lower bound aggregation correctly handles all sources.

The lower bound now properly aggregates from:

  1. lower_bound_ceiling_ (for numerical issues)
  2. node_queue.get_lower_bound() (from queued nodes)
  3. Local thread bounds

The fallback to -inf for non-finite values is appropriate.


599-604: Iteration limit for diving threads prevents resource starvation.

Limiting diving thread LP iterations to 5% of total B&B iterations is a reasonable heuristic to ensure diving doesn't consume excessive resources. The early return with ITERATION_LIMIT status allows proper handling upstream.


921-926: Correctly resets bounds/basis after cutoff.

Setting recompute_bounds_and_basis = true after a node is fathomed ensures the next node in the stack gets fresh bounds computed from the root, preventing use of stale bound values.


725-727: Correctly uses new variable selection API.

The structured binding auto [branch_var, obj_estimate] properly captures both return values, and the objective estimate is stored on the node for use in diving queue ordering.


1100-1123: Diving thread properly manages bounds state for detached nodes.

The diving thread correctly:

  1. Initializes start_lower/start_upper from original bounds
  2. Resets bounds_changed markers
  3. Uses pop_diving to get a detached node with updated bounds
  4. Applies bounds strengthening before solving

This ensures bound consistency when operating on detached node copies.


1039-1065: Best-first thread correctly uses node_queue operations.

The pop_best_first with active_subtrees_ counter and the cutoff check before plunge_from properly manages the unified queue. The optional return type handles the empty queue case gracefully.


1163-1168: ITERATION_LIMIT properly handled in diving thread.

Breaking out of the dive loop on ITERATION_LIMIT prevents the diving thread from consuming more than its allocated iteration budget, allowing exploration threads to make progress.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cpp/src/dual_simplex/pseudo_costs.cpp (2)

199-214: Guard against near-zero frac and non-finite objective deltas in update_pseudo_costs

change_in_obj / frac can explode when fractional_val is numerically integral (or if bounds already force integrality), and NaN/inf deltas will poison pseudo-cost stats. As per coding guidelines, this is core optimization logic and should be numerically robust.

Proposed fix
 void pseudo_costs_t<i_t, f_t>::update_pseudo_costs(mip_node_t<i_t, f_t>* node_ptr,
                                                    f_t leaf_objective)
 {
   std::lock_guard<omp_mutex_t> lock(mutex);
-  const f_t change_in_obj = leaf_objective - node_ptr->lower_bound;
+  f_t change_in_obj = leaf_objective - node_ptr->lower_bound;
+  if (!std::isfinite(change_in_obj)) { return; }
+  // Pseudocosts model objective *increase* for minimization; clamp small/negative noise.
+  if (change_in_obj < 0) { change_in_obj = 0; }
   const f_t frac          = node_ptr->branch_dir == rounding_direction_t::DOWN
                               ? node_ptr->fractional_val - std::floor(node_ptr->fractional_val)
                               : std::ceil(node_ptr->fractional_val) - node_ptr->fractional_val;
+  constexpr f_t frac_eps = 1e-9;
+  if (!(frac > frac_eps)) { return; }
   if (node_ptr->branch_dir == rounding_direction_t::DOWN) {
     pseudo_cost_sum_down[node_ptr->branch_var] += change_in_obj / frac;
     pseudo_cost_num_down[node_ptr->branch_var]++;
   } else {
     pseudo_cost_sum_up[node_ptr->branch_var] += change_in_obj / frac;
     pseudo_cost_num_up[node_ptr->branch_var]++;
   }
 }

Based on learnings, validate algorithm correctness + numerical stability in branching/pseudocost updates.


256-317: Fix potential OOB (select == -1) and NaN propagation in variable_selection

If fractional is empty (or scores become NaN), select can remain -1, and score[select] at Line 314 becomes an OOB access (matches the static-analysis warning). Also, NaN pseudo-costs can prevent any selection.

Proposed fix
 i_t pseudo_costs_t<i_t, f_t>::variable_selection(const std::vector<i_t>& fractional,
                                                  const std::vector<f_t>& solution,
                                                  logger_t& log)
 {
   std::lock_guard<omp_mutex_t> lock(mutex);

   const i_t num_fractional = fractional.size();
+  if (num_fractional == 0) {
+    log.debug("Pseudocost branching requested with 0 fractional variables.\n");
+    return -1;
+  }
   std::vector<f_t> pseudo_cost_up(num_fractional);
   std::vector<f_t> pseudo_cost_down(num_fractional);
   std::vector<f_t> score(num_fractional);
@@
   for (i_t k = 0; k < num_fractional; k++) {
     const i_t j = fractional[k];
@@
     constexpr f_t eps = 1e-6;
     const f_t f_down  = solution[j] - std::floor(solution[j]);
     const f_t f_up    = std::ceil(solution[j]) - solution[j];
-    score[k] =
-      std::max(f_down * pseudo_cost_down[k], eps) * std::max(f_up * pseudo_cost_up[k], eps);
+    f_t s = std::max(f_down * pseudo_cost_down[k], eps) * std::max(f_up * pseudo_cost_up[k], eps);
+    if (!std::isfinite(s)) { s = -std::numeric_limits<f_t>::infinity(); }
+    score[k] = s;
   }

   i_t branch_var = fractional[0];
-  f_t max_score  = -1;
+  f_t max_score  = -std::numeric_limits<f_t>::infinity();
   i_t select     = -1;
   for (i_t k = 0; k < num_fractional; k++) {
-    if (score[k] > max_score) {
+    if (select == -1 || score[k] > max_score) {
       max_score  = score[k];
       branch_var = fractional[k];
       select     = k;
     }
   }

-  log.debug("Pseudocost branching on %d. Value %e. Score %e.\n",
-            branch_var,
-            solution[branch_var],
-            score[select]);
+  if (select >= 0) {
+    log.debug("Pseudocost branching on %d. Value %e. Score %e.\n",
+              branch_var,
+              solution[branch_var],
+              score[select]);
+  }

   return branch_var;
 }

Based on learnings, ensure branch-and-bound decisions remain correct and race-free under multi-threaded exploration.

🤖 Fix all issues with AI agents
In @cpp/src/dual_simplex/branch_and_bound.hpp:
- Around line 70-81: In bnb_stats_t, total_lp_iters is incorrectly typed as f_t;
change it to an integer type (i_t or a fixed-width integer like std::int64_t)
since it counts iterations, and remove omp_atomic_t from fields intended for
main-thread-only access (last_log and nodes_since_last_log) making them plain
f_t and i_t respectively; keep atomic wrappers for concurrent counters
(total_lp_solve_time, nodes_explored, nodes_unexplored) and preserve the “main
thread only” comment to avoid unintended synchronization overhead.

In @cpp/src/dual_simplex/pseudo_costs.cpp:
- Around line 319-361: In obj_estimate, ensure pseudo-costs can't be NaN/inf and
the returned estimate never falls below lower_bound and silence the unused
logger: validate/replace per-variable pseudo_cost_down and pseudo_cost_up
(derived from pseudo_cost_sum_down/num or the averages
pseudo_cost_down_avg/pseudo_cost_up_avg) by checking finiteness (std::isfinite)
and clamping to a non-negative minimum (e.g., >= 0) before using them, keep the
per-term contribution bounded below by eps as already done, and after the loop
set estimate = std::max(estimate, lower_bound) to avoid decreasing below the
bound; also silence the unused logger_t& log parameter (e.g., (void)log;) or use
it for a debug message. Optionally factor the pseudo-cost extraction into a
helper used by obj_estimate and variable_selection to dedupe logic.
🧹 Nitpick comments (3)
cpp/src/dual_simplex/branch_and_bound.hpp (3)

176-178: Member naming consistency: node_queue vs trailing-underscore convention
Not blocking, but it stands out among *_ members and increases grep/scan friction.


214-223: plunge_from(...) signature is getting long; consider a small context struct
A lightweight “thread_context” (leaf_problem, presolver, basis_update, lists) would reduce call-site churn as this evolves.


230-240: Same: dive_from(...) could benefit from a context struct + explicit invariants
Particularly for start_lower/start_upper (must match LP cols) and diving_type (expect only thread_type_t::DIVING?).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 319bd22 and 43f8b31.

📒 Files selected for processing (4)
  • cpp/src/dual_simplex/branch_and_bound.cpp
  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
  • cpp/src/dual_simplex/pseudo_costs.hpp
🚧 Files skipped from review as they are similar to previous changes (2)
  • cpp/src/dual_simplex/pseudo_costs.hpp
  • cpp/src/dual_simplex/branch_and_bound.cpp
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{cu,cuh,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...

Files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
**/*.{h,hpp,py}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

Verify C API does not break ABI stability (no struct layout changes, field reordering); maintain backward compatibility in Python and server APIs with deprecation warnings

Files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
**/*.{cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state

Files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
**/*.{cu,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
🧠 Learnings (19)
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-12-04T04:11:12.640Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 500
File: cpp/src/dual_simplex/scaling.cpp:68-76
Timestamp: 2025-12-04T04:11:12.640Z
Learning: In the cuOPT dual simplex solver, CSR/CSC matrices (including the quadratic objective matrix Q) are required to have valid dimensions and indices by construction. Runtime bounds checking in performance-critical paths like matrix scaling is avoided to prevent slowdowns. Validation is performed via debug-only check_matrix() calls wrapped in #ifdef CHECK_MATRIX.

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-12-04T20:09:09.264Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 602
File: cpp/src/linear_programming/solve.cu:732-742
Timestamp: 2025-12-04T20:09:09.264Z
Learning: In cpp/src/linear_programming/solve.cu, the barrier solver does not currently return INFEASIBLE or UNBOUNDED status. It only returns OPTIMAL, TIME_LIMIT, NUMERICAL_ISSUES, or CONCURRENT_LIMIT.

Applied to files:

  • cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cpp,hpp,h} : Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
🧬 Code graph analysis (1)
cpp/src/dual_simplex/pseudo_costs.cpp (1)
cpp/src/dual_simplex/pseudo_costs.hpp (4)
  • num_initialized_down (41-44)
  • fractional (46-48)
  • fractional (50-53)
  • fractional (55-56)
🪛 Cppcheck (2.19.0)
cpp/src/dual_simplex/pseudo_costs.cpp

[warning] 314-314: Array index -1 is out of bounds.

(negativeContainerIndex)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, arm64, rockylinux8
🔇 Additional comments (3)
cpp/src/dual_simplex/branch_and_bound.hpp (3)

155-157: Stats extraction to bnb_stats_t looks good
Nice cleanup vs ad-hoc per-class structs; should make it easier to thread stats through plunge/dive flows consistently.


10-13: Include update and node_queue_t synchronization are correct

node_queue_t is properly self-synchronized with internal omp_mutex_t protection (line 100) and lock_guard on all public methods (pop_best_first, etc.). The removal of old heap/diving mutex members is safe; synchronization is correctly encapsulated within the queue itself.


246-259: solve_node() stats parameter: verified across all call-sites

All three call-sites correctly pass the intended stats object:

  • Line 851 & 974 (exploration_ramp_up): pass shared exploration_stats_ member with thread_type EXPLORATION
  • Line 1154 (dive_from): passes local dive_stats variable with thread_type DIVING

Exploration threads safely share exploration_stats_ across concurrent OpenMP tasks because critical counting fields (nodes_explored, nodes_unexplored, total_lp_iters, last_log, nodes_since_last_log) are protected with omp_atomic_t<>. Diving threads use properly isolated local dive_stats per dive. No risk of cross-contamination or double-counting.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/dual_simplex/pseudo_costs.cpp (1)

300-314: Potential out-of-bounds access if fractional vector is empty.

If fractional.size() == 0, the selection loop (lines 303-309) doesn't execute, leaving select = -1. Line 314 then accesses score[select], causing undefined behavior.

🛡️ Add defensive check for empty fractional vector
+  if (num_fractional == 0) {
+    log.debug("No fractional variables for pseudocost branching\n");
+    return -1;  // or throw an exception
+  }
+
   i_t branch_var = fractional[0];
   f_t max_score  = -1;
   i_t select     = -1;

Alternatively, add an assertion before line 300:

+  assert(num_fractional > 0 && "fractional vector must not be empty");
   i_t branch_var = fractional[0];

Note: Static analysis flagged this at line 314. Based on coding guidelines for algorithm correctness in optimization logic.

🤖 Fix all issues with AI agents
In @cpp/src/dual_simplex/pseudo_costs.cpp:
- Around line 325-362: The implementation currently clamps each product with eps
before taking the minimum which alters Achterberg's formula; change the logic in
the loop that computes the contribution to estimate so you compute the two
products (pseudo_cost_down * f_down and pseudo_cost_up * f_up) and then take
their minimum directly (i.e., replace the std::min(std::max(..., eps),
std::max(..., eps)) pattern); reference the symbols pseudo_cost_down,
pseudo_cost_up, f_down, f_up and update the line that adds to estimate to use
std::min(down_prod, up_prod) (optionally clamp the final min with eps only after
taking the min if you need a nonzero floor for stability).
🧹 Nitpick comments (2)
cpp/src/dual_simplex/pseudo_costs.cpp (2)

319-362: Refactor duplicated pseudo-cost computation logic.

The obj_estimate method duplicates significant logic from variable_selection:

  • Lines 330-335 duplicate the initialized() call and variable setup (identical to lines 267-272 in variable_selection)
  • Lines 337-358 duplicate the per-variable pseudo-cost lookup pattern (similar to lines 280-298)

Consider extracting a helper method to compute per-variable pseudo-costs to reduce duplication and improve maintainability.

♻️ Proposed refactoring to eliminate duplication

Add a private helper method:

template <typename i_t, typename f_t>
void pseudo_costs_t<i_t, f_t>::compute_variable_pseudocosts(
    const std::vector<i_t>& fractional,
    std::vector<f_t>& pseudo_cost_down,
    std::vector<f_t>& pseudo_cost_up,
    f_t pseudo_cost_down_avg,
    f_t pseudo_cost_up_avg) const
{
  const i_t num_fractional = fractional.size();
  pseudo_cost_down.resize(num_fractional);
  pseudo_cost_up.resize(num_fractional);
  
  for (i_t k = 0; k < num_fractional; k++) {
    const i_t j = fractional[k];
    pseudo_cost_down[k] = (pseudo_cost_num_down[j] != 0) 
      ? pseudo_cost_sum_down[j] / pseudo_cost_num_down[j]
      : pseudo_cost_down_avg;
    pseudo_cost_up[k] = (pseudo_cost_num_up[j] != 0)
      ? pseudo_cost_sum_up[j] / pseudo_cost_num_up[j]
      : pseudo_cost_up_avg;
  }
}

Then use it in both methods to eliminate the duplicated loops.

Based on coding guidelines: refactor code duplication in solver components (3+ occurrences) into shared utilities.


319-362: Document the new public API and empty vector behavior.

The new obj_estimate method is a public API but lacks documentation explaining:

  • What objective estimate it computes (lower bound estimate based on pseudo-costs)
  • The mathematical formula used (minimum of down/up branch estimates)
  • Expected behavior when fractional is empty (returns lower_bound unchanged)

Add a comment block above the method in the header file explaining the purpose and behavior, especially for the empty vector case which returns lower_bound without modification.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 43f8b31 and a36bf03.

📒 Files selected for processing (1)
  • cpp/src/dual_simplex/pseudo_costs.cpp
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{cu,cuh,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...

Files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
**/*.{cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state

Files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
**/*.{cu,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
🧠 Learnings (14)
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cpp,hpp,h} : Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle

Applied to files:

  • cpp/src/dual_simplex/pseudo_costs.cpp
🪛 Cppcheck (2.19.0)
cpp/src/dual_simplex/pseudo_costs.cpp

[warning] 314-314: Array index -1 is out of bounds.

(negativeContainerIndex)

🔇 Additional comments (3)
cpp/src/dual_simplex/pseudo_costs.cpp (3)

3-3: LGTM: Copyright year updated.


202-202: LGTM: RAII-based locking improves exception safety.

The replacement of manual mutex lock/unlock with std::lock_guard ensures proper cleanup even in exception paths.


260-260: LGTM: Improved locking and logging consistency.

The RAII-based locking and migration to log.debug for internal diagnostics align with best practices.

Also applies to: 274-278, 311-311

Copy link
Contributor

@chris-maes chris-maes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the comments are minor. However, I'm worried that many of the implementations details of your parallelism scheme for branch and bound are leaking into the node_queue_t class. This spreads the important logic across multiple files and makes things more difficult to understand. Is it possible to move this logic back into branch and bound and keep the node_queue_t class simple?

@nguidotti
Copy link
Contributor Author

Most of the comments are minor. However, I'm worried that many of the implementations details of your parallelism scheme for branch and bound are leaking into the node_queue_t class. This spreads the important logic across multiple files and makes things more difficult to understand. Is it possible to move this logic back into branch and bound and keep the node_queue_t class simple?

Sure. Is it now better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality mip non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants