-
Notifications
You must be signed in to change notification settings - Fork 111
Unified Node Queue + Diving Node and Iteration Limits #718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…iteration and node limit to the diving threads.
📝 WalkthroughWalkthroughReplaces heap/diving-queue node management with a thread-safe dual-heap Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧹 Recent nitpick comments
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (5)
🧰 Additional context used📓 Path-based instructions (4)**/*.{cu,cuh,cpp,hpp,h}📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Files:
**/*.{h,hpp,py}📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Files:
**/*.{cpp,hpp,h}📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Files:
**/*.{cu,cpp,hpp,h}📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Files:
🧠 Learnings (21)📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-12-04T04:11:12.640ZApplied to files:
📚 Learning: 2025-12-04T20:09:09.264ZApplied to files:
📚 Learning: 2026-01-14T00:38:33.700ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
📚 Learning: 2025-11-25T10:20:49.822ZApplied to files:
🧬 Code graph analysis (2)cpp/src/dual_simplex/branch_and_bound.cpp (3)
cpp/src/dual_simplex/pseudo_costs.cpp (1)
🪛 Clang (14.0.6)cpp/src/dual_simplex/node_queue.hpp[error] 8-8: 'algorithm' file not found (clang-diagnostic-error) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
🔇 Additional comments (11)
✏️ Tip: You can disable this entire section by setting Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
cpp/src/dual_simplex/node_queue.hpp (1)
36-41: Minor: Redundant&&instd::forward.
std::forward<Args&&>works butstd::forward<Args>is the canonical form.template <typename... Args> void emplace(Args&&... args) { - buffer.emplace_back(std::forward<Args&&>(args)...); + buffer.emplace_back(std::forward<Args>(args)...); std::push_heap(buffer.begin(), buffer.end(), comp); }cpp/src/dual_simplex/branch_and_bound.cpp (4)
599-604: Potential early termination before any LP work is done.The iteration limit is calculated as
0.05 * bnb_lp_iters - stats.total_lp_iters. On the first call when both values are small, this could be zero or negative, causing immediateITERATION_LIMITreturn before any LP iterations are performed. This could cause diving threads to spin without making progress.Consider adding a minimum iteration allowance:
if (thread_type != thread_type_t::EXPLORATION) { i_t bnb_lp_iters = exploration_stats_.total_lp_iters; f_t max_iter = 0.05 * bnb_lp_iters; - lp_settings.iteration_limit = max_iter - stats.total_lp_iters; - if (lp_settings.iteration_limit <= 0) { return node_solve_info_t::ITERATION_LIMIT; } + lp_settings.iteration_limit = std::max(static_cast<f_t>(100), max_iter - stats.total_lp_iters); + if (max_iter > 0 && stats.total_lp_iters >= max_iter) { return node_solve_info_t::ITERATION_LIMIT; } }
1127-1131: Consider documenting initialization ofdive_stats.The explicit initialization of
dive_statsfields to zero is clear, but sincebnb_stats_talready has default member initializers (line 72-76 in header), this could be simplified. However, the explicit initialization is not incorrect and provides clarity.bnb_stats_t<i_t, f_t> dive_stats; - dive_stats.total_lp_iters = 0; - dive_stats.total_lp_solve_time = 0; - dive_stats.nodes_explored = 0; - dive_stats.nodes_unexplored = 0; + // dive_stats uses default initialization from bnb_stats_t
1144-1145: Consider making dive limits configurable.The hardcoded limits (time limit check and
dive_stats.nodes_explored > 500) are reasonable heuristics but could benefit from being configurable viasimplex_solver_settings_tfor tuning on different problem types.
1180-1182: Consider documenting the sibling pruning heuristic.The condition
stack.front()->depth - stack.back()->depth > 5prunes siblings that are more than 5 levels apart. This is a valid heuristic to keep dives focused, but a brief comment explaining the rationale would improve maintainability.+ // Prune distant siblings to keep the dive focused on a single path if (stack.size() > 1 && stack.front()->depth - stack.back()->depth > 5) { stack.pop_back(); }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
cpp/src/dual_simplex/bounds_strengthening.cpp(2 hunks)cpp/src/dual_simplex/branch_and_bound.cpp(18 hunks)cpp/src/dual_simplex/branch_and_bound.hpp(6 hunks)cpp/src/dual_simplex/diving_queue.hpp(0 hunks)cpp/src/dual_simplex/mip_node.hpp(4 hunks)cpp/src/dual_simplex/node_queue.hpp(1 hunks)cpp/src/dual_simplex/pseudo_costs.cpp(5 hunks)cpp/src/dual_simplex/pseudo_costs.hpp(1 hunks)
💤 Files with no reviewable changes (1)
- cpp/src/dual_simplex/diving_queue.hpp
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{cu,cuh,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...
Files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/bounds_strengthening.cppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/node_queue.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
**/*.{h,hpp,py}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Verify C API does not break ABI stability (no struct layout changes, field reordering); maintain backward compatibility in Python and server APIs with deprecation warnings
Files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/node_queue.hppcpp/src/dual_simplex/branch_and_bound.hpp
**/*.{cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state
Files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/bounds_strengthening.cppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/node_queue.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
**/*.{cu,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/bounds_strengthening.cppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/node_queue.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
🧠 Learnings (10)
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/bounds_strengthening.cppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/node_queue.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Applied to files:
cpp/src/dual_simplex/bounds_strengthening.cppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/node_queue.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover
Applied to files:
cpp/src/dual_simplex/bounds_strengthening.cppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Applied to files:
cpp/src/dual_simplex/bounds_strengthening.cpp
📚 Learning: 2025-12-04T20:09:09.264Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 602
File: cpp/src/linear_programming/solve.cu:732-742
Timestamp: 2025-12-04T20:09:09.264Z
Learning: In cpp/src/linear_programming/solve.cu, the barrier solver does not currently return INFEASIBLE or UNBOUNDED status. It only returns OPTIMAL, TIME_LIMIT, NUMERICAL_ISSUES, or CONCURRENT_LIMIT.
Applied to files:
cpp/src/dual_simplex/node_queue.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
Applied to files:
cpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms
Applied to files:
cpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Applied to files:
cpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Applied to files:
cpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-12-04T04:11:12.640Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 500
File: cpp/src/dual_simplex/scaling.cpp:68-76
Timestamp: 2025-12-04T04:11:12.640Z
Learning: In the cuOPT dual simplex solver, CSR/CSC matrices (including the quadratic objective matrix Q) are required to have valid dimensions and indices by construction. Runtime bounds checking in performance-critical paths like matrix scaling is avoided to prevent slowdowns. Validation is performed via debug-only check_matrix() calls wrapped in #ifdef CHECK_MATRIX.
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hpp
🧬 Code graph analysis (3)
cpp/src/dual_simplex/node_queue.hpp (1)
cpp/src/dual_simplex/mip_node.hpp (4)
lower(97-108)lower(97-99)lower(112-130)lower(112-114)
cpp/src/dual_simplex/branch_and_bound.cpp (4)
cpp/src/dual_simplex/branch_and_bound.hpp (3)
node_ptr(245-257)node_ptr(260-260)node(209-212)cpp/src/dual_simplex/mip_node.hpp (6)
node_ptr(277-283)node_ptr(277-277)log(329-337)log(329-332)log(339-354)log(339-344)cpp/src/dual_simplex/pseudo_costs.hpp (3)
node_ptr(31-31)fractional(46-49)fractional(51-52)cpp/src/dual_simplex/node_queue.hpp (4)
node(24-28)node(24-24)node(30-34)node(30-30)
cpp/src/dual_simplex/pseudo_costs.cpp (1)
cpp/src/dual_simplex/pseudo_costs.hpp (3)
fractional(46-49)fractional(51-52)num_initialized_down(41-44)
🪛 Cppcheck (2.18.0)
cpp/src/dual_simplex/pseudo_costs.cpp
[warning] 320-320: Array index -1 is out of bounds.
(negativeContainerIndex)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-sh-client / 13.0.2, 3.10, amd64, rockylinux8
- GitHub Check: checks / check-style
🔇 Additional comments (30)
cpp/src/dual_simplex/bounds_strengthening.cpp (2)
157-165: LGTM: Debug logging for infeasibility detection.Switching to
settings.log.debug()aligns with the project's logging conventions and keeps diagnostic output manageable in production.
214-216: LGTM: Consistent debug logging for bound violation.Same pattern applied for the variable bound infeasibility check.
cpp/src/dual_simplex/pseudo_costs.hpp (1)
46-49: LGTM: API update for combined variable selection and objective estimation.The new signature returning
std::pair<i_t, f_t>cleanly provides both the branch variable and the objective estimate in a single call, reducing redundant computation and aligning with the pseudocost-based node ordering in the diving heap.cpp/src/dual_simplex/mip_node.hpp (3)
48-49: LGTM: Proper default initialization of objective_estimate.Initializing to infinity is correct—nodes start with the worst possible estimate until computed.
85-86: LGTM: Parent-to-child propagation of objective_estimate.Propagating the parent's estimate to children is correct; the estimate will be updated when the child node is processed by
variable_selection_and_obj_estimate.
233-240: LGTM: detach_copy preserves objective_estimate.The detached copy correctly includes all node state including the new
objective_estimatefield.cpp/src/dual_simplex/pseudo_costs.cpp (6)
202-202: LGTM: RAII-based locking.Using
std::lock_guardinstead of manual lock/unlock ensures the mutex is released even if an exception occurs.
262-262: LGTM: Thread-safe variable selection.Acquiring the lock protects the pseudo-cost data structures during read operations.
268-268: LGTM: Objective estimate initialization.Starting the estimate from
lower_boundis correct—the pseudocost contributions are then added to this base.
301-304: LGTM: Pseudocost-based objective estimate accumulation.The estimate uses
min(down_cost, up_cost)per fractional variable, which is a standard optimistic estimate for the child node's objective (assuming optimal branching direction). This aligns with Achterberg's approach cited in the PR description.
323-323: LGTM: Return type matches new API.Returning the pair
{branch_var, estimate}correctly fulfills the newvariable_selection_and_obj_estimatecontract.
306-321: Undefined behavior concern is invalid—both callers guarantee non-emptyfractional.The first call site in
solve_node()is protected by anelse ifthat only executes whenleaf_num_fractional != 0(line 17). The second call site in root solving checksif (num_fractional == 0)and returns early (lines 13–35), ensuringvariable_selection_and_obj_estimate()is never called with emptyfractional. The suggested defensive guard is unnecessary.Likely an incorrect or invalid review comment.
cpp/src/dual_simplex/node_queue.hpp (5)
18-61: LGTM: Clean STL-based heap wrapper.The generic
heap_tcorrectly leveragesstd::push_heap/std::pop_heapwhile exposing the underlying container for direct access when needed.
67-76: LGTM: Shared heap entry design.Using
heap_entry_twithshared_ptrallows both heaps to reference the same node without duplication, directly supporting the PR's memory reduction goal.
103-109: LGTM: Dual-heap push with thread safety.Pushing to both heaps under a single lock ensures consistent state and avoids partial updates.
134-152: LGTM:pop_divingcorrectly handles consumed entries.The loop skips entries where
node_ptrisnullptr(already consumed viapop_best_first), and extracts bounds before returning the detached copy to avoid races with node fathoming.
166-170: LGTM: Safe lower bound retrieval.Returns
inffor an empty heap, which is the correct semantics (no nodes means no finite lower bound).cpp/src/dual_simplex/branch_and_bound.hpp (3)
70-81: LGTM on the newbnb_stats_tstructure.The use of
omp_atomic_tfor thread-safe counters (total_lp_solve_time,nodes_explored,nodes_unexplored,total_lp_iters) is appropriate for the multi-threaded B&B context. The non-atomicstart_timeis acceptable since it's initialized once before the parallel region.Minor observation:
last_logandnodes_since_last_logare marked atomic but the comment states they're only used by the main thread. This is safe but slightly over-synchronized.
214-239: Approve updated traversal method signatures.The new
plunge_fromanddive_fromsignatures properly separate shallow plunging (best-first with bounded depth) from deep diving. Thedive_fromtaking explicitstart_lower/start_upperbounds is consistent with the memory optimization goal of not storing bounds per-node in the diving queue.
176-177: Thread-safety verification complete:node_queue_tproperly synchronizes concurrent access.The
node_queue_timplementation correctly provides internal synchronization viaomp_mutex_tmutex. All public methods that may be called concurrently (push,pop_best_first,pop_diving, and size/bound queries) are protected withstd::lock_guard, ensuring thread-safe access to the underlying heaps and shared state.cpp/src/dual_simplex/branch_and_bound.cpp (10)
247-255: LGTM onget_lower_bound()refactoring.The aggregation from
lower_bound_ceiling_,node_queue.get_lower_bound(), andlocal_lower_bounds_correctly computes the global lower bound. The final check returning-inffor non-finite values is a good defensive practice.
680-681: LGTM on stats accumulation.The LP solve time and iteration counts are correctly accumulated into the stats structure passed to
solve_node.
725-727: Approve new variable selection API usage.The structured binding correctly captures both
branch_varandobj_estimatefrom the newvariable_selection_and_obj_estimateAPI, andobjective_estimateis properly stored on the node for use in diving queue ordering.
751-752: LGTM on ITERATION_LIMIT propagation.The new
ITERATION_LIMITstatus is correctly propagated from the LP solver throughsolve_node.
881-882: LGTM on node_queue.push() usage in ramp-up.Nodes are correctly pushed to the unified queue once the ramp-up phase has generated enough nodes to keep threads busy.
924-924: Good fix: Reset basis when node is fathomed.Setting
recompute_bounds_and_basis = truewhen a node is fathomed ensures the next node processed will have its bounds and basis correctly recomputed rather than relying on incremental updates from a now-invalid parent path.
1039-1063: LGTM on best_first_thread refactoring.The transition from direct heap access to
node_queue.pop_best_first()is clean. The optional-based return correctly handles the case when no node is available while other subtrees are still active.
1163-1168: LGTM on ITERATION_LIMIT handling in diving.Breaking out of the dive loop on
ITERATION_LIMITis correct behavior - it allows the diving thread to abandon this dive and potentially pick up a new starting node rather than continuing to consume iterations on a potentially unproductive path.
1404-1405: LGTM on root variable selection with new API.The structured binding correctly uses the new
variable_selection_and_obj_estimateAPI at the root node, consistent with usage insolve_node.
1469-1470: LGTM on final lower bound calculation.The conditional correctly uses
node_queue.get_lower_bound()when the queue is non-empty, falling back to the root's lower bound when the tree has been fully explored.
|
🔔 Hi @anandhkb @nguidotti, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
1 similar comment
|
🔔 Hi @anandhkb @nguidotti, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
cpp/src/dual_simplex/branch_and_bound.cpp (1)
1127-1131: Hardcoded limits in diving thread.The 500-node limit per dive (line 1145) and the 5-level depth difference for stack pruning (line 1180) are hardcoded. Consider making these configurable via settings for tuning on different problem types.
🔎 Suggested improvement
+ // In simplex_solver_settings_t or similar: + i_t max_dive_nodes = 500; + i_t max_backtrack_depth = 5; + // In diving_thread: - if (dive_stats.nodes_explored > 500) { break; } + if (dive_stats.nodes_explored > settings_.max_dive_nodes) { break; } - if (stack.size() > 1 && stack.front()->depth - stack.back()->depth > 5) { + if (stack.size() > 1 && stack.front()->depth - stack.back()->depth > settings_.max_backtrack_depth) {Also applies to: 1144-1145
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
cpp/src/dual_simplex/bounds_strengthening.cppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/node_queue.hppcpp/src/dual_simplex/pseudo_costs.cppcpp/src/dual_simplex/pseudo_costs.hpp
🚧 Files skipped from review as they are similar to previous changes (2)
- cpp/src/dual_simplex/bounds_strengthening.cpp
- cpp/src/dual_simplex/node_queue.hpp
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{cu,cuh,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...
Files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
**/*.{h,hpp,py}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Verify C API does not break ABI stability (no struct layout changes, field reordering); maintain backward compatibility in Python and server APIs with deprecation warnings
Files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.hpp
**/*.{cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state
Files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
**/*.{cu,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
🧠 Learnings (18)
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Applied to files:
cpp/src/dual_simplex/pseudo_costs.hppcpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Applied to files:
cpp/src/dual_simplex/mip_node.hppcpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms
Applied to files:
cpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
Applied to files:
cpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-12-04T20:09:09.264Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 602
File: cpp/src/linear_programming/solve.cu:732-742
Timestamp: 2025-12-04T20:09:09.264Z
Learning: In cpp/src/linear_programming/solve.cu, the barrier solver does not currently return INFEASIBLE or UNBOUNDED status. It only returns OPTIMAL, TIME_LIMIT, NUMERICAL_ISSUES, or CONCURRENT_LIMIT.
Applied to files:
cpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-12-04T04:11:12.640Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 500
File: cpp/src/dual_simplex/scaling.cpp:68-76
Timestamp: 2025-12-04T04:11:12.640Z
Learning: In the cuOPT dual simplex solver, CSR/CSC matrices (including the quadratic objective matrix Q) are required to have valid dimensions and indices by construction. Runtime bounds checking in performance-critical paths like matrix scaling is avoided to prevent slowdowns. Validation is performed via debug-only check_matrix() calls wrapped in #ifdef CHECK_MATRIX.
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hpp
🧬 Code graph analysis (2)
cpp/src/dual_simplex/branch_and_bound.cpp (2)
cpp/src/dual_simplex/pseudo_costs.hpp (3)
node_ptr(31-31)fractional(46-49)fractional(51-52)cpp/src/dual_simplex/node_queue.hpp (4)
node(24-28)node(24-24)node(30-34)node(30-30)
cpp/src/dual_simplex/pseudo_costs.cpp (1)
cpp/src/dual_simplex/pseudo_costs.hpp (3)
fractional(46-49)fractional(51-52)num_initialized_down(41-44)
🪛 Cppcheck (2.19.0)
cpp/src/dual_simplex/pseudo_costs.cpp
[warning] 320-320: Array index -1 is out of bounds.
(negativeContainerIndex)
🔇 Additional comments (16)
cpp/src/dual_simplex/pseudo_costs.hpp (1)
46-49: API change looks good.The new
variable_selection_and_obj_estimatemethod appropriately extends the previousvariable_selectioninterface by returning both the selected branch variable and an objective estimate. The additionallower_boundparameter enables computing the pseudocost-based objective estimate as described in Achterberg's thesis (Section 6.4).cpp/src/dual_simplex/mip_node.hpp (2)
48-49: Proper initialization and propagation ofobjective_estimate.The
objective_estimatefield is correctly initialized to infinity in the default and root constructors, and properly propagated from the parent node in the child constructor. This ensures the estimate is available for node scoring in the diving queue.Also applies to: 63-64, 85-86
233-239:detach_copycorrectly preservesobjective_estimate.The detached copy preserves the objective estimate, which is essential for the diving queue that operates on detached node copies.
cpp/src/dual_simplex/pseudo_costs.cpp (3)
202-202: Good use of RAII for mutex locking.Switching from manual lock/unlock to
std::lock_guardensures the mutex is properly released even if an exception occurs, improving safety. As per coding guidelines for thread-safe code.
256-268: New API implementation correctly computes objective estimate.The function properly initializes the estimate from
lower_boundand accumulates pseudocost-based estimates for each fractional variable, following the approach described in Achterberg's thesis for diving node ordering.
306-323: Review comment is unnecessary—both callers guaranteefractionalis non-empty.The function
variable_selection_and_obj_estimateis called only whenfractionalis guaranteed to be non-empty by the calling code:
- At line 725: called inside
else ifblock afterif (leaf_num_fractional == 0) return- At line 1405: called only after
if (num_fractional == 0) returnBoth callers enforce this precondition through explicit early returns. The function correctly relies on this precondition—adding a defensive check is redundant.
Likely an incorrect or invalid review comment.
cpp/src/dual_simplex/branch_and_bound.hpp (3)
70-81: Well-structured statistics tracking with thread-safe atomics.The
bnb_stats_tstruct appropriately usesomp_atomic_tfor thread-safe statistics accumulation. The comment on line 78-79 clarifying thatlast_logandnodes_since_last_logare main-thread-only is helpful for maintainability.
176-177: Unified node queue simplifies node management.Replacing the separate heap and diving queue with a unified
node_queue_tmember aligns with the PR objective to reduce memory consumption and share information between best-first and diving strategies.
214-239: Updated method signatures support new traversal strategy.The renamed
plunge_from(shallow exploration) and newdive_from(deep exploration) methods, along with thebnb_stats_t& statsparameter insolve_node, properly support the new node queue architecture and per-thread statistics tracking.Also applies to: 256-257
cpp/src/dual_simplex/branch_and_bound.cpp (7)
247-255: Lower bound aggregation correctly handles all sources.The lower bound now properly aggregates from:
lower_bound_ceiling_(for numerical issues)node_queue.get_lower_bound()(from queued nodes)- Local thread bounds
The fallback to
-inffor non-finite values is appropriate.
599-604: Iteration limit for diving threads prevents resource starvation.Limiting diving thread LP iterations to 5% of total B&B iterations is a reasonable heuristic to ensure diving doesn't consume excessive resources. The early return with
ITERATION_LIMITstatus allows proper handling upstream.
921-926: Correctly resets bounds/basis after cutoff.Setting
recompute_bounds_and_basis = trueafter a node is fathomed ensures the next node in the stack gets fresh bounds computed from the root, preventing use of stale bound values.
725-727: Correctly uses new variable selection API.The structured binding
auto [branch_var, obj_estimate]properly captures both return values, and the objective estimate is stored on the node for use in diving queue ordering.
1100-1123: Diving thread properly manages bounds state for detached nodes.The diving thread correctly:
- Initializes
start_lower/start_upperfrom original bounds- Resets
bounds_changedmarkers- Uses
pop_divingto get a detached node with updated bounds- Applies bounds strengthening before solving
This ensures bound consistency when operating on detached node copies.
1039-1065: Best-first thread correctly uses node_queue operations.The
pop_best_firstwithactive_subtrees_counter and the cutoff check beforeplunge_fromproperly manages the unified queue. The optional return type handles the empty queue case gracefully.
1163-1168: ITERATION_LIMIT properly handled in diving thread.Breaking out of the dive loop on
ITERATION_LIMITprevents the diving thread from consuming more than its allocated iteration budget, allowing exploration threads to make progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
cpp/src/dual_simplex/pseudo_costs.cpp (2)
199-214: Guard against near-zerofracand non-finite objective deltas inupdate_pseudo_costs
change_in_obj / fraccan explode whenfractional_valis numerically integral (or if bounds already force integrality), and NaN/inf deltas will poison pseudo-cost stats. As per coding guidelines, this is core optimization logic and should be numerically robust.Proposed fix
void pseudo_costs_t<i_t, f_t>::update_pseudo_costs(mip_node_t<i_t, f_t>* node_ptr, f_t leaf_objective) { std::lock_guard<omp_mutex_t> lock(mutex); - const f_t change_in_obj = leaf_objective - node_ptr->lower_bound; + f_t change_in_obj = leaf_objective - node_ptr->lower_bound; + if (!std::isfinite(change_in_obj)) { return; } + // Pseudocosts model objective *increase* for minimization; clamp small/negative noise. + if (change_in_obj < 0) { change_in_obj = 0; } const f_t frac = node_ptr->branch_dir == rounding_direction_t::DOWN ? node_ptr->fractional_val - std::floor(node_ptr->fractional_val) : std::ceil(node_ptr->fractional_val) - node_ptr->fractional_val; + constexpr f_t frac_eps = 1e-9; + if (!(frac > frac_eps)) { return; } if (node_ptr->branch_dir == rounding_direction_t::DOWN) { pseudo_cost_sum_down[node_ptr->branch_var] += change_in_obj / frac; pseudo_cost_num_down[node_ptr->branch_var]++; } else { pseudo_cost_sum_up[node_ptr->branch_var] += change_in_obj / frac; pseudo_cost_num_up[node_ptr->branch_var]++; } }Based on learnings, validate algorithm correctness + numerical stability in branching/pseudocost updates.
256-317: Fix potential OOB (select == -1) and NaN propagation invariable_selectionIf
fractionalis empty (or scores become NaN),selectcan remain-1, andscore[select]at Line 314 becomes an OOB access (matches the static-analysis warning). Also, NaN pseudo-costs can prevent any selection.Proposed fix
i_t pseudo_costs_t<i_t, f_t>::variable_selection(const std::vector<i_t>& fractional, const std::vector<f_t>& solution, logger_t& log) { std::lock_guard<omp_mutex_t> lock(mutex); const i_t num_fractional = fractional.size(); + if (num_fractional == 0) { + log.debug("Pseudocost branching requested with 0 fractional variables.\n"); + return -1; + } std::vector<f_t> pseudo_cost_up(num_fractional); std::vector<f_t> pseudo_cost_down(num_fractional); std::vector<f_t> score(num_fractional); @@ for (i_t k = 0; k < num_fractional; k++) { const i_t j = fractional[k]; @@ constexpr f_t eps = 1e-6; const f_t f_down = solution[j] - std::floor(solution[j]); const f_t f_up = std::ceil(solution[j]) - solution[j]; - score[k] = - std::max(f_down * pseudo_cost_down[k], eps) * std::max(f_up * pseudo_cost_up[k], eps); + f_t s = std::max(f_down * pseudo_cost_down[k], eps) * std::max(f_up * pseudo_cost_up[k], eps); + if (!std::isfinite(s)) { s = -std::numeric_limits<f_t>::infinity(); } + score[k] = s; } i_t branch_var = fractional[0]; - f_t max_score = -1; + f_t max_score = -std::numeric_limits<f_t>::infinity(); i_t select = -1; for (i_t k = 0; k < num_fractional; k++) { - if (score[k] > max_score) { + if (select == -1 || score[k] > max_score) { max_score = score[k]; branch_var = fractional[k]; select = k; } } - log.debug("Pseudocost branching on %d. Value %e. Score %e.\n", - branch_var, - solution[branch_var], - score[select]); + if (select >= 0) { + log.debug("Pseudocost branching on %d. Value %e. Score %e.\n", + branch_var, + solution[branch_var], + score[select]); + } return branch_var; }Based on learnings, ensure branch-and-bound decisions remain correct and race-free under multi-threaded exploration.
🤖 Fix all issues with AI agents
In @cpp/src/dual_simplex/branch_and_bound.hpp:
- Around line 70-81: In bnb_stats_t, total_lp_iters is incorrectly typed as f_t;
change it to an integer type (i_t or a fixed-width integer like std::int64_t)
since it counts iterations, and remove omp_atomic_t from fields intended for
main-thread-only access (last_log and nodes_since_last_log) making them plain
f_t and i_t respectively; keep atomic wrappers for concurrent counters
(total_lp_solve_time, nodes_explored, nodes_unexplored) and preserve the “main
thread only” comment to avoid unintended synchronization overhead.
In @cpp/src/dual_simplex/pseudo_costs.cpp:
- Around line 319-361: In obj_estimate, ensure pseudo-costs can't be NaN/inf and
the returned estimate never falls below lower_bound and silence the unused
logger: validate/replace per-variable pseudo_cost_down and pseudo_cost_up
(derived from pseudo_cost_sum_down/num or the averages
pseudo_cost_down_avg/pseudo_cost_up_avg) by checking finiteness (std::isfinite)
and clamping to a non-negative minimum (e.g., >= 0) before using them, keep the
per-term contribution bounded below by eps as already done, and after the loop
set estimate = std::max(estimate, lower_bound) to avoid decreasing below the
bound; also silence the unused logger_t& log parameter (e.g., (void)log;) or use
it for a debug message. Optionally factor the pseudo-cost extraction into a
helper used by obj_estimate and variable_selection to dedupe logic.
🧹 Nitpick comments (3)
cpp/src/dual_simplex/branch_and_bound.hpp (3)
176-178: Member naming consistency:node_queuevs trailing-underscore convention
Not blocking, but it stands out among*_members and increases grep/scan friction.
214-223:plunge_from(...)signature is getting long; consider a small context struct
A lightweight “thread_context” (leaf_problem, presolver, basis_update, lists) would reduce call-site churn as this evolves.
230-240: Same:dive_from(...)could benefit from a context struct + explicit invariants
Particularly forstart_lower/start_upper(must match LP cols) anddiving_type(expect onlythread_type_t::DIVING?).
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cpp/src/dual_simplex/branch_and_bound.cppcpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cppcpp/src/dual_simplex/pseudo_costs.hpp
🚧 Files skipped from review as they are similar to previous changes (2)
- cpp/src/dual_simplex/pseudo_costs.hpp
- cpp/src/dual_simplex/branch_and_bound.cpp
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{cu,cuh,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...
Files:
cpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
**/*.{h,hpp,py}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Verify C API does not break ABI stability (no struct layout changes, field reordering); maintain backward compatibility in Python and server APIs with deprecation warnings
Files:
cpp/src/dual_simplex/branch_and_bound.hpp
**/*.{cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state
Files:
cpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
**/*.{cu,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Files:
cpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
🧠 Learnings (19)
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hppcpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-12-04T04:11:12.640Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 500
File: cpp/src/dual_simplex/scaling.cpp:68-76
Timestamp: 2025-12-04T04:11:12.640Z
Learning: In the cuOPT dual simplex solver, CSR/CSC matrices (including the quadratic objective matrix Q) are required to have valid dimensions and indices by construction. Runtime bounds checking in performance-critical paths like matrix scaling is avoided to prevent slowdowns. Validation is performed via debug-only check_matrix() calls wrapped in #ifdef CHECK_MATRIX.
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-12-04T20:09:09.264Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 602
File: cpp/src/linear_programming/solve.cu:732-742
Timestamp: 2025-12-04T20:09:09.264Z
Learning: In cpp/src/linear_programming/solve.cu, the barrier solver does not currently return INFEASIBLE or UNBOUNDED status. It only returns OPTIMAL, TIME_LIMIT, NUMERICAL_ISSUES, or CONCURRENT_LIMIT.
Applied to files:
cpp/src/dual_simplex/branch_and_bound.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cpp,hpp,h} : Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
🧬 Code graph analysis (1)
cpp/src/dual_simplex/pseudo_costs.cpp (1)
cpp/src/dual_simplex/pseudo_costs.hpp (4)
num_initialized_down(41-44)fractional(46-48)fractional(50-53)fractional(55-56)
🪛 Cppcheck (2.19.0)
cpp/src/dual_simplex/pseudo_costs.cpp
[warning] 314-314: Array index -1 is out of bounds.
(negativeContainerIndex)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, arm64, rockylinux8
🔇 Additional comments (3)
cpp/src/dual_simplex/branch_and_bound.hpp (3)
155-157: Stats extraction tobnb_stats_tlooks good
Nice cleanup vs ad-hoc per-class structs; should make it easier to thread stats through plunge/dive flows consistently.
10-13: Include update and node_queue_t synchronization are correctnode_queue_t is properly self-synchronized with internal
omp_mutex_tprotection (line 100) andlock_guardon all public methods (pop_best_first, etc.). The removal of old heap/diving mutex members is safe; synchronization is correctly encapsulated within the queue itself.
246-259:solve_node()stats parameter: verified across all call-sitesAll three call-sites correctly pass the intended stats object:
- Line 851 & 974 (exploration_ramp_up): pass shared
exploration_stats_member with thread_type EXPLORATION- Line 1154 (dive_from): passes local
dive_statsvariable with thread_type DIVINGExploration threads safely share
exploration_stats_across concurrent OpenMP tasks because critical counting fields (nodes_explored,nodes_unexplored,total_lp_iters,last_log,nodes_since_last_log) are protected withomp_atomic_t<>. Diving threads use properly isolated localdive_statsper dive. No risk of cross-contamination or double-counting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cpp/src/dual_simplex/pseudo_costs.cpp (1)
300-314: Potential out-of-bounds access if fractional vector is empty.If
fractional.size() == 0, the selection loop (lines 303-309) doesn't execute, leavingselect = -1. Line 314 then accessesscore[select], causing undefined behavior.🛡️ Add defensive check for empty fractional vector
+ if (num_fractional == 0) { + log.debug("No fractional variables for pseudocost branching\n"); + return -1; // or throw an exception + } + i_t branch_var = fractional[0]; f_t max_score = -1; i_t select = -1;Alternatively, add an assertion before line 300:
+ assert(num_fractional > 0 && "fractional vector must not be empty"); i_t branch_var = fractional[0];Note: Static analysis flagged this at line 314. Based on coding guidelines for algorithm correctness in optimization logic.
🤖 Fix all issues with AI agents
In @cpp/src/dual_simplex/pseudo_costs.cpp:
- Around line 325-362: The implementation currently clamps each product with eps
before taking the minimum which alters Achterberg's formula; change the logic in
the loop that computes the contribution to estimate so you compute the two
products (pseudo_cost_down * f_down and pseudo_cost_up * f_up) and then take
their minimum directly (i.e., replace the std::min(std::max(..., eps),
std::max(..., eps)) pattern); reference the symbols pseudo_cost_down,
pseudo_cost_up, f_down, f_up and update the line that adds to estimate to use
std::min(down_prod, up_prod) (optionally clamp the final min with eps only after
taking the min if you need a nonzero floor for stability).
🧹 Nitpick comments (2)
cpp/src/dual_simplex/pseudo_costs.cpp (2)
319-362: Refactor duplicated pseudo-cost computation logic.The
obj_estimatemethod duplicates significant logic fromvariable_selection:
- Lines 330-335 duplicate the
initialized()call and variable setup (identical to lines 267-272 invariable_selection)- Lines 337-358 duplicate the per-variable pseudo-cost lookup pattern (similar to lines 280-298)
Consider extracting a helper method to compute per-variable pseudo-costs to reduce duplication and improve maintainability.
♻️ Proposed refactoring to eliminate duplication
Add a private helper method:
template <typename i_t, typename f_t> void pseudo_costs_t<i_t, f_t>::compute_variable_pseudocosts( const std::vector<i_t>& fractional, std::vector<f_t>& pseudo_cost_down, std::vector<f_t>& pseudo_cost_up, f_t pseudo_cost_down_avg, f_t pseudo_cost_up_avg) const { const i_t num_fractional = fractional.size(); pseudo_cost_down.resize(num_fractional); pseudo_cost_up.resize(num_fractional); for (i_t k = 0; k < num_fractional; k++) { const i_t j = fractional[k]; pseudo_cost_down[k] = (pseudo_cost_num_down[j] != 0) ? pseudo_cost_sum_down[j] / pseudo_cost_num_down[j] : pseudo_cost_down_avg; pseudo_cost_up[k] = (pseudo_cost_num_up[j] != 0) ? pseudo_cost_sum_up[j] / pseudo_cost_num_up[j] : pseudo_cost_up_avg; } }Then use it in both methods to eliminate the duplicated loops.
Based on coding guidelines: refactor code duplication in solver components (3+ occurrences) into shared utilities.
319-362: Document the new public API and empty vector behavior.The new
obj_estimatemethod is a public API but lacks documentation explaining:
- What objective estimate it computes (lower bound estimate based on pseudo-costs)
- The mathematical formula used (minimum of down/up branch estimates)
- Expected behavior when
fractionalis empty (returnslower_boundunchanged)Add a comment block above the method in the header file explaining the purpose and behavior, especially for the empty vector case which returns
lower_boundwithout modification.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cpp/src/dual_simplex/pseudo_costs.cpp
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{cu,cuh,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...
Files:
cpp/src/dual_simplex/pseudo_costs.cpp
**/*.{cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state
Files:
cpp/src/dual_simplex/pseudo_costs.cpp
**/*.{cu,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Files:
cpp/src/dual_simplex/pseudo_costs.cpp
🧠 Learnings (14)
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cpp,hpp,h} : Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Applied to files:
cpp/src/dual_simplex/pseudo_costs.cpp
🪛 Cppcheck (2.19.0)
cpp/src/dual_simplex/pseudo_costs.cpp
[warning] 314-314: Array index -1 is out of bounds.
(negativeContainerIndex)
🔇 Additional comments (3)
cpp/src/dual_simplex/pseudo_costs.cpp (3)
3-3: LGTM: Copyright year updated.
202-202: LGTM: RAII-based locking improves exception safety.The replacement of manual mutex lock/unlock with
std::lock_guardensures proper cleanup even in exception paths.
260-260: LGTM: Improved locking and logging consistency.The RAII-based locking and migration to
log.debugfor internal diagnostics align with best practices.Also applies to: 274-278, 311-311
chris-maes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the comments are minor. However, I'm worried that many of the implementations details of your parallelism scheme for branch and bound are leaking into the node_queue_t class. This spreads the important logic across multiple files and makes things more difficult to understand. Is it possible to move this logic back into branch and bound and keep the node_queue_t class simple?
Sure. Is it now better? |
This PR introduces the following changes:
node_queue) to allow the sharing of information between them. This also greatly reduces memory consumption (33GBvs48GBforneos-848589after250s) since the lower and upper variable no longer needs to be stored for diving.The performance remains the same as the main branch:
226feasible solutions with~12.8%primal gap on a GH200 for the MIPLIB2017 dataset.This PR was split from #697 to be easier to review.
Reference:
[1] T. Achterberg, “Constraint Integer Programming,” PhD, Technischen Universität Berlin,
Berlin, 2007. doi: 10.14279/depositonce-1634.
Checklist
Summary by CodeRabbit
Bug Fixes
Refactor
New Features
API
✏️ Tip: You can customize this high-level summary in your review settings.