-
Notifications
You must be signed in to change notification settings - Fork 108
Parallel Branch-and-Bound #412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…and diving threads.
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test 0e14fe2 |
Signed-off-by: nicolas <nguidotti@nvidia.com>
aliceb-nv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the great work and results Nicolas! Just a few minor nitpicks, otherwise looks good :) I'll let Chris review the algorithm side
| : std::thread::hardware_concurrency()), | ||
| num_threads(std::thread::hardware_concurrency() - 1), | ||
| num_bfs_threads(std::min(num_threads, 4)), | ||
| num_diving_threads(num_threads - num_bfs_threads), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we guarantee one diving thread at least maybe? The following code might make this assumption (and diving even with thread contention might be better than none at all)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Also, if num_threads < 4, there will be have no diving threads. Instead, we can set something like this:
num_bfs_threads = std::min(1, num_threads / 4)
num_diving_threads = std::min(1, num_threads - num_bfs_threads)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will experiment a bit with the different configurations
| branch_and_bound_settings.absolute_mip_gap_tol = context.settings.tolerances.absolute_mip_gap; | ||
| branch_and_bound_settings.relative_mip_gap_tol = context.settings.tolerances.relative_mip_gap; | ||
| branch_and_bound_settings.integer_tol = context.settings.tolerances.integrality_tolerance; | ||
| branch_and_bound_settings.num_threads = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we experimented with this parameter? Submip tends to yield good solutions, it might perform better with a few more threads allocated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet. It was set to 1 to avoid oversubscription of the threads. I can allocate a few more threads to submip and see how it behaves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I set to 1 best first thread and 1 diving thread. Later, we can experiment in allocating more threads to submip. @akifcorduk
| stack.push_front(second); | ||
| stack.push_front(first); | ||
|
|
||
| if (dive_queue_.size() < min_diving_queue_size_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have a comment here about why you are adding to the dive_queue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the only place we add to the dive queue? I thought the BFS threads would also add to the dive queue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BFS also adds to the diving queue. It is around the line 856 in the explore_subtree routine.
| UNSET = 6, // The status is not set | ||
| }; | ||
|
|
||
| enum class mip_exploration_status_t { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for splitting this apart. I'm still a little confused why you need the status TIME_LIMIT, NODE_LIMIT, and NUMERICAL here. Why can't this status just be RUNNING or COMPLETED?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solver checks the (internal) status to determines when to stop (basically, when it is not RUNNING). The other status is to keep track of why we stopped the solver (it will be translated later to the mip_status_t).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if we hit a TIME_LIMIT or NODE_LIMIT shouldn't any part of the code be able to tell that just by checking the time and the number of nodes?
I think there is probably an opportunity to simplify the code a bit here and not mix the return status with whether branch and bound is running or not.
| FATHOMED = 3, // Node objective is greater than the upper bound | ||
| HAS_CHILDREN = 4, // Node has children to explore | ||
| NUMERICAL = 5, // Encountered numerical issue when solving the LP relaxation | ||
| TIME_LIMIT = 6 // Time out during the LP relaxation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is TIME_LIMIT needed here?
chris-maes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm removing the request for changes. But I'm still a bit fuzzy on how you handle nodes with numerical errors, in particular in the case where they are encountered by the diving thread.
I also a bit fuzzy on how the lower bound is handled in the ramp up phase versus normal BFS threads.
chris-maes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to remove request changes
|
/merge |
Description
This PR implement a parallel branch-and-bound procedure, which is split into two phases. In the first phase, the algorithm will greedily expand the search tree until a certain depth and then add the bottom nodes to a global heap. The parallel expansion is implemented using
omp task.In the second phase, some threads will explore the tree using best first search with plunging, i.e., they take the first node from the global heap and then explore the entire branch that starts on this node. Any unexplored node are insert into the heap. The remaining threads will perform deep dives in order to find feasible solutions. The solver keep a small heap contains the most promising nodes to perform the dives, which is keep in sync with the global heap.
This PR also
std::thread-based parallelization in the strong branching with OpenMP in order to use dynamic scheduling. This ensures that all threads have similar amount of work and improve parallel performance.std::mutexwithomp atomicwhatever applicable.dive_queue_tandsearch_tree_tto store the diving heap and the search tree, respectively.This is an extension of #305.
Closes #320.
Closes #417.
Benchmark results (MIPLIB2017):
master branch (53d6e74)
This PR:
i.e., a
1.8%improvement. In terms of the geomean of the gap ratio, this is equal to1.62x.Checklist