Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor/try job additions #427

Merged
merged 6 commits into from
Jan 13, 2021

Conversation

krypt-n
Copy link
Contributor

@krypt-n krypt-n commented Jan 5, 2021

Issue

Closes #392

Tasks

  • Apply some more micro optimisations I have an a local branch somewhere
  • Look into allocations in compute_best_insertion_pd
  • Benchmark properly
  • Update CHANGELOG.md (remove if irrelevant)
  • review

This is my current progress on #392, the results so far seem promising. For comparison, the li_lim_100 benchmarks:

Baseline, current master@24a6dd5:

,Gaps,Computing times
Min,-20.04,148
First decile,-0.0,301
Lower quartile,0.0,557
Median,0.0,685
Upper quartile,1.25,913
Ninth decile,4.19,986
Max,9.86,1236

This branch:

,Gaps,Computing times
Min,-20.04,114
First decile,-0.0,184
Lower quartile,0.0,231
Median,0.0,307
Upper quartile,1.25,397
Ninth decile,4.19,478
Max,9.86,601

Additionally I verified that the solutions for a range of different selected benchmarks are exactly the same. This makes me quite confident that my refactoring is correct. I'm going to continue working on this some time in the next couple of days.

@jcoupey jcoupey added this to the v1.9.0 milestone Jan 7, 2021
@jcoupey
Copy link
Collaborator

jcoupey commented Jan 7, 2021

Sounds great! From what I can tell from the current commits, this already covers the scope of #392.

I'd like to run some benchmarking on my side too, but maybe I should wait as you listed more todo items? Or if the other items are not directly related to this refactoring, maybe we can move on with the current change and have them in a separate PR?

@krypt-n
Copy link
Contributor Author

krypt-n commented Jan 7, 2021

You're right, #392 is basically done. Getting rid of the allocations in compute_best_insertion_pd seems to me to be a bit more invasive. I'd like to follow up on that though, since that and PDShift (similar algorithm) are responsible for around 50% of the calls to malloc in my benchmark.

I'd still like to add a few comments and format the code in a future commit, but feel free to benchmark and review this as is.
The commits should be reviewable individually. The first four only move stuff around and should not change behaviour in any way, the last one reduces the number of insertion calculations.

@jcoupey
Copy link
Collaborator

jcoupey commented Jan 7, 2021

Yes, I took a look at the PR with incremental diffs. Thanks for cutting out the changes in a few commits like you did, makes it really easy to follow on the overall logic.

I'll just fire a few solving rounds on the usual benchmarks which should provide a ballpark figure on the speedup anyway. Take the time you need to go on and I'll simply review the current state for a couple comments.

Copy link
Collaborator

@jcoupey jcoupey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped a couple comments. Also ./scripts/format.sh generates a lot of changes, mostly indenting stuff.

src/algorithms/local_search/local_search.cpp Outdated Show resolved Hide resolved
src/algorithms/local_search/local_search.cpp Outdated Show resolved Hide resolved
src/algorithms/local_search/local_search.cpp Outdated Show resolved Hide resolved
src/algorithms/local_search/local_search.cpp Outdated Show resolved Hide resolved
@jcoupey
Copy link
Collaborator

jcoupey commented Jan 8, 2021

The following table shows the difference between 24a6dd5 (master) and 3f2118f (this PR) for various benchmarks. Average computing time variation (in %) is reported across several exploration levels. Note that for small x values the comparisons are not always meaningful as the computing times are sometimes just a few milliseconds (Solomon and Li & Lim 100). All solutions are strictly identical.

Benchmark \ x= 0 1 2 3 4 5
CVRP 0.7 -11.2 -13.0 -17.5 -21.8 -26.5
Solomon -1.8 -1.3 -4.2 -3.6 -6.8 -9.4
Homberger 200 0 .0 1.0 -1.3 -5.0 -9.2 -12.0
Li & lim 100 -18.2 -5.6 -29.2 -41.2 -48.0 -53.4
Li & lim 200 -3.3 -8.7 -30.1 -46.8 -54.3 -59.8

When I spotted the problem and wrote the ticket, I though we'd get nice improvements but I did not suspect this order of magnitude. Especially for instances where there are actually no unassigned jobs (all except for a few CVRP instances here).

So the good news is: this already has a significant impact on computing times for the use of try_job_additions as part of the normal local search process, which happens when we loosen the routes and re-insert jobs to move on to another exploration level. The higher the exploration level, the more intensively we use try_job_additions, which explains the changes across x values.

Instances with shipments (Li & Lim classes) see a much bigger gain, which is in line with the fact that the insertion checks have a higher complexity than for regular jobs, so skipping the unnecessary checks is even more profitable.

Todo: I'll benchmark this at some point on instances with more unassigned jobs and we should see an even bigger impact.

@jcoupey
Copy link
Collaborator

jcoupey commented Jan 8, 2021

Instances with shipments (Li & Lim classes) see a much bigger gain, which is in line with the fact that the insertion checks have a higher complexity

@krypt-n all the more interested to see the improvements you have in store for compute_best_insertion_pd.

@krypt-n
Copy link
Contributor Author

krypt-n commented Jan 13, 2021

I briefly looked into the performance of the homberger instances with exploration level 0, but could not find a reliable slowdown on my machine on any instance. Do you know of any instance that performs worse on with this PR, or is the 2.7% just noise?

Other than that, I formatted this PR and applied the changes you suggested during the review

@krypt-n krypt-n changed the title [WIP] Refactor/try job additions Refactor/try job additions Jan 13, 2021
@jcoupey
Copy link
Collaborator

jcoupey commented Jan 13, 2021

Great! Looks like this is ready to merge.

My mistake on the Homberger 200 instances: I reported the difference for the computing time on the last instance only (364 ms vs 374 ms). The average is 228 ms for both runs. I just updated the table above.

Last question: you crossed the first two items in the list, does it mean you did not look into them for this PR, or that you did not find any interesting improvements?

@krypt-n
Copy link
Contributor Author

krypt-n commented Jan 13, 2021

Ah great. I wasn't sure about that since the running time varies quite a bit from run to run and I didn't do enough benchmark runs to determine if the distribution changed.

I crossed them from the list because I don't feel they need to be part of this PR. I will most likely open new PRs for them, if I get any noteworthy results.

Yep, looks ready to merge for me as well.

@jcoupey
Copy link
Collaborator

jcoupey commented Jan 13, 2021

For completeness, I've generated random problems with time windows and 600 tasks using only 10 vehicles so more or less half the tasks end up unassigned.

Instance \ x= 0 1 2 3 4 5
600 jobs 1.4 -1.6 -4.9 -6.1 -2.5 -8.5
300 shipments -1.6 -42.7 -65.4 -67.6 -71.9 -74.6

Of course that's only a couple instances but modulo computing time variations across runs, this is pretty much in line with previous tests and expectations.

@jcoupey jcoupey merged commit 4c0a788 into VROOM-Project:master Jan 13, 2021
@jcoupey
Copy link
Collaborator

jcoupey commented Jan 15, 2021

@krypt-n about the other potential optimizations you first listed, feel free to open a dedicated ticket if it makes sense, if only as a way to track the idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve try_job_additions
2 participants