Skip to content

Benchmarks

Julien Coupey edited this page May 31, 2022 · 28 revisions

All computations done on a machine using an Intel Xeon E5-1620 CPU @ 3.50GHz, 4c/8t.

VRPTW

Solomon

The Solomon benchmark is used as a reference in most research papers including time window constraints.

All required instances and scripts are available to reproduce results.

Benchmark description

56 instances with 100 jobs, grouped in classes C (clustered), R (random), RC (mix of both), and 1 or 2 for short or long planning horizon.

Global indicators

Using v1.12.0 with -x 5 and comparing to best known solutions targeting cumulated travel time (more on objectives).

Computing times (ms)
Average 359
Median 382
Longest 716
Quality
Best known solutions 14 (25.0%)
Minimum gap +0.00%
Median gap +1.12%
Average gap +1.63%
Worst gap +7.54%

Results by problem class

Comparison to best known solutions (BKS) targeting cumulated travel time (CTT). Cumulated number of vehicles (CNV) is only reported for the record.

Class C1 C2 R1 R2 RC1 RC2 Total
BKS CNV 10.0 3.0 13.25 5.36 12.88 6.25 485
BKS CTT 828.38 589.86 1175.75 878.41 1340.02 1004.21 54,699
VROOM CNV 10.0 3.0 13.25 4.27 13.25 5.25 468
VROOM CTT 828.38 589.86 1191.99 912.60 1356.76 1030.75 55,616
CTT gap +0.00% +0.00% +1.38% +3.89% +1.25% +2.64% +1.68

MDVRP

Cordeau

All required instances and scripts are available to reproduce results on the Cordeau et al. (1997) dataset.

Benchmark description

33 instances with capacity constraints and a number of jobs ranging from 48 to 360 (average 176), with vehicles starting from 2 to 9 depots.

Results

Using v1.12.0 with -x 5.

Computing times (ms)
Average 1080
Median 451
Longest 4092
Quality
Best known solutions 5 (15.2%)
Minimum gap +0.00%
Median gap +1.20%
Average gap +1.57%
Worst gap +4.90%

PDPTW

Li & Lim

The Li & Lim 100 benchmark is derived from the Solomon instances with the addition of pickup and delivery constraints.

All required instances and scripts are available to reproduce results.

Benchmark description

56 instances with around 100 jobs, grouped in classes C (clustered), R (random), RC (mix of both), and 1 or 2 for short or long planning horizon.

Global indicators

Using v1.12.0 with -x 5.

Computing times

Computing times (ms)
Average 142
Median 138
Longest 275

Quality

We target cumulated travel time (CTT), but best known results are only reported for approaches using the number of vehicles as the first optimization objective. For a meaningful comparison we rule out instances lc103, lc104 and lc109 for which we provide solutions that are better in term of CTT but use one more vehicle than the reported "best known solution".

CTT comparison
Best known solutions 24 (46.2%)
Minimum gap +0.00%
Median gap +0.00%
Average gap +1.25%
Worst gap +10.62%

CVRP

CVRPLIB

CVRPLIB is a set of benchmark instances for single-depot CVRP. We use all instances described using euclidean distance (EDGE_WEIGHT_TYPE = EUC_2D), from classes A, B, E, F, M, P and X.

All required instances and scripts are available to reproduce results.

Benchmark description

  • 189 instances
  • Sizes ranging from 15 to 1,000 jobs
  • Average size around 240 jobs
  • Number of vehicles ranging from 2 to 207
  • Average capacity tightness (total job demand / total vehicle capacity): 0.95

Global indicators

Using v1.12.0.

Exploration level 0 (fastest) 5 (best)
Median computing time 39ms 505ms
Longest computing time 2.3s 110.1s
Jobs assigned 99.6% 99.9%
Solutions with all jobs 158 (83.6%) 171 (90.5%)
Best known solutions 14 (7.4%) 52 (27.5%)

Gaps to best known solutions

Only reported for instances with all jobs assigned, so cost comparison makes sense.

Exploration level 0 (fastest) 5 (best)
Minimum gap +0.00% +0.00%
Median gap +2.40% +0.68%
Average gap +2.65% +1.30%
Worst gap +13.71% +9.37%

TSP

TSPLIB

TSPLIB is the reference benchmark for TSP. We use all instances described using euclidean distance (EDGE_WEIGHT_TYPE = EUC_2D).

All required instances and scripts are available to reproduce results.

Benchmark description

  • 78 instances
  • Sizes ranging from 50 to 18,511 points
  • Average size around 1,170 points

Global indicators

Using v1.8.0.

  • Median computing time: 35 ms
  • Average gap to optimal solution: +2.69%
  • Worst gap to optimal solution: +6.21%

Examples

Instance Size Computing time Gap
kroB100 100 5ms +0.26%
a280 280 18ms +1.12%
rat575 575 61ms +3.35%
u1060 1,060 294ms +3.04%
u2152 2,152 1824ms +4.77%
rl5915 5,915 25s +1.92%
usa13509 13,509 14m18s +2.81%
d18512 18,512 35m37s +2.99%