Skip to content

Latest commit

 

History

History
120 lines (104 loc) · 6.07 KB

BENCHMARK.md

File metadata and controls

120 lines (104 loc) · 6.07 KB

Welcome to PytorchAutoDrive benchmark

The current benchmark's FLOPs & Param count is entirely based on thop to identify underlying basic ops, which might be inaccurate. But FLOPs count is an estimate to begin with. What we are doing here, is simply providing a relatively fair benchmark for comparing different methods.

Lane detection performance

method backbone resolution FPS FLOPS(G) Params(M)
Baseline VGG16 360 x 640 56.36 214.50 20.37
Baseline ResNet18 360 x 640 148.59 85.24 12.04
Baseline ResNet34 360 x 640 79.97 159.60 22.15
Baseline ResNet50 360 x 640 50.58 177.62 24.57
Baseline ResNet101 360 x 640 27.41 314.36 43.56
Baseline ERFNet 360 x 640 85.87 26.32 2.67
Baseline ENet 360 x 640 56.63 4.26 0.95
Baseline MobileNetV2 360 x 640 126.54 4.49 2.06
Baseline MobileNetV3-Large 360 x 640 104.34 3.63 3.30
SCNN VGG16 360 x 640 21.18 218.64 20.96
SCNN ResNet18 360 x 640 21.12 89.38 12.63
SCNN ResNet34 360 x 640 20.77 163.74 22.74
SCNN ResNet50 360 x 640 19.59 181.76 25.16
SCNN ResNet101 360 x 640 13.50 318.50 44.15
SCNN ERFNet 360 x 640 18.40 30.46 3.26
LSTR ResNet18s 360 x 640 98.13 1.15 0.77
LSTR ResNet18s-2x 360 x 640 97.27 4.05 3.05
LSTR ResNet18s 1080 x 1920 91.23 10.20 0.77
LSTR ResNet18s 2160 x 4320 23.60 40.75 0.77
LSTR ResNet34 360 x 640 63.52 34.54 22.34
RESA ResNet18 360 x 640 67.66 61.35 6.61
RESA ResNet34 360 x 640 54.49 101.74 11.99
RESA ResNet50 360 x 640 44.80 105.71 12.46
RESA ResNet101 360 x 640 25.14 242.45 31.46
RESA MobileNetV2 360 x 640 60.53 12.80 4.63
RESA MobileNetV3-Large 360 x 640 54.39 11.95 5.88
LaneATT ResNet18 360 x 640 198.29 18.67 12.02
LaneATT ResNet34 360 x 640 133.84 36.01 22.12
BézierLaneNet ResNet18 360 x 640 212.83 14.77 4.10
BézierLaneNet ResNet34 360 x 640 149.52 29.85 9.49
Baseline VGG16 288 x 800 55.31 214.50 20.15
Baseline ResNet18 288 x 800 136.28 85.22 11.82
Baseline ResNet34 288 x 800 72.42 159.60 21.93
Baseline ResNet50 288 x 800 49.41 177.60 24.35
Baseline ResNet101 288 x 800 27.19 314.34 43.34
Baseline ERFNet 288 x 800 88.76 26.26 2.68
Baseline ENet 288 x 800 57.99 4.12 0.96
Baseline MobileNetV2 288 x 800 129.24 4.41 2.00
Baseline MobileNetV3-Large 288 x 800 107.83 3.56 3.25
Baseline RepVGG-A0 288 x 800 162.61 207.81 9.06
Baseline RepVGG-A1 288 x 800 117.30 339.83 13.54
Baseline RepVGG-B0 288 x 800 103.68 390.83 15.09
Baseline RepVGG-B1g2 288 x 800 36.91 1166.76 42.20
Baseline RepVGG-B2 288 x 800 18.98 2310.13 81.23
Baseline Swin-Tiny 288 x 800 51.90 44.24 27.72
SCNN VGG16 288 x 800 21.40 218.62 20.74
SCNN ResNet18 288 x 800 20.80 89.34 12.42
SCNN ResNet34 288 x 800 19.77 163.72 22.52
SCNN ResNet50 288 x 800 18.88 181.72 24.94
SCNN ResNet101 288 x 800 13.42 318.46 43.94
SCNN ERFNet 288 x 800 18.80 30.40 3.27
SCNN RepVGG-A1 288 x 800 20.53 343.96 14.13
RESA ResNet18 288 x 800 69.58 61.33 6.62
RESA ResNet34 288 x 800 55.61 101.72 12.01
RESA ResNet50 288 x 800 46.75 105.70 12.48
RESA ResNet101 288 x 800 26.08 242.44 31.47
RESA MobileNetV2 288 x 800 59.49 12.55 4.63
RESA MobileNetV3-Large 288 x 800 53.85 11.70 5.88
LSTR ResNet34 288 x 800 65.39 33.86 22.34
BézierLaneNet ResNet18 288 x 800 210.79 14.66 4.10
BézierLaneNet ResNet34 288 x 800 144.65 29.54 9.49

Segmentation performance:

method resolution FPS FLOPS(G) Params(M)
FCN 256 x 512 43.32 216.42 51.95
FCN 512 x 1024 12.06 865.69 51.95
FCN 1024 x 2048 3.06 3462.77 51.95
ERFNet 256 x 512 91.20 15.03 2.07
ERFNet 512 x 1024 85.51 60.11 2.07
ERFNet 1024 x 2048 21.53 240.44 2.07
ENet 256 x 512 59.31 2.72 0.35
ENet 512 x 1024 55.69 10.88 0.35
ENet 1024 x 2048 30.88 43.53 0.35
DeeplabV2 256 x 512 44.87 180.59 43.90
DeeplabV2 512 x 1024 12.93 722.37 43.90
DeeplabV2 1024 x 2048 3.23 2889.49 43.90
DeeplabV3 256 x 512 35.26 241.65 58.63
DeeplabV3 512 x 1024 10.26 966.61 58.63
DeeplabV3 1024 x 2048 2.56 3866.45 58.63

All results are the maximum value of 3 times on a RTX 2080Ti.

Lane detection post-processing are not counted.

LaneATT NMS is not counted yet.

Profiling Models Yourself

In the setting of mode=simple, we employ a random tensor to replace the real image. Therefore, we can avoid using the DataLoader to obtain the best performance of models.

This is also the setting for the above benchmark.

python tools/profiling.py --mode=simple \
                          --config=<config file path> \           
                          --times=3 \
                          --height=<image height in pixels> \
                          --width=<image width in pixels>

Same config mechanism and commandline overwrite by --cfg-options as in training/testing.

In the setting of mode=real, so as to simulate that the real camera transmit frames to models, we set 'batch_size=1' and 'num_workers=0' in the DataLoader. Just use --mode=real and probably provide an actual model by --checkpoint.

For detailed instructions and commandline shortcuts available, run:

python tools/profiling.py --help