@inproceedings{xiao2018unified,
title={Unified perceptual parsing for scene understanding},
author={Xiao, Tete and Liu, Yingcheng and Zhou, Bolei and Jiang, Yuning and Sun, Jian},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
pages={418--434},
year={2018}
}
Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | download |
---|---|---|---|---|---|---|---|---|
UPerNet | R-50 | 512x1024 | 40000 | 6.4 | 4.25 | 77.10 | 78.37 | model | log |
UPerNet | R-101 | 512x1024 | 40000 | 7.4 | 3.79 | 78.69 | 80.11 | model | log |
UPerNet | R-50 | 769x769 | 40000 | 7.2 | 1.76 | 77.98 | 79.70 | model | log |
UPerNet | R-101 | 769x769 | 40000 | 8.4 | 1.56 | 79.03 | 80.77 | model | log |
UPerNet | R-50 | 512x1024 | 80000 | - | - | 78.19 | 79.19 | model | log |
UPerNet | R-101 | 512x1024 | 80000 | - | - | 79.40 | 80.46 | model | log |
UPerNet | R-50 | 769x769 | 80000 | - | - | 79.39 | 80.92 | model | log |
UPerNet | R-101 | 769x769 | 80000 | - | - | 80.10 | 81.49 | model | log |
Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | download |
---|---|---|---|---|---|---|---|---|
UPerNet | R-50 | 512x512 | 80000 | 8.1 | 23.40 | 40.70 | 41.81 | model | log |
UPerNet | R-101 | 512x512 | 80000 | 9.1 | 20.34 | 42.91 | 43.96 | model | log |
UPerNet | R-50 | 512x512 | 160000 | - | - | 42.05 | 42.78 | model | log |
UPerNet | R-101 | 512x512 | 160000 | - | - | 43.82 | 44.85 | model | log |
Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | download |
---|---|---|---|---|---|---|---|---|
UPerNet | R-50 | 512x512 | 20000 | 6.4 | 23.17 | 74.82 | 76.35 | model | log |
UPerNet | R-101 | 512x512 | 20000 | 7.5 | 19.98 | 77.10 | 78.29 | model | log |
UPerNet | R-50 | 512x512 | 40000 | - | - | 75.92 | 77.44 | model | log |
UPerNet | R-101 | 512x512 | 40000 | - | - | 77.43 | 78.56 | model | log |