NMS with CUDA only #1824

grimoire · 2022-03-22T07:34:57Z

This PR add a cuda kernel for nms to avoid computation on cpu.

I am not sure if I should call this an "optimization" cause on small input data, cpu performance might even better than gpu. And the data distribution will also affect the performance.

envs:

Device: RTX 2070super
CUDA: 11.3
nvidia driver: 465.19.01
CPU: Intel(R) Core(TM) i7-9700KF CPU @ 3.60GHz

Test data comes from faster rcnn and demo image. Both nms in rpn and nms in bbox head are tested.

-	old	new
rpn data(4741 boxes)	0.98435ms	0.88798ms
rcnn data(583 boxes)	0.23121ms	0.22827ms
random data(500 boxes)	0.23056ms	0.29762ms
random data(1000 boxes)	0.26279ms	0.41685ms
random data(5000 boxes)	1.27405ms	1.34129ms
random data(10000 boxes)	3.62377ms	2.74194ms
random data(20000 boxes)	21.07619ms	4.80819ms

Real data might have clustered bboxes, which reduce gpu computations. I guess that is why real data performance is better than random data.

mmcv/ops/csrc/pytorch/cuda/nms_cuda.cu

mmcv/ops/csrc/common/cuda/nms_cuda_kernel.cuh

grimoire added 4 commits March 21, 2022 19:08

add gather_keep_from_mask_parallize

d5a772c

remove unused cache

2b18a47

move syncthread

03765c3

remove unused comment

7c30a5b

mm-assistant bot added the size/XS label Mar 22, 2022

zhouzaida requested review from teamwong111 and ZwwWayne March 29, 2022 02:26

zhouzaida added the Op label Apr 7, 2022

ZwwWayne reviewed Apr 10, 2022

View reviewed changes

mmcv/ops/csrc/pytorch/cuda/nms_cuda.cu Outdated Show resolved Hide resolved

ZwwWayne reviewed Apr 10, 2022

View reviewed changes

mmcv/ops/csrc/pytorch/cuda/nms_cuda.cu Show resolved Hide resolved

ZwwWayne reviewed Apr 10, 2022

View reviewed changes

mmcv/ops/csrc/common/cuda/nms_cuda_kernel.cuh Outdated Show resolved Hide resolved

add more comments, rename the kernel and variable

b47661d

zhouzaida assigned ZwwWayne and teamwong111 and unassigned ZwwWayne Apr 13, 2022

ZwwWayne approved these changes Apr 15, 2022

View reviewed changes

ZwwWayne merged commit 74031cc into open-mmlab:master Apr 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NMS with CUDA only #1824

NMS with CUDA only #1824

grimoire commented Mar 22, 2022

NMS with CUDA only #1824

NMS with CUDA only #1824

Conversation

grimoire commented Mar 22, 2022