Too high GRAM consumption for spconv indexing #983

zhanggefan · 2021-10-07T11:24:15Z

I am trying to reimplement the PVRCNN using this codebase and I found that the GRAM consumption during training on kitti is unacceptably high compare to OpenPCDet. There is the a surge in GRAM consumption for approximately 4GB for the first time when get_indice_pairs is called.

After some investigation I found that the spconv used by MMDet3D is a customized version, and I also found that the following 4096 seems to be the root cause of it. Changing them to 27 would solve the problem, but in an ugly way (all conv kernels larger than 3x3x3 would fail then).

mmdetection3d/mmdet3d/ops/spconv/src/indice_cuda.cu

Line 48 in e5a87f3

prepareDeConvIndicePairsKernel<Index, IndexGrid, NDim, 4096>

mmdetection3d/mmdet3d/ops/spconv/src/indice_cuda.cu

Line 54 in e5a87f3

prepareIndicePairsKernel<Index, IndexGrid, NDim, 4096>

mmdetection3d/mmdet3d/ops/spconv/src/indice_cuda.cu

Line 119 in e5a87f3

getSubMIndicePairsKernel<Index, IndexGrid, NDim, 4096>

The text was updated successfully, but these errors were encountered:

ZwwWayne · 2021-10-12T07:14:21Z

The spconv in MMDet3D is adopted from an early version (around 1.0) of spconv. The settings of kernel are not changed.

Do we have a more flexible way to handle this case? Like CUDA operators in MMCV, these numbers are usually not hardcoded. If we have a flexible way to handle these number, we can create a PR to fix that.

BTW, we are migrating CUDA operators to MMCV, feel free to have a look or review it if you are interested.

ZwwWayne · 2021-10-12T07:19:21Z

The progress can be find here #994

Referencing open-mmlab/mmdetection3d#983 MMDetection3D's spconv implementation previously cannot fit PV-RCNN with batch size 2 into a 2080 Ti. By changing this, it now can. However, as mentioned in the issue, this limits kernel sizes to 3x3x3 (which is not an issue for OpenPCDet models)

Tai-Wang added the enhancement New feature or request label Oct 12, 2021

Tai-Wang assigned ZwwWayne Oct 12, 2021

Divadi mentioned this issue Apr 14, 2022

CUDA out of memory for CenterPoint even with batch size 1. #1398

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Too high GRAM consumption for spconv indexing #983

Too high GRAM consumption for spconv indexing #983

zhanggefan commented Oct 7, 2021

ZwwWayne commented Oct 12, 2021 •

edited

Loading

ZwwWayne commented Oct 12, 2021

Too high GRAM consumption for spconv indexing #983

Too high GRAM consumption for spconv indexing #983

Comments

zhanggefan commented Oct 7, 2021

ZwwWayne commented Oct 12, 2021 • edited Loading

ZwwWayne commented Oct 12, 2021

ZwwWayne commented Oct 12, 2021 •

edited

Loading