Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR2022)[paper]
Authors: Chenhang He, Ruihuang Li, Shuai Li, Lei Zhang.
This project is built on OpenPCDet.
2022-04-09: Add waymo config and multi-frame input.
The performance of VoxSeT (single-stage, single-frame) on Waymo valdation split are as follows.
% Training | Car AP/APH | Ped AP/APH | Cyc AP/APH | Log file | |
---|---|---|---|---|---|
Level 1 | 20% | 72.10/71.59 | 77.94/69.58 | 69.88/68.54 | Download |
Level 2 | 20% | 63.62/63.17 | 70.20/62.51 | 67.31/66.02 | |
Level 1 | 100% | 74.50/74.03 | 80.03/72.42 | 71.56/70.29 | Download |
Level 2 | 100% | 65.99/65.56 | 72.45/65.39 | 68.95/67.73 |
- Linux (tested on Ubuntu 16.04)
- Python 3.7
- PyTorch 1.9 or higher (tested on PyTorch 1.10.1)
- CUDA 9.0 or higher (tested on CUDA 10.2)
pip install -r requirements.txt
python setup.py build_ext --inplace
The torch_scatter package is required
- Prepare KITTI dataset and road planes
# Download KITTI and organize it into the following form:
├── data
│ ├── kitti
│ │ │── ImageSets
│ │ │── training
│ │ │ ├──calib & velodyne & label_2 & image_2 & (optional: planes)
│ │ │── testing
│ │ │ ├──calib & velodyne & image_2
# Generatedata infos:
python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml
You can download the pretrain model here and the log file here.
The performance (using 11 recall poisitions) on KITTI validation set is as follows:
Car AP@0.70, 0.70, 0.70:
bev AP:90.1572, 88.0972, 86.8397
3d AP:88.8694, 78.7660, 77.5758
Pedestrian AP@0.50, 0.50, 0.50:
bev AP:63.1125, 58.5591, 55.1318
3d AP:60.2515, 55.5535, 50.1888
Cyclist AP@0.50, 0.50, 0.50:
bev AP:85.6768, 71.9008, 67.1551
3d AP:85.4238, 70.2774, 64.9804
The runtime is about 33 ms per sample.
- Train with a single GPU
python train.py --cfg_file tools/cfgs/kitti_models/voxset.yaml
- Train with multiple GPUs
cd VoxSeT/tools
bash scripts/dist_train.sh --cfg_file ./cfgs/kitti_models/voxset.yaml
cd VoxSeT/tools
python test.py --cfg_file --cfg_file ./cfgs/kitti_models/voxset.yaml --ckpt ${CKPT_FILE}
@inproceedings{he2022voxset,
title={Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds},
author={He, Chenhang and Li, Ruihuang and Li, Shuai and Zhang, Lei},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
year={2022}
}