Authors: Anshul Paigwar, David Sierra-Gonzalez, Ozgur Erkent, Christian Laugier
This repository is code release for our GndNet paper published in IEEE International Conference of Computer Vision, ICCV'2021, Workshop on Autonomous Vehicle Vision. Link
Accurate 3D object detection is a key part of the perception module for autonomous vehicles. A better understanding of the objects in 3D facilitates better decision-making and path planning. RGB Cameras and LiDAR are the most commonly used sensors in autonomous vehicles for environment perception. Many approaches have shown promising results for 2D detection with RGB Images, but efficiently localizing small objects like pedestrians in the 3D point cloud of large scenes has remained a challenging area of research. We propose a novel method, Frustum-PointPillars, for 3D object detection using LiDAR data. Instead of solely relying on point cloud features, we leverage the mature field of 2D object detection to reduce the search space in the 3D space. Then, we use the Pillar Feature Encoding network for object localization in the reduced point cloud. We also propose a novel approach for masking point clouds to further improve the localization of objects. We train our network on the KITTI dataset and perform experiments to show the effectiveness of our network. On the KITTI test set our method outperforms other multi-sensor SOTA approaches for 3D pedestrian localization (Bird’s Eye View) while achieving a significantly faster runtime of 14 Hz.
We would like to thank authors of PointPillars and SECOND detector. This repository is forked from nutonomy PointPillars and SECOND for KITTI object detection.
ONLY supports python 3.6+, pytorch 1.4 +. Code has only been tested on Ubuntu 18.04.
git clone https://github.com/anshulpaigwar/Frustum-Pointpillars.git
You can use pip or Anaconda package manager to install following packages.
pip install --upgrade pip
pip install fire tensorboardX shapely pybind11 protobuf scikit-image numba pillow sparsehash
Finally, install SparseConvNet. This is not required for PointPillars, but the general SECOND code base expects this to be correctly configured.
git clone git@github.com:facebookresearch/SparseConvNet.git
cd SparseConvNet/
bash build.sh
# NOTE: if bash build.sh fails, try bash develop.sh instead
Additionally, you may need to install Boost geometry:
sudo apt-get install libboost-all-dev
Add Frustum-PointPillars/ to your PYTHONPATH.
Download KITTI dataset and create some directories first:
└── KITTI_DATASET_ROOT
├── training <-- 7481 train data
| ├── image_2 <-- for visualization
| ├── calib
| ├── label_2
| ├── velodyne
| └── velodyne_reduced <-- empty directory
└── testing <-- 7580 test data
├── image_2 <-- for visualization
├── calib
├── velodyne
└── velodyne_reduced <-- empty directory
Note: PointPillar's protos use KITTI_DATASET_ROOT=/data/sets/kitti_second/
.
python create_data.py create_kitti_info_file --data_path=KITTI_DATASET_ROOT
python create_data.py create_reduced_point_cloud --data_path=KITTI_DATASET_ROOT
python create_data.py create_groundtruth_database --data_path=KITTI_DATASET_ROOT
The config file needs to be edited to point to the above datasets:
train_input_reader: {
...
database_sampler {
database_info_path: "/path/to/kitti_dbinfos_train.pkl"
...
}
kitti_info_path: "/path/to/kitti_infos_train.pkl"
kitti_root_path: "KITTI_DATASET_ROOT"
}
...
eval_input_reader: {
...
kitti_info_path: "/path/to/kitti_infos_val.pkl"
kitti_root_path: "KITTI_DATASET_ROOT"
}
cd ~/second.pytorch/second
python ./pytorch/train.py train --config_path=./configs/pointpillars/car/xyres_16.proto --model_dir=/path/to/model_dir
- If you want to train a new model, make sure "/path/to/model_dir" doesn't exist.
- If "/path/to/model_dir" does exist, training will be resumed from the last checkpoint.
- Training only supports a single GPU.
- Training uses a batchsize=2 which should fit in memory on most standard GPUs.
- On a single 1080Ti, training xyres_16 requires approximately 20 hours for 160 epochs.
cd ~/second.pytorch/second/
python pytorch/train.py evaluate --config_path= configs/pointpillars/car/xyres_16.proto --model_dir=/path/to/model_dir
- Detection result will saved in model_dir/eval_results/step_xxx.
- By default, results are stored as a result.pkl file. To save as official KITTI label format use --pickle_result=False.
ICCV workshop presentation: https://www.youtube.com/watch?v=0z7OPPRsqTk
If you find this project useful in your research, please star this GitHub repository and consider citing our work:
@INPROCEEDINGS{9607424,
author={Paigwar, Anshul and Sierra-Gonzalez, David and Erkent, Özgür and Laugier, Christian},
booktitle={2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)},
title={Frustum-PointPillars: A Multi-Stage Approach for 3D Object Detection using RGB Camera and LiDAR},
year={2021},
pages={2926-2933},
doi={10.1109/ICCVW54120.2021.00327}}
}
We welcome you for contributing to this repo, and feel free to contact us for any potential bugs and issues.
[1] Qi, Charles R., et al. "Pointnet: Deep learning on point sets for 3d classification and segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[2] Lang, A. H., Vora, S., Caesar, H., Zhou, L., Yang, J., & Beijbom, O. (2019). Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 12697-12705).