ProFusion3D is a progressive fusion framework that combines features in both Bird's Eye View (BEV) and Perspective View (PV) at both intermediate and object query levels. Our architecture hierarchically fuses local and global features, enhancing the robustness of 3D object detection.
This repository contains the PyTorch implementation of our CoRL'2024 paper Progressive Multi-Modal Fusion for Robust 3D Object Detection. The repository builds on MMDetection3D.
If you find this code useful for your research, we kindly ask you to consider citing our papers:
@article{mohan2024progressive,
title={Progressive Multi-Modal Fusion for Robust
3D Object Detection},
shorttile={ProFusion3D},
author={Mohan, Rohit and Cattaneo, Daniele and Drews, Florian and Valada, Abhinav},
journal={Conference on Robot Learning (CoRL)},
year={2024}
}
Please refer to the MMDetection3D documentation for detailed instructions.
Please refer to the official MMDetection3D documentation for instructions on processing the nuScenes dataset: MMDetection3D - nuScenes Dataset Preparation(https://mmdetection3d.readthedocs.io/en/latest/datasets/nuscenes_det.html).
# Training
bash tools/dist_train.sh /path/to/your/config 8
# Inference
bash tools/dist_test.sh /path/to/your/config /path/to/your/checkpoint.pth 8 --eval bbox
Pre-trained models can be found in the model zoo.
We have used utility functions from other open-source projects. We espeicially thank the authors of: