This repository is an official implementation of LODE:
LODE: Locally Conditioned Eikonal Implicit Scene Completion from Sparse LiDAR
Pengfei Li, Ruowen Zhao, Yongliang Shi, Hao Zhao, Jirui Yuan, Guyue Zhou, Ya-Qin Zhang from Institute for AI Industry Research(AIR), Tsinghua University.
For complete video, click HERE.
We use the proposed model trained on the KITTI dataset to predict implicit completion results on the novel DAIR-V2X dataset. The results are impressive:
Scene completion refers to obtaining dense scene representation from an incomplete perception of complex 3D scenes. This helps robots detect multi-scale obstacles and analyse object occlusions in scenarios such as autonomous driving. Recent advances show that implicit representation learning can be leveraged for continuous scene completion and achieved through physical constraints like Eikonal equations. However, former Eikonal completion methods only demonstrate results on watertight meshes at a scale of tens of meshes. None of them are successfully done for non-watertight LiDAR point clouds of open large scenes at a scale of thousands of scenes. In this paper, we propose a novel Eikonal formulation that conditions the implicit representation on localized shape priors which function as dense boundary value constraints, and demonstrate it works on SemanticKITTI and SemanticPOSS. It can also be extended to semantic Eikonal scene completion with only small modifications to the network architecture. With extensive quantitative and qualitative results, we demonstrate the benefits and drawbacks of existing Eikonal methods, which naturally leads to the new locally conditioned formulation. Notably, we improve IoU from 31.7% to 51.2% on SemanticKITTI and from 40.5% to 48.7% on SemanticPOSS. We extensively ablate our methods and demonstrate that the proposed formulation is robust to a wide spectrum of implementation hyper-parameters.
If you find our work useful in your research, please consider citing:
@article{li2023lode,
title={LODE: Locally Conditioned Eikonal Implicit Scene Completion from Sparse LiDAR},
author={Li, Pengfei and Zhao, Ruowen and Shi, Yongliang and Zhao, Hao and Yuan, Jirui and Zhou, Guyue and Zhang, Ya-Qin},
journal={arXiv preprint arXiv:2302.14052},
year={2023}
}
CUDA=11.1
python>=3.8
Pytorch>=1.8
numpy
ninja
MinkowskiEngine
tensorboard
pyyaml
configargparse
scripy
open3d
h5py
plyfile
scikit-image
Download the SemanticKITTI dataset from HERE.
Download the SemanticPOSS dataset from HERE.
Unzip them into the same directory as LODE
.
The default configuration in our codes is used for the SemanticKITTI dataset.
The configuration for training/inference is stored in opt.yaml
, which can be modified as needed.
Run the following command for a certain task
(train/valid/visualize):
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 main_sc.py --task=[task] --experiment_name=[experiment_name]
Run the following command for a certain task
(ssc_pretrain/ssc_valid/train/valid/visualize):
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 main_ssc_a.py --task=[task] --experiment_name=[experiment_name]
Here, use ssc_pretrain/ssc_valid to train/validate the semantic extension module. Then the pre-trained model can be used to further train the whole model.
Run the following command for a certain task
(train/valid/visualize):
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 main_ssc_b.py --task=[task] --experiment_name=[experiment_name]
For the SemanticPOSS dataset, change opt.yaml
to opt_semanticposs.yaml
, change config_file
in all code files from semantic-kitti.yaml
to SemanticPOSS.yaml
, and change SPLIT_SEQUENCES
in dataio.py
to SPLIT_SEQUENCES_SPOSS
.
Our pre-trained models can be downloaded here:
Table | Ablation | Checkpoints | ||||
Table I | Dataset | SemanticKITTI | SemanticPOSS | |||
Table II | Discriminative Model Structure | last1 pruning | last2 pruning | last3 pruning | last4 pruning | 4convs output |
Table III | Generative Model Structure | width128 depth4 | width512 depth4 | width256 depth3 | width256 depth5 | Gnet relu |
Table IV | Shape Dimension | shape 128 | shape 512 | |||
Table IV | Scale Size | scale 2 | scale 4 | scale 8 | scale 16 | scale 32 |
Table V | Positional Encoding | no encoding | incF level10 | incT level5 | incT level15 | |
Table VI | Sample Strategy | nearest | ||||
Table VII | Semantic Extension | semantic extension A | semantic extension B | |||
Table VII | Semantic Extension A Module | ssc_pretrain |
Table | Ablation | Corresponding Configs | ||||
Table I | Dataset | SemanticKITTI | SemanticPOSS | |||
Table II | Discriminative Model Structure | last1 pruning | last2 pruning | last3 pruning | last4 pruning | 4convs output |
Table III | Generative Model Structure | width128 depth4 | width512 depth4 | width256 depth3 | width256 depth5 | Gnet relu |
Table IV | Shape Dimension | shape 128 | shape 512 | |||
Table IV | Scale Size | scale 2 | scale 4 | scale 8 | scale 16 | scale 32 |
Table V | Positional Encoding | no encoding | incF level10 | incT level5 | incT level15 | |
Table VI | Sample Strategy | nearest | ||||
Table VII | Semantic Extension | semantic extension A | semantic extension B |