Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Add Tpvformer readme #2517

Merged
merged 35 commits into from
May 12, 2023
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
8a629f5
fix polarmix UT
sunjiahao1999 Feb 27, 2023
10bdefe
Merge branch 'dev-1.x' of github.com:open-mmlab/mmdetection3d into de…
sunjiahao1999 Feb 28, 2023
e66e5a7
Merge branch 'dev-1.x' of github.com:open-mmlab/mmdetection3d into de…
sunjiahao1999 Mar 1, 2023
372ecba
Merge branch 'dev-1.x' of github.com:open-mmlab/mmdetection3d into de…
sunjiahao1999 Mar 6, 2023
e78b860
Merge branch 'dev-1.x' of github.com:open-mmlab/mmdetection3d into de…
sunjiahao1999 Mar 8, 2023
e74d324
init tpvformer
sunjiahao1999 Mar 14, 2023
cbd6b80
add nus seg
sunjiahao1999 Mar 22, 2023
90ff44e
add nus seg
sunjiahao1999 Mar 22, 2023
851a7c9
Merge branch 'dev-1.x' of github.com:open-mmlab/mmdetection3d into de…
sunjiahao1999 Mar 22, 2023
94c8d89
Merge branch 'dev-1.x' into tpvformer
sunjiahao1999 Mar 22, 2023
6f19324
merge from dev-1.x
sunjiahao1999 Mar 28, 2023
f37db85
test done
sunjiahao1999 Mar 29, 2023
5bf1961
Merge branch 'dev-1.x' into tpvformer
sunjiahao1999 Mar 29, 2023
dfdb70f
Delete change_key.py
sunjiahao1999 Mar 29, 2023
175fe18
Delete test_dcn.py
sunjiahao1999 Mar 29, 2023
03ca29c
remove seg eval
sunjiahao1999 Mar 29, 2023
64be4a9
fix encoder
sunjiahao1999 Mar 29, 2023
9bd5d93
init train
sunjiahao1999 Apr 9, 2023
79923c1
train ready
sunjiahao1999 Apr 12, 2023
1a52343
Merge branch 'dev-1.x' into tpvformer
sunjiahao1999 Apr 13, 2023
0c337a5
remove asynctest
sunjiahao1999 Apr 19, 2023
7782c5d
change test.yml
sunjiahao1999 Apr 19, 2023
ee69b5f
pr_stage_test.yml & merge_stage_test.yml
sunjiahao1999 Apr 19, 2023
0b3342a
pip install wheel
sunjiahao1999 Apr 19, 2023
baa35c9
pip install wheel all
sunjiahao1999 Apr 19, 2023
71159c0
Merge branch 'dev-1.x' into tpvformer
sunjiahao1999 Apr 25, 2023
5893451
check type hint
sunjiahao1999 Apr 25, 2023
3369fd6
check comments
sunjiahao1999 Apr 25, 2023
8a06062
remove Photo aug
sunjiahao1999 Apr 25, 2023
8414600
fix p2v
sunjiahao1999 Apr 25, 2023
5fbf125
fix docsting & fix config filepath
sunjiahao1999 May 10, 2023
ce69084
add readme
sunjiahao1999 May 10, 2023
74850e3
Merge branch 'dev-1.x' into tpvformer_readme
sunjiahao1999 May 11, 2023
742812e
rename configs
sunjiahao1999 May 11, 2023
08b16be
fix log path
sunjiahao1999 May 12, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion mmdet3d/models/decode_heads/__init__.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
# Copyright (c) OpenMMLab. All rights reserved.
from .cylinder3d_head import Cylinder3DHead
from .decode_head import Base3DDecodeHead
from .dgcnn_head import DGCNNHead
from .minkunet_head import MinkUNetHead
from .paconv_head import PAConvHead
from .pointnet2_head import PointNet2Head

__all__ = [
'PointNet2Head', 'DGCNNHead', 'PAConvHead', 'Cylinder3DHead',
'MinkUNetHead'
'Base3DDecodeHead', 'MinkUNetHead'
]
2 changes: 1 addition & 1 deletion projects/CenterFormer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=${N
In MMDetection3D's root directory, run the following command to test the model:

```bash
python tools/train.py projects/CenterFormer/configs/centerformer_voxel01_second-atten_secfpn-atten_4xb4-cyclic-20e_waymoD5-3d-3class.py ${CHECKPOINT_PATH}
python tools/test.py projects/CenterFormer/configs/centerformer_voxel01_second-atten_secfpn-atten_4xb4-cyclic-20e_waymoD5-3d-3class.py ${CHECKPOINT_PATH}
```

## Results and models
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
_base_ = ['mmdet3d::_base_/default_runtime.py']
_base_ = ['../../../configs/_base_/default_runtime.py']
custom_imports = dict(
imports=['projects.CenterFormer.centerformer'], allow_failed_imports=False)

Expand Down
2 changes: 1 addition & 1 deletion projects/DETR3D/configs/detr3d_r101_gridmask.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
_base_ = [
# 'mmdet3d::_base_/datasets/nus-3d.py',
'mmdet3d::_base_/default_runtime.py'
'../../../configs/_base_/default_runtime.py'
]

custom_imports = dict(imports=['projects.DETR3D.detr3d'])
Expand Down
5 changes: 3 additions & 2 deletions projects/PETR/configs/petr_vovnet_gridmask_p4_800x320.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
_base_ = [
'mmdet3d::_base_/datasets/nus-3d.py', 'mmdet3d::_base_/default_runtime.py',
'mmdet3d::_base_/schedules/cyclic-20e.py'
'../../../configs/_base_/datasets/nus-3d.py',
'../../../configs/_base_/default_runtime.py',
'../../../configs/_base_/schedules/cyclic-20e.py'
]
backbone_norm_cfg = dict(type='LN', requires_grad=True)
custom_imports = dict(imports=['projects.PETR.petr'])
Expand Down
60 changes: 60 additions & 0 deletions projects/TPVFormer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction

> [Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction](https://arxiv.org/abs/2302.07817)

<!-- [ALGORITHM] -->

## Abstract

Modern methods for vision-centric autonomous driving perception widely adopt the bird's-eye-view (BEV) representation to describe a 3D scene. Despite its better efficiency than voxel representation, it has difficulty describing the fine-grained 3D structure of a scene with a single plane. To address this, we propose a tri-perspective view (TPV) representation which accompanies BEV with two additional perpendicular planes. We model each point in the 3D space by summing its projected features on the three planes. To lift image features to the 3D TPV space, we further propose a transformer-based TPV encoder (TPVFormer) to obtain the TPV features effectively. We employ the attention mechanism to aggregate the image features corresponding to each query in each TPV plane. Experiments show that our model trained with sparse supervision effectively predicts the semantic occupancy for all voxels. We demonstrate for the first time that using only camera inputs can achieve comparable performance with LiDAR-based methods on the LiDAR segmentation task on nuScenes. Code: https://github.com/wzzheng/TPVFormer.

<div align=center>
<img src="https://github.com/traveller59/spconv/assets/72679458/8cc8caa6-b330-4f32-9599-3811dc5d7332" width="800"/>
</div>

## Introduction

We implement TPVFormer and provide the results and checkpoints on nuScenes dataset.

## Usage

<!-- For a typical model, this section should contain the commands for training and testing. You are also suggested to dump your environment specification to env.yml by `conda env export > env.yml`. -->

### Training commands

In MMDetection3D's root directory, run the following command to train the model:

1. Downloads the [pretrained backbone weights](<>) to checkpoints/

2. For example, to train TPVFormer on 8 GPUs, please use

```bash
bash tools/dist_train.sh projects/TPVFormer/config/tpvformer_8xb1-2x_nus-seg.py 8
```

### Testing commands

In MMDetection3D's root directory, run the following command to test the model on 8 GPUs:

```bash
bash tools/dist_test.sh projects/TPVFormer/config/tpvformer_8xb1-2x_nus-seg.py ${CHECKPOINT_PATH} 8
```

## Results and models

### nuScenes

| Backbone | Neck | Mem (GB) | Inf time (fps) | mIoU | Downloads |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ---- | -------- | -------------- | ---- | ------------------------ |
| [ResNet101 w/ DCN](https://github.com/open-mmlab/mmdetection3d/blob/main/configs/fcos3d/fcos3d_r101-caffe-dcn_fpn_head-gn_8xb2-1x_nus-mono3d.py) | FPN | 32.0 | - | 68.9 | [model](<>) \| [log](<>) |

## Citation

```latex
@article{huang2023tri,
title={Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction},
author={Huang, Yuanhui and Zheng, Wenzhao and Zhang, Yunpeng and Zhou, Jie and Lu, Jiwen },
journal={arXiv preprint arXiv:2302.07817},
year={2023}
}
```
Loading