Trans4Map

Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers

Chang Chen, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen

News

[10/2022] Trans4Map [PDF] is accepted to WACV2023.
[9/2022] A demo video is at [VIDEO].

Introduction

In this work, we propose an end-to-end one-stage Transformer-based framework for Mapping, termed Trans4Map. Our egocentric-to-allocentric mapping process includes three steps: (1) the efficient transformer extracts the contextual features from a batch of egocentric images; (2) the proposed Bidirectional Allocentric Memory (BAM) module projects egocentric features into the allocentric memory; (3) the map decoder parses the accumulated memory and predicts the top-down semantic segmentation map.

More detailed can be found in our arxiv paper.

Installation

To create conda env:

conda create -n Trans4Map python=3.7
conda activate Trans4Map
cd /path/to/Trans4Map
pip install -r requirements.txt

To get RGBD renderings in Matterport3D dataset, we need to install Habitat-sim and Habitat-lab. To ensure consistency with our working environment，please install the following version Habitat-sim == 0.1.5 and Habitat-lab == 0.1.5.

Datasets

You can prepare the training and test dataset in the same way as SMNet.

data/paths.json has the given trajectories which are manually recorded by SMNet.
The semantic top-down ground truth are also available : GT, please place them under data/semmap.
Our project is working with Matterport3D dataset and Replica dataset, please download them and place them under data/mp3d or data/replica.

Training and Evaluation

To train our Trans4Map with different backbones, run:

python train.py

To generate the test result, run the following code:

python build_test_date_feature.py
python test.py

To obtain the mIOU and mBF1, run:

python eval/eval.py
python eval/eval_bfscore.py

Main results on Matterport3D with pretrained models

Method	Backbone	mIOU(%)	weight
ConvNeXt	ConvNeXt-T	35.91
ConvNeXt	ConvNeXt-S	36.49
FAN	FAN-T	31.07
FAN	FAN-S	34.62
Swin	Swin-T	34.19
Swin	Swin-S	36.80
Trans4Map	MiT-B2	40.02	B2
Trans4Map	MiT-B4	40.88	B4

License

This repository is under the Apache-2.0 license. For commercial use, please contact with the authors.

Citations

If you are interested in this work, please cite the following work:

@inproceedings{chen2023trans4map,
  title={Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers},
  author={Chen, Chang and Zhang, Jiaming and Yang, Kailun and Peng, Kunyu and Stiefelhagen, Rainer},
  booktitle={2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Backbone		Backbone
data		data
eval		eval
metric		metric
model		model
utils		utils
LICENSE		LICENSE
README.md		README.md
build_test_data_feature.py		build_test_data_feature.py
fig_trans4map.png		fig_trans4map.png
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trans4Map

News

Introduction

Installation

To create conda env:

Datasets

Training and Evaluation

Main results on Matterport3D with pretrained models

License

Citations

About

Releases

Packages

Contributors 3

Languages

License

jamycheung/Trans4Map

Folders and files

Latest commit

History

Repository files navigation

Trans4Map

News

Introduction

Installation

To create conda env:

Datasets

Training and Evaluation

Main results on Matterport3D with pretrained models

License

Citations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages