Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery

CVPR 2024

Yuqi Zhang · Guanying Chen · Jiaxing Chen · Shuguang Cui

Project Page

We present a neural radiance field method for urban-scale semantic and building-level instance segmentation from aerial images by lifting noisy 2D labels to 3D.

Overview

This repository contains the following components to train Aerial Lifting:

Dataset processing scripts, including:
1. far-view semantic label fusion;
2. cross-view instance label grouping.
Training and evaluation scripts.

Note: This is a preliminary release and there may still be some bugs.

Installation

Create new conda env (CUDA)

Clone this repo by:

git clone https://github.com/zyqz97/Aerial_lifting.git

Create a conda environment (installation via anaconda is recommended.
```
conda create -n aeriallift python=3.9
conda activate aeriallift
```

pytorch-version

conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge

tiny-cuda-nn and others

pip install -r requirements.txt
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch

Install the extension of torch-ngp

cd ./gp_nerf/torch_ngp/gridencoder
python setup.py install
cd ../raymarching
python setup.py install
cd ../shencoder
python setup.py install

Follow the official neuralsim to install nr3d_lib.

Install SAM

git clone https://github.com/facebookresearch/segment-anything.git
cd segment-anything
pip install -e .
cd tools/segment_anything
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

Tested environments

Ubuntu 20.04 with torch 1.10.1 & CUDA 11.3 on A100 GPU.

Data Processing & Training Step

We take Yingrenshi dataset as an example. And you need to set 'dataset_path=$YOURPATH/Aerial_lifting_data/Yingrenshi' and 'config_file=configs/yingrenshi.yaml'.
We also provide the processed data in the next section. The training scripts (Step 1.1, Step 2.4, and Step 3.3) can be run directly if you download the processed data.

Step 1. Training Geometry

1.1 Train the geometry field.
```
sh bash/train_geo.sh
```
Note: $exp_name denotes the logs_saving path (e.g. exp_name=logs/train_geo_yingrenshi)

Step 2. Training Semantic Field

2.1 Get Mask2former semantic labels

For generating semantic labels of Mask2former, please use our modified version of Mask2former from here. You need to create a new conda env. This code is largely based on MaskFormer and a modified version of Panapti-Lifting.

After installing the environment of Mask2former:
```
sh bash/2_1_m2f_labels.sh
```
2.2 Render far-view RGB images from the checkpoint of Step 1.
```
sh bash/2_2_get_far_view_images.sh
```
Note: need to specify $M2F_path, $exp_name, $ckpt_path
2.3 Get fusion semantic label.
```
sh bash/2_3_fusion.sh
```
2.4 Train the semantic field.

After processing or downloading the data, you can use the script below to train the semantic field.
```
sh bash/train_semantic.sh
```

3. Training Instance Field

3.1 Generate the SAM instance mask with geo-filter
```
sh bash/3_1_get_sam_mask_depth_filter.sh
```
3.2 Generate the cross-view guidance map
```
sh bash/3_2_cross_view_process.sh
```
3.3 Train the instance field.

After processing or downloading the data, you can use the script below to train the instance field.
```
sh bash/train_instance.sh
```

Processed Dataset & Trained Models.

Download the processed data and trained checkpoints.

We thank the authors for providing the datasets. If you find the datasets useful in your research, please cite the papers that provided the original aerial images:

@inproceedings{UrbanBIS,
title = {UrbanBIS: a Large-scale Benchmark for Fine-grained Urban Building Instance Segmentation},
author = {Guoqing Yang and Fuyou Xue and Qi Zhang and Ke Xie and Chi-Wing Fu and Hui Huang},
booktitle = {SIGGRAPH},
year = {2023},
}

@inproceedings{UrbanScene3D,
title={Capturing, Reconstructing, and Simulating: the UrbanScene3D Dataset},
author={Liqiang Lin and Yilin Liu and Yue Hu and Xingguang Yan and Ke Xie and Hui Huang},
booktitle={ECCV},
year={2022},
}

Citation

If you find this work useful for your research and applications, please cite our paper:

@inproceedings{zhang2024aerial,
  title={Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery},
  author={Zhang, Yuqi and Chen, Guanying and Chen, Jiaxing and Cui, Shuguang},
  booktitle={CVPR},
  year={2024}
}

Acknowledgements

Large parts of this codebase are based on existing work in the Mega-NeRF, torch-ngp, neuralsim, Panoptic-Lifting, Contrastive-Lift, SAM, Mask2Former. We thank the authors for releasing their code.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
bash		bash
configs		configs
gp_nerf		gp_nerf
media		media
mega_nerf		mega_nerf
nerf		nerf
nr3d_lib		nr3d_lib
scripts		scripts
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval.sh		eval.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery

Project Page

We present a neural radiance field method for urban-scale semantic and building-level instance segmentation from aerial images by lifting noisy 2D labels to 3D.

Overview

Installation

Create new conda env (CUDA)

Tested environments

Data Processing & Training Step

Step 1. Training Geometry

1.1 Train the geometry field.

Step 2. Training Semantic Field

2.1 Get Mask2former semantic labels

2.2 Render far-view RGB images from the checkpoint of Step 1.

2.3 Get fusion semantic label.

2.4 Train the semantic field.

3. Training Instance Field

3.1 Generate the SAM instance mask with geo-filter

3.2 Generate the cross-view guidance map

3.3 Train the instance field.

Processed Dataset & Trained Models.

Citation

Acknowledgements

About

Releases

Packages

Languages

License

zyqz97/Aerial_lifting

Folders and files

Latest commit

History

Repository files navigation

Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery

Project Page We present a neural radiance field method for urban-scale semantic and building-level instance segmentation from aerial images by lifting noisy 2D labels to 3D.

Overview

Installation

Create new conda env (CUDA)

Tested environments

Data Processing & Training Step

Step 1. Training Geometry

1.1 Train the geometry field.

Step 2. Training Semantic Field

2.1 Get Mask2former semantic labels

2.2 Render far-view RGB images from the checkpoint of Step 1.

2.3 Get fusion semantic label.

2.4 Train the semantic field.

3. Training Instance Field

3.1 Generate the SAM instance mask with geo-filter

3.2 Generate the cross-view guidance map

3.3 Train the instance field.

Processed Dataset & Trained Models.

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Project Page

We present a neural radiance field method for urban-scale semantic and building-level instance segmentation from aerial images by lifting noisy 2D labels to 3D.

Packages