Skip to content

Latest commit

 

History

History
137 lines (106 loc) · 6.13 KB

readme.md

File metadata and controls

137 lines (106 loc) · 6.13 KB

SANeRF-HQ

arXiv

SANeRF-HQ: Segment Anything for NeRF in High Quality [CVPR 2024].

This is the official implementation of SANeRF-HQ.

Set up

The code is based on this repo.

First, install requirement packages

pip install -r requirements.txt

Also, you can build the extension (optional)

# install all extension modules
bash scripts/install_ext.sh

# if you want to install manually, here is an example:
cd raymarching
python setup.py build_ext --inplace # build ext only, do not install (only can be used in the parent directory)
pip install . # install to python path (you still need the raymarching/ folder, since this only install the built extension.)

Dataset

We use the dataset from Mip-NeRF 360, LERF, LLFF, 3DFRONT, Panoptic Lifting and Contrastive Lift. You can download the dataset from their website by clicking the following hyperlinks. To switch different dataset, simply change the value of the flag --data_type during training.

  • Mip-NeRF 360: --data_type=colmap.
  • LERF: --data_type=colmap. Note: For LERF dataset, we do not obtain good NeRF reconstruction results by their camera poses (probably because of some hyper parameteres). Thus we use the colmap pose estimiation provided by this. Please following their instructions to run colmap first if you would like to test LERF. The corresponding scripts are also included in this repo.
  • LLFF: --data_type=llff. We use the data provided by Mip-NeRF 360.
  • 3D-FRONT: --data_type 3dfront. We use the data provided by Instance NeRF
  • Panoptic Lifting / Contrastive Lift: --data_type=others.

For the evaluation masks we selected, you can download them [here]. Some datasets actually have ground truth segmentation (e.g. 3D-FRONT and Panoptic Lifting) so we directly use their annotation. For those without ground truth segmentation (e.g. Mip-NeRF 360), we randomly select some views and use this to obtain masks. Then, we pass the masks through CascadePSP for refinement if necessary.

Training

We provide some sample scripts to use our code. Please download at least one scene from the remote

To train the RGB NeRF, run

bash scripts/train_rgb_nerf.sh

Then run the following script to obtain feature container.

bash scripts/train_sam_nerf.sh

You can change the container type by the flag--feature_container.

With the feature container, you can decode the object mask per image.

bash scripts/decode.sh

In decoding, 3D points are required as input. To obtain 3D points, you can project 2D points onto 3D (The script is not provided but you can find the corresponding code in test_step in nerf/train.py) or use the GUI to select points.

To use the GUI, you should add --gui or you can run

bash scripts/gui.sh

Use you

To train object field, run

bash scripts/train_obj_nerf.sh

Simply set ray_pair_rgb_iter > iter if you think that the ray pair rgb loss is slow or does not help in some cases.

Evaluation

To evaluate our results, you can run scripts/test_obj_nerf.sh. You can add --use_default_intrinsics in the test script to render mask with the default intrinsics. You can be download the evaluation views [here]

Other Results

In our paper, we demonstrate the potential of our pipeline to achieve various segmentation tasks.

Text-prompt Segmentation

We use Grounding-DINO to generate the bounding box based on text and then use the bounding box as prompt for SAM to generate mask.

Auto-segmentation and Dynamic Segmentation

We use DEVA for a sequence of images in video. First, render a video from NeRF. You can utilize the 'save trajectory' function in GUI to store a sequence of camera poses and then render the corresponding images.

Acknowledgement

SAM and HQ-SAM

@article{kirillov2023segany,
    title={Segment Anything},
    author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
    journal={arXiv:2304.02643},
    year={2023}
}

@inproceedings{sam_hq,
    title={Segment Anything in High Quality},
    author={Ke, Lei and Ye, Mingqiao and Danelljan, Martin and Liu, Yifan and Tai, Yu-Wing and Tang, Chi-Keung and Yu, Fisher},
    booktitle={NeurIPS},
    year={2023}
}  

Segment Anything NeRF

@misc{segment-anything-nerf,
    Author = {Jiaxiang Tang and Xiaokang Chen and Diwen Wan and Jingbo Wang and Gang Zeng},
    Year = {2023},
    Note = {https://github.com/ashawkey/Segment-Anything-NeRF},
    Title = {Segment-Anything NeRF}
}

CascadePSP

@inproceedings{cheng2020cascadepsp,
  title={{CascadePSP}: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement},
  author={Cheng, Ho Kei and Chung, Jihoon and Tai, Yu-Wing and Tang, Chi-Keung},
  booktitle={CVPR},
  year={2020}
}

OpenMMLab Playground: https://github.com/open-mmlab/playground

Citation

If you find this repo or our paper useful, please ⭐ this repository and consider citing 📝:

@article{liu2023sanerf,
  title={SANeRF-HQ: Segment Anything for NeRF in High Quality},
  author={Liu, Yichen and Hu, Benran and Tang, Chi-Keung and Tai, Yu-Wing},
  journal={arXiv preprint arXiv:2312.01531},
  year={2023}
}