SANeRF-HQ: Segment Anything for NeRF in High Quality [CVPR 2024].
This is the official implementation of SANeRF-HQ.
The code is based on this repo.
First, install requirement packages
pip install -r requirements.txt
Also, you can build the extension (optional)
# install all extension modules
bash scripts/install_ext.sh
# if you want to install manually, here is an example:
cd raymarching
python setup.py build_ext --inplace # build ext only, do not install (only can be used in the parent directory)
pip install . # install to python path (you still need the raymarching/ folder, since this only install the built extension.)
We use the dataset from Mip-NeRF 360, LERF, LLFF, 3DFRONT, Panoptic Lifting and Contrastive Lift. You can download the dataset from their website by clicking the following hyperlinks. To switch different dataset, simply change the value of the flag --data_type
during training.
- Mip-NeRF 360:
--data_type=colmap
. - LERF:
--data_type=colmap
. Note: For LERF dataset, we do not obtain good NeRF reconstruction results by their camera poses (probably because of some hyper parameteres). Thus we use the colmap pose estimiation provided by this. Please following their instructions to run colmap first if you would like to test LERF. The corresponding scripts are also included in this repo. - LLFF:
--data_type=llff
. We use the data provided by Mip-NeRF 360. - 3D-FRONT:
--data_type 3dfront
. We use the data provided by Instance NeRF - Panoptic Lifting / Contrastive Lift:
--data_type=others
.
For the evaluation masks we selected, you can download them [here]. Some datasets actually have ground truth segmentation (e.g. 3D-FRONT and Panoptic Lifting) so we directly use their annotation. For those without ground truth segmentation (e.g. Mip-NeRF 360), we randomly select some views and use this to obtain masks. Then, we pass the masks through CascadePSP for refinement if necessary.
We provide some sample scripts to use our code. Please download at least one scene from the remote
To train the RGB NeRF, run
bash scripts/train_rgb_nerf.sh
Then run the following script to obtain feature container.
bash scripts/train_sam_nerf.sh
You can change the container type by the flag--feature_container
.
With the feature container, you can decode the object mask per image.
bash scripts/decode.sh
In decoding, 3D points are required as input. To obtain 3D points, you can project 2D points onto 3D (The script is not provided but you can find the corresponding code in test_step
in nerf/train.py
) or use the GUI to select points.
To use the GUI, you should add --gui
or you can run
bash scripts/gui.sh
Use you
To train object field, run
bash scripts/train_obj_nerf.sh
Simply set ray_pair_rgb_iter > iter
if you think that the ray pair rgb loss is slow or does not help in some cases.
To evaluate our results, you can run scripts/test_obj_nerf.sh
. You can add --use_default_intrinsics
in the test script to render mask with the default intrinsics. You can be download the evaluation views [here]
In our paper, we demonstrate the potential of our pipeline to achieve various segmentation tasks.
We use Grounding-DINO to generate the bounding box based on text and then use the bounding box as prompt for SAM to generate mask.
We use DEVA for a sequence of images in video. First, render a video from NeRF. You can utilize the 'save trajectory' function in GUI to store a sequence of camera poses and then render the corresponding images.
SAM and HQ-SAM
@article{kirillov2023segany,
title={Segment Anything},
author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
journal={arXiv:2304.02643},
year={2023}
}
@inproceedings{sam_hq,
title={Segment Anything in High Quality},
author={Ke, Lei and Ye, Mingqiao and Danelljan, Martin and Liu, Yifan and Tai, Yu-Wing and Tang, Chi-Keung and Yu, Fisher},
booktitle={NeurIPS},
year={2023}
}
Segment Anything NeRF
@misc{segment-anything-nerf,
Author = {Jiaxiang Tang and Xiaokang Chen and Diwen Wan and Jingbo Wang and Gang Zeng},
Year = {2023},
Note = {https://github.com/ashawkey/Segment-Anything-NeRF},
Title = {Segment-Anything NeRF}
}
CascadePSP
@inproceedings{cheng2020cascadepsp,
title={{CascadePSP}: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement},
author={Cheng, Ho Kei and Chung, Jihoon and Tai, Yu-Wing and Tang, Chi-Keung},
booktitle={CVPR},
year={2020}
}
OpenMMLab Playground: https://github.com/open-mmlab/playground
If you find this repo or our paper useful, please ⭐ this repository and consider citing 📝:
@article{liu2023sanerf,
title={SANeRF-HQ: Segment Anything for NeRF in High Quality},
author={Liu, Yichen and Hu, Benran and Tang, Chi-Keung and Tai, Yu-Wing},
journal={arXiv preprint arXiv:2312.01531},
year={2023}
}