Pose-Free framework for Generalizable Rendering Transformer(FP-GTR) eliminates the need for pre-computed camera poses and is able to render novel views in a feed-forward pass under unseen scenes.
- Novel-view synthesis on custom scene.
- Training and evaluation codes on PF-GTR are released.
1. Download the IBRNet's dataset from the official website.
cd data/
# IBRNet captures
gdown https://drive.google.com/uc?id=1rkzl3ecL3H0Xxf5WTyc2Swv30RIyr1R_
unzip ibrnet_collected.zip
# LLFF
gdown https://drive.google.com/uc?id=1ThgjloNt58ZdnEuiCeRf9tATJ-HI0b01
unzip real_iconic_noface.zip
## [IMPORTANT] remove scenes that appear in the test set
cd real_iconic_noface/
rm -rf data2_fernvlsb data2_hugetrike data2_trexsanta data3_orchid data5_leafscene data5_lotr data5_redflower
cd ../
# Spaces dataset
git clone https://github.com/augmentedperception/spaces_dataset
# RealEstate 10k
## make sure to install ffmpeg - sudo apt-get install ffmpeg
git clone https://github.com/qianqianwang68/RealEstate10K_Downloader
cd RealEstate10K_Downloader
python3 generate_dataset.py train
cd ../
# Google Scanned Objects
gdown https://drive.google.com/uc?id=1w1Cs0yztH6kE3JIz7mdggvPGCwIKkVi2
unzip google_scanned_objects_renderings.zip
# Blender dataset
gdown https://drive.google.com/uc?id=18JxhpWD-4ZmuFKLzKlAw-w5PpzZxXOcG
unzip nerf_synthetic.zip
# LLFF dataset (eval)
gdown https://drive.google.com/uc?id=16VnMcF1KJYxN9QId6TClMsZRahHNMW5g
unzip nerf_llff_data.zip
${ROOT}
├──📂data/
├──📂ibrnet_collected_1/
├── 📂...
├── 📜...
├──📂ibrnet_collected_2/
├──📂real_iconic_noface/
├──📂spaces_dataset/
├──📂RealEstate10K-subset/
├──📂google_scanned_objects/
├──📂nerf_synthetic/
├──📂nerf_llff_data/
The code is tested with python 3.9, cuda == 11.3, pytorch == 1.10.1. Additionally dependencies include:
torchvision
ConfigArgParse
imageio
matplotlib
numpy
opencv_contrib_python
Pillow
scipy
imageio-ffmpeg
lpips
scikit-image
loguru
Setup with Conda:
conda create -n pfgrt python=3.9
pip3 install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio==0.10.1 -f https://download.pytorch.org/whl/torch_stable.html
pip3 install -r ./requirements.txt
# python3 train.py --config <config> --optional[other kwargs]. Example:
python3 train.py --config configs/view_selector.yaml
# python3 train.py --config <config> --optional[other kwargs]. Example:
python3 train.py --config configs/pose_free_transfomer.yaml
# python3 eval.py --run_val --N_samples 192 --config <config> --optional[other kwargs]. Example:
# single scene in specified dataset (such as llff)
python3 eval.py --config configs/pose_free_transfomer.yaml --eval_scenes orchids --expname gnt_orchids --chunk_size 10240 --run_val --N_samples 192
python3 eval.py --config configs/pose_free_transfomer.yaml --eval_scenes drums --expname gnt_drums --chunk_size 10240 --run_val --N_samples 192
# all scenes in specified dataset (such as llff)
python3 eval.py --config configs/pose_free_transfomer.yaml --expname llff --chunk_size 10240 --run_val --N_samples 192
bash demo.sh
The code has been recently tidied up for release and could perhaps contain tiny bugs. Please feel free to open an issue.
If you find our work useful for your research, please consider citing the paper:
@inproceedings{Fan2023PoseFreeGR,
title={Pose-Free Generalizable Rendering Transformer},
author={Zhiwen Fan and Panwang Pan and Peihao Wang and Yifan Jiang and Hanwen Jiang and Dejia Xu and Zehao Zhu and Dilin Wang and Zhangyang Wang},
year={2023},
url={https://api.semanticscholar.org/CorpusID:263671855}
}