Improving novel view synthesis of 3D Gaussian splats using 2D image enhancement methods

Wouter Bant, Ádám Divák, Jasper Eppink, Clio Feng, Roos Hutter

Novel view reconstruction based on only 2 input images is an important but extremely challenging task. pixelSplat is a potential solution that was shown to deliver high-quality results at a competitive speed. We first evaluate pixelSplat on more challenging reconstruction tasks by applying cam- era positions that are further away from each other, and find that its performance is heavily impacted. We then explore 2D image enhancement methods to fix the corrupted novel view images. A diffusion model-based solution proves to be able to restore significantly impacted areas, but fails to stay consistent with the original scene even after long fine- tuning, resulting in flickering videos. An alternative solu- tion based on an image restoration model results in pleasant videos and quantitative improvements in most metrics, but does not address all errors seen in the novel view images. We explore the underlying reasons for these shortcomings, and propose future research directions for fixing them.

This code builds upon the code from the paper pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction by David Charatan, Sizhe Lester Li, Andrea Tagliasacchi, and Vincent Sitzmann.

Check out their project website here.

Demo

demo.mp4

Important files

Training ControlNet
Training InstructIR: note the authors didn't provide a training script so we made one based on the information from their paper
Inference
Testing ControlNet on out of domain data
Creating demo video

Export training images

!python3 -m src.main +experiment=re10k mode=test test.data_loader="train" test.output_path="outputs/re10k_train_data" data_loader.train.batch_size=1 checkpointing.load=checkpoints/re10k.ckpt

Installation

Installation on Snellius supercomputer

This is not straightforward as we don't have sudo privileges and many default packages are outdated. Also, new versions of g++ are not compatible. After cloning the repo, execute the following commands, in order and only after the previous command is finished:

Step by step instructions

cd installation_jobs

This takes approximately 30 minutes, all others are much faster.

sbatch install_env.job

This will return an error but we will fix this afterwards.

sbatch install_packages.job

Debugging jobs:

sbatch debug.job

sbatch debug2.job

sbatch debug3.job

sbatch debug4.job

sbatch debug5.job

Now this should run without any errors.

sbatch install_packages.job

Acquiring Datasets

pixelSplat was trained using versions of the RealEstate10k and ACID datasets that were split into ~100 MB chunks for use on server cluster file systems. Small subsets of the Real Estate 10k and ACID datasets in this format can be found here. To use them, simply unzip them into a newly created datasets folder in the project root directory.

The datasets that were used to finetune the diffusion model and InstructIR can be found on Huggingface (https://huggingface.co/datasets/Wouter01/re10k_hard)

Acquiring Pre-trained Checkpoints

You can find pre-trained checkpoints here. You can find the checkpoints for the original codebase (without the improvements from the camera-ready version of the paper) here.

Also the finetuned diffusion and InstructIR models can be found on Huggingface (https://huggingface.co/Wouter01/diffusion_re10k_hard, https://huggingface.co/Wouter01/InstructIR_re10k_hard)

Citation

@misc{wouter2024improve_nvs,
  title={Improving novel view synthesis of 3D Gaussian splats using 2D image enhancement methods},
  author={Wouter Bant and Ádám Divák and Jasper Eppink and Clio Feng and Roos Hutter},
  year={2024},
  url={https://github.com/adamdivak/diffusion_augmented_pixelsplat}
}

Acknowledgements

This code is mainly from https://dcharatan.github.io/pixelsplat

@inproceedings{charatan23pixelsplat,
      title={pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction},
      author={David Charatan and Sizhe Li and Andrea Tagliasacchi and Vincent Sitzmann},
      year={2023},
      booktitle={arXiv},
}

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
.vscode		.vscode
assets		assets
config		config
controlnet		controlnet
datasets		datasets
demo_images		demo_images
diffusers		diffusers
installation_jobs		installation_jobs
instructir		instructir
notebooks		notebooks
src		src
.gitignore		.gitignore
CITATION.bib		CITATION.bib
CV2_Diffusion3DGS_Bant_Divak_Eppink_Feng_Hutter.pdf		CV2_Diffusion3DGS_Bant_Divak_Eppink_Feng_Hutter.pdf
LICENSE		LICENSE
README.md		README.md
best_model.pt		best_model.pt
pyproject.toml		pyproject.toml
run_ablations.job		run_ablations.job
run_eval_re10k.job		run_eval_re10k.job
run_training_re10k.job		run_training_re10k.job
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improving novel view synthesis of 3D Gaussian splats using 2D image enhancement methods

Wouter Bant, Ádám Divák, Jasper Eppink, Clio Feng, Roos Hutter

Demo

Important files

Export training images

Installation

Installation on Snellius supercomputer

Acquiring Datasets

Acquiring Pre-trained Checkpoints

Citation

Acknowledgements

About

Releases

Packages

Languages

License

adamdivak/diffusion_augmented_pixelsplat

Folders and files

Latest commit

History

Repository files navigation

Improving novel view synthesis of 3D Gaussian splats using 2D image enhancement methods

Wouter Bant, Ádám Divák, Jasper Eppink, Clio Feng, Roos Hutter

Demo

Important files

Export training images

Installation

Installation on Snellius supercomputer

Acquiring Datasets

Acquiring Pre-trained Checkpoints

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages