PointMVSNet is a deep point-based deep framework for multi-view stereo (MVS). PointMVSNet directly processes the target scene as point clouds and predicts the depth in a coarse-to-fine manner. Our network leverages 3D geometry priors and 2D texture information jointly and effectively by fusing them into a feature-augmented point cloud, and processes the point cloud to estimate the 3D flow for each point.
VAPointMVSNet extends PointMVSNet with visibility-aware multi-view feature aggregations, which allows the network to aggregate multi-view appearance cues while taking into account occlusions.
If you find this project useful for your research, please cite:
@ARTICLE{ChenVAPMVSNet2020TPAMI,
author={Chen, Rui and Han, Songfang and Xu, Jing and Su, Hao},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Visibility-Aware Point-Based Multi-View Stereo Network},
year={2020},
volume={},
number={},
pages={1-1},}
@InProceedings{ChenPMVSNet2019ICCV,
author = {Chen, Rui and Han, Songfang and Xu, Jing and Su, Hao},
title = {Point-based Multi-view Stereo Network},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
year = {2019}
}
The environment requirements are listed as follows:
- Pytorch 1.0.1
- CUDA 9.0
- CUDNN 7.4.2
- GCC5
-
Check out the source code
git clone https://github.com/callmeray/PointMVSNet && cd PointMVSNet
-
Install dependencies
bash install_dependencies.sh
-
Compile CUDA extensions
bash compile.sh
-
Download the preprocessed DTU training data from MVSNet and unzip it to
data/dtu
. -
Train the network
python pointmvsnet/train.py --cfg configs/dtu_wde3.yaml
You could change the batch size in the configuration file according to your own pc.
-
Download the rectified images from DTU benchmark and unzip it to
data/dtu/Eval
. -
Test with your own model
python pointmvsnet/test.py --cfg configs/dtu_wde3.yaml
-
Test with the pretrained model
python pointmvsnet/test.py --cfg configs/dtu_wde3.yaml TEST.WEIGHT outputs/dtu_wde3/model_pretrained.pth
PointMVSNet generates per-view depth map. We need to apply depth fusion tools/depthfusion.py
to get the complete point cloud. Please refer to MVSNet for more details.