Detector-Free Structure from Motion
Xingyi He, Jiaming Sun, Yifan Wang, Sida Peng, Qixing Huang, Hujun Bao, Xiaowei Zhou
CVPR 2024, 1st in Image Matching Challenge 2023
Please refer to INSTALL.md for installation instructions.
The data structure of our system is organized as follows:
repo_path/SfM_dataset
- dataset_name1
- scene_name_1
- images
- image_name_1.jpg or .png or ...
- image_name_2.jpg
- ...
- intrins (optional, used for evaluation)
- camera_name_1.txt
- camera_name_2.txt
- ...
- poses (optional, used for evaluation)
- pose_name_1.txt
- pose_name_2.txt
- ...
- scene_name_2
- ...
- dataset_name2
- ...
The folder naming of images
, intrins
and poses
is compulsory, for the identification by our system.
Now, download the training and evaluation datasets, and then format them to required structure following instructions in DATASET_PREPARE.md.
First modify L22 in hydra_configs/demo/dfsfm.yaml
to specify the absolute path of the repo.
Then run the following command:
python eval_dataset.py +demo=dfsfm.yaml
SfM result will be saved in SfM_dataset/example_dataset/example_scene/DetectorFreeSfM_loftr_official_coarse_only__scratch_no_intrin/colmap_refined
in COLMAP format, and can be visualized by colmap gui
.
# For ETH3D dataset:
python eval_dataset.py +eth3d_sfm=dfsfm.yaml neuralsfm.NEUSFM_coarse_matcher='loftr_official'
python eval_dataset.py +eth3d_sfm=dfsfm.yaml neuralsfm.NEUSFM_coarse_matcher='aspanformer'
python eval_dataset.py +eth3d_sfm=dfsfm.yaml neuralsfm.NEUSFM_coarse_matcher='matchformer'
# For IMC dataset:
sh scripts/eval_imc_dataset.sh
# For TexturePoorSfM dataset:
sh scripts/eval_texturepoorsfm_dataset.sh
# For ETH3D dataset:
python eval_dataset.py +eth3d_tri=dfsfm.yaml neuralsfm.NEUSFM_coarse_matcher='loftr_official'
python eval_dataset.py +eth3d_tri=dfsfm.yaml neuralsfm.NEUSFM_coarse_matcher='aspanformer'
python eval_dataset.py +eth3d_tri=dfsfm.yaml neuralsfm.NEUSFM_coarse_matcher='matchformer'
- You can speed up evaluation by enable multi-processing if you have multiple GPUs.
You can set
ray.enable=True
and setray.n_workers=your_gpu_number
the configs to simutaneously evaluate many scenes within a dataset. - For a scene with many images, like
Bridge
in the ETH3D dataset, you can set multiple workers for image matching in coarse SfM and multi-view refinement matching phase by settingsub_use_ray=True
andsub_ray_n_worker=your_gpu_number
- Increase batchsize in multi-view refinement phase. Currently, we chunk the tracks in refinement matching and
neuralsfm.NEUSFM_refinement_chunk_size
is set to2000
so that it can work on GPU with VRAM less than 12GB. If the GPUs in your device are with larger VRAM, you can consider increase this value to speed up the process.
Be sure you have downloaded and formated the MegaDepth dataset following DATASET_PREPARE.md.
python train_multiview_matcher.py +experiment=multiview_refinement_matching.yaml paths=dataset_path_config trainer=trainer_config
You can modify the GPU ids in hydra_training_configs/trainer/trainer_config.yaml
. By default, we use 8 GPUs for training.
Our code is partally based on COLMAP and HLoc, we thank the authors for their great work.
If you find this code useful for your research, please use the following BibTeX entry.
@article{he2024dfsfm,
title={Detector-Free Structure from Motion},
author={He, Xingyi and Sun, Jiaming and Wang, Yifan and Peng, Sida and Huang, Qixing and Bao, Hujun and Zhou, Xiaowei},
journal={{CVPR}},
year={2024}
}