Skip to content

Commit 91f40bf

Browse files
authored
Merge pull request #78 from cvg/dev - v1.1
- **[BREAKING]** improved structure of the SfM folders (triangulation and reconstruction), see [#76](#76) - Support for image retrieval (NetVLAD, DIR) and more local features (SIFT, R2D2) - Support for more datasets: Aachen v1.1, Extended CMU Seasons, RobotCar Seasons, 4Seasons, Cambridge Landmarks, 7-Scenes - Simplified pipeline and API - Spatial matcher - Support for arbitrary paths of features and matches - Support for matching multiple feature files together
2 parents e64814c + bd268b4 commit 91f40bf

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+207970
-19332
lines changed

.gitignore

+3
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,5 @@
11
__pycache__
22
*.pyc
3+
*.egg-info
4+
.ipynb_checkpoints
5+
outputs/

.gitmodules

+6
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,9 @@
55
path = third_party/SuperGluePretrainedNetwork
66
url = https://github.com/skydes/SuperGluePretrainedNetwork.git
77
branch = fix-memory
8+
[submodule "third_party/deep-image-retrieval"]
9+
path = third_party/deep-image-retrieval
10+
url = https://github.com/naver/deep-image-retrieval.git
11+
[submodule "third_party/r2d2"]
12+
path = third_party/r2d2
13+
url = https://github.com/naver/r2d2.git

README.md

+38-9
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# hloc - the hierarchical localization toolbox
22

3-
This is `hloc`, a modular toolbox for state-of-the-art 6-DoF visual localization. It implements [Hierarchical Localization](https://arxiv.org/abs/1812.03506), leveraging image retrieval and feature matching, and is fast, accurate, and scalable. This codebase won the indoor/outdoor [localization challenge at CVPR 2020](https://sites.google.com/view/vislocslamcvpr2020/home), in combination with [SuperGlue](https://psarlin.com/superglue/), our graph neural network for feature matching.
3+
This is `hloc`, a modular toolbox for state-of-the-art 6-DoF visual localization. It implements [Hierarchical Localization](https://arxiv.org/abs/1812.03506), leveraging image retrieval and feature matching, and is fast, accurate, and scalable. This codebase won the indoor/outdoor localization challenges at [CVPR 2020](https://sites.google.com/view/vislocslamcvpr2020/home) and [ECCV 2020](https://sites.google.com/view/ltvl2020/), in combination with [SuperGlue](https://psarlin.com/superglue/), our graph neural network for feature matching.
44

55
With `hloc`, you can:
66

@@ -34,8 +34,6 @@ docker run -it --rm -p 8888:8888 hloc:latest # for GPU support, add `--runtime=
3434
jupyter notebook --ip 0.0.0.0 --port 8888 --no-browser --allow-root
3535
```
3636

37-
38-
3937
## General pipeline
4038

4139
The toolbox is composed of scripts, which roughly perform the following steps:
@@ -57,6 +55,7 @@ Strcture of the toolbox:
5755
- `hloc/*.py` : top-level scripts
5856
- `hloc/extractors/` : interfaces for feature extractors
5957
- `hloc/matchers/` : interfaces for feature matchers
58+
- `hloc/pipelines/` : entire pipelines for multiple datasets
6059

6160
## Tasks
6261

@@ -84,7 +83,11 @@ We show in [`pipeline_SfM.ipynb`](https://nbviewer.jupyter.org/github/cvg/Hierar
8483

8584
## Results
8685

87-
`hloc` currently supports [SuperPoint](https://arxiv.org/abs/1712.07629) and [D2-Net](https://arxiv.org/abs/1905.03561) local feature extractors; and [SuperGlue](https://arxiv.org/abs/1911.11763) and Nearest Neighbor matchers. Using [NetVLAD](https://arxiv.org/abs/1511.07247) for retrieval, we obtain the following best results:
86+
- Supported local feature extractors: [SuperPoint](https://arxiv.org/abs/1712.07629), [D2-Net](https://arxiv.org/abs/1905.03561), [SIFT](https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf), and [R2D2](https://arxiv.org/abs/1906.06195).
87+
- Supported feature matchers: [SuperGlue](https://arxiv.org/abs/1911.11763) and the Nearest Neighbor matcher with ratio test and/or mutual check.
88+
- Supported image retrieval: [NetVLAD](https://arxiv.org/abs/1511.07247) and [AP-GeM/DIR](https://github.com/naver/deep-image-retrieval).
89+
90+
Using [NetVLAD](https://arxiv.org/abs/1511.07247) for retrieval, we obtain the following best results:
8891

8992
| Methods | Aachen day | Aachen night | Retrieval |
9093
| ------------------------------------------------------------ | ------------------ | ------------------ | -------------- |
@@ -101,6 +104,10 @@ We show in [`pipeline_SfM.ipynb`](https://nbviewer.jupyter.org/github/cvg/Hierar
101104

102105
Check out [visuallocalization.net/benchmark](https://www.visuallocalization.net/benchmark) for more details and additional baselines.
103106

107+
## Supported datasets
108+
109+
We provide in [`hloc/pipelines/`](./hloc/pipelines) scripts to run the reconstruction and the localization on the following datasets: Aachen Day-Night (v1.0 and v1.1), InLoc, Extended CMU Seasons, RobotCar Seasons, 4Seasons, Cambridge Landmarks, and 7-Scenes.
110+
104111
## BibTex Citation
105112

106113
If you report any of the above results in a publication, or use any of the tools provided here, please consider citing both [Hierarchical Localization](https://arxiv.org/abs/1812.03506) and [SuperGlue](https://arxiv.org/abs/1911.11763) papers:
@@ -162,17 +169,39 @@ In a match file, each key corresponds to the string `path0.replace('/', '-')+'_'
162169
<details>
163170
<summary>[Click to expand]</summary>
164171

165-
For now `hloc` does not have an interface for image retrieval. You will need to export the global descriptors into an HDF5 file, in which each key corresponds to the relative path of an image w.r.t. the dataset root, and contains a dataset `global_descriptor` with size D. You can then export the images pairs with [`hloc/pairs_from_retrieval.py`](hloc/pairs_from_retrieval.py).
172+
`hloc` also provides an interface for image retrieval via `hloc/extract_features.py`. As previously, simply add a new interface to [`hloc/extractors/`](hloc/extractors/). Alternatively, you will need to export the global descriptors into an HDF5 file, in which each key corresponds to the relative path of an image w.r.t. the dataset root, and contains a dataset `global_descriptor` with size D. You can then export the images pairs with [`hloc/pairs_from_retrieval.py`](hloc/pairs_from_retrieval.py).
173+
</details>
174+
175+
## Versions
176+
177+
<details>
178+
<summary>dev branch</summary>
179+
180+
Continuously adds new features.
181+
</details>
182+
183+
<details>
184+
<summary>v1.1 (July 2021)</summary>
185+
186+
- **Breaking**: improved structure of the SfM folders (triangulation and reconstruction), see [#76](https://github.com/cvg/Hierarchical-Localization/pull/76)
187+
- Support for image retrieval (NetVLAD, DIR) and more local features (SIFT, R2D2)
188+
- Support for more datasets: Aachen v1.1, Extended CMU Seasons, RobotCar Seasons, Cambridge Landmarks, 7-Scenes
189+
- Simplified pipeline and API
190+
- Spatial matcher
191+
</details>
192+
193+
<details>
194+
<summary>v1.0 (July 2020)</summary>
195+
196+
Initial public version.
166197
</details>
167198

168199
## Contributions welcome!
169200

170201
External contributions are very much welcome. This is a non-exaustive list of features that might be valuable additions:
171202

172-
- [ ] more localization datasets (RobotCar Seasons, CMU Seasons, Aachen v1.1, Cambridge Landmarks, 7Scenes)
173203
- [ ] covisibility clustering for InLoc
174204
- [ ] visualization of the raw predictions (features and matches)
175-
- [ ] interfaces for image retrieval (e.g. [DIR](https://github.com/almazan/deep-image-retrieval), [NetVLAD](https://github.com/uzh-rpg/netvlad_tf_open))
176-
- [ ] other local features
205+
- [ ] other local features or image retrieval
177206

178-
Created and maintained by [Paul-Edouard Sarlin](https://psarlin.com/).
207+
Created and maintained by [Paul-Edouard Sarlin](https://psarlin.com/) with the help of many.

doc/depth_aachen.svg

+7,377-6,903
Loading

doc/loc_aachen.svg

+8,312-7,888
Loading

doc/loc_inloc.svg

+4,650-4,196
Loading

hloc/colmap_from_nvm.py

-1
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@ def recover_database_images_and_ids(database_path):
1818
for name, image_id, camera_id in ret:
1919
images[name] = image_id
2020
cameras[name] = camera_id
21-
2221
db.close()
2322
logging.info(
2423
f'Found {len(images)} images and {len(cameras)} cameras in database.')

hloc/extract_features.py

+103-35
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,13 @@
88
import numpy as np
99
from tqdm import tqdm
1010
import pprint
11+
import collections.abc as collections
1112

1213
from . import extractors
1314
from .utils.base_model import dynamic_load
1415
from .utils.tools import map_tensor
16+
from .utils.parsers import parse_image_lists
17+
from .utils.io import read_image, list_h5_names
1518

1619

1720
'''
@@ -34,6 +37,21 @@
3437
'resize_max': 1024,
3538
},
3639
},
40+
# Resize images to 1600px even if they are originally smaller.
41+
# Improves the keypoint localization if the images are of good quality.
42+
'superpoint_max': {
43+
'output': 'feats-superpoint-n4096-rmax1600',
44+
'model': {
45+
'name': 'superpoint',
46+
'nms_radius': 3,
47+
'max_keypoints': 4096,
48+
},
49+
'preprocessing': {
50+
'grayscale': True,
51+
'resize_max': 1600,
52+
'resize_force': True,
53+
},
54+
},
3755
'superpoint_inloc': {
3856
'output': 'feats-superpoint-n4096-r1600',
3957
'model': {
@@ -57,6 +75,34 @@
5775
'resize_max': 1600,
5876
},
5977
},
78+
'sift': {
79+
'output': 'feats-sift',
80+
'model': {
81+
'name': 'sift'
82+
},
83+
'preprocessing': {
84+
'grayscale': True,
85+
'resize_max': 1600,
86+
},
87+
},
88+
'dir': {
89+
'output': 'global-feats-dir',
90+
'model': {
91+
'name': 'dir',
92+
},
93+
'preprocessing': {
94+
'resize_max': 1024,
95+
},
96+
},
97+
'netvlad': {
98+
'output': 'global-feats-netvlad',
99+
'model': {
100+
'name': 'netvlad',
101+
},
102+
'preprocessing': {
103+
'resize_max': 1024,
104+
},
105+
},
60106
}
61107

62108

@@ -65,37 +111,45 @@ class ImageDataset(torch.utils.data.Dataset):
65111
'globs': ['*.jpg', '*.png', '*.jpeg', '*.JPG', '*.PNG'],
66112
'grayscale': False,
67113
'resize_max': None,
114+
'resize_force': False,
68115
}
69116

70-
def __init__(self, root, conf):
117+
def __init__(self, root, conf, paths=None):
71118
self.conf = conf = SimpleNamespace(**{**self.default_conf, **conf})
72119
self.root = root
73120

74-
self.paths = []
75-
for g in conf.globs:
76-
self.paths += list(Path(root).glob('**/'+g))
77-
if len(self.paths) == 0:
78-
raise ValueError(f'Could not find any image in root: {root}.')
79-
self.paths = sorted(list(set(self.paths)))
80-
self.paths = [i.relative_to(root) for i in self.paths]
81-
logging.info(f'Found {len(self.paths)} images in root {root}.')
121+
if paths is None:
122+
paths = []
123+
for g in conf.globs:
124+
paths += list(Path(root).glob('**/'+g))
125+
if len(paths) == 0:
126+
raise ValueError(f'Could not find any image in root: {root}.')
127+
paths = sorted(list(set(paths)))
128+
self.names = [i.relative_to(root).as_posix() for i in paths]
129+
logging.info(f'Found {len(self.names)} images in root {root}.')
130+
else:
131+
if isinstance(paths, (Path, str)):
132+
self.names = parse_image_lists(paths)
133+
elif isinstance(paths, collections.Iterable):
134+
self.names = [p.as_posix() if isinstance(p, Path) else p
135+
for p in paths]
136+
else:
137+
raise ValueError(f'Unknown format for path argument {paths}.')
138+
139+
for name in self.names:
140+
if not (root / name).exists():
141+
raise ValueError(
142+
f'Image {name} does not exists in root: {root}.')
82143

83144
def __getitem__(self, idx):
84-
path = self.paths[idx]
85-
if self.conf.grayscale:
86-
mode = cv2.IMREAD_GRAYSCALE
87-
else:
88-
mode = cv2.IMREAD_COLOR
89-
image = cv2.imread(str(self.root / path), mode)
90-
if not self.conf.grayscale:
91-
image = image[:, :, ::-1] # BGR to RGB
92-
if image is None:
93-
raise ValueError(f'Cannot read image {str(path)}.')
145+
name = self.names[idx]
146+
image = read_image(self.root / name, self.conf.grayscale)
94147
image = image.astype(np.float32)
95148
size = image.shape[:2][::-1]
96149
w, h = size
97150

98-
if self.conf.resize_max and max(w, h) > self.conf.resize_max:
151+
if self.conf.resize_max and (self.conf.resize_force
152+
or max(w, h) > self.conf.resize_max):
99153
scale = self.conf.resize_max / max(h, w)
100154
h_new, w_new = int(round(h*scale)), int(round(w*scale))
101155
image = cv2.resize(
@@ -108,33 +162,43 @@ def __getitem__(self, idx):
108162
image = image / 255.
109163

110164
data = {
111-
'name': path.as_posix(),
165+
'name': name,
112166
'image': image,
113167
'original_size': np.array(size),
114168
}
115169
return data
116170

117171
def __len__(self):
118-
return len(self.paths)
172+
return len(self.names)
119173

120174

121175
@torch.no_grad()
122-
def main(conf, image_dir, export_dir, as_half=False):
176+
def main(conf, image_dir, export_dir=None, as_half=False,
177+
image_list=None, feature_path=None):
123178
logging.info('Extracting local features with configuration:'
124179
f'\n{pprint.pformat(conf)}')
125180

126-
device = 'cuda' if torch.cuda.is_available() else 'cpu'
127-
Model = dynamic_load(extractors, conf['model']['name'])
128-
model = Model(conf['model']).eval().to(device)
129-
130-
loader = ImageDataset(image_dir, conf['preprocessing'])
181+
loader = ImageDataset(image_dir, conf['preprocessing'], image_list)
131182
loader = torch.utils.data.DataLoader(loader, num_workers=1)
132183

133-
feature_path = Path(export_dir, conf['output']+'.h5')
184+
if feature_path is None:
185+
feature_path = Path(export_dir, conf['output']+'.h5')
134186
feature_path.parent.mkdir(exist_ok=True, parents=True)
135-
feature_file = h5py.File(str(feature_path), 'a')
187+
skip_names = set(list_h5_names(feature_path)
188+
if feature_path.exists() else ())
189+
if set(loader.dataset.names).issubset(set(skip_names)):
190+
logging.info('Skipping the extraction.')
191+
return feature_path
192+
193+
device = 'cuda' if torch.cuda.is_available() else 'cpu'
194+
Model = dynamic_load(extractors, conf['model']['name'])
195+
model = Model(conf['model']).eval().to(device)
136196

137197
for data in tqdm(loader):
198+
name = data['name'][0] # remove batch dimension
199+
if name in skip_names:
200+
continue
201+
138202
pred = model(map_tensor(data, lambda x: x.to(device)))
139203
pred = {k: v[0].cpu().numpy() for k, v in pred.items()}
140204

@@ -150,14 +214,15 @@ def main(conf, image_dir, export_dir, as_half=False):
150214
if (dt == np.float32) and (dt != np.float16):
151215
pred[k] = pred[k].astype(np.float16)
152216

153-
grp = feature_file.create_group(data['name'][0])
154-
for k, v in pred.items():
155-
grp.create_dataset(k, data=v)
217+
with h5py.File(str(feature_path), 'a') as fd:
218+
grp = fd.create_group(name)
219+
for k, v in pred.items():
220+
grp.create_dataset(k, data=v)
156221

157222
del pred
158223

159-
feature_file.close()
160224
logging.info('Finished exporting features.')
225+
return feature_path
161226

162227

163228
if __name__ == '__main__':
@@ -166,5 +231,8 @@ def main(conf, image_dir, export_dir, as_half=False):
166231
parser.add_argument('--export_dir', type=Path, required=True)
167232
parser.add_argument('--conf', type=str, default='superpoint_aachen',
168233
choices=list(confs.keys()))
234+
parser.add_argument('--as_half', action='store_true')
235+
parser.add_argument('--image_list', type=Path)
236+
parser.add_argument('--feature_path', type=Path)
169237
args = parser.parse_args()
170-
main(confs[args.conf], args.image_dir, args.export_dir)
238+
main(confs[args.conf], args.image_dir, args.export_dir, args.as_half)

hloc/extractors/d2net.py

+3-7
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
import sys
22
from pathlib import Path
33
import subprocess
4-
import logging
54
import torch
65

76
from ..utils.base_model import BaseModel
@@ -15,22 +14,19 @@
1514
class D2Net(BaseModel):
1615
default_conf = {
1716
'model_name': 'd2_tf.pth',
17+
'checkpoint_dir': d2net_path / 'models',
1818
'use_relu': True,
1919
'multiscale': False,
2020
}
2121
required_inputs = ['image']
2222

2323
def _init(self, conf):
24-
model_file = d2net_path / 'models' / conf['model_name']
24+
model_file = conf['checkpoint_dir'] / conf['model_name']
2525
if not model_file.exists():
2626
model_file.parent.mkdir(exist_ok=True)
2727
cmd = ['wget', 'https://dsmn.ml/files/d2-net/'+conf['model_name'],
2828
'-O', str(model_file)]
29-
ret = subprocess.call(cmd)
30-
if ret != 0:
31-
logging.warning(
32-
f'Cannot download the D2-Net model with `{cmd}`.')
33-
exit(ret)
29+
subprocess.run(cmd, check=True)
3430

3531
self.net = _D2Net(
3632
model_file=model_file,

0 commit comments

Comments
 (0)