This project is a fork of the original ann-benchmarks repository. We make significant use of their codebase and extend it to handle dynamically updating algorithms. Please see the original project for more information. Attempts have been made to keep the original functionality of the benchmarks accessible, but this project is mainly concerned with dynamically updating algorithms.
The following methods are those evaluated in the original project, with annotations concerning their support for dynamic updates. We have also added dci-knn
and pydci
.
Method | Dynamic updates? | Notes on dynamic updates |
---|---|---|
Annoy | ❌ | "in particular you can not add more items once the tree has been created" |
FLANN | ✔️ | "Working with dynamic point clouds without a need to rebuild entire kd-tree index" |
scikit-learn: LSHForest, KDTree, BallTree | ❌ | Requires rebuilding the tree each time |
PANNS | ❌ | Can be compared to old indices but needs to be recreated |
NearPy | ❌ | Rehashes with each new query vector |
KGraph | ❌ | The index file has to be rebuilt with each new dataset. |
NMSLIB (Non-Metric Space Library): SWGraph, HNSW, BallTree, MPLSH | ❌ | Static data only |
hnswlib (a part of nmslib project) | ✔️ | Updatable module |
RPForest | ❌ | Tree structures have to be rebuilt |
FAISS | ❌ | The x_i's are assumed to be fixed |
DolphinnPy | ❌ | Hypeplane LSH family model |
Datasketch | ❌ | Rebuild index each time |
PyNNDescent | ❌ | Rebuild index each time |
MRPT | ❌ | Rebuild index each time |
NGT: ONNG, PANNG, QG | ❌ | Project has evolved into VALD, which supports dynamic updating |
SPTAG | ✔️ | "Fresh update: Support online vector deletion and insertion" |
PUFFINN | ❌ | Can insert points but needs to be rebuilt after insertion |
N2 | ❌ | Rebuild each time, but can be split over several threads |
ScaNN | ❌ | Works well with large data, but does not support dynamic updating |
Elastiknn | ✔️ | "Implementation based on standard Elasticsearch and Lucene primitives, entirely in the JVM. Indexing and querying scale horizontally with Elasticsearch." |
OpenSearch KNN | ❌ | High latency in large dimensional vectors |
DiskANN: Vamana, Vamana-PQ | ✔️ | Updates when points are added |
Vespa | ✔️ | "Vespa is self-repairing and dynamic" |
scipy: cKDTree | ❌ | Tree structure needs to be rebuilt |
vald | ✔️ | "Vald uses distributed index graphs so it continues to work during indexing" |
dci-knn | ❌ | |
pydci | ✔️ |
The datasets used for this project were provided to us by Siemens and are not included in this repository. Before running any tests on this data, please move the following files to the data
directory: AS.csv
, OLHC.csv
, SHERPA.csv
, SHERPA_100000.csv
.
Note that the data included with the original project should be able to be adapted for dynamic tests. See the function ann-benchmarks.datasets.write_dynamic_output
for more information and ann-benchmarks.datasets.siemens_dynamic
for an example.
The only prerequisite is Python (tested with 3.6) and Docker.
- Clone the repo.
- Run
pip install -r requirements.txt
. - Run
python install.py
to build all the libraries inside Docker containers (this can take a while, like 10-30 minutes).
In order to replicate the results summarized in the report, try using the script run_script.sh
. As currently setup, it will take a very long time to run, so running only one dataset or algorithm at a time may be preferable. This can be performed with the command
python run.py --dataset $DATASET --count -1 --runs 5 --algorithm $ALGORITHM
where $DATASET
can be any of
siemens-sherpa
siemens-big-sherpa
siemens-olhc
siemens-as
and $ALGORITHM
can be any of
bruteforce
dciknn
flann
hnswlib
pydci
sptag
See python run.py --help
for further explanation and command line arguments.
In order to change the values tested in the parameter sweep for each method, change the args
and query-args
entries under the desired methods in algos.yaml
.
Since the testing workflow is somewhat complex, a detailed walkthrough is provided in docs/dynamic_workflow_outline.md
.
To replicate the majority of plots in the report, after running run_script.sh
, run plot_script.sh
. This script will plot results for a few combinations of algorithms on all datasets. See python plot_dynamic.py --help
for more information, and see the commands in plot_script.sh
for examples. Note that specifying metrics in the --best_metric
argument will find the run with the best behavior in that metric over the last 10% of queries. These algorithms will also be cached separately so that replotting will be much faster. Using the --force
option will recompute these best runs rather than using their cached versions.
In addition, see custom_plot_dynamic.py
for plotting individual runs rather than averages of all runs or those best in a certain metric. The main difference is that instead of algorithm names in the --algorithms
argument, the specific HDF5 files of the desired runs to be plotted should be passed instead.
- Add your algorithm into
ann_benchmarks/algorithms
by providing a small Python wrapper. - Add a Dockerfile in
install/
for it - Add it to
algos.yaml
- Make everything easy to replicate, including installing and preparing the datasets.
- Single queries are used by default. ANN-Benchmarks enforces that only one CPU is saturated during experimentation, i.e., no multi-threading.
- We mainly support CPU-based ANN algorithms. GPU support exists for FAISS, but it has to be compiled with GPU support locally and experiments must be run using the flags
--local --batch
. - Note that we consider that set similarity datasets are sparse and thus we pass a sorted array of integers to algorithms to represent the set of each user.
Original project built by Erik Bernhardsson with significant contributions from Martin Aumüller and Alexander Faithfull.
The modifications for this project were primarily made by Craig Gross with support from Daniel Ejsmont, Cullen Haselby, and Jim Lewis at Michigan State University.
The following publication details design principles behind the original benchmarking framework:
- M. Aumüller, E. Bernhardsson, A. Faithfull: ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms. Information Systems 2019. DOI: 10.1016/j.is.2019.02.006
- big-ann-benchmarks is a benchmarking effort for billion-scale approximate nearest neighbor search as part of the NeurIPS'21 Competition track.