Vector codec benchmarks

We compare the vector codecs on a few datasets that hopefully are representative of real use cases.

For all cases, the procedure is the same: we instantiate the codec with an index_factory. Then we train it on 1M representative vectors. Then we evaluate:

the reconstruction error on an independent set of 1M vectors. The reconstruction is the average L2 error (non squared) after encoding and decoding back. Note that some codecs do not support reconstruction (ITQ) in a meaningful way.
the symmetric similarity search accuracy (as 1-nearest-neighbor recall at rank 1). We use 10k vectors, that we encode/decode and search the nearest neighbors in the 1M previous vectors (that are also encoded/decoded). This reflects how much the codec destroyed the discriminative ability of vectors. For binary codecs, we use the Hamming distance to evaluate this.
the asymmetric similarity search accuracy. This is the same as above, except that the queries are not encoded. This reflects a use case where query vectors that are immediately available are compared against encoded vectors from a database.

We report these measures in tables, sorted by increasing code size. We also show plots of the symmetric search accuracy. The codecs we tested were chosen from a reasonable set of baselines. It is possible that a bit more of fine-tuning can increase their performance.

SIFT1M

This is a very classical academic dataset of 1M SIFT vectors in 128D. The vectors come encoded as bytes, so it does not make sense to use more than 128 bytes to encode it.

The table below reports the most interesting operating points:

factory key	code size	codec size	recons error	sym recall @ 1	asym recall @ 1	decode time
ITQ64,LSH	8	48.7 k	-	0.051	-	-
OPQ8_48,PQ8	8	72.2 k	308.2	0.141	0.220	0.23
PCAR24,ZnLattice1x54_6	8	77.3 k	364	0.101	0.145	0.20
OPQ8_48,PQ8x10	10	216 k	301.5	0.215	0.302	0.25
OPQ8_48,PQ8x12	12	792 k	297.1	0.282	0.353	0.27
Pad128,ITQ128,LSH	16	129 k	-	0.129	-	-
OPQ16_64,PQ16	16	96.2 k	138.8	0.307	0.412	0.23
OPQ16_64,PQ16x10	20	288 k	125.9	0.428	0.504	0.28
OPQ16_64,PQ16x12	24	1.03 M	118.7	0.523	0.567	0.34
Pad256,ITQ256,LSH	32	514 k	-	0.223	-	-
PCAR64,SQ4	32	97.9 k	121.2	0.481	0.540	0.22
PCAR96,ZnLattice4x54_6	32	114 k	98.18	0.468	0.572	0.60
OPQ32_128,IVF16384,PQ32	34	8.19 M	61.86	0.612	0.666	0.14
OPQ32_128,PQ32x10	40	576 k	56.2	0.649	0.718	0.57
OPQ32_128,Residual2x14,PQ32x10	44	8.56 M	45.34	0.727	0.764	0.11
OPQ32_128,Residual2x14,PQ32x12	52	10.1 M	36.2	0.789	0.809	0.11
OPQ32_128,Residual2x14,PQ32x14	60	16.1 M	29.55	0.835	0.850	0.14
SQ4	64	1105	42.19	0.740	0.780	0.03
OPQ64_256,PQ64x10	80	1.13 M	17.32	0.878	0.896	1.04
SQ6	96	1105	9.904	0.928	0.942	0.06
OPQ64_256,PQ64x12	96	4.13 M	10.12	0.922	0.933	1.25
PCAR128,SQ8	128	131 k	3.201	0.968	0.977	0.38
SQ8	128	1105	2.468	0.976	0.979	0.04
SQfp16	256	81	0	0.992	0.992	0.03

nan means that the operation does not make sense (ITQ cannot reconstruct). The decode time is indicated as seconds to decode 1M vectors on a full machine (48 threads).

We can make the following observations:

ITQ64,LSH is a very calssical binary hash. For all operating points, it is outperformed by the product quantizer variants and even the scalar quantizer.
PQ variants give very good similarity search results. However, note that the pre-processing (OPQx) requires to store a rotation matrix in the codec, which can make it large.
for a given PQ size (eg. PQ16) increasing the number of bits per quantizer (beyond the default 8) is a good way to scale up the accuracy for a modest increase of code size
residual codecs improve a bit over the PQ, but they are a lot slower to encode, and they require even more training data
SQ8 does not give 100% accuracy as it should because of quantization artifacts.
OPQ and PCA matrices for this large datasets are bulky to store in the codec. Note that PCA matrices (but not OPQ) are stored twice, because the raw eigenvalues are kept. Hence, for a PCA with input dimension d, it is possible to subtract d*d*4 bytes from the codec size.
the lattice quantizer is better than ITQ but not as good as the scalar quantizer. This may be because the parameters were not optimized for such short vectors

Resnet50

This is a dataset of 2048-dim vectors produced by the average pooling layer of a resnet50. The vectors come from the repository low-shot-with-diffusion.

factory key	code size	codec size	recons error	sym recall @ 1	asym recall @ 1	encode time	decode time
ITQ64,LSH	8	536 k	-	0.028	-	42.41	-
OPQ8_48,PQ8	8	432 k	11.14	0.049	0.086	18.95	1.91
PCAR24,ZnLattice1x54_6	8	16.2 M	12.67	0.045	0.060	18.93	1.21
OPQ8_48,PQ8x10	10	576 k	10.89	0.082	0.129	23.43	1.95
OPQ24_96,PQ12	12	864 k	10.29	0.071	0.129	14.53	0.94
OPQ8_48,PQ8x12	12	1.13 M	10.8	0.124	0.171	40.86	0.79
Pad128,ITQ128,LSH	16	1.07 M	-	0.069	-	41.59	-
RR128,LSHt	16	1 M	-	0.056	-	18.27	-
OPQ16_64,PQ16	16	576 k	10.14	0.161	0.214	13.93	0.89
OPQ16_64,PQ16x10	20	768 k	9.904	0.235	0.278	15.74	0.86
OPQ16_64,PQ16x12	24	1.5 M	9.779	0.284	0.316	21.72	0.92
OPQ16_64,PQ16x14	28	4.5 M	9.717	0.318	0.342	45.72	0.92
Pad256,ITQ256,LSH	32	2.26 M	-	0.125	-	43.74	-
PCAR96,ZnLattice4x54_6	32	16.8 M	8.84	0.361	0.408	22.24	1.20
PCAR64,SQ4	32	16.5 M	9.96	0.255	0.291	14.65	0.84
OPQ32_128,IVF16384,PQ32	34	9.13 M	7.783	0.419	0.486	23.32	0.86
OPQ32_128,PQ32x10	40	1.5 M	7.773	0.440	0.496	29.70	1.37
OPQ32_128,Residual2x14,PQ32x10	44	9.5 M	7.571	0.527	0.569	46.01	1.00
OPQ32_128,PQ32x12	48	3 M	7.566	0.516	0.554	32.23	1.22
OPQ32_128,Residual2x14,PQ32x12	52	11 M	7.472	0.582	0.609	57.47	0.91
OPQ32_128,Residual2x14,PQ32x14	60	17 M	7.415	0.603	0.615	95.27	0.71
OPQ64_256,PQ64	64	2.25 M	5.801	0.540	0.609	17.50	1.47
PCAR128,SQ4	64	17 M	7.763	0.457	0.515	14.18	1.03
OPQ64_256,PQ64x10	80	3 M	5.367	0.630	0.691	28.46	1.87
OPQ64_256,PQ64x12	96	6 M	5.134	0.716	0.756	47.76	1.93
OPQ128_512,PQ128	128	4.5 M	4.159	0.713	0.752	19.18	4.18
PCAR128,SQ8	128	17 M	7.404	0.642	0.647	15.39	1.03
PCAR384,ZnLattice16x54_6	128	19 M	4.688	0.679	0.737	19.05	2.75
OPQ128_512,PQ128x10	160	6 M	3.769	0.781	0.811	29.75	3.30
OPQ128_512,PQ128x12	192	12 M	3.564	0.829	0.851	76.59	4.11
OPQ128_512,PQ128x14	224	36 M	3.428	0.854	0.873	309.20	9.02
PCAR256,SQ8	256	18 M	4.861	0.846	0.850	38.03	1.40
PCAR1024,Residual2x14,PQ256	260	89 M	3.6	0.709	0.769	60.21	4.64
PCAR1024,Residual2x14,PQ256x12	388	104 M	2.529	0.863	0.890	181.39	4.76
PCAR512,SQ8	512	20 M	3.233	0.919	0.921	18.14	4.00
PCAR1024,PQ512x12	768	40 M	2.01	0.953	0.955	192.48	13.30
SQ4	1024	16.1 k	3.534	0.725	0.760	50.48	0.57
SQ8	2048	16.1 k	0.2162	0.980	0.982	16.80	0.58
SQfp16	4096	81	0.004493	0.994	0.994	15.67	0.53

Observations:

ITQ is again largely outperformed
in high dimensions, the PQ codecs are easily > 100MB just to store the codec itself. The encoding time can also be prohibitive.
in high dimensions, the scalar quantizer options are interesting, especially because they do not need large codecs
the residual codecs are a bit better than the scalar ones
the lattice quantizer is competitive with OPQ variants for low bit rates (up to 32). It is much cheaper to train but a lot more expensive to code and decode.

reconstruction error plot

For comparison, here is the reconstruction error plot. The recall measures and reconstruction are heavily correlated: the lower the reconstruction error, the better the retrieval performance.

Sentence embeddings

This is a dataset with 1024-dim German sentences from the LASER toolkit.

factory key	code size	codec size	recons error	sym recall @ 1	asym recall @ 1	encode time	decode time
ITQ64,LSH	8	276 k	-	0.025	-	43.36	-
OPQ8_48,PQ8	8	240 k	0.04436	0.034	0.062	9.93	1.82
PCAR24,ZnLattice1x54_6	8	4.1 M	0.04515	0.030	0.038	9.56	1.75
OPQ8_48,PQ8x10	10	384 k	0.04405	0.052	0.080	13.59	0.47
OPQ24_96,PQ12	12	480 k	0.04337	0.054	0.101	7.58	0.63
OPQ8_48,PQ8x12	12	960 k	0.04384	0.070	0.090	33.32	0.47
Pad128,ITQ128,LSH	16	580 k	-	0.063	-	32.96	-
OPQ16_64,PQ16	16	320 k	0.04318	0.098	0.127	13.10	0.52
OPQ16_64,PQ16x10	20	512 k	0.04292	0.126	0.143	10.93	0.65
OPQ16_64,PQ16x12	24	1.25 M	0.04278	0.141	0.157	15.56	0.66
OPQ24_96,PQ24	24	480 k	0.04203	0.155	0.202	40.63	0.59
OPQ16_64,PQ16x14	28	4.25 M	0.0427	0.151	0.160	102.27	0.59
Pad256,ITQ256,LSH	32	1.25 M	-	0.121	-	33.70	-
OPQ32_128,PQ32	32	640 k	0.04111	0.206	0.255	18.77	0.83
PCAR64,SQ4	32	4.26 M	0.04323	0.090	0.116	16.19	0.51
PCAR96,ZnLattice4x54_6	32	4.38 M	0.04168	0.201	0.222	8.69	0.81
OPQ32_128,IVF16384,PQ32	34	8.63 M	0.04079	0.242	0.275	82.16	0.42
OPQ32_128,PQ32x10	40	1 M	0.04071	0.258	0.288	22.44	0.79
OPQ32_128,Residual2x14,PQ32x10	44	9 M	0.04055	0.289	0.305	34.87	0.41
OPQ32_128,PQ32x12	48	2.5 M	0.04052	0.292	0.307	91.29	0.89
OPQ32_128,Residual2x14,PQ32x14	60	16.5 M	0.04038	0.315	0.322	94.09	0.41
OPQ64_256,PQ64	64	1.25 M	0.03845	0.363	0.421	12.66	1.44
PCAR128,SQ4	64	4.51 M	0.04125	0.189	0.230	13.19	0.71
PCAR64,SQ8	64	4.26 M	0.0426	0.166	0.165	17.95	0.54
OPQ64_256,PQ64x10	80	2 M	0.03791	0.435	0.469	111.14	1.75
OPQ64_256,PQ64x12	96	5 M	0.03767	0.473	0.498	213.54	1.64
OPQ128_512,PQ128	128	2.5 M	0.0355	0.534	0.599	16.70	3.06
PCAR384,ZnLattice16x54_6	128	5.51 M	0.03623	0.554	0.596	10.71	1.80
OPQ128_512,PQ128x10	160	4 M	0.03497	0.630	0.668	25.14	3.18
OPQ128_512,PQ128x12	192	10 M	0.03467	0.680	0.713	68.92	3.09
OPQ128_512,PQ128x14	224	34 M	0.03452	0.711	0.729	338.22	7.07
PCAR1024,PQ256	256	9.01 M	0.01069	0.646	0.700	122.88	7.62
PCAR256,SQ8	256	5.01 M	0.03747	0.528	0.528	10.69	1.08
PCAR768,ZnLattice32x54_6	256	7.01 M	0.03314	0.724	0.780	12.76	3.42
PCAR1024,Residual2x14,PQ256x12	388	88 M	0.004612	0.839	0.867	208.75	3.67
PCAR1024,PQ512	512	9.01 M	0.003313	0.881	0.901	33.06	8.53
PCAR512,SQ8	512	6.01 M	0.03431	0.768	0.768	11.32	2.00
PCAR1024,PQ512x12	768	24 M	0.001003	0.965	0.970	227.62	13.56
SQ6	768	8273	0.003924	0.843	0.872	22.25	0.47
SQ8	1024	8273	0.0009623	0.955	0.965	18.18	0.27
SQfp16	2048	81	1.041e-05	0.999	0.999	37.52	0.28

Observations:

Although the dataset is lower-dimensional than the resnet50 embeddings, the raw retrieval performance is lower, so it is also more difficult to index.
Encoding times can be quite prohibitive for residual codecs.
the lattice quantizer is very competitive in this setting. It obtains competitive accuracies with the PQ variants.

The evaluation code is in bench_standalone_codec.py.

The plots were produced with this notebook