You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Collections can have multiple [named vectors](/weaviate/config-refs/collections#named-vectors). The vectors in a collection can have their own configurations, and compression must be enabled independently for each vector. Every vector is independent and can use [PQ](/weaviate/configuration/compression/pq-compression), [BQ](/weaviate/configuration/compression/bq-compression), [RQ](/weaviate/configuration/compression/rq-compression), [SQ](/weaviate/configuration/compression/sq-compression), or no compression.
Copy file name to clipboardExpand all lines: docs/weaviate/concepts/vector-quantization.md
+16-14Lines changed: 16 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -118,25 +118,17 @@ When SQ is enabled, Weaviate boosts recall by over-fetching compressed results.
118
118
119
119
## Rotational quantization
120
120
121
-
:::info Added in `v1.32`
122
-
123
-
**8-bit Rotational quantization (RQ)** was added in **`v1.32`**.
124
-
125
-
:::
121
+
**Rotational quantization (RQ)** is a quantization technique that provides significant compression while maintaining high recall in internal testing. Unlike SQ, RQ requires no training phase and can be enabled immediately at index creation. RQ is available in two variants: **8-bit RQ** and **1-bit RQ**.
126
122
127
-
:::caution Preview
123
+
### 8-bit RQ
128
124
129
-
**1-bit Rotational quantization (RQ)** was added in **`v1.33`** as a **preview**.<br/>
125
+
:::info Added in `v1.32` and `v1.34`
130
126
131
-
This means that the feature is still under development and may change in future releases, including potential breaking changes.
132
-
**We do not recommend using this feature in production environments at this time.**
127
+
**8-bit Rotational quantization (RQ)** for HNSW indexes was added in **`v1.32`**.<br/>
128
+
**8-bit Rotational quantization (RQ)** for flat indexes was added in **`v1.34`** as a **preview**.<br/>
133
129
134
130
:::
135
131
136
-
**Rotational quantization (RQ)** is a quantization technique that provides significant compression while maintaining high recall in internal testing. Unlike SQ, RQ requires no training phase and can be enabled immediately at index creation. RQ is available in two variants: **8-bit RQ** and **1-bit RQ**.
137
-
138
-
### 8-bit RQ
139
-
140
132
8-bit RQ provides 4x compression while maintaining 98-99% recall in internal testing. The method works as follows:
141
133
142
134
1.**Fast pseudorandom rotation**: The input vector is transformed using a fast rotation based on the Walsh Hadamard Transform. This rotation takes approximately 7-10 microseconds for a 1536-dimensional vector. The output dimension is rounded up to the nearest multiple of 64.
@@ -145,6 +137,16 @@ This means that the feature is still under development and may change in future
145
137
146
138
### 1-bit RQ
147
139
140
+
:::caution Preview
141
+
142
+
**1-bit Rotational quantization (RQ)** for HNSW indexes was added in **`v1.33`** as a **preview**.<br/>
143
+
**1-bit Rotational quantization (RQ)** for flat indexes was added in **`v1.34`** as a **preview**.<br/>
144
+
145
+
This means that the feature is still under development and may change in future releases, including potential breaking changes.
146
+
**We do not recommend using this feature in production environments at this time.**
147
+
148
+
:::
149
+
148
150
1-bit RQ is an asymmetric quantization method that provides close to 32x compression as dimensionality increases. **1-bit RQ serves as a more robust and accurate alternative to BQ** with only a slight performance trade-off (approximately 10% decrease in throughput in internal testing compared to BQ). While more performant than PQ in terms of encoding time and distance calculations, 1-bit RQ typically offers slightly lower recall than well-tuned PQ.
149
151
150
152
The method works as follows:
@@ -203,7 +205,7 @@ You might be also interested in our blog post [HNSW+PQ - Exploring ANN algorithm
203
205
204
206
### With a flat index
205
207
206
-
[BQ](#binary-quantization) can use a [flat index](./indexing/inverted-index.md). A flat index search reads from disk, compression reduces the amount of data Weaviate has to read so searches are faster.
208
+
[RQ](#rotational-quantization) and [BQ](#binary-quantization) can use a [flat index](./indexing/inverted-index.md). A flat index search reads from disk, compression reduces the amount of data Weaviate has to read so searches are faster.
Copy file name to clipboardExpand all lines: docs/weaviate/config-refs/indexing/vector-index.mdx
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -56,9 +56,9 @@ Some HNSW parameters are mutable, but others cannot be modified after you create
56
56
|`flatSearchCutoff`| integer | Optional. Threshold for the [flat-search cutoff](/weaviate/concepts/filtering.md#flat-search-cutoff). To force a vector index search, set `"flatSearchCutoff": 0`. | 40000 | Yes |
57
57
|`skip`| boolean | When true, do not index the collection. <br/><br/> Weaviate decouples vector creation and vector storage. If you skip vector indexing, but a vectorizer is configured (or a vector is provided manually), Weaviate logs a warning each import. <br/><br/> To skip indexing and vector generation, set `"vectorizer": "none"` when you set `"skip": true`. <br/><br/> See [When to skip indexing](../../concepts/indexing/vector-index.md#when-to-skip-indexing). |`false`| No |
58
58
|`vectorCacheMaxObjects`| integer | Maximum number of objects in the memory cache. By default, this limit is set to one trillion (`1e12`) objects when a new collection is created. For sizing recommendations, see [Vector cache considerations](../../concepts/indexing/vector-index.md#vector-cache-considerations). |`1e12`| Yes |
59
-
|`rq`| object | Enable and configure [rotational quantization (RQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> For RQ configuration details, see [RQ configuration parameters](#pq-parameters). | -- | Yes |
59
+
|`rq`| object | Enable and configure [rotational quantization (RQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> For RQ configuration details, see [RQ configuration parameters](#rq-parameters). | -- | Yes |
60
60
|`pq`| object | Enable and configure [product quantization (PQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> PQ assumes some data has already been loaded. You should have 10,000 to 100,000 vectors per shard loaded before you enable PQ. <br/><br/> For PQ configuration details, see [PQ configuration parameters](#pq-parameters). | -- | Yes |
61
-
|`bq`| object | Enable and configure [binery quantization (BQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> For BQ configuration details, see [BQ configuration parameters](#bq-parameters). | -- | Yes |
61
+
|`bq`| object | Enable and configure [binary quantization (BQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> For BQ configuration details, see [BQ configuration parameters](#bq-parameters). | -- | Yes |
62
62
|`sq`| object | Enable and configure [product quantization (SQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> For SQ configuration details, see [SQ configuration parameters](#sq-parameters). | -- | Yes |
The [dynamic index](/weaviate/config-refs/indexing/vector-index.mdx#dynamic-index) is new in v1.25. This type of index is a [flat index](/weaviate/config-refs/indexing/vector-index.mdx#flat-index) until a collection reaches a threshold size. When the collection grows larger than the threshold size, the default is 10,000 objects, the collection is automatically reindexed and converted to an HNSW index.
@@ -130,12 +130,10 @@ Most applications benefit from compression. The cost savings are significant. In
130
130
131
131
- For most users with HNSW indexes who want the best combination of simplicity, performance, and recall, **consider 8-bit RQ compression**. RQ provides 4x compression with 98-99% recall and requires no configuration or training. It's ideal for standard use cases with embeddings from providers like OpenAI.
132
132
133
-
- If you have a small collection that uses a flat index, consider a BQ index. The BQ index is 32 times smaller and much faster than the uncompressed equivalent.
133
+
- If you have a small collection that uses a flat index, consider RQ compression. The flat index with RQ enabled is smaller and much faster than the uncompressed equivalent.
134
134
135
135
- If you have a very large data set or specialized search needs, consider PQ compression. PQ compression is very configurable, but it requires more expertise to tune well than SQ, RQ, or BQ.
136
136
137
-
For collections that are small, but that are expected to grow, consider a dynamic index. In addition to setting the dynamic index type, configure the collection to use BQ compression while the index is flat and RQ compression when the collection grows large enough to move from a flat index to an HNSW index.
138
-
139
137
## Further resources
140
138
141
139
To enable compression, follow the steps on these pages:
0 commit comments