Skip to content

Commit a80e713

Browse files
authored
Merge pull request #224 from weaviate/v1-34/flat-index-rq
Update docs
2 parents bd0c244 + a2bdb96 commit a80e713

File tree

5 files changed

+25
-26
lines changed

5 files changed

+25
-26
lines changed
Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1 @@
1-
:::info Added in `v1.24`
2-
:::
3-
41
Collections can have multiple [named vectors](/weaviate/config-refs/collections#named-vectors). The vectors in a collection can have their own configurations, and compression must be enabled independently for each vector. Every vector is independent and can use [PQ](/weaviate/configuration/compression/pq-compression), [BQ](/weaviate/configuration/compression/bq-compression), [RQ](/weaviate/configuration/compression/rq-compression), [SQ](/weaviate/configuration/compression/sq-compression), or no compression.

docs/weaviate/concepts/vector-quantization.md

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -118,25 +118,17 @@ When SQ is enabled, Weaviate boosts recall by over-fetching compressed results.
118118

119119
## Rotational quantization
120120

121-
:::info Added in `v1.32`
122-
123-
**8-bit Rotational quantization (RQ)** was added in **`v1.32`**.
124-
125-
:::
121+
**Rotational quantization (RQ)** is a quantization technique that provides significant compression while maintaining high recall in internal testing. Unlike SQ, RQ requires no training phase and can be enabled immediately at index creation. RQ is available in two variants: **8-bit RQ** and **1-bit RQ**.
126122

127-
:::caution Preview
123+
### 8-bit RQ
128124

129-
**1-bit Rotational quantization (RQ)** was added in **`v1.33`** as a **preview**.<br/>
125+
:::info Added in `v1.32` and `v1.34`
130126

131-
This means that the feature is still under development and may change in future releases, including potential breaking changes.
132-
**We do not recommend using this feature in production environments at this time.**
127+
**8-bit Rotational quantization (RQ)** for HNSW indexes was added in **`v1.32`**.<br/>
128+
**8-bit Rotational quantization (RQ)** for flat indexes was added in **`v1.34`** as a **preview**.<br/>
133129

134130
:::
135131

136-
**Rotational quantization (RQ)** is a quantization technique that provides significant compression while maintaining high recall in internal testing. Unlike SQ, RQ requires no training phase and can be enabled immediately at index creation. RQ is available in two variants: **8-bit RQ** and **1-bit RQ**.
137-
138-
### 8-bit RQ
139-
140132
8-bit RQ provides 4x compression while maintaining 98-99% recall in internal testing. The method works as follows:
141133

142134
1. **Fast pseudorandom rotation**: The input vector is transformed using a fast rotation based on the Walsh Hadamard Transform. This rotation takes approximately 7-10 microseconds for a 1536-dimensional vector. The output dimension is rounded up to the nearest multiple of 64.
@@ -145,6 +137,16 @@ This means that the feature is still under development and may change in future
145137

146138
### 1-bit RQ
147139

140+
:::caution Preview
141+
142+
**1-bit Rotational quantization (RQ)** for HNSW indexes was added in **`v1.33`** as a **preview**.<br/>
143+
**1-bit Rotational quantization (RQ)** for flat indexes was added in **`v1.34`** as a **preview**.<br/>
144+
145+
This means that the feature is still under development and may change in future releases, including potential breaking changes.
146+
**We do not recommend using this feature in production environments at this time.**
147+
148+
:::
149+
148150
1-bit RQ is an asymmetric quantization method that provides close to 32x compression as dimensionality increases. **1-bit RQ serves as a more robust and accurate alternative to BQ** with only a slight performance trade-off (approximately 10% decrease in throughput in internal testing compared to BQ). While more performant than PQ in terms of encoding time and distance calculations, 1-bit RQ typically offers slightly lower recall than well-tuned PQ.
149151

150152
The method works as follows:
@@ -203,7 +205,7 @@ You might be also interested in our blog post [HNSW+PQ - Exploring ANN algorithm
203205

204206
### With a flat index
205207

206-
[BQ](#binary-quantization) can use a [flat index](./indexing/inverted-index.md). A flat index search reads from disk, compression reduces the amount of data Weaviate has to read so searches are faster.
208+
[RQ](#rotational-quantization) and [BQ](#binary-quantization) can use a [flat index](./indexing/inverted-index.md). A flat index search reads from disk, compression reduces the amount of data Weaviate has to read so searches are faster.
207209

208210
## Rescoring
209211

docs/weaviate/config-refs/indexing/vector-index.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,9 +56,9 @@ Some HNSW parameters are mutable, but others cannot be modified after you create
5656
| `flatSearchCutoff` | integer | Optional. Threshold for the [flat-search cutoff](/weaviate/concepts/filtering.md#flat-search-cutoff). To force a vector index search, set `"flatSearchCutoff": 0`. | 40000 | Yes |
5757
| `skip` | boolean | When true, do not index the collection. <br/><br/> Weaviate decouples vector creation and vector storage. If you skip vector indexing, but a vectorizer is configured (or a vector is provided manually), Weaviate logs a warning each import. <br/><br/> To skip indexing and vector generation, set `"vectorizer": "none"` when you set `"skip": true`. <br/><br/> See [When to skip indexing](../../concepts/indexing/vector-index.md#when-to-skip-indexing). | `false` | No |
5858
| `vectorCacheMaxObjects` | integer | Maximum number of objects in the memory cache. By default, this limit is set to one trillion (`1e12`) objects when a new collection is created. For sizing recommendations, see [Vector cache considerations](../../concepts/indexing/vector-index.md#vector-cache-considerations). | `1e12` | Yes |
59-
| `rq` | object | Enable and configure [rotational quantization (RQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> For RQ configuration details, see [RQ configuration parameters](#pq-parameters). | -- | Yes |
59+
| `rq` | object | Enable and configure [rotational quantization (RQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> For RQ configuration details, see [RQ configuration parameters](#rq-parameters). | -- | Yes |
6060
| `pq` | object | Enable and configure [product quantization (PQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> PQ assumes some data has already been loaded. You should have 10,000 to 100,000 vectors per shard loaded before you enable PQ. <br/><br/> For PQ configuration details, see [PQ configuration parameters](#pq-parameters). | -- | Yes |
61-
| `bq` | object | Enable and configure [binery quantization (BQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> For BQ configuration details, see [BQ configuration parameters](#bq-parameters). | -- | Yes |
61+
| `bq` | object | Enable and configure [binary quantization (BQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> For BQ configuration details, see [BQ configuration parameters](#bq-parameters). | -- | Yes |
6262
| `sq` | object | Enable and configure [product quantization (SQ)](/weaviate/concepts/indexing/vector-index.md) compression. <br/><br/> For SQ configuration details, see [SQ configuration parameters](#sq-parameters). | -- | Yes |
6363

6464
### Database parameters for HNSW

docs/weaviate/configuration/compression/rq-compression.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,10 @@ RQ is currently not supported for the flat index type.
2929

3030
## 8-bit RQ
3131

32-
:::info Added in `v1.32`
32+
:::info Added in `v1.32` and `v1.34`
3333

34-
**8-bit Rotational quantization (RQ)** was added in **`v1.32`**.
34+
**8-bit Rotational quantization (RQ)** for HNSW indexes was added in **`v1.32`**.<br/>
35+
**8-bit Rotational quantization (RQ)** for flat indexes was added in **`v1.34`** as a **preview**.<br/>
3536

3637
:::
3738

@@ -119,7 +120,8 @@ RQ can also be enabled for an existing collection by updating the collection def
119120

120121
:::caution Preview
121122

122-
**1-bit Rotational quantization (RQ)** was added in **`v1.33`** as a **preview**.<br/>
123+
**1-bit Rotational quantization (RQ)** for HNSW indexes was added in **`v1.33`** as a **preview**.<br/>
124+
**1-bit Rotational quantization (RQ)** for flat indexes was added in **`v1.34`** as a **preview**.<br/>
123125

124126
This means that the feature is still under development and may change in future releases, including potential breaking changes.
125127
**We do not recommend using this feature in production environments at this time.**

docs/weaviate/starter-guides/managing-resources/compression.mdx

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ This table shows the compression algorithms that are available for each index ty
4141
| :--------------- | :--------- | :--------- | :------------ |
4242
| PQ | Yes | No | Yes |
4343
| SQ | Yes | No | Yes |
44-
| RQ | Yes | No | Yes |
44+
| RQ | Yes | Yes | Yes |
4545
| BQ | Yes | Yes | Yes |
4646

4747
The [dynamic index](/weaviate/config-refs/indexing/vector-index.mdx#dynamic-index) is new in v1.25. This type of index is a [flat index](/weaviate/config-refs/indexing/vector-index.mdx#flat-index) until a collection reaches a threshold size. When the collection grows larger than the threshold size, the default is 10,000 objects, the collection is automatically reindexed and converted to an HNSW index.
@@ -130,12 +130,10 @@ Most applications benefit from compression. The cost savings are significant. In
130130

131131
- For most users with HNSW indexes who want the best combination of simplicity, performance, and recall, **consider 8-bit RQ compression**. RQ provides 4x compression with 98-99% recall and requires no configuration or training. It's ideal for standard use cases with embeddings from providers like OpenAI.
132132

133-
- If you have a small collection that uses a flat index, consider a BQ index. The BQ index is 32 times smaller and much faster than the uncompressed equivalent.
133+
- If you have a small collection that uses a flat index, consider RQ compression. The flat index with RQ enabled is smaller and much faster than the uncompressed equivalent.
134134

135135
- If you have a very large data set or specialized search needs, consider PQ compression. PQ compression is very configurable, but it requires more expertise to tune well than SQ, RQ, or BQ.
136136

137-
For collections that are small, but that are expected to grow, consider a dynamic index. In addition to setting the dynamic index type, configure the collection to use BQ compression while the index is flat and RQ compression when the collection grows large enough to move from a flat index to an HNSW index.
138-
139137
## Further resources
140138

141139
To enable compression, follow the steps on these pages:

0 commit comments

Comments
 (0)