-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add vector search to the experimental page (#1051)
- Loading branch information
Showing
5 changed files
with
236 additions
and
18 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,139 @@ | ||
--- | ||
title: Vector search | ||
description: Learn how to use vector search and manage vector indices in Memgraph. | ||
--- | ||
|
||
import { Callout } from 'nextra/components' | ||
|
||
# Vector search | ||
|
||
Vector search, also known as vector similarity search or nearest neighbor search, is a technique used to find the most similar items in a collection of data based on their vector representations. | ||
The vector search is currently an [experimental | ||
feature](/database-management/experimental-features). | ||
Memgraph implements a READ_UNCOMMITTED isolation level specifically for vector indices. While the main database can operate at any isolation level, the vector index specifically operates at READ_UNCOMMITTED. | ||
This design maintains all transactional guarantees at the database level - only the vector index operations use this relaxed isolation level, ensuring the database's ACID properties remain intact for all other operations. | ||
|
||
## Run Memgraph with vector search feature | ||
|
||
To try out vector search feature you need to **enable** it and **configure** it. | ||
To enable the feature set the `--experimental-enabled` to `vector-search`. To | ||
configure the feature set the `--experimental-config` flag. | ||
|
||
<Callout type="info"> | ||
Changing the configuration settings depends on the way you are using Memgraph, | ||
so please refer to the [configuration | ||
docs](/database-management/configuration#changing-configuration) for more | ||
information. | ||
</Callout> | ||
|
||
Here is the example configuration: | ||
|
||
```shell | ||
--experimental-config= | ||
' | ||
{ | ||
"vector-search": { | ||
"index_name": { | ||
"label": "Node", | ||
"property": "vector", | ||
"dimension": 2, | ||
"capacity": 1000, | ||
"metric": "cos", | ||
"resize_coefficient": 2, | ||
} | ||
} | ||
} | ||
' | ||
``` | ||
|
||
Keep in mind that `--experimental-enabled` and `--experimental-config` flags are | ||
both required and the following fields are mandatory for the configuration: | ||
`label`, `property`, `dimension` and `capacity`. | ||
|
||
{<h3> Input: </h3>} | ||
|
||
- `index_name: string` - The vector index to be searched. | ||
- `label: string` ➡ The name of the label on which vector index is indexed. | ||
- `property: string` ➡ The name of the property on which vector index is indexed. | ||
- `dimension: int` ➡ The dimension of vectors in the index. | ||
- `capacity: int` ➡ The capacity of the vector index. | ||
- `metric: string` ➡ The metric used for the vector search. The default value is `l2sq`. | ||
- `resize_coefficient: int` ➡The resize coefficient is multiplied by the capacity when the index gets full to determine the new capacity, if possible. | ||
If the index cannot be resized due to insufficient memory, an exception will be thrown. The default value is `2`. | ||
|
||
|
||
## Usage | ||
|
||
Currently, using vector indices is done through vector_search query module. | ||
|
||
<Callout type="info"> | ||
|
||
Unlike other index types, vector indices are not currently utilized by the query planner. | ||
|
||
</Callout> | ||
|
||
### Show Vector Indices | ||
|
||
You can retrieve information about vector indices using `vector_search.show_index_info()` procedure. | ||
|
||
{<h3> Output: </h3>} | ||
|
||
- `index_name: string` ➡ The name of the vector index. | ||
- `label: string` ➡ The name of the label on which vector index is indexed. | ||
- `property: string` ➡ The name of the property on which vector index is indexed. | ||
- `dimension: int` ➡ The dimension of vectors in the index. | ||
- `capacity: int` ➡ The capacity of the vector index. | ||
- `size: int` ➡ The number of entries in the vector index. | ||
|
||
{<h3> Usage: </h3>} | ||
|
||
```shell | ||
call vector_search.show_index_info() yield *; | ||
``` | ||
|
||
### Query vector index | ||
|
||
To search for similar vectors within a vector index, use the vector_search.search procedure. This procedure allows you to find the closest vectors to a query vector based on a selected similarity metric.. | ||
|
||
{<h3> Input: </h3>} | ||
|
||
- `index_name: string` - The vector index to search. | ||
- `limit: int` - The number of nearest neighbors to return. | ||
- `search_query: List[float]` - The vector to query in the index.. | ||
|
||
{<h3> Output: </h3>} | ||
|
||
- `node: Vertex` - A node in the vector index matching the given query. | ||
- `distance: double` - The distance from the node to the query.. | ||
- `similarity: double` - The similarity of the node and the query. | ||
|
||
{<h3> Usage: </h3>} | ||
|
||
```shell | ||
call vector_search.search("index_name", 1, [2.0, 2.0]) yield *; | ||
``` | ||
|
||
### Similarity metrics | ||
|
||
The following table lists the supported similarity metrics for vector search. These | ||
metrics determine how similarities between vectors are calculated. Default type | ||
for the metric is `l2sq`. | ||
|
||
| Metric | Description | | ||
|-------------|------------------------------------------------------| | ||
| `ip` | Inner product (dot product). | | ||
| `cos` | Cosine similarity. | | ||
| `l2sq` | Squared Euclidean distance. | | ||
| `pearson` | Pearson correlation coefficient. | | ||
| `haversine` | Haversine distance (suitable for geographic data). | | ||
| `divergence`| A divergence-based metric. | | ||
| `hamming` | Hamming distance. | | ||
| `tanimoto` | Tanimoto coefficient. | | ||
| `sorensen` | Sørensen-Dice coefficient. | | ||
| `jaccard` | Jaccard index. | | ||
|
||
### Scalar type | ||
|
||
Properties are stored as 64-bit values in the property store and as 32-bit values in the vector index. | ||
Scalar type define the data type of each vector element. Default type for the | ||
metric is `f32`. |