@@ -16,13 +16,13 @@ You can calculate vector embeddings using [ArangoDB's GraphML](../../data-scienc
1616capabilities (available in ArangoGraph) or using external tools.
1717
1818{{< warning >}}
19- The vector index is an experimental feature that you need to enable  for the
20- ArangoDB server with the ` --experimental- vector-index `  startup option.
19+ You need to enable the vector index feature  for the
20+ ArangoDB server with the ` --vector-index `  startup option.
2121Once enabled for a deployment, it cannot be disabled anymore because it
2222permanently changes how the data is managed by the RocksDB storage engine
2323(it adds an additional column family).
2424
25- To restore a dump that contains vector indexes, the ` --experimental- vector-index ` 
25+ To restore a dump that contains vector indexes, the ` --vector-index ` 
2626startup option needs to be enabled on the deployment you want to restore to.
2727{{< /warning >}}
2828
@@ -56,21 +56,37 @@ be found depends on the data as well as the search effort (see the `nProbe` opti
5656{{< info >}}
5757-  If there is more than one suitable vector index over the same attribute, it is
5858  undefined which one is selected.
59- -  You cannot have any ` FILTER `  operation between ` FOR `  and ` LIMIT `  for
60-   pre-filtering.
59+ 
60+ -  In v3.12.4 and v3.12.5, you cannot have any ` FILTER `  operation between ` FOR ` 
61+   and ` LIMIT `  for pre-filtering. From v3.12.6 onward, you can add ` FILTER ` 
62+   operations between ` FOR `  and ` SORT `  that are then applied during the lookup in
63+   the vector index. Example:
64+ 
65+   ``` aql 
66+   FOR doc IN coll 
67+     FILTER doc.val > 3 
68+     SORT APPROX_NEAR_COSINE(doc.vector, @q) DESC 
69+     LIMIT 5 
70+     RETURN doc 
71+ ``` 
72+ 
73+   Note that e.g. ` LIMIT 5 `  does not ensure that you get 5 results by searching
74+   as many neighboring Voronoi cells as necessary, but it rather considers only as
75+   many as configured via the ` nProbes `  parameter.
6176{{< /info >}}
6277
6378### APPROX_NEAR_COSINE()  
6479
6580` APPROX_NEAR_COSINE(vector1, vector2, options) → similarity ` 
6681
67- Retrieve the approximate angular similarity using the cosine metric, accelerated
68- by a matching vector index.
6982
70- The higher the cosine similarity value is, the more similar the two vectors
71- are. The closer it is to 0, the more different they are. The value can also
72- be negative, indicating that the vectors are not similar and point in opposite
73- directions. You need to sort in descending order so that the most similar
83+ Retrieve the approximate cosine of the angle between two vectors, accelerated
84+ by a matching vector index with the ` cosine `  metric.
85+ 
86+ The closer the similarity value is to 1, the more similar the two vectors
87+ are. The closer it is to 0, the more different they are. The value can also be
88+ negative up to -1, indicating that the vectors are not similar and point in opposite
89+ directions. You need to ** sort in descending order**  so that the most similar
7490documents come first, which is what a vector index using the ` cosine `  metric
7591can provide.
7692
@@ -83,8 +99,8 @@ can provide.
8399    closest Voronoi cells to consider for the search results. The larger the number,
84100    the slower the search but the better the search results. If not specified, the
85101    ` defaultNProbe `  value of the vector index is used.
86- -  returns ** similarity**  (number): The approximate angular  similarity between 
87-   both vectors.
102+ -  returns ** similarity**  (number): The approximate cosine  similarity of 
103+   both normalized  vectors. The value range is  ` [-1, 1] ` .
88104
89105** Examples** 
90106
@@ -126,15 +142,83 @@ FOR docOuter IN coll
126142  RETURN { key: docOuter._key, neighbors } 
127143``` 
128144
145+ ### APPROX_NEAR_INNER_PRODUCT()  
146+ 
147+ <small >Introduced in: v3.12.6</small >
148+ 
149+ ` APPROX_NEAR_INNER_PRODUCT(vector1, vector2, options) → similarity ` 
150+ 
151+ Retrieve the approximate dot product of two vectors, accelerated by a matching
152+ vector index with the ` innerProduct `  metric.
153+ 
154+ The higher the similarity value is, the more similar the two vectors
155+ are. The closer it is to 0, the more different they are. The value can also
156+ be negative, indicating that the vectors are not similar and point in opposite
157+ directions. You need to ** sort in descending order**  so that the most similar
158+ documents come first, which is what a vector index using the ` innerProduct ` 
159+ metric can provide.
160+ 
161+ -  ** vector1**  (array of numbers): The first vector. Either this parameter or
162+   ` vector2 `  needs to reference a stored attribute holding the vector embedding.
163+ -  ** vector2**  (array of numbers): The second vector. Either this parameter or
164+   ` vector1 `  needs to reference a stored attribute holding the vector embedding.
165+ -  ** options**  (object, _ optional_ ):
166+   -  ** nProbe**  (number, _ optional_ ): How many neighboring centroids respectively
167+     closest Voronoi cells to consider for the search results. The larger the number,
168+     the slower the search but the better the search results. If not specified, the
169+     ` defaultNProbe `  value of the vector index is used.
170+ -  returns ** similarity**  (number): The approximate dot product
171+   of both vectors without normalization. The value range is unbounded.
172+ 
173+ ** Examples** 
174+ 
175+ Return up to ` 10 `  similar documents based on their closeness to the vector
176+ ` @q `  according to the inner product metric:
177+ 
178+ ``` aql 
179+ FOR doc IN coll 
180+   SORT APPROX_NEAR_INNER_PRODUCT(doc.vector, @q) DESC 
181+   LIMIT 10 
182+   RETURN doc 
183+ ``` 
184+ 
185+ Return up to ` 5 `  similar documents as well as the similarity value,
186+ considering ` 20 `  neighboring centroids respectively closest Voronoi cells:
187+ 
188+ ``` aql 
189+ FOR doc IN coll 
190+   LET similarity = APPROX_NEAR_INNER_PRODUCT(doc.vector, @q, { nProbe: 20 }) 
191+   SORT similarity DESC 
192+   LIMIT 5 
193+   RETURN MERGE( { similarity }, doc) 
194+ ``` 
195+ 
196+ Return the similarity value and the document keys of up to ` 3 `  similar documents
197+ for multiple input vectors using a subquery. In this example, the input vectors
198+ are taken from ten random documents of the same collection:
199+ 
200+ ``` aql 
201+ FOR docOuter IN coll 
202+   LIMIT 10 
203+   LET neighbors = ( 
204+     FOR docInner IN coll 
205+       LET similarity = APPROX_NEAR_INNER_PRODUCT(docInner.vector, docOuter.vector) 
206+       SORT similarity DESC 
207+       LIMIT 3 
208+       RETURN { key: docInner._key, similarity } 
209+   ) 
210+   RETURN { key: docOuter._key, neighbors } 
211+ ``` 
212+ 
129213### APPROX_NEAR_L2()  
130214
131- ` APPROX_NEAR_L2(vector1, vector2, options) → similarity  ` 
215+ ` APPROX_NEAR_L2(vector1, vector2, options) → distance  ` 
132216
133217Retrieve the approximate distance using the L2 (Euclidean) metric, accelerated
134- by a matching vector index.
218+ by a matching vector index with the  ` l2 `  metric .
135219
136220The closer the distance is to 0, the more similar the two vectors are. The higher
137- the value, the more different the they are. You need to sort in ascending order
221+ the value, the more different the they are. You need to ** sort in ascending order** 
138222so that the most similar documents come first, which is what a vector index using
139223the ` l2 `  metric can provide.
140224
@@ -147,7 +231,7 @@ the `l2` metric can provide.
147231    for the search results. The larger the number, the slower the search but the
148232    better the search results. If not specified, the ` defaultNProbe `  value of
149233    the vector index is used.
150- -  returns ** similarity **  (number): The approximate L2 (Euclidean) distance between
234+ -  returns ** distance **  (number): The approximate L2 (Euclidean) distance between
151235  both vectors.
152236
153237** Examples** 
0 commit comments