Azure · simorenoh · May 15, 2024 · Mar 21, 2024 · Mar 21, 2024 · Mar 21, 2024
@@ -3,6 +3,8 @@
 ### 4.6.1 (Unreleased)
 
 #### Features Added
+* Adds vector embedding policy and vector indexing policy. See [PR 34882](https://github.com/Azure/azure-sdk-for-python/pull/34882).
+* Adds support for vector search non-streaming order by queries. See [PR 10101](https://github.com/Azure/azure-sdk-for-python/pull/10101).
 * Added support for using the start time option for change feed query API. See [PR 35090](https://github.com/Azure/azure-sdk-for-python/pull/35090)
 
 #### Breaking Changes
@@ -11,7 +13,6 @@
 * Fixed a bug where change feed query in Async client was not returning all pages due to case-sensitive response headers. See [PR 35090](https://github.com/Azure/azure-sdk-for-python/pull/35090)
 * Fixed a bug when a retryable exception occurs in the first page of a query execution causing query to return 0 results. See [PR 35090](https://github.com/Azure/azure-sdk-for-python/pull/35090).
 
-
 #### Other Changes
 
 ### 4.6.0 (2024-03-14)

@@ -628,6 +628,72 @@ as well as containing the list of failed responses for the failed request.
 
 For more information on Transactional Batch, see [Azure Cosmos DB Transactional Batch][cosmos_transactional_batch].
 
+### Private Preview - Vector Embeddings and Vector Indexes
+We have added new capabilities to utilize vector embeddings and vector indexing for users to leverage vector
+search utilizing our Cosmos SDK. These two container-level configurations have to be turned on at the account-level
+before you can use them.
+
+Each vector embedding should have a path to the relevant vector field in your items being stored, a supported data type
+(float32, int8, uint8), the vector's dimensions (positive int <=1536), and the distance function being used for that embedding.
+A sample vector embedding policy would look like this:
+```python
+vector_embedding_policy = {
+    "vectorEmbeddings": [
+        {
+            "path": "/vector1",
+            "dataType": "float32",
+            "dimensions": 1000,
+            "distanceFunction": "euclidean"
+        },
+        {
+            "path": "/vector2",
+            "dataType": "int8",
+            "dimensions": 200,
+            "distanceFunction": "dotproduct"
+        },
+        {
+            "path": "/vector3",
+            "dataType": "uint8",
+            "dimensions": 400,
+            "distanceFunction": "cosine"
+        }
+    ]
+}
+```
+
+Separately, vector indexes have been added to the already existing indexing_policy and only require two fields per index:
+the path to the relevant field to be used, and the type of index from the possible options (flat, quantizedFlat, or diskANN).
+A sample indexing policy with vector indexes would look like this:
+```python
+indexing_policy = {
+        "automatic": True,
+        "indexingMode": "consistent",
+        "compositeIndexes": [
+            [
+                {"path": "/numberField", "order": "ascending"},
+                {"path": "/stringField", "order": "descending"}
+            ]
+        ],
+        "spatialIndexes": [
+            {"path": "/location/*", "types": [
+                "Point",
+                "Polygon"]}
+        ],
+        "vectorIndexes": [
+            {"path": "/vector1", "type": "flat"},
+            {"path": "/vector2", "type": "quantizedFlat"},
+            {"path": "/vector3", "type": "diskANN"}
+        ]
+    }
+```
+You would then pass in the relevant policies to your container creation method to ensure these configurations are used by it.
+The operation will fail if you pass new vector indexes to your indexing policy but forget to pass in an embedding policy.
+```python
+database.create_container(id=container_id, partition_key=PartitionKey(path="/id"),
+                          indexing_policy=indexing_policy, vector_embedding_policy=vector_embedding_policy)
+```
+***Note: vector embeddings and vector indexes CANNOT be edited by container replace operations. They are only available directly through creation.***
+
 ## Troubleshooting
 
 ### General

@@ -3107,7 +3107,8 @@ def _GetQueryPlanThroughGateway(self, query: str, resource_link: str, **kwargs:
                                     documents._QueryFeature.MultipleOrderBy + "," +
                                     documents._QueryFeature.OffsetAndLimit + "," +
                                     documents._QueryFeature.OrderBy + "," +
-                                    documents._QueryFeature.Top)
+                                    documents._QueryFeature.Top + "," +
+                                    documents._QueryFeature.NonStreamingOrderBy)
 
         options = {
             "contentType": runtime_constants.MediaTypes.Json,

@@ -276,3 +276,57 @@ def _validate_orderby_items(self, res1, res2):
             type2 = _OrderByHelper.getTypeStr(elt2)
             if type1 != type2:
                 raise ValueError("Expected {}, but got {}.".format(type1, type2))
+
+
+class _NonStreamingDocumentProducer(object):
+    """This class takes care of handling of the items to be sorted in a non-streaming context.
+
+    One instance of this document producer goes attached to every item coming in for the priority queue to be able
+    to properly sort items as they get inserted.
+    """
+
+    def __init__(self, item_result, sort_order):
+        """
+        Constructor
+        """
+        self._item_result = item_result
+        self._doc_producer_comp = _NonStreamingOrderByComparator(sort_order)
+
+    def __lt__(self, other):
+        return self._doc_producer_comp.compare(self, other) < 0
+
+
+class _NonStreamingOrderByComparator(object):
+    """Provide a Comparator for item results which respects orderby sort order.
+    """
+
+    def __init__(self, sort_order):  # pylint: disable=super-init-not-called
+        """Instantiates this class
+
+        :param list sort_order:
+            List of sort orders (i.e., Ascending, Descending)
+
+        :ivar list sort_order:
+            List of sort orders (i.e., Ascending, Descending)
+
+        """
+        self._sort_order = sort_order
+
+    def compare(self, doc_producer1, doc_producer2):
+        """Compares the given two instances of DocumentProducers.
+
+        Based on the orderby query items and whether the sort order is Ascending
+        or Descending compares the peek result of the two DocumentProducers.
+
+        :param _DocumentProducer doc_producer1: first instance to be compared
+        :param _DocumentProducer doc_producer2: second instance to be compared
+        :return:
+            Integer value of compare result.
+                positive integer if doc_producers1 > doc_producers2
+                negative integer if doc_producers1 < doc_producers2
+        :rtype: int
+        """
+        # TODO: this is not fully safe - doesn't deal with scenario of having orderByItems of [{}]
+        rank1 = doc_producer1._item_result["orderByItems"][0]['item']
+        rank2 = doc_producer2._item_result["orderByItems"][0]['item']
+        return _compare_helper(rank1, rank2)
@@ -60,6 +60,14 @@ def __next__(self):
 
     next = __next__  # Python 2 compatibility.
 
+class _QueryExecutionNonStreamingEndpointComponent(_QueryExecutionEndpointComponent):
+    """Represents an endpoint in handling a non-streaming order by query results.
+
+    For each processed orderby result it returns the item result.
+    """
+    def __next__(self):
+        return next(self._execution_context)._item_result["payload"]
+
 
 class _QueryExecutionTopEndpointComponent(_QueryExecutionEndpointComponent):
     """Represents an endpoint in handling top query.

@@ -25,11 +25,11 @@
 
 import json
 from azure.cosmos.exceptions import CosmosHttpResponseError
-from azure.cosmos._execution_context import multi_execution_aggregator
+from azure.cosmos._execution_context import endpoint_component, multi_execution_aggregator
+from azure.cosmos._execution_context import non_streaming_order_by_aggregator
 from azure.cosmos._execution_context.base_execution_context import _QueryExecutionContextBase
 from azure.cosmos._execution_context.base_execution_context import _DefaultQueryExecutionContext
 from azure.cosmos._execution_context.query_execution_info import _PartitionedQueryExecutionInfo
-from azure.cosmos._execution_context import endpoint_component
 from azure.cosmos.documents import _DistinctType
 from azure.cosmos.http_constants import StatusCodes, SubStatusCodes
 
@@ -111,15 +111,29 @@ def fetch_next_block(self):
         return self._execution_context.fetch_next_block()
 
     def _create_pipelined_execution_context(self, query_execution_info):
-
         assert self._resource_link, "code bug, resource_link is required."
         if query_execution_info.has_aggregates() and not query_execution_info.has_select_value():
             if self._options and ("enableCrossPartitionQuery" in self._options
                                   and self._options["enableCrossPartitionQuery"]):
                 raise CosmosHttpResponseError(StatusCodes.BAD_REQUEST,
                                   "Cross partition query only supports 'VALUE <AggregateFunc>' for aggregates")
 
-        execution_context_aggregator = multi_execution_aggregator._MultiExecutionContextAggregator(self._client,
+        # throw exception here for vector search query without limit filter
+        if query_execution_info.get_has_non_streaming_order_by():
+            if query_execution_info.get_top() is None and query_execution_info.get_limit() is None:
+                # TODO: missing one last if statement here to check for the system variable bypass - need name
+                raise CosmosHttpResponseError(StatusCodes.BAD_REQUEST,
+                                              "Executing a vector search query without TOP or LIMIT can consume many" +
+                                              " RUs very fast and have long runtimes. Please ensure you are using one" +
+                                              " of the two filters with your vector search query.")
+            execution_context_aggregator =\
+                non_streaming_order_by_aggregator._NonStreamingOrderByContextAggregator(self._client,
+                                                                                        self._resource_link,
+                                                                                        self._query,
+                                                                                        self._options,
+                                                                                        query_execution_info)
+        else:
+            execution_context_aggregator = multi_execution_aggregator._MultiExecutionContextAggregator(self._client,
                                                                                                    self._resource_link,
                                                                                                    self._query,
                                                                                                    self._options,
@@ -147,7 +161,9 @@ def __init__(self, client, options, execution_context, query_execution_info):
         self._endpoint = endpoint_component._QueryExecutionEndpointComponent(execution_context)
 
         order_by = query_execution_info.get_order_by()
-        if order_by:
+        if query_execution_info.get_has_non_streaming_order_by():
+            self._endpoint = endpoint_component._QueryExecutionNonStreamingEndpointComponent(self._endpoint)
+        elif order_by:
             self._endpoint = endpoint_component._QueryExecutionOrderByEndpointComponent(self._endpoint)
 
         aggregates = query_execution_info.get_aggregates()

@@ -0,0 +1,168 @@
+# The MIT License (MIT)
+# Copyright (c) 2024 Microsoft Corporation
+
+"""Internal class for multi execution context aggregator implementation in the Azure Cosmos database service.
+"""
+
+import heapq
+from azure.cosmos._execution_context.base_execution_context import _QueryExecutionContextBase
+from azure.cosmos._execution_context import document_producer
+from azure.cosmos._routing import routing_range
+from azure.cosmos import exceptions
+
+
+# pylint: disable=protected-access
+
+class FixedSizePriorityQueue:
+    """Provides a Fixed Size Priority Queue abstraction data structure"""
+
+    def __init__(self, max_size):
+        self._heap = []
+        self.max_size = max_size
+
+    def pop(self):
+        return heapq.heappop(self._heap)
+
+    def push(self, item):
+        if len(self._heap) < self.max_size:
+            heapq.heappush(self._heap, item)
+        else:
+            heapq.heappushpop(self._heap, item)
+
+    def peek(self):
+        return self._heap[0]
+
+    def size(self):
+        return len(self._heap)
+
+
+class _NonStreamingOrderByContextAggregator(_QueryExecutionContextBase):
+    """This class is a subclass of the query execution context base and serves for
+    non-streaming order by queries. It is very similar to the existing MultiExecutionContextAggregator,
+    but is needed since we're dealing with items and not document producers.
+
+    This class builds upon the multi-execution aggregator, building a document producer per partition
+    and draining their results entirely in order to create the result set relevant to the filters passed
+    by the user.
+    """
+
+    def __init__(self, client, resource_link, query, options, partitioned_query_ex_info):
+        super(_NonStreamingOrderByContextAggregator, self).__init__(client, options)
+
+        # use the routing provider in the client
+        self._routing_provider = client._routing_map_provider
+        self._client = client
+        self._resource_link = resource_link
+        self._query = query
+        self._partitioned_query_ex_info = partitioned_query_ex_info
+        self._sort_orders = partitioned_query_ex_info.get_order_by()
+
+        # will be a list of (partition_min, partition_max) tuples
+        targetPartitionRanges = self._get_target_partition_key_range()
+
+        self._document_producer_comparator = document_producer._NonStreamingOrderByComparator(self._sort_orders)
+
+        targetPartitionQueryExecutionContextList = []
+        for partitionTargetRange in targetPartitionRanges:
+            # create a document producer for each partition key range
+            targetPartitionQueryExecutionContextList.append(
+                self._createTargetPartitionQueryExecutionContext(partitionTargetRange)
+            )
+
+        self._doc_producers = []
+        # verify all document producers have items/ no splits
+        for targetQueryExContext in targetPartitionQueryExecutionContextList:
+            try:
+                targetQueryExContext.peek()
+                self._doc_producers.append(targetQueryExContext)
+            except exceptions.CosmosHttpResponseError as e:
+                if exceptions._partition_range_is_gone(e):
+                    # repairing document producer context on partition split
+                    self._repair_document_producer()
+                else:
+                    raise
+            except StopIteration:
+                continue
+
+        pq_size = partitioned_query_ex_info.get_top() or partitioned_query_ex_info.get_limit()
+        self._orderByPQ = FixedSizePriorityQueue(pq_size)
+        for doc_producer in self._doc_producers:
+            while True:
+                try:
+                    result = doc_producer.peek()
+                    item_result = document_producer._NonStreamingDocumentProducer(result, self._sort_orders)
+                    self._orderByPQ.push(item_result)
+                    next(doc_producer)
+                except StopIteration:
+                    break
+
+    def __next__(self):
+        """Returns the next item result.
+
+        :return: The next result.
+        :rtype: dict
+        :raises StopIteration: If no more results are left.
+        """
+        if self._orderByPQ.size() > 0:
+            res = self._orderByPQ.pop()
+            return res
+        raise StopIteration
+
+    def fetch_next_block(self):
+        raise NotImplementedError("You should use pipeline's fetch_next_block.")
+
+    def _repair_document_producer(self):
+        """Repairs the document producer context by using the re-initialized routing map provider in the client,
+        which loads in a refreshed partition key range cache to re-create the partition key ranges.
+        After loading this new cache, the document producers get re-created with the new valid ranges.
+        """
+        # refresh the routing provider to get the newly initialized one post-refresh
+        self._routing_provider = self._client._routing_map_provider
+        # will be a list of (partition_min, partition_max) tuples
+        targetPartitionRanges = self._get_target_partition_key_range()
+
+        targetPartitionQueryExecutionContextList = []
+        for partitionTargetRange in targetPartitionRanges:
+            # create and add the child execution context for the target range
+            targetPartitionQueryExecutionContextList.append(
+                self._createTargetPartitionQueryExecutionContext(partitionTargetRange)
+            )
+
+        self._doc_producers = []
+        for targetQueryExContext in targetPartitionQueryExecutionContextList:
+            try:
+                # TODO: we can also use more_itertools.peekable to be more python friendly
+                targetQueryExContext.peek()
+                # if there are matching results in the target ex range add it to the priority queue
+                self._doc_producers.append(targetQueryExContext)
+
+            except StopIteration:
+                continue
+
+    def _createTargetPartitionQueryExecutionContext(self, partition_key_target_range):
+
+        rewritten_query = self._partitioned_query_ex_info.get_rewritten_query()
+        if rewritten_query:
+            if isinstance(self._query, dict):
+                # this is a parameterized query, collect all the parameters
+                query = dict(self._query)
+                query["query"] = rewritten_query
+            else:
+                query = rewritten_query
+        else:
+            query = self._query
+
+        return document_producer._DocumentProducer(
+            partition_key_target_range,
+            self._client,
+            self._resource_link,
+            query,
+            self._document_producer_comparator,
+            self._options,
+        )
+
+    def _get_target_partition_key_range(self):
+        query_ranges = self._partitioned_query_ex_info.get_query_ranges()
+        return self._routing_provider.get_overlapping_ranges(
+            self._resource_link, [routing_range.Range.ParseFromDict(range_as_dict) for range_as_dict in query_ranges]
+        )