Merge remote-tracking branch 'origin/main' into es-117596-fix

ywangd · Nov 28, 2024 · d91680a · d91680a
2 parents 0809994 + 8350ff2
commit d91680a
Show file tree

Hide file tree

Showing 76 changed files with 2,302 additions and 414 deletions.
diff --git a/.buildkite/scripts/dra-workflow.sh b/.buildkite/scripts/dra-workflow.sh
@@ -75,7 +75,6 @@ find "$WORKSPACE" -type d -path "*/build/distributions" -exec chmod a+w {} \;
 
 echo --- Running release-manager
 
-set +e
 # Artifacts should be generated
 docker run --rm \
   --name release-manager \
@@ -92,16 +91,4 @@ docker run --rm \
   --version "$ES_VERSION" \
   --artifact-set main \
   --dependency "beats:https://artifacts-${WORKFLOW}.elastic.co/beats/${BEATS_BUILD_ID}/manifest-${ES_VERSION}${VERSION_SUFFIX}.json" \
-  --dependency "ml-cpp:https://artifacts-${WORKFLOW}.elastic.co/ml-cpp/${ML_CPP_BUILD_ID}/manifest-${ES_VERSION}${VERSION_SUFFIX}.json" \
-2>&1 | tee release-manager.log
-EXIT_CODE=$?
-set -e
-
-# This failure is just generating a ton of noise right now, so let's just ignore it
-# This should be removed once this issue has been fixed
-if grep "elasticsearch-ubi-9.0.0-SNAPSHOT-docker-image.tar.gz" release-manager.log; then
-  echo "Ignoring error about missing ubi artifact"
-  exit 0
-fi
-
-exit "$EXIT_CODE"
+  --dependency "ml-cpp:https://artifacts-${WORKFLOW}.elastic.co/ml-cpp/${ML_CPP_BUILD_ID}/manifest-${ES_VERSION}${VERSION_SUFFIX}.json"
diff --git a/docs/changelog/111494.yaml b/docs/changelog/111494.yaml
@@ -0,0 +1,5 @@
+pr: 111494
+summary: Extensible Completion Postings Formats
+area: "Suggesters"
+type: enhancement
+issues: []
diff --git a/docs/changelog/113120.yaml b/docs/changelog/113120.yaml
@@ -0,0 +1,5 @@
+pr: 113120
+summary: ESQL - enabling scoring with METADATA `_score`
+area: ES|QL
+type: enhancement
+issues: []
diff --git a/docs/changelog/117235.yaml b/docs/changelog/117235.yaml
diff --git a/docs/changelog/117618.yaml b/docs/changelog/117618.yaml
@@ -0,0 +1,5 @@
+pr: 117618
+summary: SearchStatesIt failures reported by CI
+area: Search
+type: bug
+issues: [116617, 116618]
diff --git a/docs/changelog/117655.yaml b/docs/changelog/117655.yaml
@@ -0,0 +1,5 @@
+pr: 117655
+summary: Add nulls support to Categorize
+area: ES|QL
+type: enhancement
+issues: []
diff --git a/docs/reference/cluster/stats.asciidoc b/docs/reference/cluster/stats.asciidoc
@@ -1644,7 +1644,10 @@ The API returns the following response:
         "total_deduplicated_mapping_size": "0b",
         "total_deduplicated_mapping_size_in_bytes": 0,
         "field_types": [],
-        "runtime_field_types": []
+        "runtime_field_types": [],
+        "source_modes" : {
+          "stored": 0
+        }
       },
       "analysis": {
         "char_filter_types": [],

diff --git a/docs/reference/inference/service-elasticsearch.asciidoc b/docs/reference/inference/service-elasticsearch.asciidoc
@@ -69,15 +69,15 @@ include::inference-shared.asciidoc[tag=service-settings]
 These settings are specific to the `elasticsearch` service.
 --
 
-`adaptive_allocations`:::
-(Optional, object)
-include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=adaptive-allocation]
-
 `deployment_id`:::
 (Optional, string)
 The `deployment_id` of an existing trained model deployment.
 When `deployment_id` is used the `model_id` is optional.
 
+`adaptive_allocations`:::
+(Optional, object)
+include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=adaptive-allocation]
+
 `enabled`::::
 (Optional, Boolean)
 include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=adaptive-allocation-enabled]
@@ -119,7 +119,6 @@ include::inference-shared.asciidoc[tag=task-settings]
 Returns the document instead of only the index. Defaults to `true`.
 =====
 
-
 [discrete]
 [[inference-example-elasticsearch-elser]]
 ==== ELSER via the `elasticsearch` service
@@ -137,7 +136,7 @@ PUT _inference/sparse_embedding/my-elser-model
     "adaptive_allocations": { <1>
       "enabled": true,
       "min_number_of_allocations": 1,
-      "max_number_of_allocations": 10
+      "max_number_of_allocations": 4
     },
     "num_threads": 1,
     "model_id": ".elser_model_2" <2>
@@ -150,6 +149,34 @@ PUT _inference/sparse_embedding/my-elser-model
 Valid values are `.elser_model_2` and `.elser_model_2_linux-x86_64`.
 For further details, refer to the {ml-docs}/ml-nlp-elser.html[ELSER model documentation].
 
+[discrete]
+[[inference-example-elastic-reranker]]
+==== Elastic Rerank via the `elasticsearch` service
+
+The following example shows how to create an {infer} endpoint called `my-elastic-rerank` to perform a `rerank` task type using the built-in Elastic Rerank cross-encoder model.
+
+The API request below will automatically download the Elastic Rerank model if it isn't already downloaded and then deploy the model.
+Once deployed, the model can be used for semantic re-ranking with a <<text-similarity-reranker-retriever-example-elastic-rerank,`text_similarity_reranker` retriever>>.
+
+[source,console]
+------------------------------------------------------------
+PUT _inference/rerank/my-elastic-rerank
+{
+  "service": "elasticsearch",
+  "service_settings": {
+    "model_id": ".rerank-v1", <1>
+    "num_threads": 1,
+    "adaptive_allocations": { <2>
+      "enabled": true,
+      "min_number_of_allocations": 1,
+      "max_number_of_allocations": 4
+    }
+  }
+}
+------------------------------------------------------------
+// TEST[skip:TBD]
+<1> The `model_id` must be the ID of the built-in Elastic Rerank model: `.rerank-v1`.
+<2> {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[Adaptive allocations] will be enabled with the minimum of 1 and the maximum of 10 allocations. 
 
 [discrete]
 [[inference-example-elasticsearch]]
@@ -186,7 +213,7 @@ If using the Python client, you can set the `timeout` parameter to a higher valu
 
 [discrete]
 [[inference-example-eland]]
-==== Models uploaded by Eland via the elasticsearch service
+==== Models uploaded by Eland via the `elasticsearch` service
 
 The following example shows how to create an {infer} endpoint called
 `my-msmarco-minilm-model` to perform a `text_embedding` task type.

diff --git a/docs/reference/ml/trained-models/apis/get-trained-models-stats.asciidoc b/docs/reference/ml/trained-models/apis/get-trained-models-stats.asciidoc
@@ -235,7 +235,7 @@ The reason for the current state. Usually only populated when the `routing_state
 (string)
 The current routing state.
 --
-* `starting`: The model is attempting to allocate on this model, inference calls are not yet accepted.
+* `starting`: The model is attempting to allocate on this node, inference calls are not yet accepted.
 * `started`: The model is allocated and ready to accept inference requests.
 * `stopping`: The model is being deallocated from this node.
 * `stopped`: The model is fully deallocated from this node.

diff --git a/docs/reference/quickstart/full-text-filtering-tutorial.asciidoc b/docs/reference/quickstart/full-text-filtering-tutorial.asciidoc
@@ -511,8 +511,9 @@ In this tutorial scenario it's useful for when users have complex requirements f
 
 Let's create a query that addresses the following user needs:
 
-* Must be a vegetarian main course
+* Must be a vegetarian recipe
 * Should contain "curry" or "spicy" in the title or description
+* Should be a main course
 * Must not be a dessert
 * Must have a rating of at least 4.5
 * Should prefer recipes published in the last month
@@ -524,16 +525,7 @@ GET /cooking_blog/_search
   "query": {
     "bool": {
       "must": [
-        {
-          "term": {
-            "category.keyword": "Main Course"
-          }
-        },
-        {
-          "term": {
-            "tags": "vegetarian"
-          }
-        },
+        { "term": { "tags": "vegetarian" } },
         {
           "range": {
             "rating": {
@@ -543,10 +535,18 @@ GET /cooking_blog/_search
         }
       ],
       "should": [
+        {
+          "term": {
+            "category": "Main Course"
+          }
+        },
         {
           "multi_match": {
             "query": "curry spicy",
-            "fields": ["title^2", "description"]
+            "fields": [
+              "title^2",
+              "description"
+            ]
           }
         },
         {
@@ -590,12 +590,12 @@ GET /cooking_blog/_search
       "value": 1,
       "relation": "eq"
     },
-    "max_score": 7.9835095,
+    "max_score": 7.444513,
     "hits": [
       {
         "_index": "cooking_blog",
         "_id": "2",
-        "_score": 7.9835095,
+        "_score": 7.444513,
         "_source": {
           "title": "Spicy Thai Green Curry: A Vegetarian Adventure", <1>
           "description": "Dive into the flavors of Thailand with this vibrant green curry. Packed with vegetables and aromatic herbs, this dish is both healthy and satisfying. Don't worry about the heat - you can easily adjust the spice level to your liking.", <2>
@@ -619,8 +619,8 @@ GET /cooking_blog/_search
 <1> The title contains "Spicy" and "Curry", matching our should condition. With the default <<type-best-fields,best_fields>> behavior, this field contributes most to the relevance score.
 <2> While the description also contains matching terms, only the best matching field's score is used by default.
 <3> The recipe was published within the last month, satisfying our recency preference.
-<4> The "Main Course" category matches our `must` condition.
-<5> The "vegetarian" tag satisfies another `must` condition, while "curry" and "spicy" tags align with our `should` preferences.
+<4> The "Main Course" category satisfies another `should` condition.
+<5> The "vegetarian" tag satisfies a `must` condition, while "curry" and "spicy" tags align with our `should` preferences.
 <6> The rating of 4.6 meets our minimum rating requirement of 4.5.
 ==============
 

diff --git a/docs/reference/reranking/semantic-reranking.asciidoc b/docs/reference/reranking/semantic-reranking.asciidoc
@@ -85,14 +85,16 @@ In {es}, semantic re-rankers are implemented using the {es} <<inference-apis,Inf
 
 To use semantic re-ranking in {es}, you need to:
 
-. *Choose a re-ranking model*.
-Currently you can:
-
-** Integrate directly with the <<infer-service-cohere,Cohere Rerank inference endpoint>> using the `rerank` task type
-** Integrate directly with the <<infer-service-google-vertex-ai,Google Vertex AI inference endpoint>> using the `rerank` task type
-** Upload a model to {es} from Hugging Face with {eland-docs}/machine-learning.html#ml-nlp-pytorch[Eland]. You'll need to use the `text_similarity` NLP task type when loading the model using Eland. Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third party text similarity models supported by {es} for semantic re-ranking.
-*** Then set up an <<inference-example-eland,{es} service inference endpoint>> with the `rerank` task type
-. *Create a `rerank` task using the <<put-inference-api,{es} Inference API>>*.
+. *Select and configure a re-ranking model*.
+You have the following options:
+.. Use the <<inference-example-elastic-reranker,Elastic Rerank>> cross-encoder model via the inference API's {es} service. 
+.. Use the <<infer-service-cohere,Cohere Rerank inference endpoint>> to create a `rerank` endpoint.
+.. Use the <<infer-service-google-vertex-ai,Google Vertex AI inference endpoint>> to create a `rerank` endpoint.
+.. Upload a model to {es} from Hugging Face with {eland-docs}/machine-learning.html#ml-nlp-pytorch[Eland]. You'll need to use the `text_similarity` NLP task type when loading the model using Eland. Then set up an <<inference-example-eland,{es} service inference endpoint>> with the `rerank` endpoint type.
++ 
+Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third party text similarity models supported by {es} for semantic re-ranking.
+
+. *Create a `rerank` endpoint using the <<put-inference-api,{es} Inference API>>*.
 The Inference API creates an inference endpoint and configures your chosen machine learning model to perform the re-ranking task.
 . *Define a `text_similarity_reranker` retriever in your search request*.
 The retriever syntax makes it simple to configure both the retrieval and re-ranking of search results in a single API call.
@@ -117,7 +119,7 @@ POST _search
         }
       },
       "field": "text",
-      "inference_id": "my-cohere-rerank-model",
+      "inference_id": "elastic-rerank",
       "inference_text": "How often does the moon hide the sun?",
       "rank_window_size": 100,
       "min_score": 0.5