Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always use constant_score query for match_only_text #16964

Conversation

msfroh
Copy link
Collaborator

@msfroh msfroh commented Jan 6, 2025

Description

In some cases, when we create a term query over a match_only_text field, it may still try to compute scores, which prevents early termination. We should always use a constant score query when querying match_only_text, since we don't have the statistics required to compute scores.

Related Issues

N/A

We've seen benchmark latency on the Big5 query-string-on-message operation, which we can attribute to this.

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

In some cases, when we create a term query over a `match_only_text`
field, it may still try to compute scores, which prevents early
termination. We should *always* use a constant score query when
querying `match_only_text`, since we don't have the statistics
required to compute scores.

Signed-off-by: Michael Froh <froh@amazon.com>
@msfroh
Copy link
Collaborator Author

msfroh commented Jan 6, 2025

@rishabhmaurya -- you may be interested in this

@msfroh msfroh added v2.19.0 Issues and PRs related to version 2.19.0 backport 2.x Backport to 2.x branch labels Jan 6, 2025
Signed-off-by: Michael Froh <froh@amazon.com>
Copy link
Contributor

@rishabhmaurya rishabhmaurya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. LGTM!

Copy link
Contributor

github-actions bot commented Jan 6, 2025

❌ Gradle check result for 399ddaf: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Michael Froh <froh@amazon.com>
Copy link
Contributor

github-actions bot commented Jan 7, 2025

✅ Gradle check result for 81a665c: SUCCESS

Copy link

codecov bot commented Jan 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.21%. Comparing base (4a53ff2) to head (81a665c).
Report is 4 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16964      +/-   ##
============================================
- Coverage     72.32%   72.21%   -0.12%     
+ Complexity    65310    65246      -64     
============================================
  Files          5299     5299              
  Lines        303534   303536       +2     
  Branches      43941    43941              
============================================
- Hits         219527   219187     -340     
- Misses        66021    66391     +370     
+ Partials      17986    17958      -28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

codecov bot commented Jan 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.21%. Comparing base (4a53ff2) to head (81a665c).
Report is 3 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16964      +/-   ##
============================================
- Coverage     72.32%   72.21%   -0.12%     
+ Complexity    65310    65246      -64     
============================================
  Files          5299     5299              
  Lines        303534   303536       +2     
  Branches      43941    43941              
============================================
- Hits         219527   219187     -340     
- Misses        66021    66391     +370     
+ Partials      17986    17958      -28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@msfroh msfroh merged commit 0b36599 into opensearch-project:main Jan 7, 2025
36 checks passed
@msfroh msfroh deleted the constant_score_for_match_only_term_query branch January 7, 2025 00:24
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-16964-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 0b365998ed6e4f537dbdf7983a077bc53e785bb9
# Push it to GitHub
git push --set-upstream origin backport/backport-16964-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-16964-to-2.x.

msfroh added a commit to msfroh/OpenSearch that referenced this pull request Jan 7, 2025
…ct#16964)

In some cases, when we create a term query over a `match_only_text`
field, it may still try to compute scores, which prevents early
termination. We should *always* use a constant score query when
querying `match_only_text`, since we don't have the statistics
required to compute scores.

---------

Signed-off-by: Michael Froh <froh@amazon.com>
(cherry picked from commit 0b36599)
@msfroh
Copy link
Collaborator Author

msfroh commented Jan 7, 2025

Manual backport PR: #16969

msfroh added a commit that referenced this pull request Jan 8, 2025
In some cases, when we create a term query over a `match_only_text`
field, it may still try to compute scores, which prevents early
termination. We should *always* use a constant score query when
querying `match_only_text`, since we don't have the statistics
required to compute scores.

---------

Signed-off-by: Michael Froh <froh@amazon.com>
(cherry picked from commit 0b36599)
meet-v25 pushed a commit to meet-v25/OpenSearch that referenced this pull request Jan 16, 2025
…ct#16964)

In some cases, when we create a term query over a `match_only_text`
field, it may still try to compute scores, which prevents early
termination. We should *always* use a constant score query when
querying `match_only_text`, since we don't have the statistics
required to compute scores.

---------

Signed-off-by: Michael Froh <froh@amazon.com>
meet-v25 pushed a commit to meet-v25/OpenSearch that referenced this pull request Jan 17, 2025
Signed-off-by: meetvm <meetvm@amazon.com>

Bump com.nimbusds:oauth2-oidc-sdk from 11.19.1 to 11.20.1 in /plugins/repository-azure (opensearch-project#16895)

* Bump com.nimbusds:oauth2-oidc-sdk in /plugins/repository-azure

Bumps [com.nimbusds:oauth2-oidc-sdk](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions) from 11.19.1 to 11.20.1.
- [Changelog](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions/src/master/CHANGELOG.txt)
- [Commits](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions/branches/compare/11.20.1..11.19.1)

---
updated-dependencies:
- dependency-name: com.nimbusds:oauth2-oidc-sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Bump com.netflix.nebula.ospackage-base from 11.10.0 to 11.10.1 in /distribution/packages (opensearch-project#16896)

* Bump com.netflix.nebula.ospackage-base in /distribution/packages

Bumps com.netflix.nebula.ospackage-base from 11.10.0 to 11.10.1.

---
updated-dependencies:
- dependency-name: com.netflix.nebula.ospackage-base
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Bump ch.qos.logback:logback-classic from 1.5.12 to 1.5.15 in /test/fixtures/hdfs-fixture (opensearch-project#16898)

* Bump ch.qos.logback:logback-classic in /test/fixtures/hdfs-fixture

Bumps [ch.qos.logback:logback-classic](https://github.com/qos-ch/logback) from 1.5.12 to 1.5.15.
- [Commits](qos-ch/logback@v_1.5.12...v_1.5.15)

---
updated-dependencies:
- dependency-name: ch.qos.logback:logback-classic
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Bump lycheeverse/lychee-action from 2.1.0 to 2.2.0 (opensearch-project#16897)

* Bump lycheeverse/lychee-action from 2.1.0 to 2.2.0

Bumps [lycheeverse/lychee-action](https://github.com/lycheeverse/lychee-action) from 2.1.0 to 2.2.0.
- [Release notes](https://github.com/lycheeverse/lychee-action/releases)
- [Commits](lycheeverse/lychee-action@v2.1.0...v2.2.0)

---
updated-dependencies:
- dependency-name: lycheeverse/lychee-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Create sub directories for ThirdPartyAudit dependency metadata (opensearch-project#16844)

* Extract jars to sub dirs during thirdPartyAudit task.

Signed-off-by: Finn Carroll <carrofin@amazon.com>

* Change regex to split on '-'/'.'. Ignore version.

Signed-off-by: Finn Carroll <carrofin@amazon.com>

* Split on .jar for sub folder prefix.

Signed-off-by: Finn Carroll <carrofin@amazon.com>

---------

Signed-off-by: Finn Carroll <carrofin@amazon.com>

Retrieve value from DocValues in a flat_object filed (opensearch-project#16802)

Bump com.microsoft.azure:msal4j from 1.17.2 to 1.18.0 in /plugins/repository-azure (opensearch-project#16918)

* Bump com.microsoft.azure:msal4j in /plugins/repository-azure

Bumps [com.microsoft.azure:msal4j](https://github.com/AzureAD/microsoft-authentication-library-for-java) from 1.17.2 to 1.18.0.
- [Release notes](https://github.com/AzureAD/microsoft-authentication-library-for-java/releases)
- [Changelog](https://github.com/AzureAD/microsoft-authentication-library-for-java/blob/dev/changelog.txt)
- [Commits](AzureAD/microsoft-authentication-library-for-java@v1.17.2...v1.18.0)

---
updated-dependencies:
- dependency-name: com.microsoft.azure:msal4j
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Bump org.apache.commons:commons-text from 1.12.0 to 1.13.0 in /test/fixtures/hdfs-fixture (opensearch-project#16919)

* Bump org.apache.commons:commons-text in /test/fixtures/hdfs-fixture

Bumps org.apache.commons:commons-text from 1.12.0 to 1.13.0.

---
updated-dependencies:
- dependency-name: org.apache.commons:commons-text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Add gRPC server as transport-grpc plugin (opensearch-project#16534)

Introduce auxiliary transport to NetworkPlugin and add gRPC plugin.

Auxiliary transports are optional lifecycle components provided by
network plugins which run in parallel to the http server/native
transport. They are distinct from the existing NetworkPlugin
interfaces of 'getTransports' and 'getHttpTransports' as auxiliary
transports are optional. Each AuxTransport implements it's own
'aux.transport.type' and 'aux.transport.<type>.ports' setting. Since
Security.java initializes previous to Node.java during bootstrap
socket binding permissions are granted based on
'aux.transport.<type>.ports' for each enabled 'aux.transport.type',
falling back to a default if no ports are specified.

Signed-off-by: Finn Carroll <carrofin@amazon.com>

Update script supports java.lang.String.sha1() and java.lang.String.sha256() methods (opensearch-project#16923)

* Update script supports java.lang.String.sha1() and java.lang.String.sha256() methods

Signed-off-by: Gao Binlong <gbinlong@amazon.com>

* Modify change log

Signed-off-by: Gao Binlong <gbinlong@amazon.com>

---------

Signed-off-by: Gao Binlong <gbinlong@amazon.com>

Workflow benchmark-pull-request.yml fix (opensearch-project#16925)

Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>

Add benchmark confirm for lucene-10 big5 index snapshot (opensearch-project#16940)

Signed-off-by: Rishabh Singh <sngri@amazon.com>

Remove duplicate DCO check (opensearch-project#16942)

Signed-off-by: Andriy Redko <drreta@gmail.com>

Allow extended plugins to be optional (opensearch-project#16909)

* Make extended plugins optional

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Make extended plugins optional

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Load extensions for classpath plugins

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Ensure only single instance for each classpath extension

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Add test for classpath plugin extended plugin loading

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Modify test to allow optional extended plugin

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Only optional extended plugins

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Add additional warning message

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Add to CHANGELOG

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Add tag to make extended plugin optional

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Only send plugin names when serializing PluginInfo

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Keep track of optional extended plugins in separate set

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Include in ser/de of PluginInfo

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Change to 3_0_0

Signed-off-by: Craig Perkins <cwperx@amazon.com>

---------

Signed-off-by: Craig Perkins <cwperx@amazon.com>

Change version in PluginInfo to V_2_19_0 after backport to 2.x merged (opensearch-project#16947)

Signed-off-by: Craig Perkins <cwperx@amazon.com>

Support object fields in star-tree index (opensearch-project#16728)

---------

Signed-off-by: bharath-techie <bharath78910@gmail.com>

Bump ch.qos.logback:logback-core from 1.5.12 to 1.5.16 in /test/fixtures/hdfs-fixture (opensearch-project#16951)

* Bump ch.qos.logback:logback-core in /test/fixtures/hdfs-fixture

Bumps [ch.qos.logback:logback-core](https://github.com/qos-ch/logback) from 1.5.12 to 1.5.16.
- [Commits](qos-ch/logback@v_1.5.12...v_1.5.16)

---
updated-dependencies:
- dependency-name: ch.qos.logback:logback-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

[Workload Management] Add Workload Management IT (opensearch-project#16359)

* add workload management IT
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* address comments
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

---------

Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

Add new benchmark config for nested workload (opensearch-project#16956)

Signed-off-by: Rishabh Singh <sngri@amazon.com>

Bump com.azure:azure-core-http-netty from 1.15.5 to 1.15.7 in /plugins/repository-azure (opensearch-project#16952)

* Bump com.azure:azure-core-http-netty in /plugins/repository-azure

Bumps [com.azure:azure-core-http-netty](https://github.com/Azure/azure-sdk-for-java) from 1.15.5 to 1.15.7.
- [Release notes](https://github.com/Azure/azure-sdk-for-java/releases)
- [Commits](Azure/azure-sdk-for-java@azure-core-http-netty_1.15.5...azure-core-http-netty_1.15.7)

---
updated-dependencies:
- dependency-name: com.azure:azure-core-http-netty
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Always use constant_score query for match_only_text (opensearch-project#16964)

In some cases, when we create a term query over a `match_only_text`
field, it may still try to compute scores, which prevents early
termination. We should *always* use a constant score query when
querying `match_only_text`, since we don't have the statistics
required to compute scores.

---------

Signed-off-by: Michael Froh <froh@amazon.com>

Changes to support unmapped fields in metric aggregation (opensearch-project#16481)

Avoids exception when querying unmapped field when star tree experimental
feature is enables.

---------

Signed-off-by: expani <anijainc@amazon.com>

Use async client for delete blob or path in S3 Blob Container (opensearch-project#16788)

* Use async client for delete blob or path in S3 Blob Container

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Fix UTs

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Fix failures in S3BlobStoreRepositoryTests

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Fix S3BlobStoreRepositoryTests

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Fix failures in S3RepositoryThirdPartyTests

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Fix failures in S3RepositoryPluginTests

Signed-off-by: Ashish Singh <ssashish@amazon.com>

---------

Signed-off-by: Ashish Singh <ssashish@amazon.com>

Fix Shallow copy snapshot failures on closed index (opensearch-project#16868)

* Fix shallow v1 snapshot failures on closed index

Signed-off-by: Shubh Sahu <shubhvs@amazon.com>

* UT fix

Signed-off-by: Shubh Sahu <shubhvs@amazon.com>

* Adding UT

Signed-off-by: Shubh Sahu <shubhvs@amazon.com>

* small fix

Signed-off-by: Shubh Sahu <shubhvs@amazon.com>

* Addressing comments

Signed-off-by: Shubh Sahu <shubhvs@amazon.com>

* Addressing comments

Signed-off-by: Shubh Sahu <shubhvs@amazon.com>

* Modifying IT to restore snapshot

Signed-off-by: Shubh Sahu <shubhvs@amazon.com>

---------

Signed-off-by: Shubh Sahu <shubhvs@amazon.com>
Co-authored-by: Shubh Sahu <shubhvs@amazon.com>

Add Response Status Number in http trace logs. (opensearch-project#16978)

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

support termQueryCaseInsensitive/termQuery can search from doc_value in flat_object/keyword field (opensearch-project#16974)

Signed-off-by: kkewwei <kewei.11@bytedance.com>
Signed-off-by: kkewwei <kkewwei@163.com>

use the correct type to widen the sort fields when merging top docs (opensearch-project#16881)

* use the correct type to widen the sort fields when merging top docs

Signed-off-by: panguixin <panguixin@bytedance.com>

* fix

Signed-off-by: panguixin <panguixin@bytedance.com>

* apply commments

Signed-off-by: panguixin <panguixin@bytedance.com>

* changelog

Signed-off-by: panguixin <panguixin@bytedance.com>

* add more tests

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>

Fix multi-value sort for unsigned long (opensearch-project#16732)

* Fix multi-value sort for unsigned long

Signed-off-by: panguixin <panguixin@bytedance.com>

* Add initial rest-api-spec tests

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* add more rest tests

Signed-off-by: panguixin <panguixin@bytedance.com>

* fix

Signed-off-by: panguixin <panguixin@bytedance.com>

* fix

Signed-off-by: panguixin <panguixin@bytedance.com>

* Extend MultiValueMode with dedicated support of unsigned_long doc values

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Add CHANGELOG.md, minor cleanups

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Correct the license headers

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Correct the @publicapi version

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Replace SingletonSortedNumericUnsignedLongValues with LongToSortedNumericUnsignedLongValues (as per review comments)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Co-authored-by: Andriy Redko <andriy.redko@aiven.io>

Update Gradle to 8.12 (opensearch-project#16884)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

`phone-search` analyzer: don't emit sip/tel prefix, int'l prefix, extension & unformatted input (opensearch-project#16993)

* `phone-search` analyzer: don't emit int'l prefix

this was an oversight in the initial implementation: if the tokenizer
emits the international calling prefix in the search analyzer then all
documents with the same international calling prefix will match.

e.g. when searching for `+1-555-123-4567` not only documents with this
number would match but also any other document with a `1` token (i.e.
any other number with this prefix).

thus the search functionality is currently broken for this analyzer,
making it useless.

the test coverage has now been extended to cover these and other
use-cases.

Signed-off-by: Ralph Ursprung <Ralph.Ursprung@avaloq.com>

* `phone-search` analyzer: don't emit extension & unformatted input

if these tokens are emitted it meant that phone numbers with other
international dialling prefixes still matched.

e.g. searching for `+1 1234` would also match a number stored as
`+2 1234`, which was wrong.

the tokens still need to be emited for the `phone` analyzer, e.g. when
the user only enters the extension / local number it should still match,
the same is with the other ngrams: these are needed for
search-as-you-type style queries where the user input needs to match
against partial phone numbers.

Signed-off-by: Ralph Ursprung <Ralph.Ursprung@avaloq.com>

* `phone-search` analyzer: don't emit sip/tel prefix

in line with the previous two commits, this is something else the search
analyzer shouldn't emit since otherwise searching for any number with
such a prefix will match _any_ document with the same prefix.

Signed-off-by: Ralph Ursprung <Ralph.Ursprung@avaloq.com>

---------

Signed-off-by: Ralph Ursprung <Ralph.Ursprung@avaloq.com>

Limit RW separation to remote store enabled clusters and update recovery flow (opensearch-project#16760)

* Update search only replica recovery flow

This PR includes multiple changes to search replica recovery.
1. Change search only replica copies to recover as empty store instead of PEER. This will run a store recovery that syncs segments from remote store directly and eliminate any primary communication.
2. Remove search replicas from the in-sync allocation ID set and update routing table to exclude them from allAllocationIds.  This ensures primaries aren't tracking or validating the routing table for any search replica's presence.
3. Change search replica validation to require remote store.  There are versions of the above changes that are still possible with primary based node-node replication, but I don't think they are worth making  at this time.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* more coverage

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* add changelog entry

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* add assertions that Search Replicas are not in the in-sync id set nor the AllAllocationIds set in the routing table

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* update async task to only run if the FF is enabled and we are a remote store cluster.

This check had previously only checked for segrep

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* clean up max shards logic

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* remove search replicas from check during renewPeerRecoveryRetentionLeases

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* Revert "update async task to only run if the FF is enabled and we are a remote store cluster."

reverting this, we already check for remote store earlier.

This reverts commit 48ca1a3.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* Add more tests for failover case

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* Update remotestore restore logic and add test ensuring we can restore only writers when red

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* Fix Search replicas to honor node level recovery limits

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* Fix translog UUID mismatch on existing store recovery.

This commit adds PR feedback and recovery tests post node restart.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* Fix spotless

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* Fix bug with remote restore and add more tests

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

---------

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

Fix case insensitive and escaped query on wildcard (opensearch-project#16827)

* fix case insensitive and escaped query on wildcard

Signed-off-by: gesong.samuel <gesong.samuel@bytedance.com>

* add changelog

Signed-off-by: gesong.samuel <gesong.samuel@bytedance.com>

---------

Signed-off-by: gesong.samuel <gesong.samuel@bytedance.com>
Signed-off-by: Michael Froh <froh@amazon.com>
Co-authored-by: gesong.samuel <gesong.samuel@bytedance.com>
Co-authored-by: Michael Froh <froh@amazon.com>

Bump opentelemetry from 1.41.0 to 1.46.0 and opentelemetry-semconv from 1.27.0-alpha to 1.29.0-alpha (opensearch-project#17000)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

TransportBulkAction.doRun() (opensearch-project#16950)

Signed-off-by: kkewwei <kewei.11@bytedance.com>
Signed-off-by: kkewwei <kkewwei@163.com>

Show only intersecting buckets to the Adjacency matrix aggregation (opensearch-project#11733)

Signed-off-by: Ivan Brusic <ivan@brusic.com>

Bump com.google.re2j:re2j from 1.7 to 1.8 in /plugins/repository-hdfs (opensearch-project#17012)

* Bump com.google.re2j:re2j from 1.7 to 1.8 in /plugins/repository-hdfs

Bumps [com.google.re2j:re2j](https://github.com/google/re2j) from 1.7 to 1.8.
- [Release notes](https://github.com/google/re2j/releases)
- [Commits](google/re2j@re2j-1.7...re2j-1.8)

---
updated-dependencies:
- dependency-name: com.google.re2j:re2j
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump com.nimbusds:oauth2-oidc-sdk from 11.20.1 to 11.21 in /plugins/repository-azure (opensearch-project#17010)

* Bump com.nimbusds:oauth2-oidc-sdk in /plugins/repository-azure

Bumps [com.nimbusds:oauth2-oidc-sdk](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions) from 11.20.1 to 11.21.
- [Changelog](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions/src/master/CHANGELOG.txt)
- [Commits](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions/branches/compare/11.21..11.20.1)

---
updated-dependencies:
- dependency-name: com.nimbusds:oauth2-oidc-sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

improve `PhoneNumberAnalyzerTests#testTelPrefixSearch` (opensearch-project#17016)

this way we ensure that it doesn't include any additional tokens which
we don't want.

this is a follow-up to commit 4d94399 / opensearch-project#16993.

Signed-off-by: Ralph Ursprung <Ralph.Ursprung@avaloq.com>

Filter shards for sliced search at coordinator (opensearch-project#16771)

* Filter shards for sliced search at coordinator

Prior to this commit, a sliced search would fan out to every shard,
then apply a MatchNoDocsQuery filter on shards that don't correspond
to the current slice. This still creates a (useless) search context
on each shard for every slice, though. For a long-running sliced
scroll, this can quickly exhaust the number of available scroll
contexts.

This change avoids fanning out to all the shards by checking at the
coordinator if a shard is matched by the current slice. This should
reduce the number of open scroll contexts to max(numShards, numSlices)
instead of numShards * numSlices.

---------

Signed-off-by: Michael Froh <froh@amazon.com>

Upgrade HttpCore5/HttpClient5 to support ExtendedSocketOption in HttpAsyncClient (opensearch-project#16757)

* upgrade httpcore5/httpclient5 to support ExtendedSocketOption in HttpAsyncClient

Signed-off-by: kkewwei <kewei.11@bytedance.com>
Signed-off-by: kkewwei <kkewwei@163.com>

* Use the Upgrade flow by default

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Update Reactor Netty to 1.1.26.Final

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Add SETTING_H2C_MAX_CONTENT_LENGTH to configure h2cMaxContentLength for reactor-netty4 transport

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Update Apache HttpCore5 to 5.3.2

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

---------

Signed-off-by: kkewwei <kewei.11@bytedance.com>
Signed-off-by: kkewwei <kkewwei@163.com>
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Co-authored-by: Andriy Redko <andriy.redko@aiven.io>

Update version checks for backport (opensearch-project#17030)

Signed-off-by: Michael Froh <froh@amazon.com>
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Co-authored-by: Michael Froh <froh@amazon.com>

Fix versions and breaking API changes (opensearch-project#17031)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

Bump com.nimbusds:nimbus-jose-jwt from 9.47 to 10.0.1 in /test/fixtures/hdfs-fixture (opensearch-project#17011)

* Bump com.nimbusds:nimbus-jose-jwt in /test/fixtures/hdfs-fixture

Bumps [com.nimbusds:nimbus-jose-jwt](https://bitbucket.org/connect2id/nimbus-jose-jwt) from 9.47 to 10.0.1.
- [Changelog](https://bitbucket.org/connect2id/nimbus-jose-jwt/src/master/CHANGELOG.txt)
- [Commits](https://bitbucket.org/connect2id/nimbus-jose-jwt/branches/compare/10.0.1..9.47)

---
updated-dependencies:
- dependency-name: com.nimbusds:nimbus-jose-jwt
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Remove user data from logs when not in debug/trace mode (opensearch-project#17007)

* Remove user data from logs when not in debug/trace mode

Signed-off-by: Mohit Godwani <mgodwan@amazon.com>

Remove user data from logs when not in debug/trace mode (opensearch-project#17007)

* Remove user data from logs when not in debug/trace mode

Signed-off-by: Mohit Godwani <mgodwan@amazon.com>
Signed-off-by: meetvm <meetvm@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed v2.19.0 Issues and PRs related to version 2.19.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants