Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new cluster setting for keyword indexordocvalues query #15637

Merged
merged 3 commits into from
Sep 4, 2024

Conversation

harshavamsi
Copy link
Contributor

Description

Adds a new cluster setting for MultiTermQueries in keyword fields and chooses query based on setting

Related Issues

Resolves #14755

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Copy link
Contributor

github-actions bot commented Sep 4, 2024

❌ Gradle check result for 25d43d4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
@harshavamsi
Copy link
Contributor Author

@msfroh @reta Coming from #15012, I added a new cluster setting to enable/disable this optimization. But I don't think cluster settings are accessible from QueryShardContext so I had to hook it in from SearchService. Does this look right?

Copy link
Contributor

github-actions bot commented Sep 4, 2024

✅ Gradle check result for cc7e98d: SUCCESS

Copy link

codecov bot commented Sep 4, 2024

Codecov Report

Attention: Patch coverage is 86.20690% with 4 lines in your changes missing coverage. Please review.

Project coverage is 71.96%. Comparing base (e087272) to head (cc7e98d).
Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
...rg/opensearch/index/mapper/KeywordFieldMapper.java 70.00% 1 Missing and 2 partials ⚠️
.../org/opensearch/search/internal/SearchContext.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #15637      +/-   ##
============================================
- Coverage     72.00%   71.96%   -0.05%     
- Complexity    64149    64176      +27     
============================================
  Files          5269     5269              
  Lines        299834   299894      +60     
  Branches      43330    43345      +15     
============================================
- Hits         215895   215811      -84     
- Misses        66217    66352     +135     
- Partials      17722    17731       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jainankitk
Copy link
Collaborator

jainankitk commented Sep 4, 2024

@harshavamsi - Can you provide some context for this change? Not completely clear from the linked issue as well

@msfroh msfroh merged commit d64baa6 into opensearch-project:main Sep 4, 2024
64 of 69 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-15637-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d64baa6808a14fa021b16972459257b43ac6b7da
# Push it to GitHub
git push --set-upstream origin backport/backport-15637-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-15637-to-2.x.

@msfroh
Copy link
Collaborator

msfroh commented Sep 4, 2024

@jainankitk -- most of the context comes up in the discussion on #15012

@harshavamsi
Copy link
Contributor Author

@harshavamsi - Can you provide some context for this change? Not completely clear from the linked issue as well

@jainankitk Users reported slowdowns when running terms queries against keyword fields. This was due to the introduction of the IndexOrDocValuesQuery. Underlying lucene cost estimation bug caused DocValuesQuery to be picked up in some cases, this reverts the original change but keeps the optimization under a cluster setting.

@jainankitk
Copy link
Collaborator

@harshavamsi - Can you provide some context for this change? Not completely clear from the linked issue as well

@jainankitk Users reported slowdowns when running terms queries against keyword fields. This was due to the introduction of the IndexOrDocValuesQuery. Underlying lucene cost estimation bug caused DocValuesQuery to be picked up in some cases, this reverts the original change but keeps the optimization under a cluster setting.

I am wondering if there cases where using DocValuesQuery is better than IndexOnlyQuery? If yes, aren't we concerned about regressing on the other side. Should the default setting behavior be IndexOrDocValuesQuery, and change to IndexOnlyQuery for workloads seeing regression from IndexOrDocValuesQuery?

harshavamsi added a commit to harshavamsi/OpenSearch that referenced this pull request Sep 4, 2024
…h-project#15637)

* Add new cluster setting for keyword indexordocvalues query

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

* Fix tests

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

---------

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
(cherry picked from commit d64baa6)
@msfroh
Copy link
Collaborator

msfroh commented Sep 4, 2024

#15012 (comment)

@msfroh
Copy link
Collaborator

msfroh commented Sep 4, 2024

This was all discussed on #15012

@jainankitk
Copy link
Collaborator

#15012 (comment)

Thanks for digging out the relevant discussion!

harshavamsi added a commit to harshavamsi/OpenSearch that referenced this pull request Sep 5, 2024
…h-project#15637)

* Add new cluster setting for keyword indexordocvalues query

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

* Fix tests

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

---------

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
(cherry picked from commit d64baa6)
harshavamsi added a commit to harshavamsi/OpenSearch that referenced this pull request Sep 5, 2024
…h-project#15637)

* Add new cluster setting for keyword indexordocvalues query

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

* Fix tests

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

---------

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
(cherry picked from commit d64baa6)
msfroh pushed a commit that referenced this pull request Sep 5, 2024
…uery (#15637) (#15703)


---------

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
msfroh pushed a commit that referenced this pull request Sep 5, 2024
…query (#15637) (#15751)



---------

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
akolarkunnu pushed a commit to akolarkunnu/OpenSearch that referenced this pull request Sep 10, 2024
…h-project#15637)

* Add new cluster setting for keyword indexordocvalues query

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

* Fix tests

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

---------

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 16, 2024
…h-project#15637)

* Add new cluster setting for keyword indexordocvalues query

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

* Fix tests

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

---------

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 17, 2024
…h-project#15637)

* Add new cluster setting for keyword indexordocvalues query

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

* Fix tests

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

---------

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 21, 2024
…h-project#15637)

* Add new cluster setting for keyword indexordocvalues query

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

* Fix tests

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>

---------

Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed Search Search query, autocomplete ...etc v2.17.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Keyword field slowdown while using IndexOrDocValues
3 participants