-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding access to noSubMatches and noOverlappingMatches in Hyphenation… #13895
Adding access to noSubMatches and noOverlappingMatches in Hyphenation… #13895
Conversation
5abc8ec
to
3d5ffdc
Compare
❌ Gradle check result for 5abc8ec: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 5abc8ec: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
modules/analysis-common/src/test/java/org/opensearch/analysis/common/CompoundAnalysisTests.java
Outdated
Show resolved
Hide resolved
❌ Gradle check result for 7b2142e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>
@hasnain2808 - It seems the spotless check is failing. Can you fix those?
|
Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Gradle check result for 6a88bb0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 8752b76: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 8752b76: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
I cannot merge even after approval 😢 |
#13895) * Adding access to noSubMatches and noOverlappingMatches in HyphenationCompoundWordTokenFilter Signed-off-by: Evan Kielley <evankielley@gmail.com> * Add Changelog Entry Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com> * test: add hyphenation decompounder tests Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * test: refactor tests Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * test: reformat test files Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: add changelog entry for 2.X Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: remove 3.x changelog Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: commonify settingsarr Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: commonify settingsarr Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: linting Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> --------- Signed-off-by: Evan Kielley <evankielley@gmail.com> Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com> Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> Co-authored-by: Evan Kielley <evankielley@gmail.com> (cherry picked from commit ce64fac) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Merged! :) |
#13895) (#15329) * Adding access to noSubMatches and noOverlappingMatches in HyphenationCompoundWordTokenFilter * Add Changelog Entry * test: add hyphenation decompounder tests * test: refactor tests * test: reformat test files * chore: add changelog entry for 2.X * chore: remove 3.x changelog * chore: commonify settingsarr * chore: commonify settingsarr * chore: linting --------- (cherry picked from commit ce64fac) Signed-off-by: Evan Kielley <evankielley@gmail.com> Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com> Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Evan Kielley <evankielley@gmail.com>
opensearch-project#13895) * Adding access to noSubMatches and noOverlappingMatches in HyphenationCompoundWordTokenFilter Signed-off-by: Evan Kielley <evankielley@gmail.com> * Add Changelog Entry Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com> * test: add hyphenation decompounder tests Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * test: refactor tests Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * test: reformat test files Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: add changelog entry for 2.X Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: remove 3.x changelog Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: commonify settingsarr Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: commonify settingsarr Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: linting Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> --------- Signed-off-by: Evan Kielley <evankielley@gmail.com> Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com> Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> Co-authored-by: Evan Kielley <evankielley@gmail.com>
* Optimize global ordinal includes/excludes for prefix matching (opensearch-project#14371) * Optimize global ordinal includes/excludes for prefix matching If an aggregration specifies includes or excludes based on a regular expression, and the regular expression has a finite expansion followed by .*, then we can optimize the global ordinal filter. Specifically, in this case, we can expand the matching prefixes, then include/exclude the range of global ordinals that start with each prefix. Signed-off-by: Michael Froh <froh@amazon.com> * Add unit test Signed-off-by: Michael Froh <froh@amazon.com> * Add changelog entry Signed-off-by: Michael Froh <froh@amazon.com> * Improve test coverage Updated the unit test to be functionally equivalent, but it covers more of the regex logic. Signed-off-by: Michael Froh <froh@amazon.com> * Improve test coverage Signed-off-by: Michael Froh <froh@amazon.com> * Fix bug in exclude-only case with no doc values in segment Signed-off-by: Michael Froh <froh@amazon.com> * Address comments from @mch2 Signed-off-by: Michael Froh <froh@amazon.com> --------- Signed-off-by: Michael Froh <froh@amazon.com> * Adding access to noSubMatches and noOverlappingMatches in Hyphenation… (opensearch-project#13895) * Adding access to noSubMatches and noOverlappingMatches in HyphenationCompoundWordTokenFilter Signed-off-by: Evan Kielley <evankielley@gmail.com> * Add Changelog Entry Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com> * test: add hyphenation decompounder tests Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * test: refactor tests Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * test: reformat test files Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: add changelog entry for 2.X Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: remove 3.x changelog Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: commonify settingsarr Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: commonify settingsarr Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: linting Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> --------- Signed-off-by: Evan Kielley <evankielley@gmail.com> Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com> Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> Co-authored-by: Evan Kielley <evankielley@gmail.com> * Add Settings related to Workload Management feature (opensearch-project#15028) * add QeryGroup Service tests Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * add PR to changelog Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * change the test directory Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * modify comments to be more specific Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * add test coverage Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * remove QUERY_GROUP_RUN_INTERVAL_SETTING as we'll define it in QueryGroupService Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * address comments Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * Update affiliation for @nknize. (opensearch-project#15322) Signed-off-by: dblock <dblock@amazon.com> * Add log when download completes with file size (opensearch-project#15224) Signed-off-by: Gaurav Bafna <gbbafna@amazon.com> * Support Filtering on Large List encoded by Bitmap (version update) (opensearch-project#15352) Signed-off-by: Andriy Redko <andriy.redko@aiven.io> * Add support for index level slice count setting (opensearch-project#15336) Signed-off-by: Ganesh Ramadurai <gramadur@amazon.com> * Adding allowlist setting for ingest-useragent and ingest-geoip processors (opensearch-project#15325) * Adding allowlist setting for user-agent, geo-ip and updated tests for ingest-common. Signed-off-by: Sarat Vemulapalli <vemulapallisarat@gmail.com> * Remove duplicate test in ingest-common Signed-off-by: Sarat Vemulapalli <vemulapallisarat@gmail.com> * Adding changelog Signed-off-by: Sarat Vemulapalli <vemulapallisarat@gmail.com> --------- Signed-off-by: Sarat Vemulapalli <vemulapallisarat@gmail.com> * Add Delete QueryGroup API Logic (opensearch-project#14735) * Add Delete QueryGroup API Logic Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * modify changelog Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * include comments from create pr Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * remove delete all Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * rebase and address comments Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * rebase Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * address comments Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * address comments Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * address comments Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * add UT coverage Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> * [Star Tree] Lucene Abstractions for Star Tree File Formats (opensearch-project#15278) --------- Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com> * [Star tree] Changes to handle derived metrics such as avg as part of star tree mapping (opensearch-project#15152) --------- Signed-off-by: Bharathwaj G <bharath78910@gmail.com> * relaxing the join validation for nodes which have only store disabled but only publication enabled * relaxing the join validation for nodes which have only store disabled but only publication enabled Signed-off-by: Rajiv Kumar Vaidyanathan <rajivkv@amazon.com> --------- Signed-off-by: Michael Froh <froh@amazon.com> Signed-off-by: Evan Kielley <evankielley@gmail.com> Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com> Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> Signed-off-by: dblock <dblock@amazon.com> Signed-off-by: Gaurav Bafna <gbbafna@amazon.com> Signed-off-by: Andriy Redko <andriy.redko@aiven.io> Signed-off-by: Ganesh Ramadurai <gramadur@amazon.com> Signed-off-by: Sarat Vemulapalli <vemulapallisarat@gmail.com> Signed-off-by: Rajiv Kumar Vaidyanathan <rajivkv@amazon.com> Co-authored-by: Michael Froh <froh@amazon.com> Co-authored-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com> Co-authored-by: Evan Kielley <evankielley@gmail.com> Co-authored-by: Ruirui Zhang <mariazrr@amazon.com> Co-authored-by: Daniel (dB.) Doubrovkine <dblock@amazon.com> Co-authored-by: Gaurav Bafna <85113518+gbbafna@users.noreply.github.com> Co-authored-by: Andriy Redko <andriy.redko@aiven.io> Co-authored-by: Ganesh Krishna Ramadurai <gramadur@icloud.com> Co-authored-by: Sarat Vemulapalli <vemulapallisarat@gmail.com> Co-authored-by: Sarthak Aggarwal <sarthagg@amazon.com> Co-authored-by: Bharathwaj G <bharath78910@gmail.com> Co-authored-by: Rajiv Kumar Vaidyanathan <rajivkv@amazon.com>
opensearch-project#13895) * Adding access to noSubMatches and noOverlappingMatches in HyphenationCompoundWordTokenFilter Signed-off-by: Evan Kielley <evankielley@gmail.com> * Add Changelog Entry Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com> * test: add hyphenation decompounder tests Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * test: refactor tests Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * test: reformat test files Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: add changelog entry for 2.X Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: remove 3.x changelog Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: commonify settingsarr Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: commonify settingsarr Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> * chore: linting Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> --------- Signed-off-by: Evan Kielley <evankielley@gmail.com> Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com> Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com> Co-authored-by: Evan Kielley <evankielley@gmail.com>
Description
This change adds support for / exposes two new settings (noSubMatches and noOverlappingMatches) that were added to Lucene's HyphenationCompoundWordTokenFilter class.
Related Issues
Resolves #8796
Based on of #10765
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.