Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Score Normalization and Combination feature #241

Merged

Conversation

martin-gaievski
Copy link
Member

@martin-gaievski martin-gaievski commented Aug 3, 2023

Description

Adding Score Normalization and Combination feature, that includes Hybrid Query and Normalization processor for Search Result. This is merge PR, that includes following PRs from feature branch:

Issues Resolved

#123

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

* Add main classes for Query along with basic unit tests

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
* Add integ and unit test for query

Signed-off-by: Martin Gaievski <gaievski@amazon.com>

---------

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
…ect#198)

* Add doc collector

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
* Add query phase searcher and basic tests

Signed-off-by: Martin Gaievski <gaievski@amazon.com>

---------

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
* Adding hybrid_search_enabled settings

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
…search-project#227)

* Adding search processor for score normalization and combination

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
* Adding weights param for combination technique

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
* Adding L2 norm technique

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
* Add harmonic mean combination

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
* Add geometric mean normalization for scores

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
@martin-gaievski martin-gaievski changed the title Added Hybrid Search feature Added Score Normalization and Combination feature Aug 3, 2023
@martin-gaievski martin-gaievski added Features Introduces a new unit of functionality that satisfies a requirement v2.10.0 Issues targeting release v2.10.0 backport 2.x Label will add auto workflow to backport PR to 2.x branch labels Aug 3, 2023
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
@codecov
Copy link

codecov bot commented Aug 3, 2023

Codecov Report

Merging #241 (2a38964) into main (2ff58a9) will decrease coverage by 3.32%.
The diff coverage is 83.95%.

@@             Coverage Diff              @@
##               main     #241      +/-   ##
============================================
- Coverage     89.55%   86.23%   -3.32%     
- Complexity      103      337     +234     
============================================
  Files             7       28      +21     
  Lines           316      981     +665     
  Branches         52      153     +101     
============================================
+ Hits            283      846     +563     
- Misses           16       69      +53     
- Partials         17       66      +49     
Files Changed Coverage Δ
...nsearch/neuralsearch/query/HybridQueryBuilder.java 72.64% <72.64%> (ø)
...lsearch/search/query/HybridQueryPhaseSearcher.java 73.58% <73.58%> (ø)
...ensearch/neuralsearch/query/HybridQueryWeight.java 76.00% <76.00%> (ø)
...euralsearch/search/HybridTopScoreDocCollector.java 79.62% <79.62%> (ø)
...earch/processor/normalization/ScoreNormalizer.java 80.00% <80.00%> (ø)
...ensearch/neuralsearch/query/HybridQueryScorer.java 81.25% <81.25%> (ø)
...neuralsearch/processor/NormalizationProcessor.java 83.33% <83.33%> (ø)
...r/normalization/L2ScoreNormalizationTechnique.java 83.33% <83.33%> (ø)
...rmalization/MinMaxScoreNormalizationTechnique.java 84.61% <84.61%> (ø)
...arch/processor/NormalizationProcessorWorkflow.java 85.00% <85.00%> (ø)
... and 12 more

... and 1 file with indirect coverage changes

@martin-gaievski martin-gaievski merged commit 61e6e98 into opensearch-project:main Aug 3, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-241-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 61e6e98bb6377d95d1043e1c4adcc0d4309b99ac
# Push it to GitHub
git push --set-upstream origin backport/backport-241-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-241-to-2.x.

martin-gaievski added a commit that referenced this pull request Aug 3, 2023
* Added Score Normalization and Combination feature

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
(cherry picked from commit 61e6e98)
martin-gaievski added a commit to martin-gaievski/neural-search that referenced this pull request Aug 25, 2023
…#241)

* Added Score Normalization and Combination feature

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
(cherry picked from commit 61e6e98)
martin-gaievski added a commit that referenced this pull request Aug 25, 2023
…ual backport (#263)

* Added Score Normalization and Combination feature (#241)

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Label will add auto workflow to backport PR to 2.x branch Features Introduces a new unit of functionality that satisfies a requirement v2.10.0 Issues targeting release v2.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants