RRF duplicate bug fix: remove replica number from hit ID check - 2.11 #952
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
Bug fix
What is the current behavior? (You can also link to an open issue here)
Duplicates in RRF search due to hits from different replicas having different vespa hit IDs.
What is the new behavior (if this is a feature change)?
Duplicates have been removed, only doc ID (last text after the slash) is used for comparing if tensor/lexical results are unique.
Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
No
Have unit tests been run against this PR? (Has there also been any additional testing?)
In progress
Related Python client changes (link commit/PR here)
Related documentation changes (link commit/PR here)
Other information:
Please check if the PR fulfills these requirements