Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Fix PostingsFormat in KNN Codec #236

Merged

Conversation

vamshin
Copy link
Member

@vamshin vamshin commented Sep 24, 2020

Issue #, if available:
#227

Description of changes:
Elasticsearch initiates PostingsFormat based on the MapperService. For KNNCodec, we do not pass MapperService and the PostingsFormat deviates from underlying format for the current Lucene version. Because of this some of the terms that use PostingsFormat could have incompatibility. For example #227.
Updated KNNCodec to rely on the right PostingsFormat used by underlying Codec of Elasticsearch.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@vamshin vamshin requested a review from jmazanec15 September 24, 2020 07:11
@codecov
Copy link

codecov bot commented Sep 24, 2020

Codecov Report

❗ No coverage uploaded for pull request base (master@76bdfaa). Click here to learn what that means.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #236   +/-   ##
=========================================
  Coverage          ?   78.77%           
  Complexity        ?      339           
=========================================
  Files             ?       53           
  Lines             ?     1343           
  Branches          ?      122           
=========================================
  Hits              ?     1058           
  Misses            ?      232           
  Partials          ?       53           
Impacted Files Coverage Δ Complexity Δ
...csearch/knn/index/codec/KNN86Codec/KNN86Codec.java 96.00% <100.00%> (ø) 15.00 <3.00> (?)
...roforelasticsearch/knn/plugin/KNNCodecService.java 75.00% <100.00%> (ø) 3.00 <1.00> (?)
...oforelasticsearch/knn/plugin/KNNEngineFactory.java 100.00% <100.00%> (ø) 3.00 <2.00> (?)
...elasticsearch/knn/plugin/stats/KNNStatsConfig.java 95.65% <0.00%> (ø) 1.00% <0.00%> (?%)
...search/knn/plugin/script/KNNVectorScoreScript.java 77.33% <0.00%> (ø) 6.00% <0.00%> (?%)
...forelasticsearch/knn/index/util/NmsLibVersion.java 100.00% <0.00%> (ø) 2.00% <0.00%> (?%)
...oforelasticsearch/knn/index/KNNCircuitBreaker.java 58.97% <0.00%> (ø) 6.00% <0.00%> (?%)
...forelasticsearch/knn/index/codec/KNNCodecUtil.java 83.33% <0.00%> (ø) 2.00% <0.00%> (?%)
...roforelasticsearch/knn/plugin/stats/StatNames.java 85.18% <0.00%> (ø) 3.00% <0.00%> (?%)
...csearch/knn/index/codec/KNN84Codec/KNN84Codec.java 30.00% <0.00%> (ø) 2.00% <0.00%> (?%)
... and 46 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 76bdfaa...8bd8382. Read the comment docs.

Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add test case confirming the fix works? Other than the suggests field, what other functionality does this bug break?

"Elasticsearch initiates PostingsFormat based on the MapperService. For KNNCodec, we do not pass MapperService"

Do you have code link for where Elasticsearch initiates PostingsFormat based on MapperService? Also, why can't we pass MapperService to KNNCodec?

@@ -82,9 +82,17 @@ public DocValuesFormat docValuesFormat() {
* approach of manually overriding.
*/

PostingsFormat postingsFormat = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why declare is here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -45,4 +47,8 @@ public Codec codec(String name) {
}
return codec;
}

public void setPostingsFormat(PostingsFormat postingsFormat) {
((KNN86Codec)codec("")).setPostingsFormat(postingsFormat);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this string empty?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codec function does not use this String. We could pass any string here. Idea is to get the KNNCodec from this function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

@vamshin vamshin requested a review from jmazanec15 October 8, 2020 20:41
@vamshin
Copy link
Member Author

vamshin commented Oct 14, 2020

Could we add test case confirming the fix works? Other than the suggests field, what other functionality does this bug break?

"Elasticsearch initiates PostingsFormat based on the MapperService. For KNNCodec, we do not pass MapperService"

Do you have code link for where Elasticsearch initiates PostingsFormat based on MapperService? Also, why can't we pass MapperService to KNNCodec?

  1. Manually verified the fix. I will add test case part of this PR Add test case for PostingsFormat fix in KNNCodec #245

  2. Sure. This code is part of CodecService. Please refer this link https://github.com/elastic/elasticsearch/blob/v7.9.1/server/src/main/java/org/elasticsearch/index/codec/CodecService.java#L49-L56

Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Bug Fixes Change that fixes a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants