Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor knn type and codecs #439

Conversation

martin-gaievski
Copy link
Member

@martin-gaievski martin-gaievski commented Jul 12, 2022

Signed-off-by: Martin Gaievski gaievski@amazon.com

Description

Refactor few things in plugin in order to open possibilities for future extensions required for #39

Issues Resolved

building block for #39

Check List

  • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@martin-gaievski martin-gaievski added Refactoring Improve the design, structure, and implementation while preserving its functionality 2.2.0 labels Jul 12, 2022
@martin-gaievski martin-gaievski self-assigned this Jul 12, 2022
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
@martin-gaievski martin-gaievski force-pushed the refactor-knn-method-context branch from 5e7cfff to 053bd6e Compare July 12, 2022 04:59
@codecov-commenter
Copy link

codecov-commenter commented Jul 12, 2022

Codecov Report

Merging #439 (34119de) into main (03f0049) will decrease coverage by 0.04%.
The diff coverage is 100.00%.

❗ Current head 34119de differs from pull request most recent head 4d4e813. Consider uploading reports for the commit 4d4e813 to get more accurate results

@@             Coverage Diff              @@
##               main     #439      +/-   ##
============================================
- Coverage     84.02%   83.97%   -0.05%     
+ Complexity      911      902       -9     
============================================
  Files           130      131       +1     
  Lines          3880     3863      -17     
  Branches        359      359              
============================================
- Hits           3260     3244      -16     
+ Misses          458      457       -1     
  Partials        162      162              
Impacted Files Coverage Δ
.../main/java/org/opensearch/knn/index/KNNMethod.java 100.00% <ø> (ø)
...ava/org/opensearch/knn/index/KNNMethodContext.java 92.50% <ø> (-0.76%) ⬇️
...java/org/opensearch/knn/index/MethodComponent.java 88.88% <ø> (-0.25%) ⬇️
...g/opensearch/knn/index/MethodComponentContext.java 91.13% <ø> (-0.63%) ⬇️
...org/opensearch/knn/index/KNNVectorFieldMapper.java 83.26% <100.00%> (+0.07%) ⬆️
...rg/opensearch/knn/index/codec/KNNCodecFactory.java 78.57% <100.00%> (+7.14%) ⬆️
...rg/opensearch/knn/index/codec/KNNCodecService.java 100.00% <100.00%> (ø)
.../opensearch/knn/index/codec/util/CodecBuilder.java 100.00% <100.00%> (ø)
.../java/org/opensearch/knn/training/TrainingJob.java 87.50% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 03f0049...4d4e813. Read the comment docs.

Comment on lines 34 to 35
@Getter private final MethodComponent methodComponent;
@Getter private final Set<SpaceType> spaces;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this on the top of the class

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Comment on lines 59 to 63
@Getter
private final KNNEngine knnEngine;
@Getter
private final SpaceType spaceType;
@Getter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Comment on lines +39 to +44
private final MapperService mapperService;

@Override
public Codec build() {
return new KNN910Codec(userCodec);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what is the use of the parameter MapperService here? Are we expecting when the dense_vector field we come we will use the mapper service to find out what exact codec to use?

Please improve the documentation with what we want achieve here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR I'm making it available for codec and codec factory. In next PR I'm going to change the codec and extend knnVector format through the PerFieldMappingPostingFormatCodec mechanism, similar to what we did for core in dense_vector PR
https://github.com/opensearch-project/OpenSearch/pull/3659/files#r915126068. Let me update the description to reflect exact purpose of mapping service param

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated PR description

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make sure that you are providing enough documentation and reason behind the changes in the Java doc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@martin-gaievski martin-gaievski force-pushed the refactor-knn-method-context branch 2 times, most recently from 5487268 to 34119de Compare July 12, 2022 17:20
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
@martin-gaievski martin-gaievski force-pushed the refactor-knn-method-context branch from 34119de to 4d4e813 Compare July 12, 2022 17:26
@martin-gaievski martin-gaievski marked this pull request as ready for review July 12, 2022 17:29
@martin-gaievski martin-gaievski requested a review from a team July 12, 2022 17:29
int dimension;
String modelId;
KNNMethodContext knnMethodContext;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will changing this impact BWC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't, the only case I can think of is if do older clusters after migration this field is null. I can add extra checks in next PR when we add lucene engine. Can we run any specific bwc test to make sure there are no side-effects?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think it will be a problem. When mapping reaches new node, the parser will parse it. Since we arent changing the parser, it should be okay. The MappedFieldType is not serialized, so it will be okay.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, exactly

@martin-gaievski martin-gaievski merged commit 2fc09ba into opensearch-project:main Jul 12, 2022
martin-gaievski added a commit to martin-gaievski/k-NN that referenced this pull request Jul 12, 2022
* Refactor knn type and codecs

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
martin-gaievski added a commit that referenced this pull request Jul 13, 2022
* Refactor knn type and codecs

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-439-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 2fc09ba7a7fd8f2cf38678ee38058530994a8b12
# Push it to GitHub
git push --set-upstream origin backport/backport-439-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-439-to-2.x.

@heemin32 heemin32 added v2.2.0 and removed 2.2.0 labels Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Refactoring Improve the design, structure, and implementation while preserving its functionality v2.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants