-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support for Default Model Id #337
[Feature] Support for Default Model Id #337
Conversation
Signed-off-by: Varun Jain <varunudr@amazon.com>
Signed-off-by: Varun Jain <varunudr@amazon.com>
Signed-off-by: Varun Jain <varunudr@amazon.com>
Signed-off-by: Varun Jain <varunudr@amazon.com>
Signed-off-by: Varun Jain <varunudr@amazon.com>
Signed-off-by: Varun Jain <varunudr@amazon.com>
Codecov Report
@@ Coverage Diff @@
## main #337 +/- ##
============================================
+ Coverage 84.56% 84.62% +0.05%
- Complexity 427 445 +18
============================================
Files 35 38 +3
Lines 1289 1333 +44
Branches 189 199 +10
============================================
+ Hits 1090 1128 +38
- Misses 118 119 +1
- Partials 81 86 +5
|
@vibrantvarun remove the skip changelog tag and put the info in change log. |
@vibrantvarun fill the checklist too |
@vibrantvarun add documentation on all public classes and public functions |
src/main/java/org/opensearch/neuralsearch/query/visitor/NeuralSearchQueryVisitor.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/query/visitor/NeuralSearchQueryVisitor.java
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/util/NeuralSearchClusterUtil.java
Show resolved
Hide resolved
Signed-off-by: Varun Jain <varunudr@amazon.com>
Signed-off-by: Varun Jain <varunudr@amazon.com>
Signed-off-by: Varun Jain <varunudr@amazon.com>
@vibrantvarun can we fix the github action: codecov/patch and codecov/project seems like the coverage has dropped |
Signed-off-by: Varun Jain <varunudr@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few minor things, over looks good
|
||
public class NeuralQueryProcessorTests extends OpenSearchTestCase { | ||
|
||
public void testFactory() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use naming convention from other tests - whatYouTesting_conditions_expectation, I think may be optional sometimes, when it's obvious, but other two are must
assertEquals("vasdcvkcjkbldbjkd", processor.getModelId()); | ||
assertEquals("bahbkcdkacb", processor.getNeuralFieldDefaultIdMap().get("fieldName").toString()); | ||
|
||
// Missing "query" parameter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check exception
expectThrows(IllegalArgumentException.class, () -> factory.create(Collections.emptyMap(), null, null, false, configMap, null)); | ||
} | ||
|
||
public void testProcessRequest() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can use @SneakyThrow for exceptions in tests
*/ | ||
@Setter | ||
@Getter | ||
public class NeuralQueryProcessor extends AbstractProcessor implements SearchRequestProcessor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename class to something closer to a processor type name, although I left the final call to you
Signed-off-by: Varun Jain <varunudr@amazon.com>
/** | ||
* Key to reference this processor type from a search pipeline. | ||
*/ | ||
public static final String TYPE = "enriching_query_defaults"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neural_query_enricher name seems better here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martin-gaievski I am finally going with neural_query_enricher
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, that sounds reasonable, please update the class with that new name
) throws IllegalArgumentException { | ||
String modelId; | ||
try { | ||
modelId = (String) config.remove(DEFAULT_MODEL_ID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why config.remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took reference from OS core
) throws IllegalArgumentException { | ||
String modelId; | ||
try { | ||
modelId = (String) config.remove(DEFAULT_MODEL_ID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a better way to get the string. Check this: https://github.com/opensearch-project/neural-search/blob/main/src/main/java/org/opensearch/neuralsearch/processor/factory/TextEmbeddingProcessorFactory.java#L40C9-L40C89
@@ -80,6 +80,7 @@ public Collection<Object> createComponents( | |||
final IndexNameExpressionResolver indexNameExpressionResolver, | |||
final Supplier<RepositoriesService> repositoriesServiceSupplier | |||
) { | |||
NeuralSearchClusterUtil.instance().initialize(clusterService); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should start getting rid of these kind of inits and move towards singleton pattern without this kind of inits.
May be an AI for maintainers
src/test/java/org/opensearch/neuralsearch/processor/EnrichingQueryDefaultProcessorIT.java
Outdated
Show resolved
Hide resolved
}, | ||
"passage_embedding": { | ||
"type": "knn_vector", | ||
"dimension": 768 | ||
}, | ||
"passage_text": { | ||
"type": "text" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we need these new fields? why we cannot use older fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because they don't have text matching
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need text matching?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i wrote a test case where I am passing querytext as hello world. If I don't add this field here then it will fail
Signed-off-by: Varun Jain <varunudr@amazon.com>
Signed-off-by: Varun Jain <varunudr@amazon.com>
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-337-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9c37b0e9dbaac0c4607385c728196f36164f705d
# Push it to GitHub
git push --set-upstream origin backport/backport-337-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x Then, create a pull request where the |
* Support for default Model Id Signed-off-by: Varun Jain <varunudr@amazon.com> * Support for Default Model id Signed-off-by: Varun Jain <varunudr@amazon.com> * Support for default model Id Signed-off-by: Varun Jain <varunudr@amazon.com> * Removing wildcard Imports Signed-off-by: Varun Jain <varunudr@amazon.com> * Typo fix Signed-off-by: Varun Jain <varunudr@amazon.com> * Integ test cases Signed-off-by: Varun Jain <varunudr@amazon.com> * Fixing Integ Test case Signed-off-by: Varun Jain <varunudr@amazon.com> * Addressing Comments Signed-off-by: Varun Jain <varunudr@amazon.com> * Added Visitor test cases and addressed comments Signed-off-by: Varun Jain <varunudr@amazon.com> * Comments Addressed of Jack Signed-off-by: Varun Jain <varunudr@amazon.com> * Addressed changes requested by Martin Signed-off-by: Varun Jain <varunudr@amazon.com> * Addressed changes requested by Martin Signed-off-by: Varun Jain <varunudr@amazon.com> * Fixing test cases Signed-off-by: Varun Jain <varunudr@amazon.com> * Increasing test coverage Signed-off-by: Varun Jain <varunudr@amazon.com> * Renaming and addressing comments of Martin Signed-off-by: Varun Jain <varunudr@amazon.com> * Addressing Comments of Navneet Signed-off-by: Varun Jain <varunudr@amazon.com> * Updating tests Signed-off-by: Varun Jain <varunudr@amazon.com> --------- Signed-off-by: Varun Jain <varunudr@amazon.com>
Signed-off-by: Varun Jain <varunudr@amazon.com>
* [Feature] Support for Default Model Id (#337) Signed-off-by: Varun Jain <varunudr@amazon.com>
}, | ||
"passage_embedding": { | ||
"type": "knn_vector", | ||
"dimension": 768 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this need to have a lucene
as engine, otherwise it uses default which is nmslib
and that fails, as nmslib is not part of the knn zip fopr integ tests @vibrantvarun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this field used by the way?
Description
This PR provides a functionality to client to skip passing model id in the neural search query request.
Issues Resolved
70
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.