-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add batch_size
param for text_embedding
processor
#1298
Merged
Xtansia
merged 8 commits into
opensearch-project:main
from
YeonghyeonKO:feat/batchSize-in-TextEmbeddingProcessor
Nov 18, 2024
Merged
Add batch_size
param for text_embedding
processor
#1298
Xtansia
merged 8 commits into
opensearch-project:main
from
YeonghyeonKO:feat/batchSize-in-TextEmbeddingProcessor
Nov 18, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
YeonghyeonKO
requested review from
reta,
Bukhtawar,
dblock,
szczepanczykd,
madhusudhankonda,
saratvemulapalli,
VachaShah and
Xtansia
as code owners
November 17, 2024 03:45
Xtansia
reviewed
Nov 18, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @YeonghyeonKO!
Just a few things to address:
- Please add an entry to the CHANGELOG under the
### Added
section below## [Unreleased 2.x]
- Please sign off your commits to pass DCO, see the info here on how to do this: https://github.com/opensearch-project/opensearch-java/pull/1298/checks?check_run_id=33093178665
- Please run
./gradlew spotlessApply
to fix code formatting.
...client/src/test/java/org/opensearch/client/opensearch/ingest/TextEmbeddingProcessorTest.java
Outdated
Show resolved
Hide resolved
...client/src/test/java/org/opensearch/client/opensearch/ingest/TextEmbeddingProcessorTest.java
Outdated
Show resolved
Hide resolved
Signed-off-by: YeonghyeonKO <dk02315@gmail.com>
Signed-off-by: YeonghyeonKO <dk02315@gmail.com>
Signed-off-by: YeonghyeonKO <dk02315@gmail.com>
Signed-off-by: YeonghyeonKO <dk02315@gmail.com>
YeonghyeonKO
force-pushed
the
feat/batchSize-in-TextEmbeddingProcessor
branch
from
November 18, 2024 12:55
90b81ab
to
7f2aaa3
Compare
Signed-off-by: YeonghyeonKO <dk02315@gmail.com>
Signed-off-by: YeonghyeonKO <dk02315@gmail.com>
Signed-off-by: YeonghyeonKO <dk02315@gmail.com>
Signed-off-by: Thomas Farr <tsfarr@amazon.com>
Xtansia
approved these changes
Nov 18, 2024
opensearch-trigger-bot bot
pushed a commit
that referenced
this pull request
Nov 18, 2024
* Add batchSize parameter for text_embedding processor Signed-off-by: YeonghyeonKO <dk02315@gmail.com> * throw IllegalArgumentException when batchSize is not a positive integer Signed-off-by: YeonghyeonKO <dk02315@gmail.com> * test: add test cases for BatchSize param Signed-off-by: YeonghyeonKO <dk02315@gmail.com> * test: exception when batchSize is zero or negative integer Signed-off-by: YeonghyeonKO <dk02315@gmail.com> * refactor: use assertNotNull for readability & convention Signed-off-by: YeonghyeonKO <dk02315@gmail.com> * update CHANGELOG about #1298 PR Signed-off-by: YeonghyeonKO <dk02315@gmail.com> * apply code convention to keep the codes spotless Signed-off-by: YeonghyeonKO <dk02315@gmail.com> --------- Signed-off-by: YeonghyeonKO <dk02315@gmail.com> Signed-off-by: Thomas Farr <tsfarr@amazon.com> Co-authored-by: Thomas Farr <tsfarr@amazon.com> (cherry picked from commit 6c3e68f) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Xtansia
added a commit
that referenced
this pull request
Nov 18, 2024
* Add batchSize parameter for text_embedding processor * throw IllegalArgumentException when batchSize is not a positive integer * test: add test cases for BatchSize param * test: exception when batchSize is zero or negative integer * refactor: use assertNotNull for readability & convention * update CHANGELOG about #1298 PR * apply code convention to keep the codes spotless --------- (cherry picked from commit 6c3e68f) Signed-off-by: YeonghyeonKO <dk02315@gmail.com> Signed-off-by: Thomas Farr <tsfarr@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Thomas Farr <tsfarr@amazon.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Unlike the document from text_embedding processor in ingest-pipelines describes
batch_size
parameter, interface in opensearch-java client doesn't include it. From OpenSearch 2.16, it's possible to add batch inference support in ingest processors which inherits from AbstractBatchingProcessor(opensearch-project/neural-search#820). From now on, OpenSearch java client doesn't support batch_size parameter(optional) when defining text_embedding processor.Since @miguel-vila's contribution by adding TextEmbeddingProcessor has been merged, there was another big change in opensearch-project/neural-search(opensearch-project/neural-search#820). In line with this change, I attempted to modify the code of the text_embedding processor of the opensearch-java client.
Issues Resolved
This PR is related to #1297
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.