-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vector search workload with no train procedure as default #144
Add vector search workload with no train procedure as default #144
Conversation
4cbd141
to
d40f7d4
Compare
This PR contains benchmark workload that was previously added to knn repository . This PR contains only indexing and search component. Other features like training will be added in subsequent PR. |
@rishabh6788 To run this workload, we have dependencies to library like numpy and h5py. Should this be added in this workload or to opensearch-benchmark repository? It is good to be in this repository, provided that while checking out this repository we also install requirements. |
116f071
to
73425c6
Compare
When using default num of segments
|
73425c6
to
0534761
Compare
33a70b9
to
6f40d5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com>
6bc2022
to
48b20ff
Compare
@VijayanB I understand now. To clarify this, could you add a section in the README stating that the files |
@IanHoang Any pending comments needs to be addressed? |
@gkamat can you take a look at this PR? Thanks |
vectorsearch/README.md
Outdated
Currently, we support one test procedures for the vector search workload: | ||
no-train-test that does not have steps to train a model included in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
support only one test procedure for the vector search workload. This is named no-train-test
and does not include the steps required to train the model being used.
Please indicate how the training steps are supposed to be carried out. Or if the expectation is that the workload is to be run on an untrained system, please clarify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gkamat I will update the text and will add new procedure which can use model in future. Do you recommend to mention this future work in README?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be ideal. Subsequently, you can update the writeup when the new procedure gets added.
371fc9c
to
026ecc5
Compare
Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com>
026ecc5
to
6835e95
Compare
Please confirm this is intended for backport to both the 1 and 2 branches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please confirm the backport labels are set correctly before merging. Thanks.
@gkamat yes, This is supported for both OpenSearch 1.x and 2.x |
* Add knnvector as new workload Create new workload to benchmark performacne of knn_vector field type. Added unit test and procedure for notrain. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Update README Update readme to include how to execute this workload. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add new param file faiss enginge Added new param file to index/search vector search using faiss as engine type Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Rename knnvector to vectorsearch Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add lucene engine Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * fix code review comments Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> --------- Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> (cherry picked from commit bdbd4bb) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Add knnvector as new workload Create new workload to benchmark performacne of knn_vector field type. Added unit test and procedure for notrain. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Update README Update readme to include how to execute this workload. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add new param file faiss enginge Added new param file to index/search vector search using faiss as engine type Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Rename knnvector to vectorsearch Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add lucene engine Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * fix code review comments Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> --------- Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> (cherry picked from commit bdbd4bb) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…156) * Add knnvector as new workload Create new workload to benchmark performacne of knn_vector field type. Added unit test and procedure for notrain. * Update README Update readme to include how to execute this workload. * Add new param file faiss enginge Added new param file to index/search vector search using faiss as engine type * Rename knnvector to vectorsearch * Add lucene engine * fix code review comments --------- (cherry picked from commit bdbd4bb) Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…157) * Add knnvector as new workload Create new workload to benchmark performacne of knn_vector field type. Added unit test and procedure for notrain. * Update README Update readme to include how to execute this workload. * Add new param file faiss enginge Added new param file to index/search vector search using faiss as engine type * Rename knnvector to vectorsearch * Add lucene engine * fix code review comments --------- (cherry picked from commit bdbd4bb) Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Add knnvector as new workload Create new workload to benchmark performacne of knn_vector field type. Added unit test and procedure for notrain. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Update README Update readme to include how to execute this workload. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add new param file faiss enginge Added new param file to index/search vector search using faiss as engine type Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Rename knnvector to vectorsearch Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add lucene engine Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * fix code review comments Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> --------- Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> (cherry picked from commit bdbd4bb) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Add knnvector as new workload Create new workload to benchmark performacne of knn_vector field type. Added unit test and procedure for notrain. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Update README Update readme to include how to execute this workload. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add new param file faiss enginge Added new param file to index/search vector search using faiss as engine type Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Rename knnvector to vectorsearch Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add lucene engine Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * fix code review comments Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> --------- Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> (cherry picked from commit bdbd4bb) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…arch-project#144) * Add knnvector as new workload Create new workload to benchmark performacne of knn_vector field type. Added unit test and procedure for notrain. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Update README Update readme to include how to execute this workload. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add new param file faiss enginge Added new param file to index/search vector search using faiss as engine type Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Rename knnvector to vectorsearch Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add lucene engine Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * fix code review comments Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> --------- Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com>
Description
Add vector search workload to benchmark performance of indexing and search using knn_vector as field type.
Issues Resolved
#140
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.