-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduce k value to result count #219
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change results in a failing unit test: https://github.com/marqo-ai/marqo/actions/runs/3671015792/jobs/6205945452#step:8:3873
This unit test is regarding protections for when result_count gets too big (over 10k).
This hits the k limit of Marqo-os.
Suggestion:
- Correct the test (perhaps by mocking Marqo-os via mocking a GET request)
- Set the default for this limit to the Marqo-os limit (for the best user experience):
marqo/src/marqo/tensor_search/configs.py
Line 35 in bb54a78
EnvVars.MARQO_MAX_RETRIEVABLE_DOCS: None, |
Also, what are the implications for a large result_count (result_count > 500)? For small Ks this change would result in a speed-up, but latency is probably degraded for these large cases
Default max docs set to 10k in line with marqo-os This is because Marqo-os requires k>= 0.
Tested for missing inner hits. None retrieved |
Ran test suite, all passed |
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
we reduce k to equal the result count, which vastly improves search latency. Since we have moved to lucene we no longer see the error with inner hits not being returned
What is the current behavior? (You can also link to an open issue here)
Search latency is very high due to a high k - this is a problem since we moved from NMSLIB to lucene based search.
What is the new behavior (if this is a feature change)?
Reduces search lateny on large indexes by approx 20x.
Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
no
Have unit tests been run against this PR? (Has there also been any additional testing?)
Related Python client changes (link commit/PR here)
Related documentation changes (link commit/PR here)
Other information:
Please check if the PR fulfills these requirements