-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose phrase-prefix queries via the built-in query parser #2044
Expose phrase-prefix queries via the built-in query parser #2044
Conversation
@adamreichold is this something that you need? |
I would say that it is something that I want: I have an internal UI for testing purposes which just uses the built-in query parser. Hence, if we want to try out some functionality before we commit to integrating it, it is easiest to expose it in the built-in query parser so the data people can play with it. I also thought that it is a rather straight-forward extension of the existing support for phrase queries and slop that does not really introduce anything new conceptually. So the maintenance effort should be limited making this a reasonable trade-off? |
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. @@ Coverage Diff @@
## main #2044 +/- ##
=======================================
Coverage 94.43% 94.43%
=======================================
Files 319 319
Lines 59689 59750 +61
=======================================
+ Hits 56365 56425 +60
- Misses 3324 3325 +1
|
@adamreichold that makes sense... Let's see if we can slip that into the next tantivy release. |
@PSeitz can you take this review? |
@adamreichold Can you add a test with data? What happens for |
I believe those would not be added to the query parser tests but rather to the implementation of phrase-prefix weight (as with the other query types). But that weight already has tests using an index with data. Is there anything specific missing from those tests that should be added?
Nice catch! I added an explicit error for that case as phrase-prefix queries only really make sense with at least two terms (in contrast to phrase queries which sensibly "downgrade" to term queries). |
@PSeitz Friendly ping on the above question. |
I was thinking of an end to end test, that makes sure everything is correctly connected. It's also a form of documentation and helps discovering features |
This proposes the less-than-imaginative syntax `field:"phrase ter"*` to perform a phrase prefix query against `field` using `phrase` and `ter` as the terms. The aim of this is to make this type of query more discoverable and simplify manual testing. I did consider exposing the `max_expansions` parameter similar to how slop is handled, but I think that this is rather something that should be configured via the querser parser (similar to `set_field_boost` and `set_field_fuzzy`) as choosing it requires rather intimiate knowledge of the backing index.
With this motivation in mind, I added a dedicated example as the discoverability of unit tests is limited and adding a doc test to |
…ture discoverability.
I replaced the variable name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks!
This proposes the less-than-imaginative syntax
field:"phrase ter"*
to perform a phrase prefix query againstfield
usingphrase
andter
as the terms. The aim of this is to make this type of query more discoverable and simplify manual testing.I did consider exposing the
max_expansions
parameter similar to how slop is handled, but I think that this is rather something that should be configured via the querser parser (similar toset_field_boost
andset_field_fuzzy
) as choosing it requires rather intimiate knowledge of the backing index.