-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Searching in a json field path1.path2.path3:value
should return a result
#2312
Comments
I agree it is tempting to have |
I tried to escape the dot but... it does not seem to work. When looking at tantivy code, I saw this |
we can call it a bug then :) |
Fix in quickwit-oss/tantivy#1682 |
Also updated documentation, to explain how nested structure can be searched. Closes #2312
* Update tantivy to fix json path search escaping '.' Also updated documentation, to explain how nested structure can be searched. Closes #2312 * Update docs/reference/query-language.md Co-authored-by: François Massot <francois.massot@gmail.com> Co-authored-by: François Massot <francois.massot@gmail.com>
Currently this does not work as tantivy is doing something slightly different when writing and when searching.
Let's take a document to index that is very common in the OpenTelemetry world:
The doc_mapping is as follows:
When writing the index, tantivy will interpret
k8s.container.name
as the field name and will store that as is along with the value (and the type).When searching document matching the query
resource.k8s.container.name:prometheus
, quickwit will remove theresource
part and give tantivy this term to matchk8s.container.name:prometheus
. The issue is that tantivy will interpret the dots as a separator and will build a term with an internal separator and this won't match what was written previously.I suggest modifying a bit how tantivy writes the JSON terms by using the dots in field names to define segment path.
This approach is not perfect as this will allow mixing
{"resource": {"k8s.container.name": "prometheus"}}
and{"resource": {"k8s": {"container": {"name": "prometheus"}}}}
but having dots in fieldnames is very common in the log world and it will be very painful to escape dots.@fulmicoton what do you think?
The text was updated successfully, but these errors were encountered: