-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter bar value suggestions - filter by time #15887
Comments
We chose to do this intentionally. Just because a field value isn't in the current result set doesn't always mean a user isn't interested in filtering on it. Finding out that no docs match a filter in a certain time range might be just as important finding out that docs do match. We made a number of optimizations to the suggestions request so that it should be performant enough. Are you experiencing actual issues related to these requests? If so, we can see if there are any further optimizations that might help your case. We could add an advanced option allowing admins to choose whether the time filter applies to these suggestions, but I'd prefer to leave that as a last resort. What do you think @lukasolson ? |
Just as @Bargs said, this was an intentional decision. I'd also prefer not to introduce an additional advanced setting for this. I'd be interested to hear more of the reasoning behind wanting to limit the query... Is it performance related, or related to the accuracy of the results? |
It is related to both performance & accuracy :) We have a cluster with daily indices, and when ever a user starts to filter the query is being run over around 4000 shards, with more added every day. Our data is based on sports events, and the bigger the event the more docs you have during that event time. Eventually only 10 results are returned, which might not be what the user is looking for. In fact, now that I think of it, I think that the dashboard filters/query should also be added to the suggestions query. |
Another thing is I don't understand why you are setting Wouldn't it be much faster to remove both the |
a checkbox to "use current filters" in the suggestion would be pretty cool & useful |
It's not always a match all, we use what the user has typed into the box to filter down the results. In the example you provided above where a user doesn't see the value they want in the first 10 results, I'd expect them to start typing in part of the value to get more targeted suggestions, just like any autocomplete implementation. The opposite scenario doesn't work as well. If we start with the time range applied and the result set is too narrow, the user has no way to get more suggestions other than expanding the range in the time picker which won't be at all obvious to most people. This is why I'd prefer to cast a net that's too wide rather than too narrow. As for performance, could you collect some timings of the suggestions request with and without a date range applied? You should be able to do that in Console. I'd be surprised if there's a big difference since we're using |
Regarding the performance benchmark: Without a time filter the filter bar suggestions is useless (At least in our case). Here are the detailed benchmark results: Querying for the
Results:
With a 24h time filter:
Results:
|
We've patched Kibana to add the timespan to the suggestions queries. Both because it does make it a bit faster for our use case and that when you are filtering, it makes more sense to only show users what will actually cause data to be filtered. Showing values outside of the existing timespan just means that they get no results and is confusing to them since the filter was auto-suggested. I like the idea of @shaharmor where the existing filter/query should also be used as well to give even more targeted suggestions. |
@Bargs having a checkbox in that filter (preset on) "use current filters and time selection" would provide the benefits to both audiences while maintaining clarity. |
It should also be configurable at the Kibana level for all filters at once |
Hmm... Interesting that even though the request includes |
It's very hard to configure a query so that latency is below a given threshold, so Elasticsearch only stops processing more documents after the
This would suggest these steps take close to 19 seconds. Is the response time consistently reproducible? If yes, could we try to capture hot threads a couple times while the query is running? |
@jpountz per your request: I re-ran the same test again, same results: 17s - 19s response time with no cache. I ran the query and while it was running I ran the hot_threads command 3 times, each with There are 3 servers that hold the shards in question, so the hot_threads command was run with a filter on those 3 servers alone. (Each hot_threads log contains all 3 servers) If there are more details you need let me know. Here are the results: |
Argh I had lost track of this discussion and now the pastebins have expired. Sorry for that. @shaharmor Do you still have them by any chance? |
Unfortunately no, but I will try to run it again |
There is another instance of a user having issues with the pressure created by the KQL queries here: https://discuss.elastic.co/t/kql-value-suggestions-are-killing-my-cluster/196556/7 @lukasolson @Bargs @stacey-gammon it seems like it might be worth bringing this discussion up again since KQL is now the default. Trevan raised a good point about limited the results to the current timespan as it would otherwise result in no results. In addition, the inability to scale due to the auto-complete query hitting every shard is concerning. |
Seeing as how we now allow configuring things like |
Pinging @elastic/kibana-app-arch (Team:AppArch) |
I'm very interested by this feature! |
I believe this is resolved by #81515. |
@lukasolson I actually changed it for the suggestions in the search bar, but not for the filters. |
Kibana version: 6.0
Elasticsearch version: 6.0
Description of the problem including expected versus actual behavior:
When using the filter value suggestions feature, the query that Kibana is making is not filtered by the time filter of the dashboard itself, which can cause a big in ES.
There are some limitations in place to help mitigate it, but I see no reason to query indices that are out of the range of the dashboard.
In fact, I'm pretty sure it can harm because it might show field values that don't have any data for the selected range
Steps to reproduce:
Edit by @lizozom: This will be available and enabled by default starting 7.15.0, however, due to #100174, you could also consider turning this off (
Advanced Settings
>autocomplete:useTimeRange
) and get autocomplete suggestions from your entire data set with much smaller performance implications.The text was updated successfully, but these errors were encountered: