-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shard request cache and script queries/aggregations #49321
Comments
Pinging @elastic/es-search (:Search/Search) |
I'm seeing scripted queries/aggs as a way to trade performance for flexibility, as they allow to do things that had not been planned at index time. Since these are already trading performance for something else, it doesn't feel right to me to now trade correctness for performance by enabling caching when the user declares it is safe. Maybe tell us more about your usage of scripts? I wonder that you might be using scripts as a workaround to a missing aggregation feature? |
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
I'm personally a heavy script user and my usage patterns certainly shouldn't be taken as representative :) but for the purposes of discussion, my uses of scripts include:
So it could be summarized as a mix of "missing aggregation features", (related) "trading off performance to provide (query-time) flexibility". and to a lesser extent "trading off performance to keep all logic in one place" In all cases I'm not so much trading off "correctness for performance" with cache, I'm trading off memory for performance (based on the knowledge/expectation that there will be a large number of queries with the same results in a given time period) |
Thanks for the detailed reply!
This is exactly the reason why we have this cache. :)
That one sounds interesting to me. Do I understand correctly that instead of sorting terms by doc_count descending, you want to sort them by descending weight? Or maybe even descending weight*doc_count? Can you tell me more about the higher-level use-case, is it something like a rollup? To be clear I'm not against caching scripted queries or aggs, but I'm worried about allowing users to cache data that is not cacheable. My preferred way of fixing this would be by enabling Painless to tell us when a script is deterministic or not, so that we could make caching decisions accordingly. @jdconrad @stu-elastic Do you think it'd be doable? |
This caught my eye as well, would love to know more. We've talked about making (although it would have different semantics since it's only sorting the final list of buckets, instead of all the buckets at runtime). |
So, I think we could make this possible through Painless for which scripts are deterministic, but I don't think it would be all that useful unless we are safe to assume that any access to docvalues (or _source) or whatever else the user is doing would be flushed from the cache upon changes. And if anything is done from user-defined params (are weights done this way or is a new script created every time with constants?) then it's also not deterministic as we explicitly expect those to be changed throughout a script's life. The other thing is right now Painless isn't really aware of something like Edit: After thinking about this I realized that all those values are deterministic because otherwise the cache wouldn't work. (Oops.) I think Painless only has one non-deterministic methods right now in randomUUID. |
Fixed by the following changes:
|
Support for caching queries including scripts:
Although the documentation for the shard request query currently says:
In practice the cache is skipped whenever
ScriptService
is usedThis is intentional, per @jimczi:
An alternative (which per the docs seems consistent with how some other scenarios are handled) would be to default to skipping the cache in such cases but allow use of the existing
request_cache=true
param for clients to declare their script is deterministic and can be cachedNote that scripted aggregations are often very expensive and therefore great candidates to be cached!
The text was updated successfully, but these errors were encountered: