-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query Limits #6024
Comments
Some initial thoughts before starting to work on any of these about various levels of feasability for each of these.
I think this would be possible. We have a function that is called for raw field queries that already has this information. Do we want to restrict how many shards a query touches? Restricting the number of series returned and the number of shards touched are different from each other.
This one we probably won't be able to do before running the query, but it's definitely possible.
This would be relatively easy too. We just need to set a maximum number of buckets that can be returned and this can be prevented when calculating the time range.
With the addition of the query manager, this is pretty easy. When trying to start a query though, should the query block for some timeout until it can be started? If we have a maximum of 5 queries and make a 6th query, should it try waiting 5 seconds before returning an error? While maybe more user friendly, it could become vulnerable to a denial of service attack. We could also modify the wait time to be lower if there are more queries waiting to help servers that are under vastly more load than they can handle.
This is the only one I'm not sure we can do. I haven't found a way in Go to see the used memory for a specific goroutine.
This one is easily possible as either part of the query manager or part of an iterator that wraps other iterators. I would say most of these are possible for 0.13, although we can start working on some of the low hanging fruit and see if we can get some of these into 0.12. |
@jsternberg For the concurrent queries, I think it should just return an error immediately. The user would need need to retry. If the user is getting that error too frequently, they would need to adjust the limit. The points limit idea was to kill a query after it read a certain number of points to help prevent a bad query from overloading the system. |
What if max queries could be set globally and per user? Then a user could execute queries even if the database was getting crushed by queries from another user. |
That's possible. The current version of the query manager that I have up in a PR doesn't support setting limits per user, but I think it's possible to allow that. |
I'm adding stats to the iterators so we can determine this information immediately after planning and then the stats can be checked periodically during execution. Part of the issue is that for distributed queries we'll need to send stats updates within the iterator stream and we won't have true realtime stats for remote nodes. However, I think that's sufficient for this. re: memory limits, I think we an estimate memory usage based on iterator types. It won't be perfect but it should be close. |
I personally consider default on a database would help resolve:
per query:
|
I've created all of the relevant issues so the tasks can be divided and worked on separately. |
Closing this since the remaining tasks has been converted to separate issues. |
There is currently no way for an administrator to prevent bad queries from overloading the system or OOMing the process. The database should have some configurable options to specify limits on queries and automatically kill them before they consume too many resources.
The following limits may be useful:
group by time(1s)
over 1y of points) (Max group by buckets - Prevent queries that would create too many buckets (e.g. group by time(1s) over 1y of points) #6078)Some of these could return an error at query planning time (series count) and others may not be possible until the the limit is hit while running the query (point count).
May depend on #5950
The text was updated successfully, but these errors were encountered: