Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query Limits #6024

Closed
jwilder opened this issue Mar 16, 2016 · 9 comments
Closed

Query Limits #6024

jwilder opened this issue Mar 16, 2016 · 9 comments
Assignees
Milestone

Comments

@jwilder
Copy link
Contributor

jwilder commented Mar 16, 2016

There is currently no way for an administrator to prevent bad queries from overloading the system or OOMing the process. The database should have some configurable options to specify limits on queries and automatically kill them before they consume too many resources.

The following limits may be useful:

Some of these could return an error at query planning time (series count) and others may not be possible until the the limit is hit while running the query (point count).

May depend on #5950

@jwilder jwilder added this to the 0.12.0 milestone Mar 16, 2016
@jwilder
Copy link
Contributor Author

jwilder commented Mar 16, 2016

cc @benbjohnson @jsternberg @pauldix

@jsternberg
Copy link
Contributor

Some initial thoughts before starting to work on any of these about various levels of feasability for each of these.

Number of series - Kill queries that would access too many series at once

I think this would be possible. We have a function that is called for raw field queries that already has this information. Do we want to restrict how many shards a query touches? Restricting the number of series returned and the number of shards touched are different from each other.

Number of points - Kill queries that access too many points

This one we probably won't be able to do before running the query, but it's definitely possible.

Max group by buckets - Prevent queries that would create too many buckets (e.g. group by time(1s) over 1y of points)

This would be relatively easy too. We just need to set a maximum number of buckets that can be returned and this can be prevented when calculating the time range.

Max concurrent queries - Limit the number of concurrent queries that can be run at once

With the addition of the query manager, this is pretty easy. When trying to start a query though, should the query block for some timeout until it can be started? If we have a maximum of 5 queries and make a 6th query, should it try waiting 5 seconds before returning an error? While maybe more user friendly, it could become vulnerable to a denial of service attack. We could also modify the wait time to be lower if there are more queries waiting to help servers that are under vastly more load than they can handle.

Max memory - Kill queries that allocate more than certain amount of RAM

This is the only one I'm not sure we can do. I haven't found a way in Go to see the used memory for a specific goroutine.

Max query time - Kill queries that run for longer than a certain duration

This one is easily possible as either part of the query manager or part of an iterator that wraps other iterators.

I would say most of these are possible for 0.13, although we can start working on some of the low hanging fruit and see if we can get some of these into 0.12.

@jwilder
Copy link
Contributor Author

jwilder commented Mar 17, 2016

@jsternberg For the concurrent queries, I think it should just return an error immediately. The user would need need to retry. If the user is getting that error too frequently, they would need to adjust the limit. The points limit idea was to kill a query after it read a certain number of points to help prevent a bad query from overloading the system.

@gunnaraasen
Copy link
Contributor

While maybe more user friendly, it could become vulnerable to a denial of service attack.

What if max queries could be set globally and per user? Then a user could execute queries even if the database was getting crushed by queries from another user.

@jsternberg
Copy link
Contributor

That's possible. The current version of the query manager that I have up in a PR doesn't support setting limits per user, but I think it's possible to allow that.

@benbjohnson
Copy link
Contributor

I'm adding stats to the iterators so we can determine this information immediately after planning and then the stats can be checked periodically during execution. Part of the issue is that for distributed queries we'll need to send stats updates within the iterator stream and we won't have true realtime stats for remote nodes. However, I think that's sufficient for this.

re: memory limits, I think we an estimate memory usage based on iterator types. It won't be perfect but it should be close.

@steverweber
Copy link

I personally consider Max query time the most important of the proposed limits.
Would be nice to set a default on a database and override per query.

default on a database would help resolve:

noobs using grafana and zooming out to view to much data , causing crashes.

per query:

power users that need to override the database query time limit on large queries.

@jsternberg
Copy link
Contributor

I've created all of the relevant issues so the tasks can be divided and worked on separately.

@jwilder
Copy link
Contributor Author

jwilder commented Mar 31, 2016

Closing this since the remaining tasks has been converted to separate issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants