Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customize query timeouts #249

Closed
begriffs opened this issue Aug 2, 2015 · 8 comments · Fixed by #2742
Closed

Customize query timeouts #249

begriffs opened this issue Aug 2, 2015 · 8 comments · Fixed by #2742
Labels
enhancement a feature, ready for implementation QOS

Comments

@begriffs
Copy link
Member

begriffs commented Aug 2, 2015

Either add a server command line argument to specify the timeout for all connections, or establish it per-user.

For the first option we can make another middleware that composes with authenticated:
https://github.com/begriffs/postgrest/blob/master/src/PostgREST/Main.hs#L83

For the second option (per user) we can add another column to our auth table and update setRole:
https://github.com/begriffs/postgrest/blob/master/src/PostgREST/Auth.hs#L56

We'll use the command SET SESSION STATEMENT_TIMEOUT TO [ms];

/cc @smilliken

@steve-chavez
Copy link
Member

steve-chavez commented Aug 16, 2017

I see this as a requirement for #286, I think we need to restrict the clients api usage for that case.

I think this kind of measure would be better enabled at the database level than at the api level, say the dba needs to set the timeout for each user of the api as in:

ALTER ROLE webuser SET statement_timeout = 30000

Just gonna throw an idea here, maybe PostgREST could have a force-timeout=true(by default true) config param, that could make it query the user current_setting('statement_timeout') for each transaction and if it is undefined or not > 0 it would give an error response.

Now another option I've seen is to enforce a statement_cost_limit because as mentioned in this article:

PostgreSQL has had statement_timeout for a while, which can be set on a per-user basis (or other places) to prevent application errors from running queries for hours. However, this doesn't really solve the "overload" issue, because the query runs for that length of time, gobbling resources until it's terminated. What you really want to do is return an error immediately if a query is going to be too costly.

It seems like a safer option, but it relies on a loadable module and it may be inconvenient for the user to setup, so for now maybe we could discuss timeout as an alternative.

@steve-chavez
Copy link
Member

More thoughts on this, I think restricting the statement cost is the best choice, so I've been testing the plan_filter module, originally I thought that it would be enough to setup the roles each with it's own cost in the db without changing PostgREST code but that doesn't work.

Say we enable the plan_filter module and then do ALTER ROLE .. SET plan_filter.statement_cost_limit = .. for some roles, then we have:

Role plan_filter.statement_cost_limit
authenticator 4000(has to fill dbStructure)
anonymous 500(can only login)
privileged 10000

When doing a POST /rpc/login I was expecting to have statement_cost_limit = 500 and after the successful login when doing a GET /tasks?select=id,users{id} I expected to have statement_cost_limit = 10000 but no the transaction always had a statement_cost_limit = 4000(the limit of the authenticator).

This is because according to the pg docs executing SET ROLE does not cause new configuration values to be set, this only happens at login time. I have an example of this here.

So for the statement_cost_limit to be according to roles settings we would have to do set local plan_filter.statement_cost_limit and query the setting from pg_roles/pg_db_role_setting(this could be cached on dbStructure) in PostgREST.

@steve-chavez
Copy link
Member

I think is important to solve this issue, though PostgREST now doesn't directly expose expensive operations users can still do a:

GET /tasks?select=*,users{*,tasks{*,users{*,tasks{*,users{*,tasks{*}}}}}}

That gives a cost of 1627324.37..1627324.39, relying on the pg_plan_filter avoids these kind of queries.

I propose to have this feature as experimental, also it would only work for postgres instances where you can load the pg_plan_filter module.

The feature would work by having an enforce_cost_limit config param, false by default, if it's set to true PostgREST does:

  • Check that the plan_filter module is loaded at start time by:
select current_setting('plan_filter.module_loaded') = 'on';` 
  • Enforce that the global statement_cost_limit is > 0(0 in plan_filter means no cost) at start time.
  • Query and cache the pg_roles table with settings in dbStructure.
  • Do a set local plan_filter.statement_cost_limit according to current role in Middleware.hs.

Also by having this config param, we could have a way to enable more expensive operations such as distinct safely, we would say that for enabling those operations you need to have enforce_cost_limit set to true.

@begriffs
Copy link
Member Author

@steve-chavez thanks for the research on this one. I was looking through the postgresql source to try and see if there is a hook for set role. Was thinking that pg_plan_filter is pretty simple inside and could listen for the change to reload its variable. However it doesn't look like that's possible, so your suggestion seems like our best option.

@steve-chavez
Copy link
Member

For me the ideal solution would still be to integrate with the pg_plan_filter module. However, for now, to restrict the number of embeds, a good enough solution would be to limit the number of nodes or the whole tree depth. Recently a Depth attribute was added to the Node Type, we could take advantage of that.

@FGRibreau
Copy link
Contributor

FGRibreau commented Jun 11, 2018

statement_cost_limit seems indeed the best way to go, I always got that question when talking about PostgREST/SubZero/PostgraphQL/Prisma/... so If PostgREST could implement it it would easier the transition to this stack :) (and it would give me better confidence in PostgRest)

@steve-chavez
Copy link
Member

@FGRibreau Bear in mind that you can use the pg_plan_filter module now without explicit PostgREST support, you could set a statement_cost_limit high enough(depends on your db workload) to your authenticator user and that would be enforced for the rest of your roles, if any query surpasses the cost limit PostgREST should respond with the plan cost limit exceeded error coming from PostgreSQL.

My proposal here was more about providing finer grained control and enforcing that the cost limit is present. Now that I think about it, the cost limit on the authenticator might be enough for most cases, I'll see if I can add this idea to our Hardening PostgREST section in the docs.

@steve-chavez
Copy link
Member

Doing a set local statement_timeout = 1000(as suggested in #539 (comment)) in each request would also be a good safeguard. Still, cost limit would be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement a feature, ready for implementation QOS
Development

Successfully merging a pull request may close this issue.

3 participants