You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using PostgreSQL and Flask Restless, GET_MANY requests for large tables (>100K rows) can take up to a minute to return a response.
This is because Flask Restless performs a COUNT query at some point while obtaining the results. In PostgreSQL, due to multi-version concurrency control, each COUNT must (usually) scan all rows of the table, leading to consistently long response times.
By comparison, a simple
SELECT*FROM<table> OFFSET <x>LIMIT<y>
is much faster. But even when turning pagination off (I assumed the COUNT was needed for the num_results field in the JSON response), the COUNT query is still made.
Is there any way that the code can be modified to avoid a COUNT query here?
Edit: PostgreSQL 9.3, flask-restless 0.14.2
The text was updated successfully, but these errors were encountered:
Well if the JSON response should include the total number of rows in a paginated response (which it really should, for the sake of making metadata available to the client), it's going to need to make a count at some point, unless there's some solution I'm missing.
On the other hand, when you say "turning pagination off", I assume you mean setting results_per_page to be 0, and in this case, the way the code exists now, it doesn't really "disable" pagination, but it just seems to assume there should be one gigantic "page" containing everything (see the API._paginated() function). Are you suggesting that we should change the behavior of pagination so that if the user specifies results_per_page=0, then no count should be made (perhaps replacing a count with a len() after getting the resulting list)?
I have implemented a partial solution to this problem for the case when pagination is disabled: 6d9ee97.
To summarize, the problem is that we issue a COUNT in order to compute the pagination links (both for the Link headers and for the links element in the JSON API document). One solution is to provide the user with the ability to disable the generation of those pagination links, thereby allowing Flask-Restless to skip the counting step. So I see this now as a feature request: allow the user to disable pagination links in order to allow Flask-Restless to avoid issuing a COUNT.
When using PostgreSQL and Flask Restless, GET_MANY requests for large tables (>100K rows) can take up to a minute to return a response.
This is because Flask Restless performs a COUNT query at some point while obtaining the results. In PostgreSQL, due to multi-version concurrency control, each COUNT must (usually) scan all rows of the table, leading to consistently long response times.
By comparison, a simple
is much faster. But even when turning pagination off (I assumed the COUNT was needed for the num_results field in the JSON response), the COUNT query is still made.
Is there any way that the code can be modified to avoid a COUNT query here?
Edit: PostgreSQL 9.3, flask-restless 0.14.2
The text was updated successfully, but these errors were encountered: