Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incredibly slow performance with GET_MANY on large tables (PostgreSQL) #363

Open
LordSputnik opened this issue Oct 15, 2014 · 2 comments
Open

Comments

@LordSputnik
Copy link

LordSputnik commented Oct 15, 2014

When using PostgreSQL and Flask Restless, GET_MANY requests for large tables (>100K rows) can take up to a minute to return a response.

This is because Flask Restless performs a COUNT query at some point while obtaining the results. In PostgreSQL, due to multi-version concurrency control, each COUNT must (usually) scan all rows of the table, leading to consistently long response times.

By comparison, a simple

SELECT * FROM <table> OFFSET <x> LIMIT <y>

is much faster. But even when turning pagination off (I assumed the COUNT was needed for the num_results field in the JSON response), the COUNT query is still made.

Is there any way that the code can be modified to avoid a COUNT query here?

Edit: PostgreSQL 9.3, flask-restless 0.14.2

@jfinkels
Copy link
Owner

Well if the JSON response should include the total number of rows in a paginated response (which it really should, for the sake of making metadata available to the client), it's going to need to make a count at some point, unless there's some solution I'm missing.

On the other hand, when you say "turning pagination off", I assume you mean setting results_per_page to be 0, and in this case, the way the code exists now, it doesn't really "disable" pagination, but it just seems to assume there should be one gigantic "page" containing everything (see the API._paginated() function). Are you suggesting that we should change the behavior of pagination so that if the user specifies results_per_page=0, then no count should be made (perhaps replacing a count with a len() after getting the resulting list)?

@jfinkels jfinkels added the bug label Oct 22, 2014
@jfinkels
Copy link
Owner

I have implemented a partial solution to this problem for the case when pagination is disabled: 6d9ee97.

To summarize, the problem is that we issue a COUNT in order to compute the pagination links (both for the Link headers and for the links element in the JSON API document). One solution is to provide the user with the ability to disable the generation of those pagination links, thereby allowing Flask-Restless to skip the counting step. So I see this now as a feature request: allow the user to disable pagination links in order to allow Flask-Restless to avoid issuing a COUNT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants