-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BigQuery: DB-API is very slow #9185
Comments
Peter, can you look into this? 1 minute versus 1 second is quite the difference! I don't know why this would be the case, as the DB-API should be creating a QueryJob behind the scenes, but maybe there's something we're doing wrong to wait for results (such as sleeping between requests or something)? |
This is indeed quite a difference, will check. I confirm that the issue is reproducible. Update: The reason is that results are requested one at a time, because the page size is set to 1, meaning that 100 requests are made to the backend. |
If using a cursor directly, one should set the curr.execute(QUERY)
curr.arraysize = 100 # <-- THIS
result = curr.fetchall() The default value is 1 as specified in PEP 249, meaning that if the attribute is not explicitly set, only one row at a time will be fetched by default. There is also a note on this in PEP 249 also specifies a fetchmany() method with an optional @tswast Do you know the reason why the Also, the |
I don't recall the reason. Possibly, I just didn't see the size parameter? |
Oh, now I think I remember. I think it's because we call |
As discussed on the PR, setting the default page size to |
DB-API is very slow.
google-cloud-bigquery version: 1.19.0
Output
The text was updated successfully, but these errors were encountered: