Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: load_table_from_dataframe should use a temporary file #7543

Closed
tswast opened this issue Mar 22, 2019 · 0 comments · Fixed by #7545
Closed

BigQuery: load_table_from_dataframe should use a temporary file #7543

tswast opened this issue Mar 22, 2019 · 0 comments · Fixed by #7545
Assignees
Labels
api: bigquery Issues related to the BigQuery API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@tswast
Copy link
Contributor

tswast commented Mar 22, 2019

load_table_from_dataframe currently uses BytesIO when it serializes a pandas dataframe to parquet before uploading it via a load job. This is actually violating the contract for to_parquet, which requires a filepath. BytesIO happens to work when pyarrow is used but not with fastparquet. A more minor reason we may wish to serialize to disk is that dataframe can sometimes be quite large, so spilling to disk would be preferable to filling up memory. Note: the function should clean up after itself by removing the temp file after the load job completes.

@tswast tswast self-assigned this Mar 22, 2019
@tswast tswast added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. api: bigquery Issues related to the BigQuery API. labels Mar 22, 2019
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Mar 23, 2019
@tseaver tseaver removed the triage me I really want to be triaged. label Apr 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants