Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: Autofetch table schema on load if not provided #9108

Merged
merged 7 commits into from
Sep 4, 2019

Conversation

plamut
Copy link
Contributor

@plamut plamut commented Aug 27, 2019

Closes #8142.

This PR adds automatic table schema fetching to the load_table_from_dataframe() method to improve automatic schema detection.

No additional system tests, because there already exists a "schema autodetect" test.

How to test

Check that the code is in-line with ticket specs.

@plamut plamut added the api: bigquery Issues related to the BigQuery API. label Aug 27, 2019
@plamut plamut requested a review from a team August 27, 2019 09:22
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Aug 27, 2019
@plamut plamut added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 27, 2019
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 27, 2019
@plamut
Copy link
Contributor Author

plamut commented Aug 27, 2019

Unit tests passing locally, but failing on Kokoro, investigating...

Update: Found a non-essential lint issue (unused import), but unit tests also fail on the latest master. Even if going quite a long way back in the commit history.

Update 2: Submitted a fix in #9112.

bigquery/google/cloud/bigquery/client.py Outdated Show resolved Hide resolved
bigquery/google/cloud/bigquery/client.py Outdated Show resolved Hide resolved
plamut added 3 commits August 28, 2019 13:16
A similar check is already performed on the server, and server-side
errors are preferred to client errors.
@plamut plamut requested a review from tswast August 28, 2019 13:34
bigquery/tests/unit/test_client.py Show resolved Hide resolved
bigquery/tests/unit/test_client.py Outdated Show resolved Hide resolved
A mock should raise this error instead of returning a table to
trigger schema generation from Pandas dtypes.
@plamut plamut requested a review from tswast August 29, 2019 12:00
@plamut plamut added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 29, 2019
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 29, 2019
@plamut
Copy link
Contributor Author

plamut commented Aug 29, 2019

Restarted due to failed snippets tests (500 - internal server error) - Kokoro log.

@plamut plamut added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 29, 2019
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 29, 2019
@plamut
Copy link
Contributor Author

plamut commented Aug 30, 2019

The test_copy_table_cmek snippets test fails on the latest master, too (500 server error).

@plamut plamut added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Sep 2, 2019
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Sep 2, 2019
@plamut
Copy link
Contributor Author

plamut commented Sep 4, 2019

@tswast Is there anything to add here? The PR has been unblocked now that we merged #9156.

bigquery/google/cloud/bigquery/client.py Outdated Show resolved Hide resolved
@plamut plamut requested a review from tswast September 4, 2019 19:13
@plamut plamut merged commit dce1326 into googleapis:master Sep 4, 2019
@plamut plamut deleted the iss-8142 branch September 4, 2019 20:12
emar-kar pushed a commit to MaxxleLLC/google-cloud-python that referenced this pull request Sep 11, 2019
…9108)

* Autofetch table schema on load if not provided

* Avoid fetching table schema if WRITE_TRUNCATE job

* Skip dataframe columns list check

A similar check is already performed on the server, and server-side
errors are preferred to client errors.

* Raise table NotFound in auto Pandas schema tests

A mock should raise this error instead of returning a table to
trigger schema generation from Pandas dtypes.

* Use list_columns_and_indexes() for names list
emar-kar pushed a commit to MaxxleLLC/google-cloud-python that referenced this pull request Sep 18, 2019
…9108)

* Autofetch table schema on load if not provided

* Avoid fetching table schema if WRITE_TRUNCATE job

* Skip dataframe columns list check

A similar check is already performed on the server, and server-side
errors are preferred to client errors.

* Raise table NotFound in auto Pandas schema tests

A mock should raise this error instead of returning a table to
trigger schema generation from Pandas dtypes.

* Use list_columns_and_indexes() for names list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BigQuery: get table schema if not supplied (and have pyarrow) in load_table_from_dataframe
4 participants