-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add ability to create datasets using the gbq module #11121
Conversation
59b10cf
to
72ea8be
Compare
The build failed on travis. I noticed an unusual error in the build log.
|
@parthea I restarted the build. Maybe it was a one-off failure. |
4cf7ad3
to
a846076
Compare
Ready for review. All tests passed in my local environment,
|
@parthea travis is fixed. go ahead and rebase |
a846076
to
2487a45
Compare
what does create_dataset do? what is the difference between this and a table? list_table is ok. |
why is this needed for #11110 ? |
A table belongs to a dataset. In order to create a table, you must either have an existing dataset or create a new dataset. Tables Datasets From https://cloud.google.com/bigquery/what-is-bigquery#tables, |
The |
can you simply do this via a web interface? we are adding all of these functions which should be done externally to pandas. |
separately I think we need to make the API's more explicit, e.g.
should these be
? |
Yes, it can be done through web interface. From the integration testing point of view, it would be easier to be able to create datasets programatically. Each time the test starts, there is a newly created dataset used for testing. The dataset is deleted programatically after the test. One potential change could be to use the create and delete functions for unit testing purposes only without exposing the functionality globally. |
Yes, I agree this is much better. I will commit a new version. |
why don't we just advertise the I would actually rename this to e.g.
|
sounds good! I'll commit a new version soon. |
2487a45
to
8a4ff80
Compare
8a4ff80
to
fca6876
Compare
Ready for review. All tests passed.
|
""" | ||
|
||
try: | ||
list_dataset_response = self.service.datasets().list(projectId=self.project_id).execute().get('datasets', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use PEP on this, e.g.
...... = self.service.datasets().list(
.......).execute(......)
projectId=self.project_id, | ||
datasetId=dataset_id).execute().get('tables', None) | ||
|
||
if not list_table_response: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return []
@parthea ok, small API change and a doc check. ping when green. (also confirm that this passes locally for you ). |
c945510
to
7b00d8b
Compare
@jreback made the requested changes except for the generator, I don't want to rock the boat too much at the moment :) The indentation in api.rst looks OK on my end. My only concern at this point is that some public API was removed: I've just made a commit for some fixups for some broken tests, as well. Running all the gbq tests locally yields:
Ping me if there's anything else needed. cc: @parthea |
@@ -111,9 +111,7 @@ Google BigQuery | |||
read_gbq | |||
to_gbq | |||
generate_bq_schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should rename this as well. Can you deprecate generate_bq_schema
, with a rename to generate_schema
. (rename the function and provide the original as a wrapper witth the warning, then call the renamed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deprecation is not needed as I think this is not yet in a release? (only added for 0.17.0)
But, I think this function is only useful in combination with a create_table
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, it is from 0.15.2 (it was just not yet included in the stable api docs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are making changes to generate_bq_schema
, I was thinking
generate_bq_schema
could be moved to a private function and removed from the docs. I'm not sure it will be used by many users and removing it would simplify the api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. let's just deprecate it then (put a note in the gbq section) & remove from API.rst
These only existed in master. They have not been released, so this is ok. Just make the small change in the naming of |
@@ -231,7 +240,8 @@ def run_query(self, query, verbose=True): | |||
page_token = query_reply.get('pageToken', None) | |||
|
|||
if not page_token and current_row < total_rows: | |||
raise InvalidPageToken("Required pageToken was missing. Recieved {0} of {1} rows".format(current_row, total_rows)) | |||
raise InvalidPageToken("Required pageToken was missing. Recieved {0} of {1} rows".format(current_row, | |||
total_rows)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try to wrap this line
d563645
to
008a9c9
Compare
@jreback all 53 tests pass locally again. Ping me if you'd like me to squash these commits. Will ping when travis is green. |
Got your comment about deprecating & removing after I had made the rename of |
gr8. pls squash as well. |
…1121 CLN: Make new API objects in the gbq module private and remove from documentation pandas-dev#11121 BUG: Handle GBQ datasets being empty and fix test_list_table test pandas-dev#11121 CLN: Deprecated generate_bq_schema in gbq module in favor of generate_schema pandas-dev#11121
008a9c9
to
5ba5375
Compare
All tests passing locally after squash. Will ping when Travis is done. |
merged via a5276cf |
Deprecated since 0.17.0 xref pandas-devgh-11121
Removed the
bq
command line module fromtest_gbq.py
, as it doesn't support python 3. In order to do this, I had to create functions forcreate_dataset()
,delete_dataset()
anddataset_exists()
.This change is required for #11110
At the same time, I also implemented the following list functions:
list_dataset()
andlist_table()
.