-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
to_gbq: Allow creation of new tables from DataFrame (and generate schema) #8325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @jacobschaer - I think you're the main one to ask on this? |
@jacobschaer for context - there's a Bloomberg Hackathon that's happening next Saturday and I'm thinking this could be a good project for someone who uses BigQuery and/or Pandas |
Sounds like something someone asked a while ago on Stack. See http://stackoverflow.com/questions/21886742/convert-pandas-dtypes-to-bigquery-type-representation |
yeah, it's certainly not that complicated, just would make it easier for On Fri, Sep 19, 2014 at 3:46 PM, Jacob Schaer notifications@github.com
|
I'm currently using pandas for a project I'm working on and would really like to see a new feature that allows users to create new tables in google big query using I would like to try and develop this feature if no one else is working on it. |
ENH: #8325 Add ability to create tables using the gbq module.
* commit 'v0.17.0rc1-40-gd1feb49': (394 commits) DOC: fix ref to template for plot accessor ENH Move check for inferred compression to before `get_filepath_or_buffer` CI: add py3.5 build ENH Enable streaming from S3 Fix Series.nunique groupby with object DOC: Update perf doc for 10953 TST: Fix skipped unit tests in test_ga. Install python-gflags using pip. pandas-dev#11090 ENH Recognize 's3n' and 's3a' as an S3 address DOC: Comparison with SAS BUG: Use StrictVersion instead of LooseVersion when testing for minimum google api client version pandas-dev#10652 BLD: Install google-api-python-client and httplib2 using pip ENH: Add ability to create tables using the gbq module. pandas-dev#8325 TST: make sure to close stata readers asv bench cleanup - groupby DOC: fix plot submethods whatsnew example CI: support *.pip for installations DOC: Modified incorrect doc-string for DataFrameFormatter and removed outdated doc-string (+1 squashed commit) Squashed commits: [068b1fd] DOC: Modified incorrect doc-string for DataFrameFormatter using new doc-string design (+1 squashed commit) Squashed commits: [12e032d] DOC: Updated doc-string using new doc-string design for DataFrameFormatter ENH Enable bzip2 streaming for Python 3 DOC: update release.rst with the highlites DOC: Categorize whatsnew ...
Small extension on top the
to_gbq
so that you can actually create new tables given only an existing dataframe. Given an arbitraryDataFrame
with a non hierarchical-index, create a schema from it. For now, we'd likely assume thatobject
dtype columns are string and maybe allow for specifying some or all columns for the schema so that int columns with nulls come out correctly (otherwise, they'd be coerced to float columns b/c of nan stuff).E.g.:
Then you could do something like:
and with a named index, that could be added to the schema as well. For now, we could stick to requiring non-hierarchical/MultiIndex, but maybe we could use record types for an index that's MultiIndex in the future?
The text was updated successfully, but these errors were encountered: