to_gbq: Allow creation of new tables from DataFrame (and generate schema)

Small extension on top the `to_gbq` so that you can actually create new tables given only an existing dataframe. Given an arbitrary `DataFrame` with a _non_ hierarchical-index, create a schema from it. For now, we'd likely assume that `object` dtype columns are string and maybe allow for specifying some or all columns for the schema so that int columns with nulls come out correctly (otherwise, they'd be coerced to float columns b/c of nan stuff).

E.g.:

``` python
In [6]: import pandas as pd

In [7]: import pandas.util.testing as testing

In [8]: df = testing.makeMixedDataFrame()

In [9]: df
Out[9]:
   A  B     C          D
0  0  0  foo1 2009-01-01
1  1  1  foo2 2009-01-02
2  2  0  foo3 2009-01-05
3  3  1  foo4 2009-01-06
4  4  0  foo5 2009-01-07

In [10]: df.dtypes
Out[10]:
A           float64
B           float64
C            object
D    datetime64[ns]
dtype: object
```

Then you could do something like:

``` python
In [11]: generate_bq_schema(df)
Out[11]:
{'fields': [{'name': 'A', 'type': 'FLOAT'},
  {'name': 'B', 'type': 'FLOAT'},
  {'name': 'C', 'type': 'STRING'},
  {'name': 'D', 'type': 'TIMESTAMP'}]}
```

and with a named index, that could be added to the schema as well.  For now, we could stick to requiring non-hierarchical/MultiIndex, but maybe we could use record types for an index that's MultiIndex in the future?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

to_gbq: Allow creation of new tables from DataFrame (and generate schema) #8325

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

to_gbq: Allow creation of new tables from DataFrame (and generate schema) #8325

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions