@@ -38,6 +38,7 @@ object.
3838 * ``read_json ``
3939 * ``read_msgpack `` (experimental)
4040 * ``read_html ``
41+ * ``read_gbq `` (experimental)
4142 * ``read_stata ``
4243 * ``read_clipboard ``
4344 * ``read_pickle ``
@@ -51,6 +52,7 @@ The corresponding ``writer`` functions are object methods that are accessed like
5152 * ``to_json ``
5253 * ``to_msgpack `` (experimental)
5354 * ``to_html ``
55+ * ``to_gbq `` (experimental)
5456 * ``to_stata ``
5557 * ``to_clipboard ``
5658 * ``to_pickle ``
@@ -2905,7 +2907,70 @@ There are a few other available functions:
29052907 For now, writing your DataFrame into a database works only with
29062908 **SQLite **. Moreover, the **index ** will currently be **dropped **.
29072909
2910+ Google BigQuery (Experimental)
2911+ ------------------------------
29082912
2913+ The :mod: `pandas.io.gbq ` module provides a wrapper for Google's BigQuery
2914+ analytics web service to simplify retrieving results from BigQuery tables
2915+ using SQL-like queries. Result sets are parsed into a pandas
2916+ DataFrame with a shape derived from the source table. Additionally,
2917+ DataFrames can be uploaded into BigQuery datasets as tables
2918+ if the source datatypes are compatible with BigQuery ones. The general
2919+ structure of this module and its provided functions are based loosely on those in
2920+ :mod: `pandas.io.sql `.
2921+
2922+ For specifics on the service itself, see: <https://developers.google.com/bigquery/>
2923+
2924+ As an example, suppose you want to load all data from an existing table
2925+ : `test_dataset.test_table `
2926+ into BigQuery and pull it into a DataFrame.
2927+
2928+ .. code-block :: python
2929+
2930+ from pandas.io import gbq
2931+ data_frame = gbq.read_gbq(' SELECT * FROM test_dataset.test_table' )
2932+
2933+ The user will then be authenticated by the `bq ` command line client -
2934+ this usually involves the default browser opening to a login page,
2935+ though the process can be done entirely from command line if necessary.
2936+ Datasets and additional parameters can be either configured with `bq `,
2937+ passed in as options to `read_gbq `, or set using Google's gflags (this
2938+ is not officially supported by this module, though care was taken
2939+ to ensure that they should be followed regardless of how you call the
2940+ method).
2941+
2942+ Additionally, you can define which column to use as an index as well as a preferred column order as follows:
2943+
2944+ .. code-block :: python
2945+
2946+ data_frame = gbq.read_gbq(' SELECT * FROM test_dataset.test_table' , index_col = ' index_column_name' , col_order = ' [col1, col2, col3,...]' )
2947+
2948+ Finally, if you would like to create a BigQuery table, `my_dataset.my_table `, from the rows of DataFrame, `df `:
2949+
2950+ .. code-block :: python
2951+
2952+ df = pandas.DataFrame({' string_col_name' : [' hello' ],
2953+ ' integer_col_name' : [1 ],
2954+ ' boolean_col_name' : [True ]})
2955+ schema = [' STRING' , ' INTEGER' , ' BOOLEAN' ]
2956+ data_frame = gbq.to_gbq(df, ' my_dataset.my_table' , if_exists = ' fail' , schema = schema)
2957+
2958+ To add more rows to this, simply:
2959+
2960+ .. code-block :: python
2961+
2962+ df2 = pandas.DataFrame({' string_col_name' : [' hello2' ],
2963+ ' integer_col_name' : [2 ],
2964+ ' boolean_col_name' : [False ]})
2965+ data_frame = gbq.to_gbq(df2, ' my_dataset.my_table' , if_exists = ' append' )
2966+
2967+
2968+
2969+ .. note ::
2970+
2971+ * There is a hard cap on BigQuery result sets, at 128MB compressed. Also, the BigQuery SQL query language has some oddities,
2972+ see: <https://developers.google.com/bigquery/query-reference>
2973+
29092974STATA Format
29102975------------
29112976
0 commit comments