-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support INTERVAL data type #826
Comments
Full pandas / arrow support may have to wait. Getting an error from the BQ Storage API when I added
to the |
Query parameter support doesn't appear to be implemented yet in the backend. I've filed internal issue 195050789. |
I've filed internal issue 195051077 to support ISO 8601 Duration. |
Issue 195051077 is now resolved. Both |
Hi, @tswast - would you be able to summarize the status of this request? Are there outstanding blockers or is it just a question of prioritization? My team would love to be able to support |
@tboddyspargo Last I checked, the BQ Storage API now sends back data of this type https://arrow.apache.org/docs/python/generated/pyarrow.month_day_nano_interval.html which accounts for the calendar-based intervals that BigQuery supports. Easiest thing for pandas-gbq and this package to do would be to wrap that type in a pandas ArrowDtype. Would love to see a PR for that at some point. Main tricky part is we need to transform the response from the REST API into that same arrow type. I believe that requires updating the type mappings here: https://github.com/googleapis/python-bigquery/blob/main/google/cloud/bigquery/_pyarrow_helpers.py and the types mappers here: https://github.com/googleapis/python-bigquery/blob/main/google/cloud/bigquery/_pandas_helpers.py For context: pandas is making a lot of progress making the ArrowDtype type act like other timestamp/timedelta operations, but I'm not aware of any work in pandas to support |
For the write side, we might need some backend changes. I'm not seeing INTERVAL listed in the parquet data types here: https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-parquet#parquet_conversions |
The work to be done to get closer to interval support here:
I plan to complete this work in 3 weeks before the end of my internship, if I am unable to then this work can be taken up by anyone else interested in adding this support. |
"BigQuery now supports the INTERVAL type, which represents a duration or an amount of time. This type is in Preview."
https://cloud.google.com/bigquery/docs/release-notes#July_27_2021
An INTERVAL object represents duration or amount of time. Interval is composed of three independent parts:
[sign]Y-M
: Years and Months[sign]D
: Days[sign]H:M:S.F
: Hours, Minutes, Seconds and Subseconds.Canonical format
Y
: YearM
: MonthD
: DayH
: HourM
: MinuteS
: Second[.F]
: Up to six fractional digits (microsecond precision)TODO:
list_rows()
) feat: add support for INTERVAL data type tolist_rows
#840insert_rows
Edit: Removed pandas, arrow, db-api support in favor of #836, as those implementations are currently blocked on Arrow and the BQ Storage API.
The text was updated successfully, but these errors were encountered: