-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert time columns to dbtime by default in to_dataframe
#862
Comments
Why Do people use BQ TIME columns to represent time deltas? If we're looking for a way to express timedeltas, I wonder if it might be better to leverage structs somehow. |
Is there a
|
Note: this is related to the expected dtypes in the python-bigquery/tests/unit/test_table_pandas.py Lines 99 to 111 in eed311e
|
Looking at that, I think That's what #861 is about -- picking the right dtype for date columns by default. |
Well, I was thinking of |
Sure, but would BQ users expect
🤷 In Python, you can't add |
A datetime/timestamp represents an instantaneous point in time. A date represents a time interval (typically from midnight to midnight). Mapping a date to a datetime is similar, if less egregious :), to mapping a month or year to a datetime.
Sure, but "right" is somewhat relative, and sometimes hard to pin down. Is rightness judged by accuracy or convenience? |
Anyway, I'll implement whatever you wish. :) |
@tswast at your convenience, please let me know what you want me to do or if you'd like to discuss. |
Some thoughts on the pandas connector:
In favor of timedelta:
In favor of datetime.time (object dtype):
Confusing parts:
Conclusion
Longer-term, we should probably make a native |
to_dataframe
to_dataframe
Unless it turns out to be relatively easy to make a |
Pro timedelta: pandas-dev/pandas#10329 -- folks do want to add them... |
to_dataframe
to_dataframe
I think adding a time dtype will be straightforward using the newish extension types ( |
In that issue, they acknowledge that "I should have converted it to a timedelta" and are mainly complaining about the error message. Just because someone expects something to work doesn't mean it should. :) IMO, it would be most natural to convert interval data to timedelta, however #949 and #950. IMO it should also be possible to combine data and time values to create datetime values. |
Reopening, as I notice when I checkout v3 locally, I don't always get
|
to_dataframe
to_dataframe
Good idea. I'll give that a try. |
I get the same error with 3d1af95. Seeing as we have a test for this dtype at https://github.com/googleapis/python-bigquery/blob/v3/tests/system/test_pandas.py#L1033, I'll look more closely at googleapis/python-bigquery-pandas#444, why I'm getting this. Possible pandas-gbq is doing something to cast |
We are explicitly casting to |
Currently TIME columns are just exposed as string objects. This would be a better experience and align with better with the expectations for working with timeseries in pandas https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html
Presumably one could combine a date column with a time column to create a datetime by adding them.
The text was updated successfully, but these errors were encountered: