Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_dataframe dtypes argument could allow functions that take pandas.Series-like object and return new series #807

Closed
tswast opened this issue Jul 23, 2021 · 2 comments
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@tswast
Copy link
Contributor

tswast commented Jul 23, 2021

Is your feature request related to a problem? Please describe.

Some timestamp-related methods don't quite work with the current string/dtype object logic. For example, I'm having a hard time getting TIME data types to parse as the timedelta64[ns] dtype.

I also think this feature would be useful for extension dtypes like Fletcher (https://github.com/xhochy/fletcher) and GeoPandas (https://geopandas.org/)

Describe the solution you'd like

Look at the dtypes dictionary. If any value is a function / callable, save them for later. Create the initial dataframe using the other dtypes, then follow-up and call the desired transformation functions.

Describe alternatives you've considered

Keep dtypes a simple pass-through to pandas and either:

  • Let the developer call their own transformation functions.
  • Have second argument for transformations

Additional context

Possibly useful when setting up default dtypes #786 #793

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Jul 23, 2021
@tswast tswast self-assigned this Jul 23, 2021
@tswast tswast added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Jul 23, 2021
@tswast
Copy link
Contributor Author

tswast commented Jul 23, 2021

On second-thought, I think keeping dtypes simple is more important. We can have some custom transformations for default dtypes without this feature.

@tswast tswast closed this as completed Jul 23, 2021
@tswast
Copy link
Contributor Author

tswast commented Jul 23, 2021

On third-thought, for flectcher something like this could be useful, but the transformation function should take the arrow buffer(s) and return the pandas series. Definitely shouldn't use dtypes for this, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

1 participant