-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: rename 'dtype_backend' #58214
Comments
I like |
Is there evidence that users would not be confused if it was called e.g. I feel like this is something that would happen eventually as long as the numpy/arrow dtypes shared names (e.g. "int64" vs "int64[pyarrow]"). |
I don't understand the question. We haven't used any other terms... "backend" has connotations of swappability and an invariant frontend that wouldn't apply to other terms. |
I'm asking since renaming a parameter causes a lot of code churn. For me, personally, it is not clear what a dtype family or flavor is, while dtype backend gives me the understanding that the underlying arrays backing my Series/DataFrame is arrow/numpy/whatever. So, IMO, dtype_backend is more clear than the other terms.
I guess the [citation needed] part was what I was asking for in my previous question. If you could dig that up, that'd be really helpful. |
Totally reasonable concern. My thought is that ATM this is used relatively little, so is easier to change than it would be after #58141 and related.
Also fair. I think there was a lot of confusion surfaced in https://www.reddit.com/r/Python/comments/11fio85/we_are_the_developers_behind_pandas_currently/ about what "backend" means. I remember other things on hackernews that I'm not inclined to dig up. Searching our issues for "backend" i see #53154 has a user expecting identical behavior. I'll update this as I find more of these, as I think "incorrectly expecting identical behavior" is a common complaint. |
I also initially agree with @lithomas1's question here. I'm not fully convinced (yet) that renaming a keyword argument would be able to convey "pick a dtype implementation that is not fully equivalent to the other options". I am open to there being a better term though. |
#58307 another case of incorrectly expecting identical behavior |
Personally, I think this is actually the correct impression. It's how I think most users should think about the backends (so in that sense I don't have a problem with the current naming). I know that in practice this of course not correct in all cases right now, but it could be what we want it to be eventually. And so whenever we get a report about different behaviours, it might be something we should fix. It's something that we should discuss and spell out, tough, what we generally think the expectations should be about those different backends (maybe as part of the PDEP discussion in #58455) |
Reading the room, I'm going to learn to live with users continuing to be confused by this name. Closing. |
This came up in #58141. Discussed briefly at the sprint in August.
I've seen some user confusion [citation needed] stemming from the term "backend" in the "dtype_backend" parameter. It gives the incorrect impression that behaviors are the same across backends, just with different implementations or performance characteristics.
I think we should move away from "backend", renaming the dtype_backend parameter where applicable (with a deprecation cycle where appropriate). Maybe dtype "family"?
The text was updated successfully, but these errors were encountered: