Skip to content

ENH: include conversion to nullable float in convert_dtypes() #38117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Nov 29, 2020

Conversation

jorisvandenbossche
Copy link
Member

xref #38110

There is one potentially corner case: what with floats that are all "integer"-like? I think we want to keep returning nullable int for that, at least by default, and that is what I did now in this PR But we might want to add a parameter controlling that behaviour? (but that can also be added later on if there is demand for it)

@jorisvandenbossche
Copy link
Member Author

jorisvandenbossche commented Nov 27, 2020

Also, note, this builds upon #38113. Once that is merged, the changes I needed to do to the test cases will become clear. #38113 is merged

@jorisvandenbossche jorisvandenbossche added this to the 1.2 milestone Nov 27, 2020
@jorisvandenbossche jorisvandenbossche added the NA - MaskedArrays Related to pd.NA and nullable extension arrays label Nov 28, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is after the feature freeze, so moving off 1.2

@jreback jreback removed this from the 1.2 milestone Nov 28, 2020
@jorisvandenbossche
Copy link
Member Author

liking less and less the fact that we have many options here.

would change to an include=, exclude interface. (might involve deprecating the existing options). am -1 on doing this for 1.2 for this reason.

We had a long discussion about this in the original PR (#30929), and I want to remind that it was actually you who asked for the separate options .. (I personally think we should not have added ay of those options in the first place)

If we want to change the interface of convert_dtypes, we can do that later as well. I don't think that needs to block this PR. I could also leave out the option for now (and only have the default behaviour of True)

@jreback
Copy link
Contributor

jreback commented Nov 28, 2020

and am happy to revisit but not at the 11hour

@jorisvandenbossche
Copy link
Member Author

I am happy with leaving out the option, but I think we should still include the actual functionality in 1.2. The float EA array already is included in 1.2, this just enables it in the convert_dtypes (as is documented that this function will change when new nullable dtypes get added).

@jorisvandenbossche jorisvandenbossche added this to the 1.2 milestone Nov 28, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok this is fine (we should change the api but that's a future issue, do we have one?)

is there a test for convert_floating=True & convert_integer=True and it converts floats to ints?

otherwise lgtm

@@ -6173,7 +6182,7 @@ def convert_dtypes(
>>> dfn = df.convert_dtypes()
>>> dfn
a b c d e f
0 1 x True h 10 NaN
0 1 x True h 10 <NA>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth mentioning here that this changed in 1.2

@jorisvandenbossche
Copy link
Member Author

we should change the api but that's a future issue, do we have one?

No, but I will open one.

is there a test for convert_floating=True & convert_integer=True and it converts floats to ints?

Yes, all possible combo's of all parameters are tested, and we have cases of both integer-like floats as well as actual floats that can't be casted to int

@jreback jreback merged commit 4a35f2d into pandas-dev:master Nov 29, 2020
@jreback
Copy link
Contributor

jreback commented Nov 29, 2020

thanks @jorisvandenbossche

@jorisvandenbossche jorisvandenbossche deleted the floating-convert-dtypes branch November 29, 2020 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NA - MaskedArrays Related to pd.NA and nullable extension arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants