ENH: include conversion to nullable float in convert_dtypes() #38117

jorisvandenbossche · 2020-11-27T17:07:08Z

There is one potentially corner case: what with floats that are all "integer"-like? I think we want to keep returning nullable int for that, at least by default, and that is what I did now in this PR But we might want to add a parameter controlling that behaviour? (but that can also be added later on if there is demand for it)

jorisvandenbossche · 2020-11-27T17:07:51Z

~~Also, note, this builds upon #38113. Once that is merged, the changes I needed to do to the test cases will become clear.~~ #38113 is merged

…dtypes

jreback

this is after the feature freeze, so moving off 1.2

pandas/core/dtypes/cast.py

jorisvandenbossche · 2020-11-28T21:13:48Z

liking less and less the fact that we have many options here.

would change to an include=, exclude interface. (might involve deprecating the existing options). am -1 on doing this for 1.2 for this reason.

We had a long discussion about this in the original PR (#30929), and I want to remind that it was actually you who asked for the separate options .. (I personally think we should not have added ay of those options in the first place)

If we want to change the interface of convert_dtypes, we can do that later as well. I don't think that needs to block this PR. I could also leave out the option for now (and only have the default behaviour of True)

jreback · 2020-11-28T21:18:07Z

and am happy to revisit but not at the 11hour

jorisvandenbossche · 2020-11-28T22:08:45Z

I am happy with leaving out the option, but I think we should still include the actual functionality in 1.2. The float EA array already is included in 1.2, this just enables it in the convert_dtypes (as is documented that this function will change when new nullable dtypes get added).

jreback

ok this is fine (we should change the api but that's a future issue, do we have one?)

is there a test for convert_floating=True & convert_integer=True and it converts floats to ints?

otherwise lgtm

jreback · 2020-11-29T17:19:34Z

pandas/core/generic.py

@@ -6173,7 +6182,7 @@ def convert_dtypes(
        >>> dfn = df.convert_dtypes()
        >>> dfn
           a  b      c     d     e      f
-        0  1  x   True     h    10    NaN
+        0  1  x   True     h    10   <NA>


might be worth mentioning here that this changed in 1.2

jorisvandenbossche · 2020-11-29T17:28:01Z

we should change the api but that's a future issue, do we have one?

No, but I will open one.

is there a test for convert_floating=True & convert_integer=True and it converts floats to ints?

Yes, all possible combo's of all parameters are tested, and we have cases of both integer-like floats as well as actual floats that can't be casted to int

pandas/tests/series/methods/test_convert_dtypes.py

jreback · 2020-11-29T19:12:58Z

thanks @jorisvandenbossche

jorisvandenbossche added 8 commits November 27, 2020 15:25

add float to convert_dtypes

90bcb4d

fix convert_floating

96b9b07

TST: rewrite convert_dtypes test to make it easier extendable

bf87361

formatting

e2e6bdc

update tests

e411e65

fix implementation for integer-like floats

031acaa

update docs

bf95155

test + fix case of float32 as input

3106644

jorisvandenbossche added this to the 1.2 milestone Nov 27, 2020

Merge remote-tracking branch 'upstream/master' into floating-convert-…

80f8ee9

…dtypes

jorisvandenbossche added the NA - MaskedArrays Related to pd.NA and nullable extension arrays label Nov 28, 2020

jreback requested changes Nov 28, 2020

View reviewed changes

pandas/core/dtypes/cast.py Show resolved Hide resolved

jreback removed this from the 1.2 milestone Nov 28, 2020

jorisvandenbossche added this to the 1.2 milestone Nov 28, 2020

jreback requested changes Nov 29, 2020

View reviewed changes

jreback reviewed Nov 29, 2020

View reviewed changes

jorisvandenbossche commented Nov 29, 2020

View reviewed changes

pandas/tests/series/methods/test_convert_dtypes.py Show resolved Hide resolved

add note about changed behaviour

693a0a4

jreback approved these changes Nov 29, 2020

View reviewed changes

jreback merged commit 4a35f2d into pandas-dev:master Nov 29, 2020

jorisvandenbossche deleted the floating-convert-dtypes branch November 29, 2020 19:29

jorisvandenbossche mentioned this pull request Nov 30, 2020

Follow-up on basic FloatingArray implementation #38110

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: include conversion to nullable float in convert_dtypes() #38117

ENH: include conversion to nullable float in convert_dtypes() #38117

Uh oh!

jorisvandenbossche commented Nov 27, 2020

Uh oh!

jorisvandenbossche commented Nov 27, 2020 •

edited

Loading

Uh oh!

jreback left a comment

Uh oh!

Uh oh!

jorisvandenbossche commented Nov 28, 2020

Uh oh!

jreback commented Nov 28, 2020

Uh oh!

jorisvandenbossche commented Nov 28, 2020

Uh oh!

jreback left a comment

Uh oh!

jreback Nov 29, 2020

Uh oh!

jorisvandenbossche commented Nov 29, 2020

Uh oh!

Uh oh!

jreback commented Nov 29, 2020

Uh oh!

Uh oh!

Uh oh!

ENH: include conversion to nullable float in convert_dtypes() #38117

ENH: include conversion to nullable float in convert_dtypes() #38117

Uh oh!

Conversation

jorisvandenbossche commented Nov 27, 2020

Uh oh!

jorisvandenbossche commented Nov 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jorisvandenbossche commented Nov 28, 2020

Uh oh!

jreback commented Nov 28, 2020

Uh oh!

jorisvandenbossche commented Nov 28, 2020

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

jreback Nov 29, 2020

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche commented Nov 29, 2020

Uh oh!

Uh oh!

jreback commented Nov 29, 2020

Uh oh!

Uh oh!

jorisvandenbossche commented Nov 27, 2020 •

edited

Loading