-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
fix column_arrays for array manager #45001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -794,7 +794,14 @@ def column_arrays(self) -> list[ArrayLike]: | |||
""" | |||
Used in the JSON C code to access column arrays. | |||
""" | |||
return self.arrays | |||
|
|||
def convert_array(arr: ArrayLike) -> ArrayLike: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract_array ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't work to convert the ExtensionArray
to a numpy array:
In [1]: import pandas as pd
In [2]: pd.__version__
Out[2]: '1.4.0.dev0+1440.g5ee2c0229b.dirty'
In [3]: from pandas.core.construction import extract_array
In [4]: a=pd.array([1,2,3])
In [5]: a
Out[5]:
<IntegerArray>
[1, 2, 3]
Length: 3, dtype: Int64
In [6]: extract_array(a, True)
Out[6]:
<IntegerArray>
[1, 2, 3]
Length: 3, dtype: Int64
That may be a bug in the implementation of extract_array()
. Should I fix that and use it instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That may be a bug in the implementation of extract_array() . Should I fix that and use it instead?
No, that is behaving as intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That may be a bug in the implementation of extract_array() . Should I fix that and use it instead?
No, that is behaving as intended.
OK, the docs aren't clear, because the parameter extract_numpy=True
then only works for numpy backed extension arrays, otherwise it is ignored for other EA types (e.g., Categorical
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numpy backed extension arrays
yep, that is poor wording on the 'obj' arg; the 'extract_numpy' part of the docstring gets it right. ill write myself a note to fix the wording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why isn't np.asarray
the right answer here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why isn't np.asarray the right answer here?
That would work, yes.
@jreback I think this is good to go - all failures are related to the |
@Dr-Irv can you merge master |
Done. One of the tests timed out - should I try again? |
umm can you just use |
An |
ok can u see if we repeat this code elsewhere iirc o have seen this a few times and don't want to reinvent the wheel yet again |
There is one place:
So a solution would be to move |
yep let's do it, though I think you can leave the convert_array function in |
I will await the suggested location from @jbrockmendel before making that change |
ExtensionBlock.values_for_json just uses Just use np.asarray and be done with it. |
OK that is in commit 178b2dd |
@jbrockmendel @jreback this is now all green and just uses |
thanks @Dr-Irv |
Note - I was doing this in parallel with @phofl skipping the JSON tests for the array manager. So I undid those changes so that the JSON code tests with array manager are done.