-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: Deprecate NDFrame.as_matrix #18458
Conversation
d981cb8
to
7fd4b71
Compare
Codecov Report
@@ Coverage Diff @@
## master #18458 +/- ##
==========================================
+ Coverage 91.3% 91.32% +0.02%
==========================================
Files 163 163
Lines 49781 49783 +2
==========================================
+ Hits 45451 45464 +13
+ Misses 4330 4319 -11
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Left some comments
doc/source/whatsnew/v0.22.0.txt
Outdated
@@ -82,7 +82,7 @@ Deprecations | |||
~~~~~~~~~~~~ | |||
|
|||
- ``Series.from_array`` and ``SparseSeries.from_array`` are deprecated. Use the normal constructor ``Series(..)`` and ``SparseSeries(..)`` instead (:issue:`18213`). | |||
- | |||
- ``NDFrame.as_matrix`` is deprecated. Use ``NDFrame.values`` instead (:issue:`18458`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use DataFrame
instead of NDFrame
(basically NDFrame should never show up in the docs, it is an internal implementation detail, although there are some places where it leaks through ..)
pandas/core/generic.py
Outdated
@@ -3735,6 +3735,9 @@ def _get_bool_data(self): | |||
|
|||
def as_matrix(self, columns=None): | |||
""" | |||
DEPRECATED: This method will be removed in a future version. | |||
Use :meth:`NDFrame.values` instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment here
pandas/core/generic.py
Outdated
@@ -3770,6 +3773,8 @@ def as_matrix(self, columns=None): | |||
-------- | |||
pandas.DataFrame.values | |||
""" | |||
warnings.warn("This method will be removed in a future version. " | |||
"Use ``.values`` instead.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be a FutureWarning, and you also need to set a stacklevel (this is a top-level method, stacklevel=2
should be the correct one)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also mention as_matrix
explicitly in the message (that is clearer as you don't always directly see what line or what method on a certain line is causing the warning).
And also follow a bit the typical message like "'as_matrix' is deprecated and will be removed in a future version. Use .."
@@ -243,9 +243,9 @@ def test_itertuples(self): | |||
def test_len(self): | |||
assert len(self.frame) == len(self.frame.index) | |||
|
|||
def test_as_matrix(self): | |||
def test_values(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave one test (or now add a test_as_matrix_deprecated
that copies eg the first case of this test) that uses as_matrix
and that asserts it raises a warning and is the same as .values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I've added in below test_repr_with_mi_nat
General question: I am wondering if we shouldn't rather recommend to do |
c5ef79b
to
ae8dc5a
Compare
Interesting, does it conform to Anyway, I suggest that that will be a separate issue from this one. |
I've changed the PR according to the comments. Is this ok? |
@@ -3791,7 +3796,10 @@ def values(self): | |||
int32. By numpy.find_common_type convention, mixing int64 and uint64 | |||
will result in a flot64 dtype. | |||
""" | |||
return self.as_matrix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just return .values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the def of .values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, then we should push this entirely down to the BlockManager, so pass the axis as well. The BM can then do the transpose, we don't like to do things in user code like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should actually deprecate .values
as well, in favor of .to_numpy()
which is the future spelling anyhow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already almost entirely pushed down, since it is calling directly BlockManager.as_matrix (but OK, the transpose could be done inside BlockManager.as_matrix
). The consolidate_inplace
needs to happens in NDFrame I think.
Deprecating .values
is a no-go IMO, but let's deprecate that in another issue if you want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, then we should push this entirely down to the BlockManager, so pass the axis as well. The BM can then do the transpose, we don't like to do things in user code like this.
I'm not too familiar with the BlockManager, except accessing it with ._data
. How do I pass the axis to it, and how do I transpose in a BlockManager? E.g. it doesn't have a .transpose
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
look at _take
in pandas/core/generic.py
, you just pass the get_block_manager_axis
to it.
then add axis=
to def as_matrix
in pandas/core/internals
and do the transpose there.
The reason for pushing this to internals is to avoid all of the wrapping layers (e.g. frame/series, etc.) to know about internal details like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as_matrix
doesn't have a axis parameter. Do you mean adding a new axis parameter there and transposing inside, if axis=0
(where I assume from my limited knowledge that BlockManager.axes[0]
is always the same as dataFrame.axes[1]
, correct?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jreback, I've tried quite a bit today and I can't push this down while getting the tests to pass.
Could you look at my latest commit (not passing ATM) and give some advice?
pandas/tests/frame/test_api.py
Outdated
@@ -369,6 +369,12 @@ def test_values(self): | |||
self.frame.values[:, 0] = 5. | |||
assert (self.frame.values[:, 0] == 5).all() | |||
|
|||
def test_as_matrix_deprecated(self): | |||
with tm.assert_produces_warning(FutureWarning): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the issue number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
ae8dc5a
to
9255589
Compare
61c5a2d
to
579d128
Compare
pandas/core/internals.py
Outdated
if len(self.blocks) == 0: | ||
return np.empty(self.shape, dtype=float) | ||
|
||
other_axis = abs(axis-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't do this
pandas/core/internals.py
Outdated
if items is not None: | ||
mgr = self.reindex_axis(items, axis=0) | ||
mgr = self.reindex_axis(items, axis=other_axis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leave this alone
pandas/core/generic.py
Outdated
return self.as_matrix() | ||
self._consolidate_inplace() | ||
bm_axis = self._get_block_manager_axis(axis=1) | ||
return self._data.as_matrix(axis=bm_axis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of this, just pass transpose = self._AXIS_REVERSED
in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But _data.as_matrix
has no transpose
parameter, do you mean axis=self._AXIS_REVERSED
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no I mean add the argument transpose=False
rather than axis
prob simpler for now.
else: | ||
mgr = self | ||
|
||
if self._is_single_block or not self.is_mixed_type: | ||
return mgr.blocks[0].get_values() | ||
arr = mgr.blocks[0].get_values() | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change this to transpose
instead. should be straightforward from here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, transpose
is of course much better than axis
.
The issue was actually in the if len(self.blocks) == 0:
block, as the empty array also must be transposed.
Everything is green now locally and I've pushed that upstream.
579d128
to
eed2fb6
Compare
A thought: Should I not just change it to |
@topper-123 ok this lgtm. ping on green. |
this is completely private and internal. ping when ready. |
eed2fb6
to
0d47e46
Compare
pandas/core/generic.py
Outdated
self._consolidate_inplace() | ||
if self._AXIS_REVERSED: | ||
return self._data.as_matrix(columns).T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should just return self.values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning self.values implies that columns=None
which isn't necessary true for user code.
pandas/core/generic.py
Outdated
@@ -3842,7 +3848,7 @@ def as_blocks(self, copy=True): | |||
.. deprecated:: 0.21.0 | |||
|
|||
NOTE: the dtypes of the blocks WILL BE PRESERVED HERE (unlike in | |||
as_matrix) | |||
as_array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leave this spelling, this is a top-level method name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
pandas/core/internals.py
Outdated
@@ -3670,19 +3670,22 @@ def copy(self, deep=True, mgr=None): | |||
return self.apply('copy', axes=new_axes, deep=deep, | |||
do_integrity_check=False) | |||
|
|||
def as_matrix(self, items=None): | |||
def as_array(self, transpose=False, items=None): | |||
if len(self.blocks) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a doc-string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
0816d86
to
6ad82de
Compare
6ad82de
to
e48ea09
Compare
pandas/core/generic.py
Outdated
@@ -3770,10 +3773,12 @@ def as_matrix(self, columns=None): | |||
-------- | |||
pandas.DataFrame.values | |||
""" | |||
warnings.warn("method ``as_matrix`` will be removed in a future version. " | |||
"Use ``values`` instead.", FutureWarning, stacklevel=2) | |||
self._consolidate_inplace() | |||
if self._AXIS_REVERSED: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok so this should be the same as below (e.g. passing transpose=
). make sure we still have a test on this (e.g. that passes columns).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, and test_as_matrix_deprecated
has been modified to a take columns
param.
eb0b82d
to
e1066d2
Compare
pandas/core/generic.py
Outdated
@@ -3770,10 +3773,11 @@ def as_matrix(self, columns=None): | |||
-------- | |||
pandas.DataFrame.values | |||
""" | |||
warnings.warn("method ``as_matrix`` will be removed in a future version. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"method method .as_matrix()...."
"Use .values instead"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I not sure I understand, assume you want the backticks gone. Change uploaded.
d8b4d0b
to
a343fe3
Compare
a343fe3
to
48e1fc8
Compare
All green, @jreback |
thanks @topper-123 |
as_matrix() got removed with Pandas 1.0.0 (pandas-dev/pandas#18458), replaced with values
See GH18458(pandas-dev/pandas#18458). Since as_matrix is deprecated for pd.DataFrame and pd.Series, use data.values instead.
Pandas.DataFrame.as_matrix is deprecated since version 0.23.0 (see docs and PR 18458). According to the documentation for Pandas 0.25.1, the recommended function is DataFrame.to_numpy() in place of DataFrame.values or DataFrame.as_matrix(). pandas-dev/pandas#18458 https://pandas.pydata.org/pandas-docs/version/0.23/generated/pandas.DataFrame.as_matrix.html https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.values.html
git diff upstream/master -u -- "*.py" | flake8 --diff
Deprecating
NDFrame.as_matrix
as per discussion in #18262.