BUG: Allow using numpy in `DataFrame.eval` and `DataFrame.query` via @-notation #58057

domsmrz · 2024-03-28T23:01:38Z

closes BUG: Numpy not directly usable in DataFrame.eval and DataFrame.query #58041
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
~~Added type annotations to new arguments/methods/functions.~~ (not applicable)
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

@np

Add additional check to make sure that ndim of the provided variable is an int. This makes sure that the ndim guard doesn't trigger if it is something else (e.g., a function in case of a numpy). This allow for using @np in df.eval and df.query methods.

doc/source/whatsnew/v3.0.0.rst

Aloqeely · 2024-03-29T01:16:49Z

Can you have a look at the CI errors please, your test might be failing.

Co-authored-by: Abdulaziz Aloqeely <52792999+Aloqeely@users.noreply.github.com>

domsmrz · 2024-03-29T10:16:34Z

Thanks for the review @Aloqeely . I've managed to make all the CIs green. Unfortunately, I've encountered different bug when doing so, filed as #58069 . Which is the reason I had to skip checking for series name in the tests. Strictly speaking this PR and #58069 are unrelated -- one can easily encounter the bug in current main even without accepting the PR, as explained in the bug description. That being said I can see how accepting this PR may nudge users to actually encounter the bug a bit more often. Let me know what would you like to do with this PR.

rhshadrach

Thanks for the PR! A few minor requests.

rhshadrach · 2024-04-02T02:32:22Z

pandas/tests/computation/test_eval.py

+        expected = np.floor(df["a"])
+        tm.assert_series_equal(expected, res, check_names=False)


Can you set the proper name on expected and then remove check_names=False.

Unfortunately, I can't. The issue is that df.eval(@np.floor(a)) will have different names in Linux unittests compared to Win/MacOS unittest. Meaning if I just remove check_names and:

leave as-is, the Linux unittests will fail

change the name of expected to "a", the Win & MacOS unittests will fail

Please see my previous comment and/or #58069 for a little bit more details.

On a further thought, we can sort of work around the issue by explicitly stating the engine, which is arguably something we should do anyway. I've updated the PR accordingly -- and also moved the test into more fitting test file. Please take a look and let me know what you think.

rhshadrach · 2024-04-02T02:34:11Z

doc/source/whatsnew/v3.0.0.rst

@@ -325,6 +325,7 @@ Bug fixes
 - Fixed bug in :class:`SparseDtype` for equal comparison with na fill value. (:issue:`54770`)
 - Fixed bug in :meth:`.DataFrameGroupBy.median` where nat values gave an incorrect result. (:issue:`57926`)
 - Fixed bug in :meth:`DataFrame.cumsum` which was raising ``IndexError`` if dtype is ``timedelta64[ns]`` (:issue:`57956`)
+- Fixed bug in :meth:`DataFrame.eval` and :meth:`DataFrame.query` which caused an exception when using numpy via ``@`` notation. (:issue:`58041`)


I'm not sure if "when using numpy" is clear - what do you think of doing something like

when using NumPy attributes via @ notation, e.g. df.eval("@np.floor(a)")

Updated as per suggestion, thanks

Also move the test to more appropriate file.

rhshadrach

lgtm

mroeschke · 2024-04-08T16:48:37Z

Thanks @domsmrz

@np

…@-notation (pandas-dev#58057) * Allow @np to be used within df.eval and df.query Add additional check to make sure that ndim of the provided variable is an int. This makes sure that the ndim guard doesn't trigger if it is something else (e.g., a function in case of a numpy). This allow for using @np in df.eval and df.query methods. * Add whatsnew * Fix typo Co-authored-by: Abdulaziz Aloqeely <52792999+Aloqeely@users.noreply.github.com> * Test: skip checking names due to inconsistencies between OSes * Elaborate futher on whatsnew message * Fix the test by explicitly specifing engine. Also move the test to more appropriate file. --------- Co-authored-by: Abdulaziz Aloqeely <52792999+Aloqeely@users.noreply.github.com>

domsmrz and others added 3 commits March 28, 2024 23:45

Add whatsnew

4051d36

Merge branch 'main' into np-eval-fix

fd47a93

domsmrz changed the title ~~Np eval fix~~ BUG: Allow using numpy in DataFrame.eval and DataFrame.query via @-notation Mar 28, 2024

Aloqeely reviewed Mar 29, 2024

View reviewed changes

doc/source/whatsnew/v3.0.0.rst Outdated Show resolved Hide resolved

Fix typo

e0b1e72

Co-authored-by: Abdulaziz Aloqeely <52792999+Aloqeely@users.noreply.github.com>

domsmrz force-pushed the np-eval-fix branch from 4de0c2b to e0b1e72 Compare March 29, 2024 08:28

Test: skip checking names due to inconsistencies between OSes

81d9c7f

mroeschke requested a review from rhshadrach April 1, 2024 17:55

mroeschke added the expressions pd.eval, query label Apr 1, 2024

rhshadrach requested changes Apr 2, 2024

View reviewed changes

Elaborate futher on whatsnew message

1642cdb

domsmrz force-pushed the np-eval-fix branch from db57aab to 540cae5 Compare April 2, 2024 18:52

Fix the test by explicitly specifing engine.

dee12b3

Also move the test to more appropriate file.

domsmrz force-pushed the np-eval-fix branch from 540cae5 to dee12b3 Compare April 2, 2024 19:18

domsmrz requested a review from rhshadrach April 5, 2024 13:12

rhshadrach approved these changes Apr 8, 2024

View reviewed changes

mroeschke approved these changes Apr 8, 2024

View reviewed changes

mroeschke added this to the 3.0 milestone Apr 8, 2024

mroeschke merged commit f3ddd2b into pandas-dev:main Apr 8, 2024

domsmrz deleted the np-eval-fix branch April 9, 2024 08:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Allow using numpy in `DataFrame.eval` and `DataFrame.query` via @-notation #58057

BUG: Allow using numpy in `DataFrame.eval` and `DataFrame.query` via @-notation #58057

Uh oh!

domsmrz commented Mar 28, 2024

Uh oh!

Uh oh!

Aloqeely commented Mar 29, 2024

Uh oh!

domsmrz commented Mar 29, 2024

Uh oh!

rhshadrach left a comment

Uh oh!

rhshadrach Apr 2, 2024

Uh oh!

domsmrz Apr 2, 2024 •

edited

Loading

Uh oh!

domsmrz Apr 2, 2024

Uh oh!

rhshadrach Apr 2, 2024

Uh oh!

domsmrz Apr 2, 2024

Uh oh!

rhshadrach left a comment

Uh oh!

mroeschke commented Apr 8, 2024

Uh oh!

Uh oh!

		expected = np.floor(df["a"])
		tm.assert_series_equal(expected, res, check_names=False)

Uh oh!

BUG: Allow using numpy in DataFrame.eval and DataFrame.query via @-notation #58057

BUG: Allow using numpy in DataFrame.eval and DataFrame.query via @-notation #58057

Uh oh!

Conversation

domsmrz commented Mar 28, 2024

Uh oh!

Uh oh!

Aloqeely commented Mar 29, 2024

Uh oh!

domsmrz commented Mar 29, 2024

Uh oh!

rhshadrach left a comment

Choose a reason for hiding this comment

Uh oh!

rhshadrach Apr 2, 2024

Choose a reason for hiding this comment

Uh oh!

domsmrz Apr 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

domsmrz Apr 2, 2024

Choose a reason for hiding this comment

Uh oh!

rhshadrach Apr 2, 2024

Choose a reason for hiding this comment

Uh oh!

domsmrz Apr 2, 2024

Choose a reason for hiding this comment

Uh oh!

rhshadrach left a comment

Choose a reason for hiding this comment

Uh oh!

mroeschke commented Apr 8, 2024

Uh oh!

Uh oh!

BUG: Allow using numpy in `DataFrame.eval` and `DataFrame.query` via @-notation #58057

BUG: Allow using numpy in `DataFrame.eval` and `DataFrame.query` via @-notation #58057

domsmrz Apr 2, 2024 •

edited

Loading