-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(python): Give priority to pycapsule interface in from_dataframe #21377
base: main
Are you sure you want to change the base?
Conversation
088d28d
to
c6cde43
Compare
with pytest.raises( | ||
CopyNotAllowedError, | ||
match="byte-packed boolean buffer must be converted to bit-packed boolean", | ||
): | ||
result = pl.from_dataframe(df, allow_copy=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this no longer raises with allow_copy=False
- I don't know how strict (if at all?) the pycapsule interface is about copies, will take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked and none of the examples in https://github.com/search?q=%22pl.from_dataframe%22&type=code use allow_copy=False
, so I'm not sure how much we should be concerned?
I'd like to suggest just deprecating the argument tbh
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #21377 +/- ##
==========================================
+ Coverage 79.84% 79.95% +0.10%
==========================================
Files 1597 1597
Lines 228911 228956 +45
Branches 2618 2621 +3
==========================================
+ Hits 182785 183051 +266
+ Misses 45527 45300 -227
- Partials 599 605 +6 ☔ View full report in Codecov by Sentry. |
76415ba
to
0adf009
Compare
0adf009
to
ada30dd
Compare
This reverts commit c0e59dd.
@@ -123,7 +126,7 @@ def test_to_dataframe_pandas_zero_copy_parametric(df: pl.DataFrame) -> None: | |||
) | |||
def test_from_dataframe_pyarrow_parametric(df: pl.DataFrame) -> None: | |||
df_pa = df.to_arrow() | |||
result = pl.from_dataframe(df_pa) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests are very specific to the interchange protocol, so I've kept the test to those (rather than letting the pycapsule interface take precedence)
closes #20316. As in, to clarify:
As suggested in #20065 (comment), this gives priority to the pycapsule interface over the interchange protocol. I did the same in pandas: pandas-dev/pandas#60739
"pl.from_dataframe" shows up a scary amount of times in https://github.com/search?q=%22pl.from_dataframe%22&type=code 🥶