-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inefficiency of over expression #10063
Comments
One suggestion for improving the (pl.col(f"v{i}") - pl.col(f"v{i}").mean().truediv(pl.col(f"v{i}").std()).over(["id", "id2"])) in my quick testing this reduces the difference but doesn't eliminate it. |
Even if removing the denominator part entirely for both approaches, there is still about 2x performance difference for my benchmarking. |
There was a @cbilot answer on SO that discussed/benchmarked the overhead of window expressions, but I cannot seem to find it anymore. While searching, I did find:
https://stackoverflow.com/a/71554447/ But I'm not sure if this statement is in relation to a comparison against an equivalent Perhaps another question to ask is can Polars rewrite the I'm not sure on the technical details, so maybe this is already happening internally. |
Then choose an We cannot expect different queries that hit different code paths to have equal performance. Especially not if the constraints are different. |
Is it possible to optimize when multiple expressions using the same |
Research
I have searched the above Polars tags on Stack Overflow for similar questions.
I have asked my usage related question on Stack Overflow.
Link to question on Stack Overflow
https://stackoverflow.com/questions/76757987/polars-inefficiency-of-over-expression
Question about Polars
I found out that at least for the scenario below, doing
over
is much slower (2~3x) than doinggroupby/agg
+explode
. And, the results are exactly the same.Based on this finding, I have the following questions:
groupby/agg
+explode
) instead of usingover
directly?over
?The text was updated successfully, but these errors were encountered: