Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(rust): Use defunctionalization in polars-core scalar.rs in order to reduce code duplication #20377

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

burakemir
Copy link

This change reduces the number of copies of the bitonic_map function from 144 to 12. It should help with build times and code size, without any change in performance.

The use of capturing closures when calling bitonic_mask leads
to unnecessary code duplication, because Rust will create one
new type per closure, and then make one copy of bitonic_mask
per call site.
With this change, these copies are avoided, which should improve
build speed noticeably. cargo llvm-lines shows that
the number of copies is reduced from 144 to 12.
@burakemir burakemir changed the title refactor(rust): polars-core Scalar, defunctionalized refactor(rust): polars-core scalar.rs, use defunctionalization to reduce code duplication Dec 19, 2024
@burakemir burakemir changed the title refactor(rust): polars-core scalar.rs, use defunctionalization to reduce code duplication refactor(rust): Use defunctionalization in polars-core scalar.rs in order to reduce code duplication Dec 19, 2024
@github-actions github-actions bot added internal An internal refactor or improvement rust Related to Rust Polars and removed title needs formatting labels Dec 19, 2024
Copy link

codecov bot commented Dec 19, 2024

Codecov Report

Attention: Patch coverage is 92.68293% with 3 lines in your changes missing coverage. Please review.

Project coverage is 79.10%. Comparing base (db1684d) to head (d21f75d).

Files with missing lines Patch % Lines
...polars-core/src/chunked_array/comparison/scalar.rs 92.68% 3 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #20377   +/-   ##
=======================================
  Coverage   79.10%   79.10%           
=======================================
  Files        1572     1572           
  Lines      219958   219963    +5     
  Branches     2465     2465           
=======================================
+ Hits       173991   174006   +15     
+ Misses      45399    45389   -10     
  Partials      568      568           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@orlp
Copy link
Collaborator

orlp commented Dec 19, 2024

While better compile times are good, and it probably doesn't matter since it's only called a logarithmic number of times it still feels bad to me we do a match on every single comparison now.

@burakemir
Copy link
Author

burakemir commented Dec 20, 2024

The apply method should in theory be inlined and the match disappear.

I have an alternative approach that uses a custom closure type that achieves the same reduction, would you prefer a PR for that instead?
branch: https://github.com/burakemir/polars/tree/scalar_customclosure
convenient link to scalar.rs (only file that changes, like in this PR): https://github.com/burakemir/polars/blob/scalar_customclosure/crates/polars-core/src/chunked_array/comparison/scalar.rs

@burakemir burakemir force-pushed the scalar_defunctionalized branch from eb8d01b to d21f75d Compare December 20, 2024 06:53
@orlp
Copy link
Collaborator

orlp commented Dec 20, 2024

@burakemir If it were inlined it would not reduce the amount of code generated after all, no?

@burakemir
Copy link
Author

The problem addressed is the needless duplication because of monorphization: the FA and FD generic parameters get instantiated with a special (per-call site) closure type. This leads to the 144 copies of bitonic_mask, which is reduced to 12 by either this or the alternative approach.

You are of course correct that inlining can in general inlining will cause code size increases. In case of the match statement with a statically known scrutinee, the whole match statement should be removed, which then ideally gives rise to further inlining of the actual comparison method (though I have not checked).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal An internal refactor or improvement rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants