-
Notifications
You must be signed in to change notification settings - Fork 224
Improve performance of rem_scalar/div_scalar
for integer types (4x-10x)
#275
Conversation
Codecov Report
@@ Coverage Diff @@
## main #275 +/- ##
==========================================
+ Coverage 77.14% 77.24% +0.10%
==========================================
Files 254 254
Lines 20648 20733 +85
==========================================
+ Hits 15929 16016 +87
+ Misses 4719 4717 -2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks great!
I think that for now we should live with the transmute. I think that the new scalar API will enable us to write this function as div_scalar(&dyn Array, &dyn Scalar) -> Box<dyn Array>
, which will allow us to avoid this transmute and specialize directly based on DataType
.
u16 and u32 is also supported by strength division.
Great! than I will implement the other types as well. |
b603528
to
4eb3adf
Compare
@jorgecarleitao I updated the PR. It is now implemented for There is a bit repetition in the code. Because branches work on different concrete types it seems not possible to write a generic function for that. A macro would be possible but, but given the usage of |
Do we have a test hitting the branch? |
Will write that tomorrow. |
4eb3adf
to
7a6f439
Compare
@jorgecarleitao all branches are ran in tests now. I think its good to go. |
rem_scalar/div_scalar
for integer types (4x-10x)
Already make this PR to start the discussion. I implemented the strength reduction algorithm discussed in #259 for division of
u64
arrays and the performance increases are almost 10x!I benchmarked a divsion by 4 (which is a single bit shift) and a divison by a prime number.
@jorgecarleitao, I used some
unsafe
to convince the compiler of the proper types. How do you want to go about this. We could also create some extra traits to do the conversion.