Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of DataFrame binary comparison operations #6869

Merged
merged 3 commits into from
Oct 23, 2023
Merged

Improve performance of DataFrame binary comparison operations #6869

merged 3 commits into from
Oct 23, 2023

Conversation

asmirnov82
Copy link
Contributor

@asmirnov82 asmirnov82 commented Oct 19, 2023

Goal of this PR is to improve performance of comparison operation and is the next step in aligning datafrane arithmetic API with the new TensorPrimitives API

In DataFrame 0.20.1:

Method Mean Error StdDev
ElementwiseEquals_Int32_Int32 38.00 ms 0.145 ms 0.121 ms
ElementwiseEquals_Int16_Int16 39.55 ms 0.291 ms 0.258 ms
ElementwiseEquals_Double_Double 40.28 ms 0.367 ms 0.343 ms
ElementwiseEquals_Float_Float 41.18 ms 0.805 ms 1.074 ms

After this PR:

Method Mean Error StdDev
ElementwiseEquals_Int32_Int32 1.171 ms 0.0228 ms 0.0263 ms
ElementwiseEquals_Int16_Int16 1.090 ms 0.0569 ms 0.0475 ms
ElementwiseEquals_Double_Double 1.388 ms 0.0264 ms 0.0247 ms
ElementwiseEquals_Float_Float 1.250 ms 0.0215 ms 0.0190 ms

Other comparison operations shows the same boost in performance

@asmirnov82
Copy link
Contributor Author

@JakeRadMSFT could you please review?

@codecov
Copy link

codecov bot commented Oct 19, 2023

Codecov Report

Merging #6869 (12a296a) into main (766569b) will decrease coverage by 0.02%.
Report is 2 commits behind head on main.
The diff coverage is 62.00%.

@@            Coverage Diff             @@
##             main    #6869      +/-   ##
==========================================
- Coverage   69.40%   69.39%   -0.02%     
==========================================
  Files        1238     1238              
  Lines      249441   249462      +21     
  Branches    25522    25522              
==========================================
- Hits       173130   173113      -17     
+ Misses      69692    69599      -93     
- Partials     6619     6750     +131     
Flag Coverage Δ
Debug 69.39% <62.00%> (-0.02%) ⬇️
production 63.91% <62.00%> (-0.02%) ⬇️
test 88.90% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/Microsoft.Data.Analysis/BitUtility.cs 84.78% <100.00%> (ø)
...rosoft.Data.Analysis/ArrowStringDataFrameColumn.cs 63.54% <0.00%> (ø)
src/Microsoft.Data.Analysis/DataFrameBuffer.cs 84.90% <50.00%> (ø)
...icrosoft.Data.Analysis/PrimitiveColumnContainer.cs 86.08% <92.30%> (ø)
...c/Microsoft.Data.Analysis/StringDataFrameColumn.cs 71.42% <0.00%> (ø)
...icrosoft.Data.Analysis/PrimitiveDataFrameColumn.cs 73.19% <66.66%> (ø)
src/Microsoft.Data.Analysis/Strings.Designer.cs 42.36% <0.00%> (ø)
.../Microsoft.Data.Analysis/VBufferDataFrameColumn.cs 45.77% <56.25%> (-1.41%) ⬇️
...Microsoft.Data.Analysis/Computations/Arithmetic.cs 62.06% <50.00%> (ø)
...lysis/PrimitiveColumnContainer.BinaryOperations.cs 89.83% <57.14%> (-10.17%) ⬇️
... and 1 more

... and 8 files with indirect coverage changes

@JakeRadMSFT JakeRadMSFT merged commit 796cb35 into dotnet:main Oct 23, 2023
@asmirnov82 asmirnov82 deleted the improve_filtering_and_comparison_operations branch October 23, 2023 11:07
@ghost ghost locked as resolved and limited conversation to collaborators Nov 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants