Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forder3 #3162

Merged
merged 38 commits into from
Nov 28, 2018
Merged

forder3 #3162

merged 38 commits into from
Nov 28, 2018

Conversation

mattdowle
Copy link
Member

@mattdowle mattdowle commented Nov 28, 2018

Slow down in dev ordering all-unique input fixed; thanks to @arunsrinivasan's testing.

N = 1e8
set.seed(1)
x = sample(N)                       # v1.11.8  master     this PR
system.time(o <- forderv(x))        # 4.5s     5.9s-7.0s  1.2s
!base::is.unsorted(x[o])  # TRUE

x = sample(N/10, N, replace=TRUE)   
system.time(o <- forderv(x))        # 2.6s     1.3s       0.8s

Long-standing flip-flop group sizes removed; now recursive and writes directly to final group size stack via thread buffers.
Nested parallelism recently added, removed. skew still handled.
timetaken() in verbose mode now displays both elapsed and cpu time

Looks like AppVeyor timings on this PR are back to normal at about 5 mins (were over 10 min on master) so that's a good sign.

Needs to be in master to be picked up by db-bench as I've only tested on laptop size (N=1e8).

@mattdowle mattdowle added this to the 1.12.0 milestone Nov 28, 2018
@codecov
Copy link

codecov bot commented Nov 28, 2018

Codecov Report

Merging #3162 into master will increase coverage by 0.13%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3162      +/-   ##
==========================================
+ Coverage   92.05%   92.18%   +0.13%     
==========================================
  Files          61       61              
  Lines       11450    11582     +132     
==========================================
+ Hits        10540    10677     +137     
+ Misses        910      905       -5
Impacted Files Coverage Δ
R/data.table.R 92.74% <ø> (ø) ⬆️
src/forder.c 99.71% <100%> (+0.77%) ⬆️
R/timetaken.R 100% <100%> (+9.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 55168ed...de2855a. Read the comment docs.

@mattdowle mattdowle merged commit 05c0d45 into master Nov 28, 2018
@mattdowle mattdowle deleted the forder3 branch November 28, 2018 09:52
@jangorecki
Copy link
Member

those are incredible speed ups! 3-4 times reduced vs. 1.11.8

@mattdowle
Copy link
Member Author

mattdowle commented Nov 28, 2018

@jangorecki yep fingers crossed. needs battle testing. could you set off a db-bench rerun please (just data.table if possible) to see how it does on server as I missed the 11pm start time. I'm hoping this will fix the oddness of the 2nd run being slower for the first test, too.

@jangorecki jangorecki mentioned this pull request Mar 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants