Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gmedian retain class #3564

Merged
merged 3 commits into from
May 15, 2019
Merged

gmedian retain class #3564

merged 3 commits into from
May 15, 2019

Conversation

mattdowle
Copy link
Member

@mattdowle mattdowle commented May 15, 2019

Closes #3079
Follow up to #3546
gmedian now retains the class, other than for integer64 since it always returns double.

Also reduced lines of code in gmedian by adding an integer64 case to quickselect.c, moving the 2nd-half minimum loops into quickselect.c, and finding location of median inside quickselect rather than passing that in. Existing test 1579 covered int, double, integer64 and na.rm cases nicely.

R API raised outside loops in gmedian (since altrep R 3.5 greater overhead).

A few trailing spaces removed (my editor does that on save).

A few unrelated tests tweaked to pass in dev mode when data.table is not installed.

Checked all gforce function retain attributes. Added comment to the only one that doesn't (gvarsd1) since class unlikely applicable for sd/var.

@mattdowle mattdowle added this to the 1.12.4 milestone May 15, 2019
@codecov
Copy link

codecov bot commented May 15, 2019

Codecov Report

Merging #3564 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3564      +/-   ##
==========================================
+ Coverage   97.45%   97.46%   +<.01%     
==========================================
  Files          66       66              
  Lines       12806    12678     -128     
==========================================
- Hits        12480    12356     -124     
+ Misses        326      322       -4
Impacted Files Coverage Δ
R/fwrite.R 98.75% <100%> (ø) ⬆️
src/gsumm.c 96.9% <100%> (+0.16%) ⬆️
src/quickselect.c 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 93f50f7...2533c90. Read the comment docs.

@mattdowle mattdowle merged commit 9848004 into master May 15, 2019
@mattdowle mattdowle deleted the gmedian_retain_class branch May 15, 2019 20:33
@mattdowle
Copy link
Member Author

I declared isInt64 as const bool so that branch prediction should be very good. Interesting to test but, thanks to branch prediction, I doubt there will be a measurable difference in this case.
It is probably worth creating totally branch free cases; i.e. taking isunsorted and irowslen outside too. But that would complicate/expand the code. The move of the R API outside the loops should have already had a greater benefit.
However, I knew that when gather() is applied this serial gather will go away. Then gmedian can apply the quickselect in parallel. That's what should be done next.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Date and POSIXct coerced to numeric when calculating median by group
2 participants