Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is.sorted is slow for determining whether jval is sorted by key #4498

Closed
MichaelChirico opened this issue May 26, 2020 · 2 comments · Fixed by #4501
Closed

is.sorted is slow for determining whether jval is sorted by key #4498

MichaelChirico opened this issue May 26, 2020 · 2 comments · Fixed by #4501
Labels
breaking-change issues whose solution would require breaking existing behavior High
Milestone

Comments

@MichaelChirico
Copy link
Member

From SO:

https://stackoverflow.com/questions/62019120/why-does-data-table-notation-for-column-retrieval-affect-speed/62028864#62028864

x <- as.data.table(as.character(rnorm(20000000,1,0.5)))
setkey(x,V1)

tic(); x[,.(V1)]; toc()
# 25.08 sec elapsed

(timing is even worse on my machine)

The bottleneck appears to be this line:

if (haskey(x) && all(key(x) %chin% names(jval)) && suppressWarnings(is.sorted(jval, by=key(x)))) # TO DO: perhaps this usage of is.sorted should be allowed internally then (tidy up and make efficient)

IINM we can tell the output is sorted because V1 is the key and it appears as a name -- no need to compute the sort order all over again.

@jangorecki
Copy link
Member

jangorecki commented May 26, 2020

could you check if recomputing key in this case will be resolved by #4386?
it won't because jval does not have key/indices anymore

@jangorecki
Copy link
Member

I think we can move this issue to a next release. Is.sorted is now optimized so the issue is at least less painful now.

@jangorecki jangorecki modified the milestones: 1.12.9, 1.12.11 Jun 15, 2020
@jangorecki jangorecki added High breaking-change issues whose solution would require breaking existing behavior labels Jun 20, 2020
@mattdowle mattdowle modified the milestones: 1.13.1, 1.13.3 Oct 17, 2020
@jangorecki jangorecki modified the milestones: 1.14.3, 1.14.5 Jul 19, 2022
@jangorecki jangorecki modified the milestones: 1.14.11, 1.15.1 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change issues whose solution would require breaking existing behavior High
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants