Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] Add note to ?setorder that character ordering sorts case, then alphabetically #1996

Closed
HughParsonage opened this issue Jan 20, 2017 · 2 comments · Fixed by #2387
Closed

Comments

@HughParsonage
Copy link
Member

HughParsonage commented Jan 20, 2017

I noticed that setorder is case-sensitive in a different way to base::order

DT <- data.table(x = c("a", "A", "B", "b"), y = 1:4)
DF <- as.data.frame(DT)

# ordered by case, then alphabetically and capital letters occur before lower-case letters
DT[order(x)]
#    x y
# 1: A 2
# 2: B 3
# 3: a 1
# 4: b 4

# in alphabetical order first, then case-sensitively (and lower-case first)
DF[order(DF$x), ]
#   x y
# 3 a 1
# 1 A 2
# 4 b 4
# 2 B 3

If this is intended behaviour, I believe it should be added to the documentation for ?setorder.

@MichaelChirico
Copy link
Member

MichaelChirico commented Jan 20, 2017

I believe this is documented, though perhaps it could be done so more elegantly/obviously:

data.table always reorders in C-locale. To sort by session locale, use x[base::order(.)] instead.

Also, potential duplicate of #565? But definite duplicate of #1103.

@HughParsonage
Copy link
Member Author

HughParsonage commented Jan 20, 2017

FWIW, I saw data.table always reorders in C-locale in the documentation (before raising this issue) but didn't realize it had that effect. (I thought it was just a comment about performance.)

HughParsonage added a commit to HughParsonage/data.table that referenced this issue Sep 26, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants