Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected result when creating a factor column with 'by=' present #2522

Closed
ben519 opened this issue Dec 13, 2017 · 2 comments · Fixed by #3906
Closed

Unexpected result when creating a factor column with 'by=' present #2522

ben519 opened this issue Dec 13, 2017 · 2 comments · Fixed by #3906
Milestone

Comments

@ben519
Copy link

ben519 commented Dec 13, 2017

For example,

library(data.table)

dt <- data.table(
  id = 1:9,
  grp = rep(1:3, each = 3),
  val = c("a","b","c",   "a","b","c",   "a","b","c")
)

dt[, valfactor1 := factor(val), by = grp]  # fine and dandy
dt[, valfactor2 := factor(val), by = id]   # breaks
dt
   id grp val valfactor1 valfactor2
1:  1   1   a          a          c
2:  2   1   b          b          c
3:  3   1   c          c          c
4:  4   2   a          a          c
5:  5   2   b          b          c
6:  6   2   c          c          c
7:  7   3   a          a          c
8:  8   3   b          b          c
9:  9   3   c          c          c

I suspect this has come up before, but I didn't see anything on it. Also, I'm not sure what a reasonable behavior would be if I was trying to assign factors by group with ordered levels. At the least, I think a warning in these cases would be helpful. Right now it silently does something strange.

Thanks!

sessionInfo()

R version 3.4.2 (2017-09-28)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.10.5

loaded via a namespace (and not attached):
[1] compiler_3.4.2 tools_3.4.2    yaml_2.1.15
@franknarf1
Copy link
Contributor

Seems related to #2199

@mattdowle mattdowle added this to the 1.12.4 milestone Mar 20, 2019
@Henrik-P
Copy link

I assume this is the same underlying issue, but I thought I just as well could post it here for reference. I stumbled over it when I needed a factor version of .GRP for plotting purposes.

d <- data.table(x = rep(letters[c(3, 1, 2)], each = 2))
d[ , `:=`(
  g = .GRP,
  f = factor(.GRP)),
  by = x]

#    x g f
# 1: c 1 3
# 2: c 1 3
# 3: a 2 3
# 4: a 2 3
# 5: b 3 3
# 6: b 3 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants