Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistently dropping class/attributes of column vector #1160

Closed
renkun-ken opened this issue May 23, 2015 · 2 comments
Closed

Inconsistently dropping class/attributes of column vector #1160

renkun-ken opened this issue May 23, 2015 · 2 comments
Assignees
Labels
Milestone

Comments

@renkun-ken
Copy link
Member

I'm creating a package to make formattable data. It works nice with data.table in most cases, but in the following cases, data.table somehow shows inconsistent behavior:

library(data.table)
library(formattable)
p <- data.table(
  id = c(1, 2, 3, 4, 5), 
  name = c("A1", "A1", "B1", "B1", "C1"),
  balance = accounting(c(52500, 36150, 25000, 18300, 7600), format = "d"),
  growth = percent(c(0.3, 0.3, 0.1, 0.15, 0.15), format = "d"),
  ready = formattable(c(TRUE, TRUE, FALSE, FALSE, TRUE), "yes", "no"))

Then p looks like

   id name balance growth ready
1:  1   A1  52,500    30%   yes
2:  2   A1  36,150    30%   yes
3:  3   B1  25,000    10%    no
4:  4   B1  18,300    15%    no
5:  5   C1   7,600    15%   yes

Then I do some aggregation:

> p[, .(balance = mean(balance)), by = .(name)]
   name balance
1:   A1  44,325
2:   B1  21,650
3:   C1   7,600
> p[, .(balance = mean(balance), growth = min(growth)), by = .(name)]
   name balance growth
1:   A1  44,325    30%
2:   B1  21,650    10%
3:   C1   7,600    15%
> p[, .(balance = mean(balance), growth = min(growth), ready = all(ready)), by = .(name)]
   name balance growth ready
1:   A1   44325    30%   yes
2:   B1   21650    10%    no
3:   C1    7600    15%   yes
> p[, .(balance = mean(balance), growth = min(growth)), by = .(name)]
   name balance growth
1:   A1  44,325    30%
2:   B1  21,650    10%
3:   C1   7,600    15%
> p[, .(balance = mean(balance), growth = min(growth), ready = all(ready)), by = .(name)]
   name balance growth ready
1:   A1   44325    30%   yes
2:   B1   21650    10%    no
3:   C1    7600    15%   yes
> p[, .(balance = last(balance), growth = min(growth), ready = all(ready)), by = .(name)]
   name balance growth ready
1:   A1  36,150    30%   yes
2:   B1  18,300    10%    no
3:   C1   7,600    15%   yes
> class(p[, .(balance = mean(balance), growth = min(growth), ready = all(ready)), by = .(name)]$balance) ## should be c("formattable", "numeric")
[1] "numeric"

In the first 2 cases and the last case, the formattable class and attributes is preserved, but in the 3rd case the balance derived from mean(balance) drops formattable numeric class while in other cases it does not behave in this way. I'm not sure if it is for some reason intended to work in this way but it pretty much looks like a bug.

I'm using the latest commit data.table@b6ea972 under R 3.2.0 and Win8.1 x64.

@renkun-ken
Copy link
Member Author

I do some experiments and find that as long as a non-numeric calculation appears in j then the class/attributes do not preserve correctly.

> p[, .(balance = mean(balance), growth = mean(growth)), by = .(name)]
   name balance growth
1:   A1  44,325    30%
2:   B1  21,650    12%
3:   C1   7,600    15%
> p[, .(balance = mean(balance), growth = mean(growth), ready = all(ready)), by = .(name)]
   name balance growth ready
1:   A1   44325  0.300   yes
2:   B1   21650  0.125    no
3:   C1    7600  0.150   yes

@renkun-ken renkun-ken changed the title Inconsistently droping class/attributes of column vector Inconsistently dropping class/attributes of column vector May 24, 2015
@arunsrinivasan arunsrinivasan added this to the v1.9.6 milestone May 28, 2015
@arunsrinivasan arunsrinivasan self-assigned this May 28, 2015
@renkun-ken
Copy link
Member Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants