Skip to content

Commit

Permalink
Closes #705, rbindlist errors on non-identical non-factor class attri…
Browse files Browse the repository at this point in the history
…butes.
  • Loading branch information
arunsrinivasan committed Mar 1, 2015
1 parent ee92c58 commit a221865
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 1 deletion.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,8 @@

42. When `by` expression is, for example, `by = x %% 2`, `data.table` tries to automatically extracts meaningful column names from the expression. In this case it would be `x`. However, if the `j-expression` also contains `x`, for example, `DT[, last(x), by= x %% 2]`, the original `x` got masked by the expression in `by`. This is now fixed; by-expressions are not simplified in column names for these cases. Closes [#497](https://github.com/Rdatatable/data.table/issues/497). Thanks to @GSee for the report.

43. `rbindlist` now errors when columns have non-identical class attributes and are not `factor`s, e.g., binding column of class `Date` with `POSIXct`. Previously this returned incorrect results. Closes [#705](https://github.com/Rdatatable/data.table/issues/705). Thanks to @ecoRoland for the minimal report.

#### NOTES

1. Clearer explanation of what `duplicated()` does (borrowed from base). Thanks to @matthieugomez for pointing out. Closes [#872](https://github.com/Rdatatable/data.table/issues/872).
Expand Down
6 changes: 6 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -6094,6 +6094,12 @@ test(1492, ans1, ans2)
dt = data.table(x=1:10, y=11:20)
test(1493, dt[, .(x=sum(x)),by= x %% 2, verbose=TRUE], data.table(`x%%2`=c(1,0), x=c(25L,30L)), output="by-expression 'x%%2' is not named")

# Fix for #705
DT1 = data.table(date=as.POSIXct("2014-06-22", format="%Y-%m-%d", tz="GMT"))
DT2 = data.table(date=as.Date("2014-06-23"))
test(1494.1, rbind(DT1, DT2), error="Class attributes at column")
test(1494.2, rbind(DT2, DT1), error="Class attributes at column")

##########################


Expand Down
11 changes: 10 additions & 1 deletion src/rbindlist.c
Original file line number Diff line number Diff line change
Expand Up @@ -497,7 +497,7 @@ static SEXP match_names(SEXP v) {
static void preprocess(SEXP l, Rboolean usenames, Rboolean fill, struct preprocessData *data) {

R_len_t i, j, idx;
SEXP li, lnames=R_NilValue, fnames, findices=R_NilValue, f_ind=R_NilValue, thiscol, col_name=R_NilValue;
SEXP li, lnames=R_NilValue, fnames, findices=R_NilValue, f_ind=R_NilValue, thiscol, col_name=R_NilValue, thisClass = R_NilValue;
SEXPTYPE type;

data->first = -1; data->lcount = 0; data->n_rows = 0; data->n_cols = 0; data->protecti = 0;
Expand Down Expand Up @@ -568,17 +568,26 @@ static void preprocess(SEXP l, Rboolean usenames, Rboolean fill, struct preproce
data->max_type = Calloc(data->n_cols, SEXPTYPE);
data->is_factor = Calloc(data->n_cols, int);
for (i = 0; i< data->n_cols; i++) {
thisClass = R_NilValue;
if (usenames) f_ind = VECTOR_ELT(findices, i);
for (j=data->first; j<LENGTH(l); j++) {
if (data->is_factor[i] == 2) break;
idx = (usenames) ? INTEGER(f_ind)[j] : i;
li = VECTOR_ELT(l, j);
if (isNull(li) || !LENGTH(li) || idx < 0) continue;
thiscol = VECTOR_ELT(li, idx);
// Fix for #705, check attributes
if (j == data->first)
thisClass = getAttrib(thiscol, R_ClassSymbol);
if (isFactor(thiscol)) {
data->is_factor[i] = (isOrdered(thiscol)) ? 2 : 1;
data->max_type[i] = STRSXP;
} else {
// Fix for #705, check attributes and error if non-factor class and not identical
if (!data->is_factor[i] &&
!R_compute_identical(thisClass, getAttrib(thiscol, R_ClassSymbol), 0)) {
error("Class attributes at column %d of input list at position %d does not match with column %d of input list at position %d. Coercion of objects of class 'factor' alone is handled internally by rbind/rbindlist at the moment.", i+1, j+1, i+1, data->first+1);
}
type = TYPEOF(thiscol);
if (type > data->max_type[i]) data->max_type[i] = type;
}
Expand Down

0 comments on commit a221865

Please sign in to comment.