Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed typeorder coercion rules for RAWSXP and LGLSXP to match base R #4195

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,8 @@ unit = "s")

6. `all.equal(DT, y)` no longer errors when `y` is not a data.table, [#4042](https://github.com/Rdatatable/data.table/issues/4042). Thanks to @d-sci for reporting and the PR.

7. `rbindlist()` now correctly coerces raw to logical instead of vice-versa [#4172](https://github.com/Rdatatable/data.table/issues/4172), making it consistent with base R's coercion rules. Thanks to @sritchie73 for reporting and fixing.

## NOTES

1. `as.IDate`, `as.ITime`, `second`, `minute`, and `hour` now recognize UTC equivalents for speed: GMT, GMT-0, GMT+0, GMT0, Etc/GMT, and Etc/UTC, [#4116](https://github.com/Rdatatable/data.table/issues/4116).
Expand Down
8 changes: 7 additions & 1 deletion inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -16770,7 +16770,13 @@ test(2132.2, fifelse(TRUE, 1, s2), error = "S4 class objects (except nanot
test(2132.3, fcase(TRUE, s1, FALSE, s2), error = "S4 class objects (except nanotime) are not supported. Please see https://github.com/Rdatatable/data.table/issues/4131.")
rm(s1, s2, class2132)

# Check rbindlist coercion rules for raw match base R (e.g. using c()) #4172
DT1 = data.table(a=as.raw(1), b=as.raw(2), c=as.raw(3), d=as.raw(4), e=as.raw(5), f=as.raw(6), g=as.raw(7), h=as.raw(8))
DT2 = data.table(a=as.raw(1), b=TRUE, c=1L, d=1.5, e=complex(real=3, imaginary=1), f="a", g=list(3:5), h=expression(1+1))
DT3 = setDT(lapply(names(DT1), function(j) c(DT1[[j]], DT2[[j]])))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting merge conflict for this test with #4196.

To resolve, replace this line with:

DT3 = data.table(a=c(DT1$a, DT2$a), b=c(DT1$b, DT2$b), c=c(DT1$c, DT2$c),
                 d=c(DT1$d, DT2$d), e=c(DT1$e, DT2$e), f=c(DT1$f, DT2$f),
                 g=c(DT1$g, DT2$g), h=c(DT1$h, list(DT2$h)))

(i.e. because in this current PR coercion of any type with expression leads to an expression vector, while PR #4196 expects that column to be coerced to a list)

setnames(DT3, names(DT1))
test(2133, rbind(DT1, DT2), DT3, warning="Column 2 of item 1: 2 (type 'raw') at RHS position 1 taken as TRUE when assigning to type 'logical' (column 2 named 'b')") # warning expected due to loss of precision when coercing raw to logical

########################
# Add new tests here #
########################
########################
4 changes: 2 additions & 2 deletions src/init.c
Original file line number Diff line number Diff line change
Expand Up @@ -223,8 +223,8 @@ R_ExternalMethodDef externalMethods[] = {
static void setSizes() {
for (int i=0; i<100; ++i) { sizes[i]=0; typeorder[i]=0; }
// only these types are currently allowed as column types :
sizes[LGLSXP] = sizeof(int); typeorder[LGLSXP] = 0;
sizes[RAWSXP] = sizeof(Rbyte); typeorder[RAWSXP] = 1;
sizes[RAWSXP] = sizeof(Rbyte); typeorder[RAWSXP] = 0;
sizes[LGLSXP] = sizeof(int); typeorder[LGLSXP] = 1;
sizes[INTSXP] = sizeof(int); typeorder[INTSXP] = 2; // integer and factor
sizes[REALSXP] = sizeof(double); typeorder[REALSXP] = 3; // numeric and integer64
sizes[CPLXSXP] = sizeof(Rcomplex); typeorder[CPLXSXP] = 4;
Expand Down
7 changes: 4 additions & 3 deletions src/rbindlist.c
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ SEXP rbindlist(SEXP l, SEXP usenamesArg, SEXP fillArg, SEXP idcolArg)

SEXP coercedForFactor = NULL;
for(int j=0; j<ncol; ++j) {
int maxType=LGLSXP; // initialize with LGLSXP for test 2002.3 which has col x NULL in both lists to be filled with NA for #1871
int maxType=-1; // Initialize to -1, which is not a SEXPTYPE, to detect when both columns are NULL to be filled with NA_logical_
bool factor=false, orderedFactor=false; // ordered factor is class c("ordered","factor"). isFactor() is true when isOrdered() is true.
int longestLen=0, longestW=-1, longestI=-1; // just for ordered factor
SEXP longestLevels=R_NilValue; // just for ordered factor
Expand All @@ -292,8 +292,8 @@ SEXP rbindlist(SEXP l, SEXP usenamesArg, SEXP fillArg, SEXP idcolArg)
}
SEXP thisCol = VECTOR_ELT(li, w);
int thisType = TYPEOF(thisCol);
// Use >= for #546 -- TYPEORDER=0 for both LGLSXP and EXPRSXP (but also NULL)
if (TYPEORDER(thisType)>=TYPEORDER(maxType) && !isNull(thisCol)) maxType=thisType;
// Use >= for #546 -- TYPEORDER=0 for both RAWSXP and EXPRSXP. For NULL, keep maxType as -1
if (!isNull(thisCol) && (maxType == -1 || TYPEORDER(thisType)>=TYPEORDER(maxType))) maxType=thisType;
if (isFactor(thisCol)) {
if (isNull(getAttrib(thisCol,R_LevelsSymbol))) error(_("Column %d of item %d has type 'factor' but has no levels; i.e. malformed."), w+1, i+1);
factor = true;
Expand All @@ -320,6 +320,7 @@ SEXP rbindlist(SEXP l, SEXP usenamesArg, SEXP fillArg, SEXP idcolArg)
}
}

if (maxType == -1) maxType=LGLSXP; // col X is NULL in both lists then column to be filled with NA_logical_ #1871, test 2002.03
if (!foundName) { static char buff[12]; sprintf(buff,"V%d",j+1), SET_STRING_ELT(ansNames, idcol+j, mkChar(buff)); foundName=buff; }
if (factor) maxType=INTSXP; // if any items are factors then a factor is created (could be an option)
if (int64 && maxType!=REALSXP)
Expand Down