From 2c13fb83cf041f868a601fb03722ae7f3f6050a2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?V=C3=A1clav=20Tlap=C3=A1k?= <55213630+tlapak@users.noreply.github.com> Date: Thu, 2 Apr 2020 12:49:25 +0200 Subject: [PATCH 1/9] Prevent replacing list column with vector of wrong length --- NEWS.md | 2 ++ inst/tests/tests.Rraw | 4 ++++ src/assign.c | 5 +++-- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/NEWS.md b/NEWS.md index 71fd76aa6..5cb18584e 100644 --- a/NEWS.md +++ b/NEWS.md @@ -107,6 +107,8 @@ unit = "s") 12. `rbindlist` no longer errors when coercing complex vectors to character vectors, [#4202](https://github.com/Rdatatable/data.table/issues/4202). Thanks to @sritchie73 for reporting and the PR. +13. Attempting to replace an existing list column with a vector of the wrong length no longer leads to a segfault later, [#4166](https://github.com/Rdatatable/data.table/issues/4166). Thanks to @fklirono for reporting and to @tlapak for the PR. + ## NOTES 0. Retrospective license change permission was sought from and granted by 4 contributors who were missed in [PR#2456](https://github.com/Rdatatable/data.table/pull/2456), [#4140](https://github.com/Rdatatable/data.table/pull/4140). We had used [GitHub's contributor page](https://github.com/Rdatatable/data.table/graphs/contributors) which omits 3 of these due to invalid email addresses, unlike GitLab's contributor page which includes the ids. The 4th omission was a PR to a script which should not have been excluded; a script is code too. We are sorry these contributors were not properly credited before. They have now been added to the contributors list as displayed on CRAN. All the contributors of code to data.table hold its copyright jointly; your contributions belong to you. You contributed to data.table when it had a particular license at that time, and you contributed on that basis. This is why in the last license change, all contributors of code were consulted and each had a veto. diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index 7cc6819e8..670cbc7e7 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -16846,3 +16846,7 @@ A = data.table(A=c(complex(real = 1:3, imaginary=c(0, -1, 1)), NaN)) test(2138.3, rbind(A,B), data.table(A=c(as.character(A$A), B$A))) A = data.table(A=as.complex(rep(NA, 5))) test(2138.4, rbind(A,B), data.table(A=c(as.character(A$A), B$A))) + +# Used to be uncaught which lead to a segfault down the line, #4116 +dt = data.table(a=list(1:2, 3)) +test(2139, dt[, a:=1:3], error="Supplied 3 items to be assigned to 2 items of column 'a'.*please use rep") \ No newline at end of file diff --git a/src/assign.c b/src/assign.c index 1392079e7..4e2573809 100644 --- a/src/assign.c +++ b/src/assign.c @@ -432,8 +432,9 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values) error(_("Can't assign to column '%s' (type 'factor') a value of type '%s' (not character, factor, integer or numeric)"), CHAR(STRING_ELT(names,coln)),type2char(TYPEOF(thisvalue))); } - if (nrow>0 && targetlen>0 && vlen>1 && vlen!=targetlen && (TYPEOF(existing)!=VECSXP || TYPEOF(thisvalue)==VECSXP)) { - // note that isNewList(R_NilValue) is true so it needs to be TYPEOF(existing)!=VECSXP above + if (nrow>0 && targetlen>0 && vlen>1 && vlen!=targetlen && !(TYPEOF(existing)==VECSXP && targetlen==1)) { + // We allow assigning objects of arbitrary to single items of list columns for convenience. + // Note that isNewList(R_NilValue) is true so it needs to be !(TYPEOF(existing)==VECSXP) above error(_("Supplied %d items to be assigned to %d items of column '%s'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code."), vlen, targetlen, CHAR(colnam)); } } From f0da4ec2fb38785eb4bab2e1323944a5b9a7774a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?V=C3=A1clav=20Tlap=C3=A1k?= <55213630+tlapak@users.noreply.github.com> Date: Thu, 2 Apr 2020 13:21:54 +0200 Subject: [PATCH 2/9] Corrected issue ref --- inst/tests/tests.Rraw | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index 670cbc7e7..5ba3ca5c2 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -16847,6 +16847,6 @@ test(2138.3, rbind(A,B), data.table(A=c(as.character(A$A), B$A))) A = data.table(A=as.complex(rep(NA, 5))) test(2138.4, rbind(A,B), data.table(A=c(as.character(A$A), B$A))) -# Used to be uncaught which lead to a segfault down the line, #4116 +# Used to be uncaught which lead to a segfault down the line, #4166 dt = data.table(a=list(1:2, 3)) test(2139, dt[, a:=1:3], error="Supplied 3 items to be assigned to 2 items of column 'a'.*please use rep") \ No newline at end of file From 37691738b2e3239e633bebbcc8ed5de0ca498ee0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?V=C3=A1clav=20Tlap=C3=A1k?= <55213630+tlapak@users.noreply.github.com> Date: Fri, 3 Apr 2020 22:55:10 +0200 Subject: [PATCH 3/9] Fix segfault when subassigning to list col by wrapping elements of RHS in list() --- NEWS.md | 2 +- inst/tests/tests.Rraw | 11 +++++++++-- src/assign.c | 38 +++++++++++++++++++++++++++++++++----- 3 files changed, 43 insertions(+), 8 deletions(-) diff --git a/NEWS.md b/NEWS.md index 5cb18584e..2cbd6239c 100644 --- a/NEWS.md +++ b/NEWS.md @@ -107,7 +107,7 @@ unit = "s") 12. `rbindlist` no longer errors when coercing complex vectors to character vectors, [#4202](https://github.com/Rdatatable/data.table/issues/4202). Thanks to @sritchie73 for reporting and the PR. -13. Attempting to replace an existing list column with a vector of the wrong length no longer leads to a segfault later, [#4166](https://github.com/Rdatatable/data.table/issues/4166). Thanks to @fklirono for reporting and to @tlapak for the PR. +13. Attempting to replace an existing list column with a vector of the wrong length no longer leads to a segfault later, [#4166](https://github.com/Rdatatable/data.table/issues/4166). Subassigning a non-list vector would also lead to a segfault. The elements are now 'coerced' by individually wrapping them in `list()`. Thanks to @fklirono for the initial report and to @tlapak for the PR. ## NOTES diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index 5ba3ca5c2..ed3ae5939 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -16848,5 +16848,12 @@ A = data.table(A=as.complex(rep(NA, 5))) test(2138.4, rbind(A,B), data.table(A=c(as.character(A$A), B$A))) # Used to be uncaught which lead to a segfault down the line, #4166 -dt = data.table(a=list(1:2, 3)) -test(2139, dt[, a:=1:3], error="Supplied 3 items to be assigned to 2 items of column 'a'.*please use rep") \ No newline at end of file +dt = data.table(a=list(1:2, 3, 4)) +test(2139.1, dt[, a:=1:4], error="Supplied 4 items to be assigned to 3 items of column 'a'.*please use rep") +# Subassigning into a list column with a non-list object of matching length would also segfault, now wrapped in list +test(2139.2, dt[1:2, a:=1:2 ]$a, list(1L, 2L, 4)) +test(2139.3, dt[1:2, a:=as.numeric(1:2)]$a, list(1, 2, 4)) +test(2139.4, dt[1:2, a:=c(TRUE, FALSE) ]$a, list(TRUE, FALSE, 4)) +test(2139.5, dt[1:2, a:=c('a', 'b') ]$a , list('a', 'b', 4)) +# Preserve attributes +test(2139.6, dt[1:2, a:=as.IDate(1:2)]$a, list(as.IDate(1), as.IDate(2), 4)) diff --git a/src/assign.c b/src/assign.c index 4e2573809..5b0edd4dd 100644 --- a/src/assign.c +++ b/src/assign.c @@ -1068,11 +1068,39 @@ const char *memrecycle(const SEXP target, const SEXP where, const int start, con BODY(SEXP, STRING_PTR, SEXP, val, SET_STRING_ELT(target, off+i, cval)) } case VECSXP : - case EXPRSXP : // #546 - if (TYPEOF(source)!=VECSXP && TYPEOF(source)!=EXPRSXP) - BODY(SEXP, &, SEXP, val, SET_VECTOR_ELT(target, off+i, cval)) - else - BODY(SEXP, SEXPPTR_RO, SEXP, val, SET_VECTOR_ELT(target, off+i, cval)) + case EXPRSXP : { // #546 + if (len == 1 && TYPEOF(source)!=VECSXP && TYPEOF(source)!=EXPRSXP) { + BODY(SEXP, &, SEXP, val, SET_VECTOR_ELT(target, off+i, cval)) + } else { + switch (TYPEOF(source)) { + case RAWSXP: BODY(Rbyte, RAW, SEXP, PROTECT(ScalarRaw(val)); protecti++, + copyMostAttrib(source, cval); + SET_VECTOR_ELT(target, off+i, cval)) + case LGLSXP: BODY(int, INTEGER, SEXP, PROTECT(ScalarLogical(val)); protecti++, + copyMostAttrib(source, cval); + SET_VECTOR_ELT(target, off+i, cval)) + case INTSXP: BODY(int, INTEGER, SEXP, PROTECT(ScalarInteger(val)); protecti++, + copyMostAttrib(source, cval); + SET_VECTOR_ELT(target, off+i, cval)) + case REALSXP: + if (sourceIsI64) BODY(int64_t, REAL, SEXP, PROTECT(ScalarReal(val)); protecti++, + copyMostAttrib(source, cval); + SET_VECTOR_ELT(target, off+i, cval)) + else BODY(double, REAL, SEXP, PROTECT(ScalarReal(val)); protecti++, + copyMostAttrib(source, cval); + SET_VECTOR_ELT(target, off+i, cval)) + case CPLXSXP: BODY(Rcomplex, COMPLEX, SEXP, PROTECT(ScalarComplex(val)); protecti++, + copyMostAttrib(source, cval); + SET_VECTOR_ELT(target, off+i, cval)) + case STRSXP: BODY(SEXP, STRING_PTR, SEXP, PROTECT(ScalarString(val)); protecti++, + copyMostAttrib(source, cval); + SET_VECTOR_ELT(target, off+i, cval)) + case VECSXP: + case EXPRSXP: BODY(SEXP, SEXPPTR_RO, SEXP, val, SET_VECTOR_ELT(target, off+i, cval)) + default: COERCE_ERROR("list"); + } + } + } break; default : error(_("Unsupported column type in assign.c:memrecycle '%s'"), type2char(TYPEOF(target))); // # nocov } From e4c35a886febbbe09801cc1f5b4cc1c5f6e67550 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?V=C3=A1clav=20Tlap=C3=A1k?= <55213630+tlapak@users.noreply.github.com> Date: Sun, 12 Apr 2020 22:29:13 +0200 Subject: [PATCH 4/9] Added unit tests, fixed integer64 bug --- inst/tests/tests.Rraw | 21 ++++++++++++++------- src/assign.c | 25 +++++++++++-------------- 2 files changed, 25 insertions(+), 21 deletions(-) diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index ed3ae5939..5d3a00679 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -16850,10 +16850,17 @@ test(2138.4, rbind(A,B), data.table(A=c(as.character(A$A), B$A))) # Used to be uncaught which lead to a segfault down the line, #4166 dt = data.table(a=list(1:2, 3, 4)) test(2139.1, dt[, a:=1:4], error="Supplied 4 items to be assigned to 3 items of column 'a'.*please use rep") -# Subassigning into a list column with a non-list object of matching length would also segfault, now wrapped in list -test(2139.2, dt[1:2, a:=1:2 ]$a, list(1L, 2L, 4)) -test(2139.3, dt[1:2, a:=as.numeric(1:2)]$a, list(1, 2, 4)) -test(2139.4, dt[1:2, a:=c(TRUE, FALSE) ]$a, list(TRUE, FALSE, 4)) -test(2139.5, dt[1:2, a:=c('a', 'b') ]$a , list('a', 'b', 4)) -# Preserve attributes -test(2139.6, dt[1:2, a:=as.IDate(1:2)]$a, list(as.IDate(1), as.IDate(2), 4)) +# Subassigning into a list column with a non-list object of matching length would also segfault, +# now wrapped in list, preserve attributes +test(2139.2, dt[1:2, a:=structure(c(1L, 2L), att='t') ]$a, list(structure(1L, att='t'), structure(2L, att='t'), 4)) +test(2139.3, dt[1:2, a:=structure(c(1, 2), att='t') ]$a, list(structure(1, att='t'), structure(2, att='t'), 4)) +test(2139.4, dt[1:2, a:=structure(as.raw(c(1, 2)), att='t') ]$a, list(structure(as.raw(1), att='t'), structure(as.raw(2), att='t'), 4)) +test(2139.5, dt[1:2, a:=structure(as.complex(c(1, 2)), att='t')]$a, list(structure(as.complex(1), att='t'), structure(as.complex(2), att='t'), 4)) +test(2139.6, dt[1:2, a:=structure(c(TRUE, FALSE), att='t') ]$a, list(structure(TRUE, att='t'), structure(FALSE, att='t'), 4)) +test(2139.7, dt[1:2, a:=structure(c('a', 'b'), att='t') ]$a, list(structure('a', att='t'), structure('b', att='t'), 4)) +# integer64 is a double decorated with class 'integer64' +if (test_bit64) { + test(2139.8, dt[1:2, a:=as.integer64(1:2) ]$a, list(as.integer64(1), as.integer64(2), 4)) +} +# Non-supported type +test(2139.9, dt[1:2, a:=call('sum', 1)], error="type 'language' cannot be coerced to 'list'") diff --git a/src/assign.c b/src/assign.c index 5b0edd4dd..ffbc1c638 100644 --- a/src/assign.c +++ b/src/assign.c @@ -1073,30 +1073,27 @@ const char *memrecycle(const SEXP target, const SEXP where, const int start, con BODY(SEXP, &, SEXP, val, SET_VECTOR_ELT(target, off+i, cval)) } else { switch (TYPEOF(source)) { - case RAWSXP: BODY(Rbyte, RAW, SEXP, PROTECT(ScalarRaw(val)); protecti++, - copyMostAttrib(source, cval); - SET_VECTOR_ELT(target, off+i, cval)) + case RAWSXP: BODY(Rbyte, RAW, SEXP, PROTECT(ScalarRaw(val)); protecti++, + copyMostAttrib(source, cval); + SET_VECTOR_ELT(target, off+i, cval)) case LGLSXP: BODY(int, INTEGER, SEXP, PROTECT(ScalarLogical(val)); protecti++, copyMostAttrib(source, cval); SET_VECTOR_ELT(target, off+i, cval)) case INTSXP: BODY(int, INTEGER, SEXP, PROTECT(ScalarInteger(val)); protecti++, copyMostAttrib(source, cval); SET_VECTOR_ELT(target, off+i, cval)) - case REALSXP: - if (sourceIsI64) BODY(int64_t, REAL, SEXP, PROTECT(ScalarReal(val)); protecti++, - copyMostAttrib(source, cval); - SET_VECTOR_ELT(target, off+i, cval)) - else BODY(double, REAL, SEXP, PROTECT(ScalarReal(val)); protecti++, - copyMostAttrib(source, cval); - SET_VECTOR_ELT(target, off+i, cval)) - case CPLXSXP: BODY(Rcomplex, COMPLEX, SEXP, PROTECT(ScalarComplex(val)); protecti++, - copyMostAttrib(source, cval); - SET_VECTOR_ELT(target, off+i, cval)) + case REALSXP: BODY(double, REAL, SEXP, PROTECT(ScalarReal(val)); protecti++, + copyMostAttrib(source, cval); + SET_VECTOR_ELT(target, off+i, cval)) + case CPLXSXP: BODY(Rcomplex, COMPLEX, SEXP, PROTECT(ScalarComplex(val)); protecti++, + copyMostAttrib(source, cval); + SET_VECTOR_ELT(target, off+i, cval)) case STRSXP: BODY(SEXP, STRING_PTR, SEXP, PROTECT(ScalarString(val)); protecti++, copyMostAttrib(source, cval); SET_VECTOR_ELT(target, off+i, cval)) case VECSXP: - case EXPRSXP: BODY(SEXP, SEXPPTR_RO, SEXP, val, SET_VECTOR_ELT(target, off+i, cval)) + case EXPRSXP: BODY(SEXP, SEXPPTR_RO, SEXP, val, + SET_VECTOR_ELT(target, off+i, cval)) default: COERCE_ERROR("list"); } } From 20f3d238f5339e97081b90bdcd5561134ad444a2 Mon Sep 17 00:00:00 2001 From: Matt Dowle Date: Tue, 18 May 2021 00:13:40 -0600 Subject: [PATCH 5/9] moved news item up and tweaked --- NEWS.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/NEWS.md b/NEWS.md index fd41a80cd..958a86fb6 100644 --- a/NEWS.md +++ b/NEWS.md @@ -127,6 +127,8 @@ 16. `DT[, min(colB), by=colA]` when `colB` is type `character` would miss blank strings (`""`) at the beginning of a group and return the smallest non-blank instead of blank, [#4848](https://github.com/Rdatatable/data.table/issues/4848). Thanks to Vadim Khotilovich for reporting and for the PR fixing it. +17. Assigning a wrong-length or non-list vector to a list column could segfault, [#4166](https://github.com/Rdatatable/data.table/issues/4166). Thanks to @fklirono for reporting, and to @tlapak for the PR. + ## NOTES 1. New feature 29 in v1.12.4 (Oct 2019) introduced zero-copy coercion. Our thinking is that requiring you to get the type right in the case of `0` (type double) vs `0L` (type integer) is too inconvenient for you the user. So such coercions happen in `data.table` automatically without warning. Thanks to zero-copy coercion there is no speed penalty, even when calling `set()` many times in a loop, so there's no speed penalty to warn you about either. However, we believe that assigning a character value such as `"2"` into an integer column is more likely to be a user mistake that you would like to be warned about. The type difference (character vs integer) may be the only clue that you have selected the wrong column, or typed the wrong variable to be assigned to that column. For this reason we view character to numeric-like coercion differently and will warn about it. If it is correct, then the warning is intended to nudge you to wrap the RHS with `as.()` so that it is clear to readers of your code that a coercion from character to that type is intended. For example : @@ -368,8 +370,6 @@ has a better chance of working on Mac. 20. The environment variable `R_DATATABLE_NUM_THREADS` was being limited by `R_DATATABLE_NUM_PROCS_PERCENT` (by default 50%), [#4514](https://github.com/Rdatatable/data.table/issues/4514). It is now consistent with `setDTthreads()` and only limited by the full number of logical CPUs. For example, on a machine with 8 logical CPUs, `R_DATATABLE_NUM_THREADS=6` now results in 6 threads rather than 4 (50% of 8).r -13. Attempting to replace an existing list column with a vector of the wrong length no longer leads to a segfault later, [#4166](https://github.com/Rdatatable/data.table/issues/4166). Subassigning a non-list vector would also lead to a segfault. The elements are now 'coerced' by individually wrapping them in `list()`. Thanks to @fklirono for the initial report and to @tlapak for the PR. - ## NOTES 0. Retrospective license change permission was sought from and granted by 4 contributors who were missed in [PR#2456](https://github.com/Rdatatable/data.table/pull/2456), [#4140](https://github.com/Rdatatable/data.table/pull/4140). We had used [GitHub's contributor page](https://github.com/Rdatatable/data.table/graphs/contributors) which omits 3 of these due to invalid email addresses, unlike GitLab's contributor page which includes the ids. The 4th omission was a PR to a script which should not have been excluded; a script is code too. We are sorry these contributors were not properly credited before. They have now been added to the contributors list as displayed on CRAN. All the contributors of code to data.table hold its copyright jointly; your contributions belong to you. You contributed to data.table when it had a particular license at that time, and you contributed on that basis. This is why in the last license change, all contributors of code were consulted and each had a veto. From 1834b72d9873e53b36ddb56dbdcf70986bab334a Mon Sep 17 00:00:00 2001 From: Matt Dowle Date: Tue, 18 May 2021 01:37:33 -0600 Subject: [PATCH 6/9] moved copyMostAttrib into CAST outside loop, no protect, and aligned vertically --- src/assign.c | 35 ++++++++++++----------------------- 1 file changed, 12 insertions(+), 23 deletions(-) diff --git a/src/assign.c b/src/assign.c index d7ccfb9ef..c87d99bdb 100644 --- a/src/assign.c +++ b/src/assign.c @@ -434,7 +434,7 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values) } // RHS of assignment to new column is zero length but we'll use its type to create all-NA column of that type } - { + { int j; if (isMatrix(thisvalue) && (j=INTEGER(getAttrib(thisvalue, R_DimSymbol))[1]) > 1) // matrix passes above (considered atomic vector) warning(_("%d column matrix RHS of := will be treated as one vector"), j); @@ -1066,35 +1066,24 @@ const char *memrecycle(const SEXP target, const SEXP where, const int start, con BODY(SEXP, STRING_PTR, SEXP, val, SET_STRING_ELT(target, off+i, cval)) } case VECSXP : - case EXPRSXP : { // #546 + case EXPRSXP : { // #546 #4350 if (len == 1 && TYPEOF(source)!=VECSXP && TYPEOF(source)!=EXPRSXP) { BODY(SEXP, &, SEXP, val, SET_VECTOR_ELT(target, off+i, cval)) } else { switch (TYPEOF(source)) { - case RAWSXP: BODY(Rbyte, RAW, SEXP, PROTECT(ScalarRaw(val)); protecti++, - copyMostAttrib(source, cval); - SET_VECTOR_ELT(target, off+i, cval)) - case LGLSXP: BODY(int, INTEGER, SEXP, PROTECT(ScalarLogical(val)); protecti++, - copyMostAttrib(source, cval); - SET_VECTOR_ELT(target, off+i, cval)) - case INTSXP: BODY(int, INTEGER, SEXP, PROTECT(ScalarInteger(val)); protecti++, - copyMostAttrib(source, cval); - SET_VECTOR_ELT(target, off+i, cval)) - case REALSXP: BODY(double, REAL, SEXP, PROTECT(ScalarReal(val)); protecti++, - copyMostAttrib(source, cval); - SET_VECTOR_ELT(target, off+i, cval)) - case CPLXSXP: BODY(Rcomplex, COMPLEX, SEXP, PROTECT(ScalarComplex(val)); protecti++, - copyMostAttrib(source, cval); - SET_VECTOR_ELT(target, off+i, cval)) - case STRSXP: BODY(SEXP, STRING_PTR, SEXP, PROTECT(ScalarString(val)); protecti++, - copyMostAttrib(source, cval); - SET_VECTOR_ELT(target, off+i, cval)) + // no protect of CAST needed because SET_VECTOR_ELT protects it, and it can't get released by copyMostAttrib or anything else inside BODY + // copyMostAttrib is appended to CAST so as to be outside loop + case RAWSXP: BODY(Rbyte, RAW, SEXP, ScalarRaw(val); copyMostAttrib(source,cval), SET_VECTOR_ELT(target,off+i,cval)) + case LGLSXP: BODY(int, INTEGER, SEXP, ScalarLogical(val);copyMostAttrib(source,cval), SET_VECTOR_ELT(target,off+i,cval)) + case INTSXP: BODY(int, INTEGER, SEXP, ScalarInteger(val);copyMostAttrib(source,cval), SET_VECTOR_ELT(target,off+i,cval)) + case REALSXP: BODY(double, REAL, SEXP, ScalarReal(val); copyMostAttrib(source,cval), SET_VECTOR_ELT(target,off+i,cval)) + case CPLXSXP: BODY(Rcomplex, COMPLEX, SEXP, ScalarComplex(val);copyMostAttrib(source,cval), SET_VECTOR_ELT(target,off+i,cval)) + case STRSXP: BODY(SEXP, STRING_PTR, SEXP, ScalarString(val); copyMostAttrib(source,cval), SET_VECTOR_ELT(target,off+i,cval)) case VECSXP: - case EXPRSXP: BODY(SEXP, SEXPPTR_RO, SEXP, val, - SET_VECTOR_ELT(target, off+i, cval)) + case EXPRSXP: BODY(SEXP, SEXPPTR_RO, SEXP, val, SET_VECTOR_ELT(target,off+i,cval)) default: COERCE_ERROR("list"); } - } + } } break; default : error(_("Unsupported column type in assign.c:memrecycle '%s'"), type2char(TYPEOF(target))); // # nocov From 40e0a8902f38d8187fcac16904eb1734428764eb Mon Sep 17 00:00:00 2001 From: Matt Dowle Date: Tue, 18 May 2021 01:43:43 -0600 Subject: [PATCH 7/9] dt => DT --- inst/tests/tests.Rraw | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index d75d1084a..29a76866c 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -17634,24 +17634,24 @@ test(2188.15, fifelse(TRUE, NA, factor('a'), factor('a', levels = c('a','b'))), test(2188.16, fifelse(c(NA, NA), 1L, 2L, NULL), c(NA_integer_, NA_integer_)) # NULL `na` is treated as NA # rolling join expected output on non-matching join column has been fixed #1913 -dt = data.table(ID=1:5, A=c(1.3, 1.7, 2.4, 0.9, 0.6)) +DT = data.table(ID=1:5, A=c(1.3, 1.7, 2.4, 0.9, 0.6)) buckets = data.table(BucketID=1:4, BinA=1:4) -dt[, A.copy := A] -test(2189.1, buckets[dt, on=c("BinA"="A"), roll=-Inf], data.table(BucketID = c(2L, 2L, 3L, 1L, 1L), BinA = c(1.3, 1.7, 2.4, 0.9, 0.6), ID = 1:5, A.copy = c(1.3, 1.7, 2.4, 0.9, 0.6))) +DT[, A.copy := A] +test(2189.1, buckets[DT, on=c("BinA"="A"), roll=-Inf], data.table(BucketID = c(2L, 2L, 3L, 1L, 1L), BinA = c(1.3, 1.7, 2.4, 0.9, 0.6), ID = 1:5, A.copy = c(1.3, 1.7, 2.4, 0.9, 0.6))) buckets[, BinA := as.numeric(BinA)] -test(2189.2, buckets[dt, on=c("BinA"="A"), roll=-Inf], data.table(BucketID = c(2L, 2L, 3L, 1L, 1L), BinA = c(1.3, 1.7, 2.4, 0.9, 0.6), ID = 1:5, A.copy = c(1.3, 1.7, 2.4, 0.9, 0.6))) +test(2189.2, buckets[DT, on=c("BinA"="A"), roll=-Inf], data.table(BucketID = c(2L, 2L, 3L, 1L, 1L), BinA = c(1.3, 1.7, 2.4, 0.9, 0.6), ID = 1:5, A.copy = c(1.3, 1.7, 2.4, 0.9, 0.6))) # segfault subassigning non-list type to list column, #4166 -dt = data.table(a=list(1:2, 3, 4)) -test(2190.1, dt[, a:=1:4], error="Supplied 4 items to be assigned to 3 items of column 'a'.*please use rep") -test(2190.2, dt[1:2, a:=structure(c(1L, 2L), att='t') ]$a, list(structure(1L, att='t'), structure(2L, att='t'), 4)) -test(2190.3, dt[1:2, a:=structure(c(1, 2), att='t') ]$a, list(structure(1, att='t'), structure(2, att='t'), 4)) -test(2190.4, dt[1:2, a:=structure(as.raw(c(1, 2)), att='t') ]$a, list(structure(as.raw(1), att='t'), structure(as.raw(2), att='t'), 4)) -test(2190.5, dt[1:2, a:=structure(as.complex(c(1, 2)), att='t')]$a, list(structure(as.complex(1), att='t'), structure(as.complex(2), att='t'), 4)) -test(2190.6, dt[1:2, a:=structure(c(TRUE, FALSE), att='t') ]$a, list(structure(TRUE, att='t'), structure(FALSE, att='t'), 4)) -test(2190.7, dt[1:2, a:=structure(c('a', 'b'), att='t') ]$a, list(structure('a', att='t'), structure('b', att='t'), 4)) +DT = data.table(a=list(1:2, 3, 4)) +test(2190.1, DT[, a:=1:4], error="Supplied 4 items to be assigned to 3 items of column 'a'.*please use rep") +test(2190.2, DT[1:2, a:=structure(c(1L, 2L), att='t') ]$a, list(structure(1L, att='t'), structure(2L, att='t'), 4)) +test(2190.3, DT[1:2, a:=structure(c(1, 2), att='t') ]$a, list(structure(1, att='t'), structure(2, att='t'), 4)) +test(2190.4, DT[1:2, a:=structure(as.raw(c(1, 2)), att='t') ]$a, list(structure(as.raw(1), att='t'), structure(as.raw(2), att='t'), 4)) +test(2190.5, DT[1:2, a:=structure(as.complex(c(1, 2)), att='t')]$a, list(structure(as.complex(1), att='t'), structure(as.complex(2), att='t'), 4)) +test(2190.6, DT[1:2, a:=structure(c(TRUE, FALSE), att='t') ]$a, list(structure(TRUE, att='t'), structure(FALSE, att='t'), 4)) +test(2190.7, DT[1:2, a:=structure(c('a', 'b'), att='t') ]$a, list(structure('a', att='t'), structure('b', att='t'), 4)) if (test_bit64) { - test(2190.8, dt[1:2, a:=as.integer64(1:2) ]$a, list(as.integer64(1), as.integer64(2), 4)) + test(2190.8, DT[1:2, a:=as.integer64(1:2) ]$a, list(as.integer64(1), as.integer64(2), 4)) } -test(2190.9, dt[1:2, a:=call('sum', 1)], error="type 'language' cannot be coerced to 'list'") +test(2190.9, DT[1:2, a:=call('sum', 1)], error="type 'language' cannot be coerced to 'list'") From 54333e5d6d75160e780524136a7e224187dd46b8 Mon Sep 17 00:00:00 2001 From: Matt Dowle Date: Tue, 18 May 2021 02:02:52 -0600 Subject: [PATCH 8/9] news item link other issues --- NEWS.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/NEWS.md b/NEWS.md index 958a86fb6..9d092fdee 100644 --- a/NEWS.md +++ b/NEWS.md @@ -127,7 +127,7 @@ 16. `DT[, min(colB), by=colA]` when `colB` is type `character` would miss blank strings (`""`) at the beginning of a group and return the smallest non-blank instead of blank, [#4848](https://github.com/Rdatatable/data.table/issues/4848). Thanks to Vadim Khotilovich for reporting and for the PR fixing it. -17. Assigning a wrong-length or non-list vector to a list column could segfault, [#4166](https://github.com/Rdatatable/data.table/issues/4166). Thanks to @fklirono for reporting, and to @tlapak for the PR. +17. Assigning a wrong-length or non-list vector to a list column could segfault, [#4166](https://github.com/Rdatatable/data.table/issues/4166) [#4678](https://github.com/Rdatatable/data.table/issues/4678) [#4729](https://github.com/Rdatatable/data.table/issues/4729). Thanks to @fklirono, @kevinvzandvoort and @peterlittlejohn for reporting, and to Václav Tlapák for the PR. ## NOTES From 5a811d10db39660a6903094908f0cb81a7583adf Mon Sep 17 00:00:00 2001 From: Matt Dowle Date: Tue, 18 May 2021 02:10:32 -0600 Subject: [PATCH 9/9] news item another linked issue --- NEWS.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/NEWS.md b/NEWS.md index 9d092fdee..aae123455 100644 --- a/NEWS.md +++ b/NEWS.md @@ -127,7 +127,7 @@ 16. `DT[, min(colB), by=colA]` when `colB` is type `character` would miss blank strings (`""`) at the beginning of a group and return the smallest non-blank instead of blank, [#4848](https://github.com/Rdatatable/data.table/issues/4848). Thanks to Vadim Khotilovich for reporting and for the PR fixing it. -17. Assigning a wrong-length or non-list vector to a list column could segfault, [#4166](https://github.com/Rdatatable/data.table/issues/4166) [#4678](https://github.com/Rdatatable/data.table/issues/4678) [#4729](https://github.com/Rdatatable/data.table/issues/4729). Thanks to @fklirono, @kevinvzandvoort and @peterlittlejohn for reporting, and to Václav Tlapák for the PR. +17. Assigning a wrong-length or non-list vector to a list column could segfault, [#4166](https://github.com/Rdatatable/data.table/issues/4166) [#4667](https://github.com/Rdatatable/data.table/issues/4667) [#4678](https://github.com/Rdatatable/data.table/issues/4678) [#4729](https://github.com/Rdatatable/data.table/issues/4729). Thanks to @fklirono, Kun Ren, @kevinvzandvoort and @peterlittlejohn for reporting, and to Václav Tlapák for the PR. ## NOTES