Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CJ fails with proper error msg #2641

Merged
merged 7 commits into from
Feb 28, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,9 @@ Thanks to @MichaelChirico for reporting and to @MarkusBonsch for the implementat

34. Fixed bug where result of `merge()` would contain duplicate column names if `by.x` was also in `names(y)`.
Where there are duplicate column names (i.e. `suffixes = c("", "")`) `merge()` will throw a warning to match
the behaviour of `base:::merge.data.frame()`. Thanks to @sritchie73 for reporting and fixing [PR#2631](https://github.com/Rdatatable/data.table/pull/2631).
the behaviour of `base:::merge.data.frame()`. Thanks to @sritchie73 for reporting and fixing [PR#2631](https://github.com/Rdatatable/data.table/pull/2631).

35. `CJ()` now fails with proper error message when results would exceed max integer, [#2636](https://github.com/Rdatatable/data.table/issues/2636).

#### NOTES

Expand All @@ -167,7 +169,6 @@ the behaviour of `base:::merge.data.frame()`. Thanks to @sritchie73 for reportin

10. As warned in v1.9.8 release notes below in this file (on CRAN 25 Nov 2016) it has been 1 year since then and so use of `options(datatable.old.unique.by.key=TRUE)` to restore the old default is now deprecated with warning. The new warning states that this option still works and repeats the request to pass `by=key(DT)` explicitly to `unique()`, `duplicated()`, `uniqueN()` and `anyDuplicated()` and to stop using this option. In another year, this warning will become error. Another year after that the option will be removed.


### Changes in v1.10.4-3 (on CRAN 20 Oct 2017)

1. Fixed crash/hang on MacOS when `parallel::mclapply` is used and data.table is merely loaded, [#2418](https://github.com/Rdatatable/data.table/issues/2418). Oddly, all tests including test 1705 (which tests `mclapply` with data.table) passed fine on CRAN. It appears to be some versions of MacOS or some versions of libraries on MacOS, perhaps. Many thanks to Martin Morgan for reporting and confirming this fix works. Thanks also to @asenabouth, Joe Thorley and Danton Noriega for testing, debugging and confirming that automatic parallelism inside data.table (such as `fwrite`) works well even on these MacOS installations. See also news items below for 1.10.4-1 and 1.10.4-2.
Expand Down
5 changes: 4 additions & 1 deletion R/setkey.R
Original file line number Diff line number Diff line change
Expand Up @@ -354,8 +354,11 @@ CJ <- function(..., sorted = TRUE, unique = FALSE)
else if (length(l) > 1L && !emptyList) {
# using rep.int instead of rep speeds things up considerably (but attributes are dropped).
attribs = lapply(l, attributes) # remember attributes for resetting after rep.int
n = vapply(l, length, 0L)
n = vapply(l, length, 0L) #lengths(l) will work from R 3.2.0
nrow = prod(n)
if (nrow > .Machine$integer.max) {
stop("Cross product of elements provided to CJ() would result in ",nrow," rows which exceeds .Machine$integer.max == ",.Machine$integer.max)
}
x = c(rev(take(cumprod(rev(n)))), 1L)
for (i in seq_along(x)) {
y = l[[i]]
Expand Down
6 changes: 6 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -11750,6 +11750,12 @@ test(1881.3, fread(f), DT, warning=c('Discarded single-line footer: <<(10000 row
'Found and resolved improper quoting. First healed line 111: <<"a"b,def>>'))
unlink(f)

# CJ will should fail with proper error message, #2636
test(1882.1, .Machine$integer.max, 2147483647L) # same on all platforms and very unlikely to change in R (which is good)
test(1882.2, ceiling(.Machine$integer.max^(1/3)), 1291)
v = seq_len(1291L)
test(1882.3, CJ(v, v, v), error="Cross product of elements provided to CJ() would result in 2151685171 rows which exceeds .Machine$integer.max == 2147483647")


###################################
# Add new tests above this line #
Expand Down