Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dcast on missing int64 value generates 9218868437227407266 instead of NA #4561

Closed
emallickhossain opened this issue Jun 19, 2020 · 4 comments · Fixed by #4586
Closed

dcast on missing int64 value generates 9218868437227407266 instead of NA #4561

emallickhossain opened this issue Jun 19, 2020 · 4 comments · Fixed by #4586
Labels

Comments

@emallickhossain
Copy link
Contributor

When I dcast a data.table that contains an int64 column, it fills in 9218868437227407266 for any missing values instead of the usual NA.

This is related to Issues #488, #1385, and #3723, but none of those fixes have resolved this error from arising when using dcast(). This has been asked on StackOverflow here (no MWE and no response) and here where the latter answer simply proposed a workaround, but should have been filed as an issue.

# Minimal reproducible example

library(data.table)
apple <- fread("id, time, y
               a, 1, 12345678901234
               b, 1, 70
               b, 2, 20")
dcast(apple, id ~ time, value.var = "y")
   id              1                   2
1:  a 12345678901234 9218868437227407266
2:  b             70                  20

# Output of sessionInfo()

R version 4.0.1 (2020-06-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] data.table_1.12.9 RPostgres_1.2.0

loaded via a namespace (and not attached):
[1] compiler_4.0.1 bit_1.1-15.2 hms_0.5.3 tools_4.0.1 DBI_1.1.0
[6] Rcpp_1.0.4.6 bit64_0.9-7 vctrs_0.3.1 blob_1.2.1 pkgconfig_2.0.3
[11] rlang_0.4.6

@jangorecki jangorecki added the reshape dcast melt label Jun 19, 2020
@MichaelChirico
Copy link
Member

MichaelChirico commented Jun 30, 2020

Hi @emallickhossain :)

Working on a fix.

As a workaround, you can explicitly provide NA-int64 as fill [just noticed this is included in the referenced SO Q&A as well]:

dcast(apple, id ~ time, value.var = "y", fill = bit64::as.integer64(NA))
#        id              1     2
# 1:      a 12345678901234  <NA>
# 2:      b             70    20

@jangorecki
Copy link
Member

This can be more nicely resolved after #4491

@MichaelChirico
Copy link
Member

Indeed I was starting to walk down a path of recreating coerceAs incidentally. Following up on that PR.

@emallickhossain
Copy link
Contributor Author

Thanks! My workaround was just coercing to a numeric before reshaping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants