You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Submitted by: James Sams; Assigned to: Nobody; R-Forge link
TL;DR: dim(unique(..., by=c(A, B))) reports MORE rows than dim(unique(..., by=c(A, B, C))). Affects duplicated() and merge(). I see this in 1.8.11, not 1.8.10.
I actually discovered this when a merge that was working previously stopped working, believing itself to be a cartesian join. So, the affected code is used by merge() as well. However, I think the problem is made more clear using unique(). I have a data.table with 3 columns (double, integer, integer). The double column, when read by fread, is integer64. However, I've found integer64 to be unreliable; so, I stick to using double/numeric. The values are up to 12 digits, all positive, and as I said, always integral values. I've duplicated this problem by coercing the other columns to double and reading using read.delim and coercing to data.table.
Submitted by: James Sams; Assigned to: Nobody; R-Forge link
TL;DR:
dim(unique(..., by=c(A, B)))
reports MORE rows thandim(unique(..., by=c(A, B, C)))
. Affectsduplicated()
andmerge()
. I see this in 1.8.11, not 1.8.10.I actually discovered this when a merge that was working previously stopped working, believing itself to be a cartesian join. So, the affected code is used by
merge()
as well. However, I think the problem is made more clear usingunique()
. I have adata.table
with 3 columns (double, integer, integer). The double column, when read byfread
, isinteger64
. However, I've foundinteger64
to be unreliable; so, I stick to using double/numeric. The values are up to 12 digits, all positive, and as I said, always integral values. I've duplicated this problem by coercing the other columns to double and reading usingread.delim
and coercing todata.table
.THIS is where things go wrong. Notice that adding the rows:
There are no NA's or similar in the data:
The text was updated successfully, but these errors were encountered: