-
Notifications
You must be signed in to change notification settings - Fork 992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
malformed factor resulting from 'by' expression when using melt.data.table in 'j' expression #2199
Comments
In my opinion, combining vectors with inconsistent attributes is a bad idea, and it would be reasonable to leave the responsibility for handling that with the user. Some other examples:
On the other hand, rbind/rbindlist apparently has special handling for factors: |
I somehow agree about the responsibility (I usually avoid factors at all), but for ordinary R user I think this is unexpected behavior in two ways: a) the error message about corrupted factors comes only when printing the data.table. Only after I tried |
So this crashes my R/RStudio session
|
I was about to post the same issue. In addition to the arguments above I think the issue should be adressed because it can lead to silent errors in data analysis without any kind of error or warning message. I found this behaviour because I got strange results in a study due to that. As factor variables are one of the fundamental data types in R and one of the first things encountered by new R users, many people using them won't even know that such things as "attributes" even exist. Therefore they can't be held responsible for possible problems caused by them. Similar to the other poster, this issue can also crash R with a fatal error on my computer. |
Further example to Frank's (showed it isn't due to > DT = data.table(A=1:2)
> g = function(x) { if (x==1L) factor(c("a","b")) else factor(c("b","c")) }
> ans = DT[,g(.GRP),by=A]
> ans
A V1
<int> <fctr>
1: 1 a
2: 1 b
3: 2 a # wrong silently
4: 2 b # wrong silently
> g = function(x) { if (x==1L) factor(c("a","b")) else factor(c("a","b","c")) }
> ans = DT[,g(.GRP),by=A]
> ans
Error in as.character.factor(x) : malformed factor
> unclass(ans$V1)
[1] 1 2 1 2 3
attr(,"levels")
[1] "a" "b"
> |
I was facing similar problem in |
I get unprintable data.table object whose printing results in error
but in some cases (large dataset) also crashes R.
My code looks like this
In my case I forgot to put the usual
variable.factor = FALSE
into the melt.data.table() which makes it work ok. But this behavior surprises me. It only appears when the two sets of factors differ (for date=1 and date=2 the ids are different sets), so if you I skip the lineit works alright.
I am on the latest stable version of data.table.
My sessionInfo()
The text was updated successfully, but these errors were encountered: