-
Notifications
You must be signed in to change notification settings - Fork 996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
as.matrix.data.table incorrect results when nrows == 1 #2930
Comments
Doesn't seem to be a bug. (Did you mean |
@HughParsonage - I've clarified the original issue. I think it IS a bug. I don't think I have privs to re-open per https://stackoverflow.com/questions/21333654/how-to-re-open-an-issue-in-github - can you re-open for me if you agree? |
Thanks @malcook, good edit. I think this is a documentation issue: essentially |
I think it is NOT a documentation issue. Try passing rownames='name' and we don't get the documented behavior. IMO, the mode and dimension of In fact... I'm guessing that under the hood (in the data.table source code) it will prove to be exactly this underlying issue - and the fix will be to add drop=FALSE to some call to |
It's definitely an issue. as.matrix(d[1, ], "name")
#> name age IQ
#> name "bob" "10" "130" but this is a documented result:
Since data.frame(x = 1, y = 7, row.names = 2)
#> x y
#> 2 1 7 data.frame(x = 1:2, y = 7:8, row.names = 2)
#> x
#> 7 1
#> 8 2 Unfortunately, this means that forcing the column of a one-row data.frame to a row name is not possible, but that is the behaviour. I think even if we agree that this is documented behaviour, the documentation should be changed to emphasize that if |
Hmm. I think you’re stretching. Try Also, decide what you wish/expect/hope the result of passing Now try it. Surprised? I claim that if you are correct and this is a bug in the documentation, and you “fix” the documentation to reflect current behavior, then the documentation will only more clearly reflect a bug in the design. But I don’t think there is a bug in the design, except perhaps for trying to overload |
feel free to file a PR that solves the issue as you see it :)
…On Wed, Jun 13, 2018, 12:57 PM Malcolm Cook ***@***.***> wrote:
Hmm. I think you’re stretching.
Try setkey(d,’name’) and then read the documentation again, and then
guess what the result of passing rownames=TRUE to as.matrix.
Also, decide what you wish/expect/hope the result of passing rownames=TRUE
should be before trying it.
Now try it.
Surprised?
I claim that if you are correct and this is a bug in the documentation,
and you “fix” the documentation to reflect current behavior, then the
documentation will only more clearly reflect a bug in the design.
But I don’t think there is a bug in the design, except perhaps for trying
to overload rownames with multiple possible conflicting possible
interpretations. I hope the project finds a way to provide both behaviors
and resolve the (presumably unintended) ambiguity.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2930 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHQQdX5Zbpy-ZFleVnpTWkM8A87O7cBiks5t8JvDgaJpZM4Ujtlv>
.
|
My suggestion would be to change the design in such a way that the mode and dimension of the result of as.matrix should never depend on the dimensionality of d, which it does now as implemented in this (arguably) edge case. So, any change would have to be a (possibly) breaking change. I'd be surprised if anyone discovered this and chose to depend upon it, but YMMV. I'm not sure who vets/approves pull requests. Is that you @MichaelChirico ? Anyway, if that person(s) agrees that this is a poor design, and should be fixed, then perhaps there is a chance that a pull request from me that fixes the design/implementation/code would be welcome. But I would understand that a reasonable position might be that this is not a bad design, or the badness is insufficient to merit a possible breaking change. Please advise, oh data.table gods, and I will try and respond accordingly. Thanks |
I am guessing it is |
Yes – it is
|
and what if we want to apply same principle to |
There appears to be no data.table:::as.array.data.table so not much of an issue |
so maybe there should be... |
I don't really understand where you are leading, but feel it is not toward resolving the issue I've raise, but, thanks for your interest and please carry on if there is a relationship I am missing. |
if result should not depend on dimensionality then we should focus on array method instead of matrix method which by name limits dimensionality. |
I'm going to bow out of this exchange I've started. Thanks all for hearing me out. If you want to "fix" the documentation so others do not fall down the same rabbit hole as I did, that is one way of resolving it. As always, thanks for data.table. I use it every day! |
Thanks for raising this @malcook , and apologies for not thinking of that edge case! |
Reopening because PR #2939 by @sritchie73 (thanks) fixes this in agreement with @malcook, about to be merged. |
In further agreement with @malcook, for even further clarity, safety and consistency, I've suggested a minor change to @sritchie73's PR here: #2939 (comment). I'll make the change to the PR unless there are any objections. |
#
Minimal reproducible example
Created on 2018-06-13 by the reprex package (v0.2.0).
The problem is that the rownames argument to as.matrix.data.table is not being respected as documented:
We should expect the 2nd call above to yield a 1x2 matrix of numeric having the single rowname of "bob". Instead, it yields a 1x3 character matrix.
The text was updated successfully, but these errors were encountered: