-
Notifications
You must be signed in to change notification settings - Fork 995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow a single column to be used as rownames in as.matrix #2702
Changes from 9 commits
0389de1
b1590ae
1323be0
ddaeb6a
0ab7e4f
8477788
ac52d9a
b9eab65
c0cca0d
11da144
5b4bca7
c5ae94b
f07b813
3d4681c
d9a4a54
de48d84
2810538
8348172
12cace8
895554e
a887594
cde74a2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1881,17 +1881,58 @@ chmatch2 <- function(x, table, nomatch=NA_integer_) { | |
# x | ||
#} | ||
|
||
|
||
as.matrix.data.table <- function(x,...) | ||
{ | ||
dm <- dim(x) | ||
cn <- names(x) | ||
as.matrix.data.table <- function(x, rownames, ...) { | ||
rn <- NULL | ||
rnc <- NULL | ||
if (!missing(rownames)) { # Convert rownames to a column index if possible | ||
if (is.null(rownames)) { | ||
warning("rownames is NULL, ignoring rownames") | ||
} else if (length(rownames) != 1) { | ||
stop("rownames must be a single column in x") | ||
} else if (is.na(rownames)) { | ||
warning("rownames is NA, ignoring rownames") | ||
} else if (is.logical(rownames) && !isTRUE(rownames)) { | ||
warning("rownames is FALSE, ignoring rownames") | ||
} else if (!(is.logical(rownames) || is.character(rownames) || is.numeric(rownames))) { | ||
# E.g. because rownames is some sort of object that cant be converted to a column index | ||
stop("rownames must be TRUE, a column index, or a column name in x") | ||
} else { | ||
if (is.logical(rownames) && isTRUE(rownames)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're right. I've changed this statement to |
||
if (haskey(x)) { | ||
rownames <- key(x) | ||
if (length(rownames) > 1) { | ||
warning("rownames is TRUE but multiple keys found in key(x), using first column instead") | ||
rownames <- 1 | ||
} | ||
} else { | ||
rownames <- 1 | ||
} | ||
} | ||
if (is.character(rownames)) { # Handles cases where rownames is a column name, or key(x) from TRUE | ||
rnc <- chmatch(rownames, names(x)) | ||
if (is.na(rnc)) stop(rownames, " is not a column of x") | ||
} else { # rownames is an index already | ||
if (rownames < 1 || rownames > ncol(x)) | ||
stop("rownames is ", rownames, " which is outside the column number range [1,ncol=", ncol(x), "]") | ||
rnc <- rownames | ||
} | ||
} | ||
} | ||
if (!is.null(rnc)) { # If there are rownames, extract and drop that column | ||
rn <- x[[rnc]] | ||
dm <- dim(x) - c(0, 1) | ||
cn <- names(x)[-rnc] | ||
X <- x[, -rnc, with = FALSE] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @HughParsonage - I like the |
||
} else { | ||
dm <- dim(x) | ||
cn <- names(x) | ||
X <- x | ||
} | ||
if (any(dm == 0L)) | ||
return(array(NA, dim = dm, dimnames = list(NULL, cn))) | ||
return(array(NA, dim = dm, dimnames = list(rn, cn))) | ||
p <- dm[2L] | ||
n <- dm[1L] | ||
collabs <- as.list(cn) | ||
X <- x | ||
class(X) <- NULL | ||
non.numeric <- non.atomic <- FALSE | ||
all.logical <- TRUE | ||
|
@@ -1936,7 +1977,7 @@ as.matrix.data.table <- function(x,...) | |
} | ||
X <- unlist(X, recursive = FALSE, use.names = FALSE) | ||
dim(X) <- c(n, length(X)/n) | ||
dimnames(X) <- list(NULL, unlist(collabs, use.names = FALSE)) | ||
dimnames(X) <- list(rn, unlist(collabs, use.names = FALSE)) | ||
X | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
\name{as.matrix} | ||
\alias{as.matrix} | ||
\alias{as.matrix.data.table} | ||
\title{Convert a data.table to a matrix} | ||
\description{ | ||
Converts a \code{data.table} into a \code{matrix}, optionally using one | ||
of the columns in the \code{data.table} as the \code{matrix} \code{rownames}. | ||
} | ||
\usage{ | ||
\method{as.matrix}{data.table}(x, rownames, ...)} | ||
|
||
\arguments{ | ||
\item{x}{a \code{data.table}} | ||
\item{rownames}{optional, a single column name or column index to use as | ||
the \code{rownames} inthe returned \code{matrix}. If \code{TRUE} the | ||
\code{\link{key}} of the \code{data.table} will be used if it is a | ||
single column, otherwise the first column in the \code{data.table} will | ||
be used.} | ||
\item{\dots}{additional arguments to be passed to or from methods.} | ||
} | ||
|
||
\details{ | ||
\code{\link{as.matrix}} is a generic function in base R. It dispatches to | ||
\code{as.matrix.data.table} if its \code{x} argument is a \code{data.table}. | ||
|
||
The method for \code{data.table}s will return a character matrix if there | ||
are only atomic columns and any non-(numeric/logical/complex) column, | ||
applying \code{\link{as.vector}} to factors and \code{\link{format}} to other | ||
non-character columns. Otherwise, the usual coercion hierarchy (logical < | ||
integer < double < complex) will be used, e.g., all-logical data frames | ||
will be coerced to a logical matrix, mixed logical-integer will give an | ||
integer matrix, etc. | ||
|
||
An additional argument \code{rownames} is provided for \code{as.matrix.data.table} | ||
to facilitate conversions to matrices where the \code{\link{rownames}} are stored | ||
in a single column of \code{x}, e.g. the first column after using | ||
\code{\link{dcast.data.table}}. | ||
} | ||
|
||
\value{ | ||
A new \code{matrix} containing the contents of \code{x}. | ||
} | ||
|
||
\seealso{ | ||
\code{\link{data.table}}, \code{\link{as.matrix}}, \code{\link{data.matrix}} | ||
\code{\link{array}} | ||
} | ||
|
||
\examples{ | ||
(dt1 <- data.table(A = letters[1:10], X = 1:10, Y = 11:20)) | ||
as.matrix(dt1) # character matrix | ||
as.matrix(dt1, rownames = "A") | ||
as.matrix(dt1, rownames = 1) | ||
as.matrix(dt1, rownames = TRUE) | ||
|
||
(dt1 <- data.table(A = letters[1:10], X = 1:10, Y = 11:20)) | ||
setkey(dt1, A) | ||
as.matrix(dt1, rownames = TRUE) | ||
} | ||
|
||
\keyword{ array } | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure we have a style guide on this, but I note that the corresponding CRAN cheat for
[.data.table
symbols are defined in the package environment rather than the function body:https://github.com/Rdatatable/data.table/blob/master/R/data.table.R#L11
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are defined there because they are exported.
rn
won't be used by user.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap -
rn
is internal here, it will contain the vector of rownames to put in the matrix (after all the processing inif (!missing(rownames)) {}
.rnc
will contain the index of the column inx
to be dropped.