-
Notifications
You must be signed in to change notification settings - Fork 998
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implemented anywhere() and %anywhere%, convenient fn for range join. c…
…loses #679.
- Loading branch information
1 parent
672e6fd
commit 80ccd2f
Showing
5 changed files
with
50 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,32 +1,51 @@ | ||
\name{between} | ||
\alias{between} | ||
\alias{\%between\%} | ||
\title{ Convenience function for range subset logic. } | ||
\alias{anywhere} | ||
\alias{\%anywhere\%} | ||
\title{ Convenience functions for range subsets. } | ||
\description{ | ||
Intended for use in \code{i} in \code{[.data.table}. From \code{v1.9.8}, \code{between} is vectorised. | ||
Intended for use in \code{i} in \code{[.data.table}. | ||
|
||
\code{between} answers the question: Is \code{x[i]} in between \code{lower[i]} and \code{upper[i]}. \code{lower} and \code{upper} are recycled if they are not identical to \code{length(x)}. This is equivalent to \code{x >= lower & x <= upper}, when \code{incbounds=TRUE} and \code{x > lower & y < upper} when \code{FALSE}. | ||
|
||
\code{anywhere} on the other hand answers the question: Is \code{x[i]} is in between \emph{any of the intervals}specified by \code{lower, upper}. There is no need for recycling here. A \code{non-equi} join is performed internally in this case to determine if \code{x[i]} is in between \emph{any} of the intervals in \code{lower, upper}. | ||
} | ||
\usage{ | ||
between(x,lower,upper,incbounds=TRUE) | ||
x \%between\% y | ||
anywhere(x,lower,upper,incbounds=TRUE) | ||
x \%anywhere\% y | ||
} | ||
\arguments{ | ||
\item{x}{ Any orderable vector, i.e., those with relevant methods for \code{`<=`}, such as \code{numeric}, \code{character}, \code{Date}, ... } | ||
\item{lower}{ Lower range bound. Usually of length=\code{1} or \code{length(x)}.} | ||
\item{upper}{ Upper range bound. Usually of same length as \code{lower}.} | ||
\item{lower}{ Lower range bound. Must be of same length as \code{upper}. Recycled to \code{length(x)} in case of \code{between}.} | ||
\item{upper}{ Upper range bound. Must be of same length as \code{lower}. Recycled to \code{length(x)} in case of \code{between}.} | ||
\item{y}{ A length-2 \code{vector} or \code{list}, with \code{y[[1]]} interpreted as \code{lower} and \code{y[[2]]} as \code{upper}.} | ||
\item{incbounds}{ \code{TRUE} means inclusive bounds, i.e., [lower,upper]. \code{FALSE} means exclusive bounds, i.e., (lower,upper). } | ||
} | ||
% \details{ | ||
% } | ||
\details{ | ||
When \code{lower} and \code{upper} are length-1 vectors, \code{between} and \code{anywhere} are the same. In that case, \code{anywhere} is likely to be faster since it uses \emph{binary search} based \code{non-equi} join instead of \code{vector scan} as in the case of \code{between}. | ||
} | ||
\value{ | ||
Logical vector as the same length as \code{x} with value \code{TRUE} for those that lie within the specified range. | ||
} | ||
\note{ Current implementation does not make use of ordered keys. \code{incbounds} is set to \code{TRUE} for the infix notation \code{\%between\%}. } | ||
\seealso{ \code{\link{data.table}}, \code{\link{like}} } | ||
\examples{ | ||
DT = data.table(x=1:5, y=6:10, z=c(5:1)) | ||
DT[y \%between\% c(7,9)] | ||
X = data.table(a=1:5, b=6:10, c=c(5:1)) | ||
X[b \%between\% c(7,9)] | ||
X[between(b, 7, 9)] # same as above | ||
# NEW feature in v1.9.8, vectorised between | ||
DT[z \%between\% list(x,y)] | ||
X[c \%between\% list(a,b)] | ||
X[between(c, a, b)] # same as above | ||
X[between(c, a, b, incbounds=FALSE)] # open interval | ||
|
||
# anywhere() | ||
Y = data.table(a=c(8,3,10,7,-10), val=runif(5)) | ||
range = data.table(start = 1:5, end = 6:10) | ||
Y[a \%anywhere\% range] | ||
Y[anywhere(a, range$start, range$end)] # same as above | ||
Y[anywhere(a, range$start, range$end, incbounds=FALSE)] # open interval | ||
} | ||
\keyword{ data } |