Skip to content

Commit

Permalink
kde1d 1.1.0 (#59)
Browse files Browse the repository at this point in the history
* update actions script

* no CXX restriction

* remove codecov badge

* update main

* Updated handling of discrete variables (#58)

* update cpp

* adapt interface

* better plots

* update cpp

* little fixes

* improve docs

* standardize bandwidth

* @importFrom graphics points

* check for boundary violations in cpp

* fix zi-case where prob0 = 1

* update cpp backend

* adjust to new C++ types

* fix bitwise comparisons

* reorder params in fit_kde1d_cpp() docs

---------

Co-authored-by: tnagler <thomas.nagler@tum.de>

* bump version

* update website

* make CRAN happy

* more CRAN happyness

---------

Co-authored-by: tnagler <thomas.nagler@tum.de>
  • Loading branch information
tnagler and tnagler authored Jan 8, 2025
1 parent 7e7bb7f commit dd4b533
Show file tree
Hide file tree
Showing 83 changed files with 23,022 additions and 1,357 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ jobs:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3

- uses: r-lib/actions/setup-r@v2
with:
Expand Down
16 changes: 8 additions & 8 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
Package: kde1d
Type: Package
Title: Univariate Kernel Density Estimation
Version: 1.0.6
Version: 1.1.0
Authors@R: c(
person("Thomas", "Nagler",, "mail@tnagler.com", role = c("aut", "cre")),
person("Thibault", "Vatter",, "thibault.vatter@gmail.com", role = c("aut"))
)
Description: Provides an efficient implementation of univariate local polynomial
kernel density estimators that can handle bounded and discrete data. See
Geenens (2014) <arXiv:1303.4121>,
Geenens and Wang (2018) <arXiv:1602.04862>,
Nagler (2018a) <arXiv:1704.07457>,
Nagler (2018b) <arXiv:1705.05431>.
Geenens (2014) <doi:10.48550/arXiv.1303.4121>,
Geenens and Wang (2018) <doi:10.48550/arXiv.1602.04862>,
Nagler (2018a) <doi:10.48550/arXiv.1704.07457>,
Nagler (2018b) <doi:10.48550/arXiv.1705.05431>.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Expand All @@ -25,9 +25,9 @@ Imports:
randtoolbox,
stats,
utils
RoxygenNote: 7.1.2
Suggests:
testthat
URL: https://github.com/tnagler/kde1d
BugReports: https://github.com/tnagler/kde1d/issues
URL: https://tnagler.github.io/kde1d/
BugReports: https://github.com/tnagler/kde1d/issues/
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
23 changes: 2 additions & 21 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,21 +1,2 @@
MIT License

Copyright (c) 2024 kde1d

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
YEAR: 2025
COPYRIGHT HOLDER: Thomas Nagler, Thibault Vatter
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
S3method(lines,kde1d)
S3method(logLik,kde1d)
S3method(plot,kde1d)
S3method(points,kde1d)
S3method(print,kde1d)
S3method(summary,kde1d)
export(dkde1d)
Expand All @@ -14,6 +15,7 @@ export(rkde1d)
importFrom(Rcpp,sourceCpp)
importFrom(graphics,lines)
importFrom(graphics,plot)
importFrom(graphics,points)
importFrom(randtoolbox,sobol)
importFrom(stats,logLik)
importFrom(stats,na.omit)
Expand Down
18 changes: 17 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,20 @@
# kde1d 1.0.6
# kde1d 1.1.0

NEW FEATURES

* Added functionality for estimating zero-inflated discrete-continuous mixtures.

* New `kde1d(..., type = "...")` argument to specify the data type. Options are
{c, cont, continuous} for continuous variables, {d, disc, discrete} for
discrete integer variables, or {zi, zinfl, zero-inflated} for zero-inflated
variables.

BREAKING CHANGE

* New C++ API, making it easier to use stand-alone.


# kde1d 1.0.7

DEPENDS

Expand Down
13 changes: 9 additions & 4 deletions R/RcppExports.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,23 @@
#' freedom.
#' @param x vector of observations; categorical data must be converted to
#' non-negative integers.
#' @param nlevels the number of factor levels; 0 for continuous data.
#' @param bandwidth the bandwidth parameter.
#' @param xmin lower bound for the support of the density, `NaN` means no
#' boundary.
#' @param xmax upper bound for the support of the density, `NaN` means no
#' boundary.
#' @param type variable type; must be one of {c, cont, continuous} for
#' continuous variables, one of {d, disc, discrete} for discrete integer
#' variables, or one of {zi, zinfl, zero-inflated} for zero-inflated
#' variables.
#' @param bandwidth the bandwidth parameter.
#' @param mult positive bandwidth multiplier; the actual bandwidth used is
#' bw*mult.
#' @param degree order of the local polynomial.
#' @return `An Rcpp::List` containing the fitted density values on a grid and
#' additional information.
#' @noRd
fit_kde1d_cpp <- function(x, nlevels, bandwidth, mult, xmin, xmax, degree, weights) {
.Call('_kde1d_fit_kde1d_cpp', PACKAGE = 'kde1d', x, nlevels, bandwidth, mult, xmin, xmax, degree, weights)
fit_kde1d_cpp <- function(x, xmin, xmax, type, mult, bandwidth, degree, weights) {
.Call('_kde1d_fit_kde1d_cpp', PACKAGE = 'kde1d', x, xmin, xmax, type, mult, bandwidth, degree, weights)
}

#' computes the pdf of a kernel density estimate by interpolation.
Expand Down
113 changes: 70 additions & 43 deletions R/kde1d-methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -32,25 +32,15 @@
#' @export
dkde1d <- function(x, obj) {
x <- prep_eval_arg(x, obj)
sel <- is_below_support(x, obj) | is_above_support(x, obj)
x[sel] <- NA
d <- dkde1d_cpp(x, obj)
d[sel] <- 0
d
dkde1d_cpp(x, obj)
}

#' @param q vector of quantiles.
#' @rdname dkde1d
#' @export
pkde1d <- function(q, obj) {
q <- prep_eval_arg(q, obj)
below <- is_below_support(q, obj)
above <- is_above_support(q, obj)
q[below | above] <- NA
p <- pkde1d_cpp(q, obj)
p[below] <- 0
p[above] <- 1
p
pkde1d_cpp(q, obj)
}

#' @param p vector of probabilities.
Expand Down Expand Up @@ -115,26 +105,26 @@ rkde1d <- function(n, obj, quasi = FALSE) {
#' dpois(0:20, 3),
#' col = "red"
#' )
#'
#' ## zero-inflated data
#' x <- rexp(500, 0.5) # simulate data
#' x[sample(1:500, 200)] <- 0 # add zero-inflation
#' fit <- kde1d(x, xmin = 0, type = "zi") # estimate density
#' plot(fit) # plot the density estimate
#' lines( # add true density
#' seq(0, 20, l = 100),
#' 0.6 * dexp(seq(0, 20, l = 100), 0.5),
#' col = "red"
#' )
#' points(0, 0.4, col = "red")
#'
#' @importFrom graphics plot
#' @importFrom utils modifyList
#' @export
plot.kde1d <- function(x, ...) {
plot_type <- "l" # for continuous variables, use a line plot
if (is.ordered(x$x)) {
ev <- ordered(levels(x$x), levels(x$x))
plot_type <- "h" # for discrete variables, use a histrogram
} else {
# adjust grid if necessary
ev <- seq(min(x$grid_points), max(x$grid_points), l = 200)
if (!is.nan(x$xmin)) {
ev[1] <- x$xmin
}
if (!is.nan(x$xmax)) {
ev[length(ev)] <- x$xmax
}
}
ev <- make_plotting_grid(x)
vals <- dkde1d(ev, x)

plot_type <- ifelse(x$type == "discrete", "p", "l")
pars <- list(
x = ev,
y = vals,
Expand All @@ -143,8 +133,11 @@ plot.kde1d <- function(x, ...) {
ylab = "density",
ylim = c(0, 1.1 * max(x$values))
)

do.call(plot, modifyList(pars, list(...)))

if (x$type == "zero-inflated") {
points(0, dkde1d(0, x))
}
}

#' @method lines kde1d
Expand All @@ -154,20 +147,51 @@ plot.kde1d <- function(x, ...) {
#' @importFrom utils modifyList
#' @export
lines.kde1d <- function(x, ...) {
if (is.ordered(x$x)) {
stop("lines does not work for discrete estimates.")
}
ev <- seq(min(x$grid_points), max(x$grid_points), l = 200)
if (!is.nan(x$xmin)) {
ev[1] <- x$xmin
}
if (!is.nan(x$xmax)) {
ev[length(ev)] <- x$xmax
if (x$type == "discrete") {
points(x, ...)
}
ev <- make_plotting_grid(x)
vals <- dkde1d(ev, x)

pars <- list(x = ev, y = vals)
do.call(lines, modifyList(pars, list(...)))

if (x$type == "zero-inflated") {
points(0, dkde1d(0, x))
}
}

#' @method points kde1d
#'
#' @rdname plot.kde1d
#' @importFrom graphics points
#' @importFrom utils modifyList
#' @export
points.kde1d <- function(x, ...) {
ev <- make_plotting_grid(x)
vals <- dkde1d(ev, x)
pars <- list(x = ev, y = vals)
do.call(points, modifyList(pars, list(...)))
}

make_plotting_grid <- function(x) {
if (is.ordered(x$x)) {
ev <- ordered(levels(x$x), levels(x$x))
} else if (x$type == "discrete") {
ev <- seq.int(floor(min(x$grid_points)), ceiling(max(x$grid_points)))
} else {
# adjust grid if necessary
ev <- seq(min(x$grid_points), max(x$grid_points), l = 200)
if (!is.nan(x$xmin)) {
ev[1] <- x$xmin
}
if (!is.nan(x$xmax)) {
ev[length(ev)] <- x$xmax
}
if (x$type == "zero-inflated") {
ev <- setdiff(ev, 0)
}
}
ev
}

#' @importFrom stats logLik
Expand All @@ -181,8 +205,10 @@ logLik.kde1d <- function(object, ...) {
#' @method print kde1d
#' @export
print.kde1d <- function(x, ...) {
if (is.ordered(x$x)) {
if (x$type == "discrete") {
cat("(jittered) ")
} else if (x$type == "zero-inflated") {
cat("(zero-inflated) ")
}
cat("kernel density estimate ('kde1d')")
if (x$deg > 0) {
Expand Down Expand Up @@ -213,12 +239,13 @@ print.kde1d <- function(x, ...) {
#' @method summary kde1d
#' @export
summary.kde1d <- function(object, ...) {
df <- rep(NA, 4)
names(df) <- c("nobs", "bw", "loglik", "d.f.")
df <- rep(NA, 5)
names(df) <- c("nobs", "bw", "mult", "loglik", "d.f.")
df[1] <- object$nobs
df[2] <- object$bw
df[3] <- object$loglik
df[4] <- object$edf
df[3] <- object$mult
df[4] <- object$loglik
df[5] <- object$edf

print(object)
cat(strrep("-", 65), "\n", sep = "")
Expand Down
12 changes: 6 additions & 6 deletions R/kde1d-package.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
#' One-Dimensional Kernel Density Estimation
#'
#' Provides an efficient implementation of univariate local polynomial
#' kernel density estimators that can handle bounded and discrete data. The
#' kernel density estimators that can handle bounded, discrete, and
#' zero-inflated data. The
#' implementation utilizes spline interpolation to reduce memory usage and
#' computational demand for large data sets.
#'
Expand All @@ -12,21 +13,20 @@
#'
#' Geenens, G., Wang, C. (2018). *Local-likelihood transformation kernel
#' density estimation for positive random variables.* Journal of Computational
#' and Graphical Statistics, to appear,
#' and Graphical Statistics, 27(4), 822-835.
#' [arXiv:1602.04862](https://arxiv.org/abs/1602.04862)
#'
#' Nagler, T. (2018a). *A generic approach to nonparametric function
#' estimation with mixed data.* Statistics & Probability Letters, 137:326–330,
#' [arXiv:1704.07457](https://arxiv.org/abs/1704.07457)
#'
#' Nagler, T. (2018b). *Asymptotic analysis of the jittering kernel density
#' estimator.* Mathematical Methods of Statistics, in press,
#' estimator.* Mathematical Methods of Statistics, 27, 32-46.
#' [arXiv:1705.05431](https://arxiv.org/abs/1705.05431)
#'
#' @name kde1d-package
#' @docType package
NULL

#' @useDynLib kde1d
#' @importFrom Rcpp sourceCpp
NULL
"_PACKAGE"

Loading

0 comments on commit dd4b533

Please sign in to comment.