Skip to content

Bump dependencies on Delphi packages #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Oct 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .Rprofile
Original file line number Diff line number Diff line change
@@ -1 +1,7 @@
source("renv/activate.R")

# Check if user .Rprofile exists
if (file.exists("~/.Rprofile")) {
# Source user .Rprofile
source("~/.Rprofile")
}
11 changes: 6 additions & 5 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@ Package: delphitoolingbook
Title: Delphi Tooling
Version: 0.0.0.9999
Authors@R: c(
person("Daniel", "McDonald", "J.", "daniel@stat.ubc.ca", role = c("cre", "aut"),
person("Logan", "Brooks", role = c("cre","aut"),
person("Rachel", "Lobay", role = "aut"))
person("Ryan", "Tibshirani", "J.", "ryantibs@berkeley.edu", role = "aut"),
Description:
person("Daniel", "McDonald", "J.", "daniel@stat.ubc.ca", role = c("cre", "aut")),
person("Logan", "Brooks", role = c("cre","aut")),
person("Rachel", "Lobay", role = "aut"),
person("Ryan", "Tibshirani", "J.", "ryantibs@berkeley.edu", role = "aut")
)
Description:
| This book is a longform introduction to analysing and forecasting epidemiological data.
License: MIT + file LICENSE
Imports:
Expand Down
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Delphi Tooling Book

The book is a collection of articles and tutorials on how to use the Delphi tooling effectively.

## Compiling the book

The book is written with [Quarto](https://quarto.org/docs/guide/) (which can be installed [here](https://quarto.org/docs/get-started/)). To compile the book, run the following commands:

```sh
# Install the R dependencies
R -e 'install.packages(c("pak", "rspm", "renv"))'
R -e 'renv::restore()'

# Compile the book and preview it
quarto preview
```

We use Quarto's freeze feature to re-render only the qmd files that have changed. To force a re-render of a page, run this command:

```sh
quarto render <name.qmd>
```
11 changes: 11 additions & 0 deletions _common.R
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,14 @@ options(

ggplot2::theme_set(ggplot2::theme_bw())

# Workaround for interleaved `cat`s and `message`s (from `cli`) getting
# intercepted and not combined properly by `collapse: true`:
with_messages_cat_to_stdout <- function(code) {
withCallingHandlers(
code,
message = function(m) {
cat(m$message)
tryInvokeRestart("muffleMessage")
}
)
}
4 changes: 2 additions & 2 deletions _freeze/archive/execute-results/html.json

Large diffs are not rendered by default.

1,465 changes: 1,465 additions & 0 deletions _freeze/archive/figure-html/unnamed-chunk-8-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,179 changes: 588 additions & 591 deletions _freeze/archive/figure-html/unnamed-chunk-9-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
297 changes: 147 additions & 150 deletions _freeze/correlations/figure-html/unnamed-chunk-10-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
345 changes: 171 additions & 174 deletions _freeze/correlations/figure-html/unnamed-chunk-4-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
381 changes: 189 additions & 192 deletions _freeze/correlations/figure-html/unnamed-chunk-6-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
271 changes: 134 additions & 137 deletions _freeze/correlations/figure-html/unnamed-chunk-8-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions _freeze/epidf/execute-results/html.json

Large diffs are not rendered by default.

944 changes: 474 additions & 470 deletions _freeze/epidf/figure-html/unnamed-chunk-11-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,426 changes: 715 additions & 711 deletions _freeze/epidf/figure-html/unnamed-chunk-13-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3,992 changes: 1,998 additions & 1,994 deletions _freeze/epidf/figure-html/unnamed-chunk-15-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions _freeze/epipredict/execute-results/html.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions _freeze/flatline-forecaster/execute-results/html.json

Large diffs are not rendered by default.

691 changes: 344 additions & 347 deletions _freeze/flatline-forecaster/figure-html/unnamed-chunk-12-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
813 changes: 349 additions & 464 deletions _freeze/flatline-forecaster/figure-html/unnamed-chunk-13-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
875 changes: 436 additions & 439 deletions _freeze/flatline-forecaster/figure-html/unnamed-chunk-14-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
879 changes: 879 additions & 0 deletions _freeze/flatline-forecaster/figure-html/unnamed-chunk-15-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions _freeze/forecast-framework/execute-results/html.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion _freeze/growth-rates/execute-results/html.json

Large diffs are not rendered by default.

599 changes: 298 additions & 301 deletions _freeze/growth-rates/figure-html/unnamed-chunk-11-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
653 changes: 325 additions & 328 deletions _freeze/growth-rates/figure-html/unnamed-chunk-11-2.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3,799 changes: 1,888 additions & 1,911 deletions _freeze/growth-rates/figure-html/unnamed-chunk-4-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
377 changes: 187 additions & 190 deletions _freeze/growth-rates/figure-html/unnamed-chunk-5-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
611 changes: 304 additions & 307 deletions _freeze/growth-rates/figure-html/unnamed-chunk-7-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
615 changes: 306 additions & 309 deletions _freeze/growth-rates/figure-html/unnamed-chunk-9-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions _freeze/index/execute-results/html.json

Large diffs are not rendered by default.

418 changes: 208 additions & 210 deletions _freeze/index/figure-html/unnamed-chunk-8-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions _freeze/outliers/execute-results/html.json

Large diffs are not rendered by default.

1,060 changes: 532 additions & 528 deletions _freeze/outliers/figure-html/unnamed-chunk-3-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2,286 changes: 1,244 additions & 1,042 deletions _freeze/outliers/figure-html/unnamed-chunk-7-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2,250 changes: 1,222 additions & 1,028 deletions _freeze/outliers/figure-html/unnamed-chunk-7-2.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,072 changes: 538 additions & 534 deletions _freeze/outliers/figure-html/unnamed-chunk-9-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions _freeze/preprocessing-and-models/execute-results/html.json

Large diffs are not rendered by default.

481 changes: 239 additions & 242 deletions _freeze/preprocessing-and-models/figure-html/unnamed-chunk-9-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions _freeze/slide/execute-results/html.json

Large diffs are not rendered by default.

308 changes: 308 additions & 0 deletions _freeze/slide/figure-html/unnamed-chunk-10-1.svg

Large diffs are not rendered by default.

19,118 changes: 16,561 additions & 2,557 deletions _freeze/slide/figure-html/unnamed-chunk-12-1.svg

Large diffs are not rendered by default.

2,860 changes: 2,860 additions & 0 deletions _freeze/slide/figure-html/unnamed-chunk-16-1.svg

Large diffs are not rendered by default.

24,171 changes: 12,068 additions & 12,103 deletions _freeze/slide/figure-html/unnamed-chunk-8-1.svg

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions _freeze/sliding-forecasters/execute-results/html.json

Large diffs are not rendered by default.

3,539 changes: 1,771 additions & 1,768 deletions _freeze/sliding-forecasters/figure-html/plot-ar-asof-1.svg

Large diffs are not rendered by default.

3,464 changes: 1,739 additions & 1,725 deletions _freeze/sliding-forecasters/figure-html/plot-arx-1.svg

Large diffs are not rendered by default.

9,701 changes: 4,993 additions & 4,708 deletions _freeze/sliding-forecasters/figure-html/plot-can-fc-boost-1.svg

Large diffs are not rendered by default.

9,445 changes: 4,969 additions & 4,476 deletions _freeze/sliding-forecasters/figure-html/plot-can-fc-lr-1.svg

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion _freeze/tidymodels-intro/execute-results/html.json

Large diffs are not rendered by default.

647 changes: 322 additions & 325 deletions _freeze/tidymodels-intro/figure-html/unnamed-chunk-23-1.svg

Large diffs are not rendered by default.

273 changes: 136 additions & 137 deletions _freeze/tidymodels-intro/figure-html/unnamed-chunk-26-1.svg

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion _freeze/tidymodels-regression/execute-results/html.json

Large diffs are not rendered by default.

5,234 changes: 2,614 additions & 2,620 deletions _freeze/tidymodels-regression/figure-html/unnamed-chunk-21-1.svg

Large diffs are not rendered by default.

5,180 changes: 2,584 additions & 2,596 deletions _freeze/tidymodels-regression/figure-html/unnamed-chunk-24-1.svg

Large diffs are not rendered by default.

130 changes: 51 additions & 79 deletions archive.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -16,18 +16,17 @@ claims, available through the [COVIDcast
API](https://cmu-delphi.github.io/delphi-epidata/api/covidcast.html). This
signal is subject to very heavy and regular revision; you can read more about it
on its [API documentation
page](https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/doctor-visits.html). We'll use the offline version stored in `{epidatasets}`.

page](https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/doctor-visits.html).
We'll use the offline version stored in `{epidatasets}`.

```{r, include=FALSE}
source("_common.R")
```

## Getting data into `epi_archive` format

An `epi_archive` object
can be constructed from a data frame, data table, or tibble, provided that it
has (at least) the following columns:
An `epi_archive` object can be constructed from a data frame, data table, or
tibble, provided that it has (at least) the following columns:

* `geo_value`: the geographic value associated with each row of measurements.
* `time_value`: the time value associated with each row of measurements.
Expand All @@ -37,7 +36,7 @@ has (at least) the following columns:
the data for January 14, 2022 that were available one day later.

As we can see from the above, the data frame returned by
`epidatr::covidcast()` has the columns required for the `epi_archive`
`epidatr::pub_covidcast()` has the columns required for the `epi_archive`
format, so we use
`as_epi_archive()` to cast it into `epi_archive` format.[^1]

Expand All @@ -48,17 +47,17 @@ to the [compactify vignette](https://cmu-delphi.github.io/epiprocess/articles/co

```{r}
x <- archive_cases_dv_subset_dt %>%
select(geo_value, time_value, version, percent_cli) %>%
select(geo_value, time_value, version, percent_cli) %>%
as_epi_archive(compactify = TRUE)

class(x)
print(x)
```

An `epi_archive` is special kind of class called an R6 class. Its primary field
is a data table `DT`, which is of class `data.table` (from the `data.table`
package), and has columns `geo_value`, `time_value`, `version`, as well as any
number of additional columns.
An `epi_archive` is an S3 class. Its primary field is a data table `DT`, which
is of class `data.table` (from the `{data.table}` package), and has columns
`geo_value`, `time_value`, `version`, as well as any number of additional
columns.

```{r}
class(x$DT)
Expand All @@ -70,33 +69,18 @@ for the data table, as well as any other specified in the metadata (described
below). There can only be a single row per unique combination of key variables,
and therefore the key variables are critical for figuring out how to generate a
snapshot of data from the archive, as of a given version (also described below).

```{r, error=TRUE}
key(x$DT)
```

In general, the last version of each observation is carried forward (LOCF) to
fill in data between recorded versions. **A word of caution:** R6 objects,
unlike most other objects in R, have reference semantics. An important
consequence of this is that objects are not copied when modified.


```{r}
original_value <- x$DT$percent_cli[1]
y <- x # This DOES NOT make a copy of x
y$DT$percent_cli[1] = 0
head(y$DT)
head(x$DT)
x$DT$percent_cli[1] <- original_value
data.table::key(x$DT)
```

To make a copy, we can use the `clone()` method for an R6 class, as in `y <-
x$clone()`. You can read more about reference semantics in Hadley Wickham's
[Advanced R](https://adv-r.hadley.nz/r6.html#r6-semantics) book.
In general, the last version of each observation is carried forward (LOCF) to
fill in data between recorded versions.

## Some details on metadata

The following pieces of metadata are included as fields in an `epi_archive`
object:
object:

* `geo_type`: the type for the geo values.
* `time_type`: the type for the time values.
Expand All @@ -112,20 +96,18 @@ call (as it did in the case above).

A key method of an `epi_archive` class is `as_of()`, which generates a snapshot
of the archive in `epi_df` format. This represents the most up-to-date values of
the signal variables as of a given version. This can be accessed via `x$as_of()`
for an `epi_archive` object `x`, but the package also provides a simple wrapper
function `epix_as_of()` since this is likely a more familiar interface for users
not familiar with R6 (or object-oriented programming).
the signal variables as of a given version. This can be accessed via
`epix_as_of()`.

```{r}
x_snapshot <- epix_as_of(x, max_version = as.Date("2021-06-01"))
x_snapshot <- epix_as_of(x, version = as.Date("2021-06-01"))
class(x_snapshot)
x_snapshot
max(x_snapshot$time_value)
attributes(x_snapshot)$metadata$as_of
```

We can see that the max time value in the `epi_df` object `x_snapshot` that was
We can see that the max time value in the `epi_df` object `x_snapshot` that was
generated from the archive is May 29, 2021, even though the specified version
date was June 1, 2021. From this we can infer that the doctor's visits signal
was 2 days latent on June 1. Also, we can see that the metadata in the `epi_df`
Expand All @@ -134,65 +116,67 @@ object has the version date recorded in the `as_of` field.
By default, using the maximum of the `version` column in the underlying data table in an
`epi_archive` object itself generates a snapshot of the latest values of signal
variables in the entire archive. The `epix_as_of()` function issues a warning in
this case, since updates to the current version may still come in at a later
this case, since updates to the current version may still come in at a later
point in time, due to various reasons, such as synchronization issues.

```{r}
x_latest <- epix_as_of(x, max_version = max(x$DT$version))
x_latest <- epix_as_of(x, version = max(x$DT$version))
```

Below, we pull several snapshots from the archive, spaced one month apart. We
overlay the corresponding signal curves as colored lines, with the version dates
marked by dotted vertical lines, and draw the latest curve in black (from the
marked by dotted vertical lines, and draw the latest curve in black (from the
latest snapshot `x_latest` that the archive can provide).

```{r, fig.width = 8, fig.height = 7}
self_max <- max(x$DT$version)
versions <- seq(as.Date("2020-06-01"), self_max - 1, by = "1 month")
snapshots <- map(
versions,
function(v) {
epix_as_of(x, max_version = v) %>% mutate(version = v)
}) %>%
versions,
function(v) {
epix_as_of(x, version = v) %>% mutate(version = v)
}
) %>%
list_rbind() %>%
bind_rows(x_latest %>% mutate(version = self_max)) %>%
mutate(latest = version == self_max)
```

```{r, fig.height=7}
#| code-fold: true
ggplot(snapshots %>% filter(!latest),
aes(x = time_value, y = percent_cli)) +
geom_line(aes(color = factor(version)), na.rm = TRUE) +
ggplot(
snapshots %>% filter(!latest),
aes(x = time_value, y = percent_cli)
) +
geom_line(aes(color = factor(version)), na.rm = TRUE) +
geom_vline(aes(color = factor(version), xintercept = version), lty = 2) +
facet_wrap(~ geo_value, scales = "free_y", ncol = 1) +
facet_wrap(~geo_value, scales = "free_y", ncol = 1) +
scale_x_date(minor_breaks = "month", date_labels = "%b %Y") +
scale_color_viridis_d(option = "A", end = .9) +
labs(x = "Date", y = "% of doctor's visits with CLI") +
labs(x = "Date", y = "% of doctor's visits with CLI") +
theme(legend.position = "none") +
geom_line(data = snapshots %>% filter(latest),
aes(x = time_value, y = percent_cli),
inherit.aes = FALSE, color = "black", na.rm = TRUE)
geom_line(
data = snapshots %>% filter(latest),
aes(x = time_value, y = percent_cli),
inherit.aes = FALSE, color = "black", na.rm = TRUE
)
```

We can see some interesting and highly nontrivial revision behavior: at some
points in time the provisional data snapshots grossly underestimate the latest
curve (look in particular at Florida close to the end of 2021), and at others
they overestimate it (both states towards the beginning of 2021), though not
they overestimate it (both states towards the beginning of 2021), though not
quite as dramatically. Modeling the revision process, which is often called
*backfill modeling*, is an important statistical problem in it of itself.


## Merging `epi_archive` objects
## Merging `epi_archive` objects

Now we demonstrate how to merge two `epi_archive` objects together, e.g., so
that grabbing data from multiple sources as of a particular version can be
performed with a single `as_of` call. The `epi_archive` class provides a method
`merge()` precisely for this purpose. The wrapper function is called
`epix_merge()`; this wrapper avoids mutating its inputs, while `x$merge` will
mutate `x`. Below we merge the working `epi_archive` of versioned percentage CLI
from outpatient visits to another one of versioned COVID-19 case reporting data,
which we fetch the from the [COVIDcast
performed with a single `as_of` call. The `epiprocess` packages provides
`epix_merge()` for this purpose. Below we merge the working `epi_archive` of
versioned percentage CLI from outpatient visits to another one of versioned
COVID-19 case reporting data, which we fetch the from the [COVIDcast
API](https://cmu-delphi.github.io/delphi-epidata/api/covidcast.html/), on the
rate scale (counts per 100,000 people in the population).

Expand All @@ -209,39 +193,27 @@ When merging archives, unless the archives have identical data release patterns,
the other).

```{r, message = FALSE, warning = FALSE,eval=FALSE}
# This code is for illustration and doesn't run.
# This code is for illustration and doesn't run.
# The result is saved/loaded in the (hidden) next chunk from `{epidatasets}`
y <- covidcast(
data_source = "jhu-csse",
y <- pub_covidcast(
source = "jhu-csse",
signals = "confirmed_7dav_incidence_prop",
time_type = "day",
geo_type = "state",
time_values = epirange(20200601, 20211201),
geo_values = "ca,fl,ny,tx",
issues = epirange(20200601, 20211201)
) %>%
fetch() %>%
select(geo_value, time_value, version = issue, case_rate_7d_av = value) %>%
as_epi_archive(compactify = TRUE)

x$merge(y, sync = "locf", compactify = FALSE)
x <- epix_merge(x, y, sync = "locf", compactify = FALSE)
print(x)
head(x$DT)
```

```{r, echo=FALSE}
x <- archive_cases_dv_subset
print(x)
head(x$DT)
```

Importantly, see that `x$merge` mutated `x` to hold the result of the merge. We
could also have used `xy = epix_merge(x, y)` to avoid mutating `x`. See the
documentation for either for more detailed descriptions of what mutation,
pointer aliasing, and pointer reseating is possible.

## Sliding version-aware computations

::: {.callout-note}
TODO: need a simple example here.
:::
:::
Loading