test parsnip "overhead" #1071

simonpcouch · 2024-02-26T14:44:01Z

Some minimal testing to alert us when we've allowed additional "overhead" to creep in. A more true test of overhead would test differences rather than ratios, but those numbers are largely system-dependent.

My intent here is not that we'd automatically reject a PR that causes this test to fail, but to let us know when a PR changes these ratios and give us a chance to consider whether the slowdown is worth the benefit or ought to be addressed.

tests/testthat/test_fit_interfaces.R

simonpcouch · 2024-02-26T14:45:41Z

tests/testthat/test_fit_interfaces.R

+  time_parsnip_form <-
+    timing(fit(parsnip::linear_reg(), mpg ~ ., mtcars))
+  time_parsnip_xy <-
+    timing(fit_xy(parsnip::linear_reg(), mtcars[2:11], mtcars[1]))


The bracket subsetting does add some additional compute that's not "our fault," though it's a fraction of time_engine.

simonpcouch · 2024-02-26T16:53:02Z

ubuntu-latest (devel) failure is due to unrelated #1074.

EmilHvitfeldt · 2024-02-26T17:07:13Z

I feel a little weird trying to test these, but I agree about the sentiment. I totally agree about the skip on cran!

Firstly, how variable are the ratios when you test them locally?

simonpcouch · 2024-02-26T17:11:41Z

Firstly, how variable are the ratios when you test them locally?

Heard. Here are the distributions of those ratios locally:

library(parsnip)

timing <- function(expr) {
  expr <- substitute(expr)
  system.time(replicate(100, eval(expr)))[["elapsed"]]
}

time_engine <-
  replicate(100, timing(lm(mpg ~ ., mtcars)))
time_parsnip_form <-
  replicate(100, timing(fit(linear_reg(), mpg ~ ., mtcars)))
time_parsnip_xy <-
  replicate(100, timing(fit_xy(linear_reg(), mtcars[2:11], mtcars[1])))

hist(time_parsnip_form / time_engine)

hist(time_parsnip_xy / time_engine)

^{Created on 2024-02-26 with reprex v2.1.0}

simonpcouch · 2024-02-26T17:13:57Z

Given the current timings, we could even use 1000 replicates for each and this test would only take a couple seconds, if we wanted to cut down on that variability further. I've set the thresholds to be quite permissive generally, though.

EmilHvitfeldt · 2024-02-26T17:32:57Z

that is not a bad idea. Up to you 😄

topepo

Maybe instead of summing the elapsed times for 100 reps, take the median of the reps (or mean) and do a ratio of those.

tests/testthat/test_fit_interfaces.R

simonpcouch · 2024-02-27T20:57:24Z

I like the idea of using the median timing! The only issue is that individual fits go very fast, and system.time() only measures in 1000ths of a second, so timings on individual runs would be either .0000 or .0001.

topepo · 2024-02-27T21:13:33Z

You gotta bump them numbers up mtcars[rep(1:32, 100),] or use a different data set (Sacramento?)

simonpcouch · 2024-02-27T22:20:13Z

The idea to use medians was spot on. Prioritizing the consistency of these tests (i.e. avoiding false failures), I propose we add a Suggests for bench in favor of system.time(), which allows us to test more quickly-running expressions and get the median run's timing back by default. The variability of those ratios decreases drastically with the transition to testing medians rather than the sum. :)

library(ggplot2)
library(parsnip)

bm <- function() {
  res <- bench::mark(
    time_engine = lm(mpg ~ ., mtcars),
    time_parsnip_form = fit(linear_reg(), mpg ~ ., mtcars),
    time_parsnip_xy = fit_xy(linear_reg(), mtcars[2:11],  mtcars[1]),
    relative = TRUE,
    check = FALSE
  )

  c(form = res$median[2], xy = res$median[3])
}

ratios <- replicate(100, bm())
ratios <- data.frame(t(ratios))

ggplot(ratios) + 
  aes(x = form) + 
  geom_histogram() +
  geom_vline(xintercept = 3)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(ratios) + 
  aes(x = xy) + 
  geom_histogram() +
  geom_vline(xintercept = 3.5)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

^{Created on 2024-02-27 with reprex v2.1.0}

The alternative is to test this in extratests to avoid the Suggests, but these aren't really integration tests and moving them brings them farther away from our development cycle.

simonpcouch · 2024-02-27T22:21:20Z

The decreased ratio cutoffs in the above comment are due to speedups merged upstream from PRs in the last two days!

topepo

Looks great with bench. Not much dependency overhead either.

github-actions · 2024-03-14T00:49:37Z

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

test parsnip "overhead"

c7965f9

simonpcouch commented Feb 26, 2024

View reviewed changes

simonpcouch added 5 commits February 26, 2024 08:46

add skip_on_cran()

ceda225

add PR ref

206f3ed

no need for namespacing

25e3bda

skip on covr

9b54931

merge upstream speedups (#1072, #1073)

dc6b120

simonpcouch requested a review from topepo February 26, 2024 16:52

merge upstream speedup (#1075)

24ea115

topepo reviewed Feb 27, 2024

View reviewed changes

tests/testthat/test_fit_interfaces.R Outdated Show resolved Hide resolved

transition to bench, add informative test labels

0444753

simonpcouch requested a review from topepo February 27, 2024 22:33

topepo reviewed Feb 27, 2024

View reviewed changes

reticulate -> rstudio/reticulate to fix r-devel issues on ububtu

20e0680

simonpcouch merged commit 36ad315 into main Feb 28, 2024

simonpcouch deleted the timings branch February 28, 2024 13:31

simonpcouch mentioned this pull request Feb 29, 2024

CI failures on main dev #1074

Closed

github-actions bot locked and limited conversation to collaborators Mar 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test parsnip "overhead" #1071

test parsnip "overhead" #1071

Uh oh!

simonpcouch commented Feb 26, 2024

Uh oh!

Uh oh!

Uh oh!

simonpcouch Feb 26, 2024

Uh oh!

simonpcouch commented Feb 26, 2024

Uh oh!

EmilHvitfeldt commented Feb 26, 2024

Uh oh!

simonpcouch commented Feb 26, 2024

Uh oh!

simonpcouch commented Feb 26, 2024

Uh oh!

EmilHvitfeldt commented Feb 26, 2024

Uh oh!

topepo left a comment

Uh oh!

Uh oh!

simonpcouch commented Feb 27, 2024

Uh oh!

topepo commented Feb 27, 2024

Uh oh!

simonpcouch commented Feb 27, 2024 •

edited

Loading

Uh oh!

simonpcouch commented Feb 27, 2024

Uh oh!

topepo left a comment

Uh oh!

github-actions bot commented Mar 14, 2024

Uh oh!

Uh oh!

test parsnip "overhead" #1071

test parsnip "overhead" #1071

Uh oh!

Conversation

simonpcouch commented Feb 26, 2024

Uh oh!

Uh oh!

Uh oh!

simonpcouch Feb 26, 2024

Choose a reason for hiding this comment

Uh oh!

simonpcouch commented Feb 26, 2024

Uh oh!

EmilHvitfeldt commented Feb 26, 2024

Uh oh!

simonpcouch commented Feb 26, 2024

Uh oh!

simonpcouch commented Feb 26, 2024

Uh oh!

EmilHvitfeldt commented Feb 26, 2024

Uh oh!

topepo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

simonpcouch commented Feb 27, 2024

Uh oh!

topepo commented Feb 27, 2024

Uh oh!

simonpcouch commented Feb 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

simonpcouch commented Feb 27, 2024

Uh oh!

topepo left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 14, 2024

Uh oh!

Uh oh!

simonpcouch commented Feb 27, 2024 •

edited

Loading