Handlers refactoring #185

hadley · 2024-06-26T21:40:40Z

This is the culmination of all the evaluate() refactoring I've been working on — we can now define the handlers once (instead of once per top-level expression) and evaluate_tle() becomes sufficiently simple that we can inline it, making the double-loop strategy more clear.

And correct what we're sending the output handler

hadley · 2024-06-26T21:42:01Z

NEWS.md

@@ -1,5 +1,7 @@
 # evaluate (development version)

+* The `source` output handler is now parsed the entire top-level expression, not just the first component.
+* `evaluate()` will now terminate on the first error in a top-level expression. This matches R's own behaviour more closely.


I think I might have actually changed this behaviour when I started using restarts, but it's now tested and documented. This is a bug that umpire works around in https://github.com/rstudio/umpire/blob/main/R/evaluate.R#L6-L11.

I find this terminology quite confusing because to me 1\n2 and 1;2 both contain two top-level expressions. I.e. 1;2 is not a single expression.

parse(text = "1; 2") #> expression(1, 2) parse(text = "1\n 2") #> expression(1, 2)

Can we improve the terms used for this? Maybe "parser inputs"? Parser inputs are broken down by line by the R REPL, so 1;2 is one input containing two TLE and 1\n2 is two inputs containing each one TLE?

To put it another way a top-level expression should correspond to one iteration of the evaluation loop rather than multiple iterations. Each TLE produces one piece of printed output.

I'd say 1;2 is one top-level expression consisting of two expressions. In your definition, what's the difference between a TLE and an expression?

I'd say each TLE generates one source statement.

Whether an expression is top-level is a property of where it's evaluated, at the top-level evaluation loop. It certainly makes sense to call 1;2 "top-level" but I find it confusing to also call it an "expression" because it's not an R expression stricto sensu. An expression is something that can be evaluated and thus must be representable as an AST node or leaf.

You could argue that 1;2 is parsed as an EXPRSXP vector and that you can evaluate it with the R-level eval() function, but I think it's the C-level function that should guide meaning here. And for the C-level function, EXPRSXP is a literal.

From this point of view foo(bar) consists of two expressions with bar nested in foo(bar). Whereas 1; 2 is not an expression but a sequence of two expressions managed by a top-level evaluation loop.

I'd say each TLE generates one source statement.

Sorry I'm not sure what that means.

Ok, I see where you're coming from. I'm going to merge this PR but I'll keep thinking about the vocab.

NEWS.md

hadley · 2024-06-26T21:43:42Z

R/watcher.R

@@ -17,8 +17,23 @@ watchout <- function(handler = new_output_handler(),
  push <- function(value) {
    output[i] <<- list(value)
    i <<- i + 1
+
+    switch(output_type(value),


The watcher is now in charge of calling the handler when we push an output onto the stack.

hadley · 2024-06-26T21:44:11Z

R/watcher.R

@@ -67,6 +80,22 @@ watchout <- function(handler = new_output_handler(),
    capture_output()
  }

+  print_value <- function(value, visible) {


It feels a little weird to have this here, but the watcher is the one object that has all the details to handle this correctly.

lionel-

I love seeing this getting simpler and simpler!

NEWS.md

lionel- · 2024-06-27T07:15:56Z

NEWS.md

@@ -1,5 +1,7 @@
 # evaluate (development version)

+* The `source` output handler is now parsed the entire top-level expression, not just the first component.
+* `evaluate()` will now terminate on the first error in a top-level expression. This matches R's own behaviour more closely.


I find this terminology quite confusing because to me 1\n2 and 1;2 both contain two top-level expressions. I.e. 1;2 is not a single expression.

parse(text = "1; 2") #> expression(1, 2) parse(text = "1\n 2") #> expression(1, 2)

Can we improve the terms used for this? Maybe "parser inputs"? Parser inputs are broken down by line by the R REPL, so 1;2 is one input containing two TLE and 1\n2 is two inputs containing each one TLE?

To put it another way a top-level expression should correspond to one iteration of the evaluation loop rather than multiple iterations. Each TLE produces one piece of printed output.

tests/testthat/test-conditions.R

R/conditions.R

hadley · 2024-06-29T14:53:36Z

A little info about speed. The I estimate that the CRAN version of evaluate adds about ~700µm of overhead to each TLE. (For reference eval + parse takes about 3µs) The current main branch brings that down to ~500µs, and this branch brings it down to ~400µm. Obviously unlikely to make much difference in practice, but it's nice that these changes also make evaluate a bit faster.

overhead <- function(x, n) {
  x <- rep(x, n)

  df <- bench::mark(
    evaluate(x),
    eval(parse(text = x)),
    time_unit = "us",
    check = FALSE
  )[1:3]
  df[2:3] <- df[2:3] / n
  df
}

overhead("1 + 1", 10)

hadley added 9 commits June 26, 2024 09:26

Move handlers into their own file in conditions.R

bfb7f4e

Create handlers once per evaluate call

7dd9692

Another test

3342d3b

Make watcher responsible for calling the output handler

f04be87

Add news bullet

11bcdc1

Add print_value "method" to watcher; eliminate evaluate_tle

6df9b94

Move more setup to watcher

b8a8b88

Make watcher responsible for src too

df45c3f

Use watcher for failed parse

f694c61

And correct what we're sending the output handler

hadley requested review from lionel- and cderv June 26, 2024 21:40

hadley commented Jun 26, 2024

View reviewed changes

NEWS.md Outdated Show resolved Hide resolved

hadley commented Jun 26, 2024

View reviewed changes

lionel- approved these changes Jun 27, 2024

View reviewed changes

hadley added 5 commits June 28, 2024 15:07

Merge commit 'd0e5d98bd619dd17c1ed9a5ae1031afd705c7b88'

6601944

Fix typo

300f719

Clarify stop_on_error behaviour

14da7c4

More feedback from code review

a2aa015

Redocument

6b5dedb

cderv approved these changes Jul 1, 2024

View reviewed changes

hadley merged commit d8f00ea into main Jul 1, 2024
13 checks passed

hadley deleted the handlers-update branch July 1, 2024 13:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handlers refactoring #185

Handlers refactoring #185

hadley commented Jun 26, 2024

hadley Jun 26, 2024

lionel- Jun 27, 2024

hadley Jun 28, 2024

lionel- Jun 30, 2024

hadley Jul 1, 2024

hadley Jun 26, 2024

hadley Jun 26, 2024

lionel- left a comment

lionel- Jun 27, 2024

hadley commented Jun 29, 2024 •

edited

Loading

Handlers refactoring #185

Handlers refactoring #185

Conversation

hadley commented Jun 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lionel- left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hadley commented Jun 29, 2024 • edited Loading

hadley commented Jun 29, 2024 •

edited

Loading