-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handlers refactoring #185
Handlers refactoring #185
Conversation
And correct what we're sending the output handler
@@ -1,5 +1,7 @@ | |||
# evaluate (development version) | |||
|
|||
* The `source` output handler is now parsed the entire top-level expression, not just the first component. | |||
* `evaluate()` will now terminate on the first error in a top-level expression. This matches R's own behaviour more closely. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I might have actually changed this behaviour when I started using restarts, but it's now tested and documented. This is a bug that umpire works around in https://github.com/rstudio/umpire/blob/main/R/evaluate.R#L6-L11.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this terminology quite confusing because to me 1\n2
and 1;2
both contain two top-level expressions. I.e. 1;2
is not a single expression.
parse(text = "1; 2")
#> expression(1, 2)
parse(text = "1\n 2")
#> expression(1, 2)
Can we improve the terms used for this? Maybe "parser inputs"? Parser inputs are broken down by line by the R REPL, so 1;2
is one input containing two TLE and 1\n2
is two inputs containing each one TLE?
To put it another way a top-level expression should correspond to one iteration of the evaluation loop rather than multiple iterations. Each TLE produces one piece of printed output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say 1;2
is one top-level expression consisting of two expressions. In your definition, what's the difference between a TLE and an expression?
I'd say each TLE generates one source statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether an expression is top-level is a property of where it's evaluated, at the top-level evaluation loop. It certainly makes sense to call 1;2
"top-level" but I find it confusing to also call it an "expression" because it's not an R expression stricto sensu. An expression is something that can be evaluated and thus must be representable as an AST node or leaf.
You could argue that 1;2
is parsed as an EXPRSXP vector and that you can evaluate it with the R-level eval()
function, but I think it's the C-level function that should guide meaning here. And for the C-level function, EXPRSXP is a literal.
From this point of view foo(bar)
consists of two expressions with bar
nested in foo(bar)
. Whereas 1; 2
is not an expression but a sequence of two expressions managed by a top-level evaluation loop.
I'd say each TLE generates one source statement.
Sorry I'm not sure what that means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see where you're coming from. I'm going to merge this PR but I'll keep thinking about the vocab.
@@ -17,8 +17,23 @@ watchout <- function(handler = new_output_handler(), | |||
push <- function(value) { | |||
output[i] <<- list(value) | |||
i <<- i + 1 | |||
|
|||
switch(output_type(value), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The watcher is now in charge of calling the handler when we push an output onto the stack.
@@ -67,6 +80,22 @@ watchout <- function(handler = new_output_handler(), | |||
capture_output() | |||
} | |||
|
|||
print_value <- function(value, visible) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a little weird to have this here, but the watcher is the one object that has all the details to handle this correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love seeing this getting simpler and simpler!
@@ -1,5 +1,7 @@ | |||
# evaluate (development version) | |||
|
|||
* The `source` output handler is now parsed the entire top-level expression, not just the first component. | |||
* `evaluate()` will now terminate on the first error in a top-level expression. This matches R's own behaviour more closely. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this terminology quite confusing because to me 1\n2
and 1;2
both contain two top-level expressions. I.e. 1;2
is not a single expression.
parse(text = "1; 2")
#> expression(1, 2)
parse(text = "1\n 2")
#> expression(1, 2)
Can we improve the terms used for this? Maybe "parser inputs"? Parser inputs are broken down by line by the R REPL, so 1;2
is one input containing two TLE and 1\n2
is two inputs containing each one TLE?
To put it another way a top-level expression should correspond to one iteration of the evaluation loop rather than multiple iterations. Each TLE produces one piece of printed output.
A little info about speed. The I estimate that the CRAN version of evaluate adds about ~700µm of overhead to each TLE. (For reference overhead <- function(x, n) {
x <- rep(x, n)
df <- bench::mark(
evaluate(x),
eval(parse(text = x)),
time_unit = "us",
check = FALSE
)[1:3]
df[2:3] <- df[2:3] / n
df
}
overhead("1 + 1", 10) |
This is the culmination of all the
evaluate()
refactoring I've been working on — we can now define the handlers once (instead of once per top-level expression) andevaluate_tle()
becomes sufficiently simple that we can inline it, making the double-loop strategy more clear.