Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support unscoped functions and library functions #452

Merged
merged 22 commits into from
Jul 10, 2021

Conversation

renkun-ken
Copy link
Member

@renkun-ken renkun-ken commented Jul 8, 2021

Closes #257
Closes #426
Closes #451

This PR introduces customizable unscoped parser hook functions and library functions so that the following code could be supported:

suppressPackageStartupMessages({
  pacman::p_load(dplyr)
})

The unscoped functions (e.g. try, suppressPackageStartupMessages, system.time) evaluate an expression in its calling environment as if the expr is directly evaluated but with some other things done.

The library functions (e.g. library, require) are a list of functions recognized as equivalent with library. An example is pacman::p_load.

User could add more functions via the option languageserver.unscoped_functions and languageserver.library_functions respectively in .Rprofile.

The parse_expr function is rewritten with unified parser hooks defined in the package and could be modified by user via languageserver.parser_hooks option.

A parser hook is a function of expr(input expression currently processing), and an action (a list of action functions). For example, the for loop is handled by the following parser hook:

"for" = function(expr, action) {
        if (is.symbol(e <- expr[[2L]])) {
            action$update(nonfuncts = as.character(e))
        }
        action$parse(expr[[4L]])
    }

In this function, it adds the iteration variable (expr[[2L]]) to nonfuncts as a non-function global variable, and continue to parse the for loop body (expr[[4L]]).

The pre-defined action functions include

  • update(...): update the character vector of functs, nonfuncts, packages, etc. For example, for (i in 1:10) { } adds i to nonfuncts.

  • assign(symbol, value, type): create an assignment which adds new symbol definition and associated comment lines as its documentation.

  • parse(expr): continue to parse the given expression.

  • parse_args(args): continue to parse the expressions of the given matched function arguments. For example, some functions (e.g. system.time, suppressPackageStartupMessages) or arguments (e.g. tryCatch(expr=, finally=)) evaluate expressions in the current scope so that we would like to capture these arguments of the function to continue to parse the expressions as being evaluated in the global scope.

  • Implementation

  • Test cases

@randy3k
Copy link
Member

randy3k commented Jul 9, 2021

I am not sure if I like how languageserver.unscoped_functions and languageserver.library_functions work. It seems that we should have some kind of more general hooks for parsing the doc. Maybe something like

parser_hooks <- list(
	"delayedAssign" = function(...) { # modify env$definitions},
	"require" = function(...) { # modify env$packages },
	"library" = function(...) { # modify env$packages },
	"suppressPackageStartupMessages" = function(...) { # parse the sub expression }
	.... # and many other predefined hooks for other calls
)

Insider the hook functions, we could modify the parse environment, redispatch parse_expr or do whatever relevant to the expression.

For specific usage, we also allow users to override them individually.

@renkun-ken
Copy link
Member Author

renkun-ken commented Jul 10, 2021

The ideal way is definitely a nice list of hook functions as you described. The assignment expressions, however, are somewhat special as they are only slightly different but do a lot of things in common. Let me see if I could make it look better.

@renkun-ken
Copy link
Member Author

renkun-ken commented Jul 10, 2021

If we make all functions equivalent in parser_hooks, then assignment functions do a lot of things in common and call a number of internal functions in the package, then it makes more sense to also let them call a function to do the shared part, or they are generated by a function generator. Either way, if the user wants to tweak them in .Rprofile, it would be ugly using the shared function or function generator or the internal functions in the package, which seems a bit of an over-generalization.

Another approach is to make the hook functions more descriptive by returning a structure that described what should be done: assignment, recall, or they could modify the input env directly. The assignment and recall behavior is pre-defined.

@randy3k
Copy link
Member

randy3k commented Jul 10, 2021

I guess it makes sense to expose to uses only a limited set of actions via some helper functions. Either we could pass those functions in a list via the hook functions or define a R6 class that contains all the relevant methods.

@renkun-ken
Copy link
Member Author

renkun-ken commented Jul 10, 2021

I come up with the following functions defined by default in parser_hooks. Each function should return NULL or a list of actions (e.g. assign or recall). recall= and assign=list(symbol=, value=, type=). recall=character vector means the behavior of match.call and recall with the expression at given arguments.

parser_hooks <- list(
    "{" = function(expr, env) {
        list(recall = as.list(expr)[-1L])
    },
    "(" = function(expr, env) {
        list(recall = as.list(expr)[-1L])
    },
    "if" = function(expr, env) {
        list(recall = as.list(expr)[2:4])
    },
    "for" = function(expr, env) {
        if (is.symbol(e <- expr[[2L]])) {
            env$nonfuncts <- c(env$nonfuncts, as.character(e))
        }
        list(recall = expr[[4L]])
    },
    "while" = function(expr, env) {
        list(recall = as.list(expr)[2:3])
    },
    "repeat" = function(expr, env) {
        list(recall = expr[[2L]])
    },
    "<-" = function(expr, env) {
        if (length(expr) == 3L && is.symbol(expr[[2L]])) {
            list(
                assign = list(symbol = as.character(expr[[2L]]), value = expr[[3L]]),
                recall = expr[[3L]]
            )
        }
    },
    "=" = function(expr, env) {
        if (length(expr) == 3L && is.symbol(expr[[2L]])) {
            list(
                assign = list(symbol = as.character(expr[[2L]]), value = expr[[3L]]),
                recall = expr[[3L]]
            )
        }
    },
    "assign" = function(expr, env) {
        call <- match.call(base::assign, as.call(expr))
        if (is.character(call$x) && is_top_level(call$pos, -1L, -1) && is_top_level(call$envir)) {
            list(
                assign = list(symbol = call$x, value = call$value),
                recall = call$value
            )
        }
    },
    "delayedAssign" = function(expr, env) {
        call <- match.call(base::delayedAssign, as.call(expr))
        if (is.character(call$x) && is_top_level(call$assign.env)) {
            list(
                assign = list(symbol = call$x, value = call$value),
                recall = call$value
            )
        }
    },
    "makeActiveBinding" = function(expr, env) {
        call <- match.call(base::makeActiveBinding, as.call(e))
        if (is.character(call$sym) && is_top_level(call$env)) {
            list(
                assign = list(symbol = call$sym, value = call$fun, type = "variable")
            )
        }
    },
    "library" = function(expr, env) {
        call <- match.call(base::library, expr)
        if (!isTRUE(call$character.only)) {
            env$packages <- union(env$packages, as.character(call$package))
        }
        NULL
    },
    "require" = function(expr, env) {
        call <- match.call(base::require, call)
        if (!isTRUE(call$character.only)) {
            env$packages <- union(env$packages, as.character(call$package))
        }
        NULL
    },
    "pacman::p_load" = function(expr, env) {
        fun <- if (requireNamespace("pacman")) pacman::p_load else
            function(..., char, install = TRUE,
                     update = getOption("pac_update"),
                     character.only = FALSE) NULL
        call <- match.call(fun, expr, expand.dots = FALSE)
        if (!isTRUE(call$character.only)) {
            env$packages <- union(
                env$packages,
                vapply(call[["..."]], as.character, character(1L))
            )
        }
        NULL
    },
    "system.time" = function(expr, env) list(recall = "expr"),
    "try" = function(expr, env) list(recall = "expr"),
    "tryCatch" = function(expr, env) list(recall = c("expr", "finally")),
    "withCallingHandlers" = function(expr, env) list(recall = "expr"),
    "withRestarts" = function(expr, env) list(recall = "expr"),
    "allowInterrupts" = function(expr, env) list(recall = "expr"),
    "suspendInterrupts" = function(expr, env) list(recall = "expr"),
    "suppressPackageStartupMessages" = function(expr, env) list(recall = "expr"),
    "suppressMessages" = function(expr, env) list(recall = "expr"),
    "suppressWarnings" = function(expr, env) list(recall = "expr")
)

Then we process expressions recursively using these hook functions and act according to the return value as a set of pre-defined behavior (e.g. assign or recall).

In this way, user could easily define new hooks or disable or overwriting existing one by setting it to NULL.

Does this make sense?

@randy3k
Copy link
Member

randy3k commented Jul 10, 2021

Thank you for the fast iteration. I like it better, but the return object still feels like a bit hard to understand and hard to extend, how about instead of returning a list, we pass an object action which contains the methods for assign and recall. I don't know actually if it will work with Recall though.

parser_hooks <- list(
    "{" = function(expr, env, action) {
        action$recall(as.list(expr)[-1L])
    },
    "assign" = function(expr, env, action) {
        call <- match.call(base::assign, as.call(expr))
        if (is.character(call$x) && is_top_level(call$pos, -1L, -1) && is_top_level(call$envir)) {
            action$assign(symbol = call$x, value = call$value)
            action$recall(call$value)
        }
    }
)

@randy3k
Copy link
Member

randy3k commented Jul 10, 2021

Actually, even better, we could have action$update_packages to avoid the need of env.

@renkun-ken
Copy link
Member Author

The action indeed look better. In fact, most fields in env is only accessed by assign action internally except for defines a local variable:

"for" = function(expr, env) {
        if (is.symbol(e <- expr[[2L]])) {
            env$nonfuncts <- c(env$nonfuncts, as.character(e))
        }
        list(recall = expr[[4L]])
    }

How should we include this using action? It will not look good if we put a update_* function for each variable in env. What about something like action$update("nonfuncts", a character vector) and action$update("packages", a character vector)?

@randy3k
Copy link
Member

randy3k commented Jul 10, 2021

maybe like this action$update(packages = .....) and action$update(funts = ....)?

@randy3k
Copy link
Member

randy3k commented Jul 10, 2021

Awesome. I like how it looks like now. Good stuff!

@randy3k
Copy link
Member

randy3k commented Jul 10, 2021

You might want to update the description for future users :).

@renkun-ken
Copy link
Member Author

Thanks for the great suggestions! Descriptions are updated.

@renkun-ken renkun-ken merged commit 2c53931 into REditorSupport:master Jul 10, 2021
@phineas-pta
Copy link

hello, i'm trying to add a custom parser hook to be able to use {import} package, but still no success

options(languageserver.parser_hooks = list("import::from" = function(expr, env) {
	call <- match.call(import::from, expr)
	env$packages <- union(env$packages, as.character(call$.from))
}))

do you know a way to show the log?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants