Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WISH: Way to call function with explicit "missing" arguments #17

Open
HenrikBengtsson opened this issue Mar 10, 2016 · 9 comments
Open

Comments

@HenrikBengtsson
Copy link
Owner

(adopted from Wiki entry)

Wish

A way to specify that an argument value should be considered "missing", e.g. foo(x, y=missing()).

Background / Issue

Some functions use code that is evaluated conditionally on an argument being (explicitly) specified or not - if not explicitly specified we say the argument is "missing" . For example, base::sample() sets size to be x or length(x) iff "missing" (depending on the value of x);

> sample
function (x, size, replace = FALSE, prob = NULL)
{
    if (length(x) == 1L && is.numeric(x) && x >= 1) {
        if (missing(size))
            size <- x
        sample.int(x, size, replace, prob)
    }
    else {
        if (missing(size))
            size <- length(x)
        x[sample.int(length(x), size, replace, prob)]
    }
}
<bytecode: 0x000000000b909e88>
<environment: namespace:base>

If explicit "missing" values would be supported by R, we could do things like:

my_sample <- function(x, size) {
  if (!missing(size)) size <- 2*size
  sample(x, size=size)
}

Instead, we have to write:

my_sample <- function(x, size) {
  if (missing(size)) {
    sample(x)
  } else {
    size <- 2*size
    sample(x, size=size)
  }
}

Comment

A common design pattern is to allow NULL to represent a "missing" value;

sample2 <- function (x, size  = NULL, replace = FALSE, prob = NULL)
{
    if (length(x) == 1L && is.numeric(x) && x >= 1) {
        if (is.null(size))
            size <- x
        sample.int(x, size, replace, prob)
    }
    else {
        if (is.null(size))
            size <- length(x)
        x[sample.int(length(x), size, replace, prob)]
    }
}

Note that this would allow us to do:

my_sample <- function(x, size=NULL) {
  if (!is.null(size)) size <- 2*size
  sample(x, size=size)
}

See also

@HenrikBengtsson
Copy link
Owner Author

Here is the discussion from the old wiki entry (not sure if it's still relevant):

  • Explicitly specify the value of an argument as "missing". For instance, calling value <- missing() and foo(x=value) should resolve missing(x) as TRUE. Comment: See matrixStats discussion.
f = function(a) f2(b = a)
f2 = function(b) missing(b)
f()
[1] TRUE
  • ...
    • This is already possibly by doing foo(x=,). This works even if foo only takes one argument. /GB
      • Clarified my wish; value needs to be passed explicitly, i.e. need to be able to call foo(a=v1, b=v2, c=v3) where zero or more of v1, v2 and v3 should be missing so sometimes foo(a=, b=v2, c=v3) and sometimes foo(a=v1, b=, c=). /HB
        • So long as you are in a call-frame and the value is being passed from above, you can do this, e.g the code below. Can you tell me more about the context of the use-case here? /GB

@daroczig
Copy link

Following up on our Twitter conversation, I use match.call in such cases. Very basic example:

my_sample <- function(x, size) {

    ## capture original call
    mc <- match.call()

    ## update size (if it's available)
    if (!is.null(mc$size)) mc$size <- mc$size * 2

    ## update call to use base::sample
    mc[[1]] <- quote(sample)

    ## evaluate the updated call
    eval(mc, envir = parent.frame())

}

I think manipulating the calls is a very powerful R feature and I really enjoy doing that. Eg you can come up with a set of functions with the my_ prefix, so that you can reuse the above simple example to define multiple similar functions handling the missing issue eg by replacing the static quote part by eg as.symbol(sub('^my_', '', deparse(mc[[1]]))). Something like:

my_head <- my_tail <- function(x, n) {

    ## capture original call
    mc <- match.call()

    ## update size (if it's available)
    if (!is.null(mc$n)) mc$n <- mc$n * 2

    ## update call to use base::sample
    mc[[1]] <- as.symbol(sub('^my_', '', deparse(mc[[1]])))

    ## evaluate the updated call
    eval(mc, envir = parent.frame())

}

@HenrikBengtsson
Copy link
Owner Author

Maybe my my_sample() example wasn't the best; it wasn't meant to be tweaked function of another function. It could basically be whatever function calling another function of interest one or more times (slightly updated version below).

Even if match.call() and do.call() etc. work, it requires you to code on the language making the code be very hard to follow, not to mention to code static-code validation on.

I'd still argue that

my_function <- function(x, size=NULL) {
  if (!is.null(size)) size <- 2*size
  y <- sample(x, size=size)
  mean(y)
}

is much more readable etc.

@HenrikBengtsson
Copy link
Owner Author

Here's a sketch mockup (just to illustrate the idea) of a backward compatible missing() function that returns a place-holder for "missing" values if called without arguments. It's very recursive ;)

print.MISSING_VALUE <- function(x, ...) cat("<MISSING VALUE>\n")
str.MISSING_VALUE <- function(x, ...) cat(" <MISSING VALUE>\n")
as.character.MISSING_VALUE <- function(x, ...) character(0L)

missing <- local({
  UNIQUE_VALUE <- .Machine$integer.max-1L
  MISSING_VALUE <- structure(UNIQUE_VALUE, class=c("MISSING_VALUE"))
  function(x) {
    if (base::missing(x)) return(MISSING_VALUE)
    expr <- substitute(base::missing(x), list(x=substitute(x)))
    if (eval(expr, envir=parent.frame())) return(TRUE)
    if (identical(x, MISSING_VALUE)) return(TRUE)
    FALSE
  }
})

Examples

missing() without argument

> missing()
<MISSING VALUE>

> value <- missing()
> missing(value)
[1] TRUE

It's backward compatible

foo <- function(a=0, b=0) c(a=missing(a), b=missing(b))
print(m <- foo())
stopifnot(identical(m, c(a=TRUE, b=TRUE)))

print(m <- foo(1))
stopifnot(identical(m, c(a=FALSE, b=TRUE)))

print(m <- foo(b=2))
stopifnot(identical(m, c(a=TRUE, b=FALSE)))

print(m <- foo(1, 2))
stopifnot(identical(m, c(a=FALSE, b=FALSE)))

print(m <- foo(b=2, a=1))
stopifnot(identical(m, c(a=FALSE, b=FALSE)))

print(m <- foo(a=1, b=))
stopifnot(identical(m, c(a=FALSE, b=TRUE)))

print(m <- foo(a=, b=))
stopifnot(identical(m, c(a=TRUE, b=TRUE)))

It can be used to explicitly specify an argument to be missing

print(m <- foo(a=missing(), b=))
stopifnot(identical(m, c(a=TRUE, b=TRUE)))

print(m <- foo(a=missing(), b=missing()))
stopifnot(identical(m, c(a=TRUE, b=TRUE)))

Of course, it's not safe - requires low-level changes to R

bar <- function(a=1, b=2) { a + b }
print(m <- bar())
stopifnot(m == 2)
print(m <- bar(a=3))
stopifnot(m == 5)
bar <- function(a=1, b=2) { a + b }
print(m <- bar())
stopifnot(m == 2)
print(m <- bar(a=3, b=))
stopifnot(m == 5)

but not surprisingly

print(m <- bar(a=3, b=missing()))
stopifnot(m == 5)
Error: m == 5 is not TRUE

@hadley
Copy link

hadley commented Mar 11, 2016

Wouldn't a simpler fix be to make missing() iterate through the stack of promises? I think an R based solution is inherently fragile - you need to solve this sort of problem in C because you can more easily inspect a promise without evaluating it.

@HenrikBengtsson
Copy link
Owner Author

@hadley, yes, this needs to be implemented by R itself internally (i.e. in C). Hopefully just in the code for setting up and executing function calls (whereever that is located?). My example in the comment was just to further illustrate the idea with a "runnable" example (I've update to "sketch mockup").

Wouldn't a simpler fix be to make missing() iterate through the stack of promises?

This comment is a bit too sparse for me; is it referring to my mockup example or are you suggesting how base::missing() should be updated so it can be used as foo(a=missing())? No longer relevant?

@hadley
Copy link

hadley commented Mar 11, 2016

You could implement it yourself in the same way that lazyeval does (to improve substitute()): https://github.com/hadley/lazyeval/blob/master/src/lazy.c#L17. It can be in a package, doesn't need to be in base R.

@jangorecki
Copy link

one more approach not listed here is to pass substitute() to eval-call

my_sample <- function(x, size) {
  eval(call('sample', x, if(missing(size)) substitute() else size*2))
}

in such simple example it isn't much useful.
I actually asked related question on SO while ago.

@lionel-
Copy link

lionel- commented Apr 6, 2018

This can be:

my_sample <- function(x, size) {
  if (!missing(size)) {
    size <- size * 2
  }
  sample(x, size)
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants