Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster code analysis #573

Closed
wlandau opened this issue Nov 4, 2018 · 14 comments
Closed

Faster code analysis #573

wlandau opened this issue Nov 4, 2018 · 14 comments

Comments

@wlandau
Copy link
Member

wlandau commented Nov 4, 2018

Related: #41. The code analysis for automatic dependency detection is the biggest bottleneck in initialization. A couple goals:

  1. Detect true globals in the existing recursion so we can avoid codetools::findGlobals().
  2. Consider a custom C++ backend for the code analysis.

I think we should do (1) before (2).

@wlandau
Copy link
Member Author

wlandau commented Nov 5, 2018

So apparently, we can remove a lot of overhead just by removing calls to parse(). Here is a profvis for a really long command. I was using a custom is_parsable() to filter out bad symbols, and calling it too often resulted in a lot of this overhead. Similarly, we could probably avoid calls to wide_deparse(), which calls deparse().

bottleneck

@wlandau
Copy link
Member Author

wlandau commented Nov 5, 2018

After #574, the benchmarks look much better.

library(drake)
n <- 1000
x <- parse(
  text = paste0(
    "max(",
    paste0(paste0("target_", seq_len(n)), collapse = ", "),
    ")"
  )
)[[1]]
pkgconfig::set_config("drake::strings_in_dots" = "literals")
profvis::profvis(drake:::command_dependencies(x))

Created on 2018-11-05 by the reprex package (v0.2.1)

@wlandau
Copy link
Member Author

wlandau commented Nov 5, 2018

Now, the bottleneck really is codetools.

@wlandau wlandau changed the title Faster custom code analysis Faster code analysis Nov 5, 2018
@wlandau
Copy link
Member Author

wlandau commented Dec 9, 2018

I plan to emulate codetools::findGlobals() by inserting steps in drake's existing code_dependencies() function to check for local variables. Some relevant functions in codetools:

  • getCollectLocalsHandler()
  • collectUsageFun()
  • collectLocalsAssignHandler()
  • collectLocalsForHandler()
  • collectLocalsLocalHandler()

@wlandau
Copy link
Member Author

wlandau commented Dec 10, 2018

I learned a lot from the codetools source, particularly getCollectLocalsHandler(). We can modify code_dependencies() to catch locals.

  • Change the name of the globals argument of command_dependencies(). It should be called allowed_globals.
  • Add an exclude argument to walk() for the recursion. Start with the strings in the exclude argument to code_dependencies(). Use an environment-based hash table so the excluded variables carry over to the appropriate recursion steps that follow.
  • Turn the allowed_globals argument into its own hash table.
  • Before appending a variable to results$globals, verify that it is not in exclude and it is in allowed_globals. This is where hash tables could really speed things up.
  • Do not hold onto the locals found in function definitions (either the arguments or the body) or calls to local(). For these cases, make a deep copy of exclude before calling walk(). Creating this deep copy may or may not be slow enough to create a bottleneck. We need to check with profiling.
  • Preregister function argument names in this deep copy of exclude.
  • Walk the default values of function arguments as well as the body.
  • Record local variables assigned with =, <-, and ->.
  • If assign() or delayedAssign() is detected, do a match.call() and x to exclude.
  • Do not analyze calls to quote(), Quote(), and expression().
  • Add a ton of special tests to compare codetools to drake's code analysis. They should agree on the most important points. We can precompute the expected results so we can remove codetools from the stack entirely instead of putting it in Suggests:.

I will implement this in branch 573. The merge into master will be after tomorrow's CRAN release of drake 6.2.0. (I need to release 6.2.0 tomorrow because it ensures compatibility with development tibble, which in turn goes to CRAN on December 17.)

@wlandau
Copy link
Member Author

wlandau commented Dec 17, 2018

To close this issue, it is enough to:

@wlandau
Copy link
Member Author

wlandau commented Dec 18, 2018

In #625, the bottleneck seems to appending stuff results$globals. Maybe we should use a hash table.

screenshot_20181218_074154

@wlandau
Copy link
Member Author

wlandau commented Dec 18, 2018

The last thing I will try before closing this issue is to pre-compute an ignored_symbols_list, a list that we can use for creating and populating locals faster (using list2env(hash = TRUE)) at the beginning of analyze_code(). After that, we can work on improvements on a case-by-case basis.

@wlandau
Copy link
Member Author

wlandau commented Dec 19, 2018

As of 4cce238, I think I have exhausted the lowest-hanging fruit to speed up the code analysis. A lot more work needs to be done to speed up drake overall. Issues like this one are coming up (though I do not know if OP was using version 6.2.1, which is faster than its predecessors).

I am closing this issue in favor of more targeted ones that will come up later.

@wlandau wlandau closed this as completed Dec 19, 2018
@wlandau
Copy link
Member Author

wlandau commented Dec 19, 2018

I will say that I could really use help with the profiling. Realistic test cases are usually too large for profvis to handle, and more targeted ones are not realistic enough to be useful. cc @bpbond.

@bpbond
Copy link
Contributor

bpbond commented Dec 19, 2018

You mentioned this previously, and I'm happy to take a hard look if useful.

@wlandau
Copy link
Member Author

wlandau commented Dec 20, 2018

Thanks, that would be fantastic! Profiling is going to be super important going forward.

I expect #630 will solve #572 and speed things up, but there will be more bottlenecks after that.

One thing I have noticed: profvis generates a ton of output, and even on my decently-powered Linux desktop, I cannot interact with the visuals generated by fully-fledged calls to make(). We may want to consider a static profiling tool. I have used used gperftools + pprof for Rcpp projects at work, and I think it might be nice to find something similar for R. Maybe jointprof?

@wlandau
Copy link
Member Author

wlandau commented Dec 21, 2018

@bpbond, an update: #633 sped up drake quite a lot, and I am currently out of hypotheses about the next potential bottlenecks.

@wlandau
Copy link
Member Author

wlandau commented Dec 21, 2018

I think what we're after is Rprof() + profile. The overhead example may be a good place to start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants