Faster code analysis #573

wlandau · 2018-11-04T17:27:40Z

Related: #41. The code analysis for automatic dependency detection is the biggest bottleneck in initialization. A couple goals:

Detect true globals in the existing recursion so we can avoid codetools::findGlobals().
Consider a custom C++ backend for the code analysis.

I think we should do (1) before (2).

The text was updated successfully, but these errors were encountered:

wlandau · 2018-11-05T02:36:03Z

So apparently, we can remove a lot of overhead just by removing calls to parse(). Here is a profvis for a really long command. I was using a custom is_parsable() to filter out bad symbols, and calling it too often resulted in a lot of this overhead. Similarly, we could probably avoid calls to wide_deparse(), which calls deparse().

wlandau · 2018-11-05T05:12:35Z

After #574, the benchmarks look much better.

library(drake)
n <- 1000
x <- parse(
  text = paste0(
    "max(",
    paste0(paste0("target_", seq_len(n)), collapse = ", "),
    ")"
  )
)[[1]]
pkgconfig::set_config("drake::strings_in_dots" = "literals")
profvis::profvis(drake:::command_dependencies(x))

^{Created on 2018-11-05 by the reprex package (v0.2.1)}

wlandau · 2018-11-05T05:13:49Z

Now, the bottleneck really is codetools.

wlandau · 2018-12-09T16:27:45Z

I plan to emulate codetools::findGlobals() by inserting steps in drake's existing code_dependencies() function to check for local variables. Some relevant functions in codetools:

getCollectLocalsHandler()
collectUsageFun()
collectLocalsAssignHandler()
collectLocalsForHandler()
collectLocalsLocalHandler()

wlandau · 2018-12-10T00:23:30Z

wlandau · 2018-12-17T22:54:15Z

To close this issue, it is enough to:

Merge Code analysis without codetools #625.
Address the edge case described in Code analysis without codetools #625.
Do more profiling.

wlandau · 2018-12-18T12:56:34Z

In #625, the bottleneck seems to appending stuff results$globals. Maybe we should use a hash table.

wlandau · 2018-12-18T22:57:30Z

The last thing I will try before closing this issue is to pre-compute an ignored_symbols_list, a list that we can use for creating and populating locals faster (using list2env(hash = TRUE)) at the beginning of analyze_code(). After that, we can work on improvements on a case-by-case basis.

wlandau · 2018-12-19T01:55:57Z

As of 4cce238, I think I have exhausted the lowest-hanging fruit to speed up the code analysis. A lot more work needs to be done to speed up drake overall. Issues like this one are coming up (though I do not know if OP was using version 6.2.1, which is faster than its predecessors).

I am closing this issue in favor of more targeted ones that will come up later.

wlandau · 2018-12-19T01:57:13Z

I will say that I could really use help with the profiling. Realistic test cases are usually too large for profvis to handle, and more targeted ones are not realistic enough to be useful. cc @bpbond.

bpbond · 2018-12-19T22:55:28Z

You mentioned this previously, and I'm happy to take a hard look if useful.

wlandau · 2018-12-20T18:27:29Z

Thanks, that would be fantastic! Profiling is going to be super important going forward.

I expect #630 will solve #572 and speed things up, but there will be more bottlenecks after that.

One thing I have noticed: profvis generates a ton of output, and even on my decently-powered Linux desktop, I cannot interact with the visuals generated by fully-fledged calls to make(). We may want to consider a static profiling tool. I have used used gperftools + pprof for Rcpp projects at work, and I think it might be nice to find something similar for R. Maybe jointprof?

wlandau · 2018-12-21T16:48:02Z

@bpbond, an update: #633 sped up drake quite a lot, and I am currently out of hypotheses about the next potential bottlenecks.

wlandau · 2018-12-21T23:50:22Z

I think what we're after is Rprof() + profile. The overhead example may be a good place to start.

wlandau added difficulty: advanced topic: performance topic: reproducibility type: bottleneck labels Nov 4, 2018

wlandau added the status: priority label Nov 5, 2018

wlandau mentioned this issue Nov 5, 2018

A large speed boost: reduce calls to parse() #574

Merged

6 tasks

wlandau changed the title ~~Faster custom code analysis~~ Faster code analysis Nov 5, 2018

wlandau removed status: priority labels Nov 8, 2018

wlandau added the status: priority label Nov 28, 2018

wlandau mentioned this issue Dec 4, 2018

Remove dependency on dplyr #593

Closed

wlandau mentioned this issue Dec 9, 2018

Dependencies in formula not recognised #525

Closed

wlandau self-assigned this Dec 10, 2018

This was referenced Dec 12, 2018

Prework for faster code analysis #610

Merged

Targets get outdated when running make() in multiple R session #615

Closed

Code analysis without codetools #625

Merged

wlandau mentioned this issue Dec 18, 2018

Make more use of hash tables in the code analysis #626

Merged

6 tasks

wlandau closed this as completed Dec 19, 2018

wlandau mentioned this issue Jan 3, 2019

Profiling study with Rprof() before the release of 7.0.0 #647

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster code analysis #573

Faster code analysis #573

wlandau commented Nov 4, 2018 •

edited

Loading

wlandau commented Nov 5, 2018 •

edited

Loading

wlandau commented Nov 5, 2018

wlandau commented Nov 5, 2018 •

edited

Loading

wlandau commented Dec 9, 2018

wlandau commented Dec 10, 2018 •

edited

Loading

wlandau commented Dec 17, 2018 •

edited

Loading

wlandau commented Dec 18, 2018

wlandau commented Dec 18, 2018 •

edited

Loading

wlandau commented Dec 19, 2018

wlandau commented Dec 19, 2018

bpbond commented Dec 19, 2018

wlandau commented Dec 20, 2018

wlandau commented Dec 21, 2018

wlandau commented Dec 21, 2018

Faster code analysis #573

Faster code analysis #573

Comments

wlandau commented Nov 4, 2018 • edited Loading

wlandau commented Nov 5, 2018 • edited Loading

wlandau commented Nov 5, 2018

wlandau commented Nov 5, 2018 • edited Loading

wlandau commented Dec 9, 2018

wlandau commented Dec 10, 2018 • edited Loading

wlandau commented Dec 17, 2018 • edited Loading

wlandau commented Dec 18, 2018

wlandau commented Dec 18, 2018 • edited Loading

wlandau commented Dec 19, 2018

wlandau commented Dec 19, 2018

bpbond commented Dec 19, 2018

wlandau commented Dec 20, 2018

wlandau commented Dec 21, 2018

wlandau commented Dec 21, 2018

wlandau commented Nov 4, 2018 •

edited

Loading

wlandau commented Nov 5, 2018 •

edited

Loading

wlandau commented Nov 5, 2018 •

edited

Loading

wlandau commented Dec 10, 2018 •

edited

Loading

wlandau commented Dec 17, 2018 •

edited

Loading

wlandau commented Dec 18, 2018 •

edited

Loading