-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: segfault from C stack overflow #9
Comments
If the problem arises only when ggtext is loaded then it's likely a gridtext issue. However, we will need a much more limited reproducible example. I'd like to ask you to try to whittle away at things until we have isolated exactly what causes the problem. Try to remove anything that you can remove without resolving the problem. Instead of making multiple different figures, see if you can make the same one over and over. See if the problem remains if you don't save and reload figures but instead generate all in one go. Etc. If this is a memory leak in gridtext, it should be possible to bring this down to one very simple figure that you just create over and over until the segfault. Alternatively, I wouldn't be surprised if it was somehow related to saving and reloading plots, which gridtext may not support. (ggplot plots contain a copy of their environment, and that may cause the trouble.) |
Actually, another simple thing to try: Before saving any plots, zero out the environment they hold. You can do this as follows: # make a plot
p <- ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
geom_point() +
facet_wrap(~Species)
# zero out the environment
p$plot_env <- rlang::new_environment()
# now save
... |
OK, I did the environment thing on all of the plots into the tibble. I saved that tibble as an RDS here: processing file: PerTrialReportDev.Rmd
|............................. | 17%
|.......................................................... | 33%
|........................................................................................ | 50%
|..................................................................................................................... | 67%
|.................................................................................................................................................. | 83%
|...............................................................................................................................................................................| 100%
label: outputs (with options)
List of 7
$ message : logi FALSE
$ warning : logi FALSE
$ error : logi FALSE
$ fig.retina: logi TRUE
$ fig.height: num 7
$ fig.width : num 8
$ results : chr "asis"
Error: segfault from C stack overflow To your comment about whittling this down, I've been doing that for nearly the whole week, much to my employers chagrin. At first I thought it was because I was using development dplyr/vctrs/glue, which is why I started this adventure in the dplyr repo, but yesterday through guess and checking I got to ggtext, which is reproducible on two of my macs. I'm sorry it's so large, but I've tried really hard to get to where I am. |
Too bad the environment is not the issue. It would have been an easy fix. |
OK. I got it with a simpler example. This takes mtcars and makes 1200 samples of it with a plot that uses ggtext in the caption. |
I'm glad you're able to cut this down, but please remove absolutely everything that is not needed. Make it the smallest possible example that causes the problem. Also, do you need the separate rscript? What if you just make 1000 plots in the .Rmd? What if you just make 1000 plots without using knitr at all? |
I am trying. I'm sorry if this is too piecemeal for you. I spent a couple of hours trying to get Valgrind and lldb to work as Thomas suggested. I downloaded Winston's Docker, but I've not used that stuff enough to get to the problem quickly. |
Just in general, I'm referring to things like this: Minimal gridtext example: library(grid)
library(gridtext)
g <- richtext_grob("Hello!")
grid.newpage()
grid.draw(g) Created on 2020-05-15 by the reprex package (v0.3.0) |
You are getting a C stack overflow so that is probably function calls nested too deeply. To figure out what is happening I would run under a debugger and look at the C call stack when this happens. For me on Linux this runs OK with my standard settings which has an 8Mb C stack. If I drop that to 1Mb then I get stack overflows in R_ReleaseObject. The implementation there isn't ideal; If this is the issue, then for some reason a humongous number of objects is being put on the |
Shortest one yet! |
It's a stack overflow, not a memory error. So valgrind isn't likely to help. You need to catch the segfault in a debugger. Catalina has made that harder. On a mac or on Linux you can make this fail more quickly by starting a shell and reducing your stack size before running your example. The R extensions manual says It is still a horrible design (somewhere in the C code that is running) to push that many things into the preserved object list and rely on finalizers to eventually clean things up. |
It seems to require knitr to make the stack overflow. I suspect that the issue is really there, but is exacerbated by ggtext/gridtext. I would have never come across this if this particular experiment had better data structure (the outputs are fairly useless using my pipeline for this data set), but it all needs to go through it. |
@ltierney This may be a strange interaction of several different packages. The gridtext package is written with Rcpp, so it never explicitly calls |
Is it possible to work around the issue by simply increasing the stack size? |
In principle yes. On Linux the default size is 8M and that is enough for the example I ran. I forget what the default is on Macs and how much you can raise that without sudo level fiddling. |
It turns out it's 8M also on Mac and there's no simple way to make it bigger (only smaller). Making it bigger requires some special linking flags. https://developer.apple.com/library/archive/qa/qa1419/_index.html |
It may be that on Linux I'm just barely avoiding the segfault. I've committed the changes to avoid deep recursion in It would still be good to understand why this list is getting so large, and why it is only being cleaned up by (eventually) getting finalizers to run. As a design this would be a bad idea, but it may be that there is a better design in place and something isn't working quite right. You can see one aspect of the problem if you run Brandon's example with To expand on this: If I call this function after sourcing Brandon's code it takes about 1000 seconds. cleanup <- function() {
old <- 0
system.time(
repeat {
new <- gc()[[1]]
if (old == new) return(old)
else old <- new
})
} If I change the implementation of the preserved object set to a hash table it is 3 seconds. But with more complex and slightly non-portable code, so not somewhere I particularly want to go. |
@ltierney Thanks for committing a fix in My hunch is this will ultimately trace back to the fact that gridtext generates thousands of little strings and stores each in a separate R string vector. I had to do it this way to avoid encoding issues on Windows. Once we move to UTF throughout I can just store the strings as regular C++ objects and circumvent all the R memory allocation/release issues. |
But that doesn't explain why they have to end up in the preserve object list. I very rarely use that mechanism; when I do it is usually for some table I want to keep alive for an entire session without making it reachable by some other means, like an R variable. I'm definitely missing something about the usage here. |
Rcpp calls |
I think I'm beginning to see. It looks like Rcpp is misusing this feature. A better way would seem to be for Rcpp to maintain it's own table of R objects it wants to keep alive, probably using weak references, and just register that table with |
Something like this seems to kill my R session reliably. For some reason I need the second loop. library(ggplot2)
library(ggtext)
library(grid)
plot_grob <- function(df) {
p <- ggplot(data.frame(x = 1, y = 1)) +
geom_point(aes(x, y)) +
labs(caption = "<br>Pink dots represent outliers and are removed from downstream analyses.<br>Error bars represent the standard error of the mean.<br>ANOVA model significance *p* < 0.05.<br>Treatments with the same letter are not significantly different at *α* = 0.05 according to Tukey's HSD. ") +
theme(plot.caption = element_textbox_simple())
ggplotGrob(p)
}
l <- list()
for (i in 1:1000) {
cat(i, " ")
g <- plot_grob()
grid.newpage()
grid.draw(g)
l[[i]] <- g
}
l <- NULL
l <- list()
for (i in 1:1000) {
cat(i, " ")
g <- plot_grob()
grid.newpage()
grid.draw(g)
l[[i]] <- g
} |
If I comment out the list assignments in the above code (i.e., this: @bhive01 Now thinking some more about this, the problem in your .Rmd may be that you're generating all the plots in one chunk. Depending on how knitr is written, it may hold on to all output from one chunk before processing it. If you can somehow rewrite your code so plot generation is broken up over several chunks this problem may go away. |
This is in R 4.0.0, not R-devel with my commit I assume? That makes sense -- it is putting about 300K objects into the preserved object list before it starts to clean up; if it does start to clean up during the second loop that takes a long time. |
Actually, for me it's R 3.6. I haven't upgraded to 4.0 yet. |
@clauswilke, you might be on to something there and that might explain why knitr is connected to my version of the issue. I do know that it generates the png files all at once and then pulls them into the html document. after encoding them. Is this something I should bring up with Yihui? Or is this likely to be "fixed" by Luke's edits? |
I think knitr behaves the only way it can. You could split up the figure generation over more chunks if Luke's patch doesn't solve the issue for you. Or you could write a separate script that generates the pngs and then just pull them into your report directly. |
The stack overflow segfault will be gone but, depending on when the cleanups kick in, you may have to go for a cup of coffee or two while waiting for it to finish. |
See here for my initial look at the issue:
tidyverse/dplyr#5243
wilkelab/ggtext#34
I have continued to work on this all week and figured some things out. Essentially, all of the data in my example is required to reproduce the problem and so is ggtext.
You need 3 components:
RDS: https://drive.google.com/file/d/1lhz7kOTZTKT7Dt0D5bcmbwSldiOLwe6F/view?usp=sharing
RMD: https://gist.github.com/bhive01/6f68c3301dbfb75ba0063db2abbc7f64
Rscript: https://gist.github.com/bhive01/e293135e904745a129165794bd4c0032
The RDS is compressed and uncompresses to about 2 GB in RAM. It is a very large multilayered tibble with premade plots embedded. When I run the Rscript with the entire dataset and ggtext, I get:
This happens consistently for me (session_info's below). If you '#' out library(ggtext) it runs without issue. I've tried it with only slice 1, 2, 3, 1&2, 1&3, 2&3 and it works, but all 3 rows and it segfaults (with ggtext). In Rstudio you get the bomb, I'm using R-gui.
When I run the script without ggtext:
session_info taken after running
without ggtext: https://gist.github.com/bhive01/491477484034ad109b9f6791d99afa75
with ggtext: https://gist.github.com/bhive01/43e7aa0f1e52e5e53d8f48801153d403
The text was updated successfully, but these errors were encountered: