-
-
Notifications
You must be signed in to change notification settings - Fork 975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FR] *.knit.md
should have a unique name to allow parallelization
#2454
Comments
This seems like a good idea. I wonder what it could break in other tools pipeline doing this 🤔 (also having Quarto in mind) But is renaming the In your testing does changing the knit output name was only thing to do ? Current possible small hack if you: the knit output extension can be changed for now using if hash is enough, we could do something like diff --git a/R/render.R b/R/render.R
index 9d73610d..60045dee 100644
--- a/R/render.R
+++ b/R/render.R
@@ -395,8 +395,12 @@ render <- function(input,
# as `.md~` will be ignored.
input <- basename(input)
knit_input <- input
+ # input_based meta
+ meta_ext_fun <- getOption("rmarkdown.knit.meta_ext_fun", function(input) "knit")
+ if (!is.function(meta_ext_fun))
+ stop2("'rmarkdown.knit.meta_ext_fun' has been provided but it should be a function with 'input' as argument")
knit_output <- intermediates_loc(
- file_with_meta_ext(input, "knit", getOption("rmarkdown.knit.ext", "md"))
+ file_with_meta_ext(input, meta_ext_fun(input), getOption("rmarkdown.knit.ext", "md"))
)
intermediates <- c(intermediates, knit_output) This way you could do Just an idea if really changing the knit output filename helps. |
I actually didn't test only renaming the
No (didn't test that as stated above).
Actually, I don't think using a hash that is based solely on the Thanks about the code suggestion with a new |
Ok - I agree with that and was suprised with the suggestion. Thanks for confirming it is not enough.
We could try that, but I believe all the intermediate directories generation should be changed. This is what #499 is about and not easy to do - this part is really sensible to breaking change. And also there is the specific case of LaTeX which expect a lot of its intermediates to be local to the input file. Though I agree this would be very interesting to solve issues with parallel rendering. Quite a challenge that requires some dedicated time on this specifically, and probably lots of testings. |
Well, instead of using a unique/isolated I.e. rendering No? |
Probably. I wonder how much we rely on the stem for the input file. Like but we could probably track every occurence and change this. I just don't know how LaTeX would behave with this. Probably ok as the this worth trying for sure. We just think this is no an easy change - @yihui has more experiece about path handling over the years in R Markdown |
rmarkdown / knitr / Pandoc can be slow when generating 1000s of PDFs, especially when using xelatex instead of the default pdflatex renderer. An easy and straightforward way to speed things up on multicore systems is to rely on the means provided by the awesome future framework, especially the furrr package, a drop-in replacement for purrr.
Now if one tries to combine
rmarkdown::render()
with sayfuture_pwalk()
and themultisession
backend, the rendering will most likely fail due to the individual workers interfering each other by trying to read and write from/to the same intermediary*.knit.md
file.This can be worked around by giving the
input
filename a unique name in the map/walk function before feeding it tormarkdown::render()
(or setting a differentintermediates_dir
; but varying the intermediates dir is discouraged as far as I understand and introduced additional bugs in my tests).A solution at the core of rmarkdown / knitr for this issue would be to
give the intermediateadd a sufficiently large random string to all intermediate output filenames (per*.knit.md
a unique filename from the beginning. Ideally one that is based on its actual content (a fast hash function like the ones from the xxHash family would be suitable I guess;rlang::hash_file()
provides XXH128).rmarkdown::render()
invocation).What do you think?
As I realized after submission, this issue could be considered a duplicate of #499.
The text was updated successfully, but these errors were encountered: