Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error: missing toolbar in self_contained gitbook with pandoc 2.7.3 #739

Closed
Pindar777 opened this issue Jul 9, 2019 · 25 comments · Fixed by #784
Closed

error: missing toolbar in self_contained gitbook with pandoc 2.7.3 #739

Pindar777 opened this issue Jul 9, 2019 · 25 comments · Fixed by #784
Labels
bug an unexpected problem or unintended behavior
Milestone

Comments

@Pindar777
Copy link
Contributor

After updating the RStudio-Version and at the same time the pandoc version my self_contained gitbook got knitted without toolbar. I could reproduce the error with pandoc 2.7.3 and visual studio code bookdown extension.

After downgrading to a pandoc version included in RStudio 1.14 the error was solved.

@yihui
Copy link
Member

yihui commented Jul 10, 2019

Could you provide a minimal reproducible example as required in the issue guide? It will be great if you can provide the HTML output, too (you can upload them as a zip archive here).

@Pindar777
Copy link
Contributor Author

Pindar777 commented Jul 10, 2019

bookdown-demo-master-html-selfcontained.zip

This is the reproducible example that I just made from scratch:

  1. install GNU R 3.6.1 for current Windows 10
  2. install RStudio 1.2.1335 for current Windows 10
  3. install packages devtools | tidyverse | bookdown
  4. download https://github.com/rstudio/bookdown-demo and get rid of first level folder
  5. add in _output.yml "self_contained: TRUE" (e.g. SLOC before "css: style.css")
  6. knit with missings toolbar

In the zip file the html files of the book are included.

@yihui
Copy link
Member

yihui commented Jul 10, 2019

Perfect. Could you also provide a copy of the output with the lower version of Pandoc so that I can compare it with the buggy output produced by Pandoc 2.7? Thanks!

@Pindar777
Copy link
Contributor Author

bookdown-demo-master-html-selfcontained_pandoc_old.zip

These are the (selfcontained) files generated by "pandoc_old" included in RStudio 1.1.4.
I hope, that it helps you @yihui to find the bug.

Cheers

@Pindar777
Copy link
Contributor Author

@yihui is there news? how could i help you to solve the issue?

@yihui
Copy link
Member

yihui commented Sep 3, 2019

Sorry, but I won't have time for this issue this year (I have to focus on another project). I would start from comparing the output .html files generated by the different versions of Pandoc and see what's going on.

@Pindar777
Copy link
Contributor Author

@yihui Thanks for the update. Fair enough. At the moment I'm fine with working on the old pandoc and looking forward to get rid of the obstacle to create a standalone version.
If I find a promising hint I check back too.

@RLesur
Copy link
Collaborator

RLesur commented Oct 10, 2019

I haven't seen this issue. I'll have a look.

@RLesur RLesur self-assigned this Oct 10, 2019
@RLesur
Copy link
Collaborator

RLesur commented Oct 10, 2019

@Pindar777 I can reproduce the issue for R 3.6.1, the latest CRAN versions of bookdown, rmarkdown et alii and Pandoc >= 2.0.6, see https://rlesur.gitlab.io/bookdown-issue739/.

But I am struggling to reproduce the self contained gitbook with "old Pandoc".
Please, could you indicate your session info and pandoc version used to produce bookdown-demo-master-html-selfcontained_pandoc_old.zip?

@Pindar777
Copy link
Contributor Author

@RLesur Thank you for your efforts.

Well, I think I know what you mean: it takes ages!
But it's working with this setting:

rmarkdown::pandoc_version()
[1] '1.19.2.1'

xfun::session_info("bookdown")
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)
Locale:
LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
LC_TIME=German_Germany.1252
Package version:
base64enc_0.1.3 bookdown_0.11 digest_0.6.20 evaluate_0.14
glue_1.3.1 graphics_3.6.1 grDevices_3.6.1 highr_0.8
htmltools_0.3.6 jsonlite_1.6 knitr_1.23 magrittr_1.5
markdown_1.0 methods_3.6.1 mime_0.7 Rcpp_1.0.2
rmarkdown_1.13 stats_3.6.1 stringi_1.4.3 stringr_1.4.0
tinytex_0.14 tools_3.6.1 utils_3.6.1 xfun_0.10
yaml_2.2.0 `

@RLesur
Copy link
Collaborator

RLesur commented Oct 12, 2019

@Pindar777 Thanks a lot! This is super useful. I can reproduce with "old Pandoc". It seems to be related to Pandoc 2. I haven't spotted the issue yet. Will dive into this.

@Pindar777
Copy link
Contributor Author

@RLesur Good to hear!
I also assumed that the major version switch in Pandoc is the problem.
Hopefully a minor tweak is sufficient to solve the issue. Good luck and thanks again!

@RLesur
Copy link
Collaborator

RLesur commented Oct 12, 2019

I think I've spotted the problem.

Before Pandoc 2, JS scripts (and CSS stylesheets) were encoded in base64 for self contained HTML documents. Pandoc 2 has a different behavior (see jgm/pandoc#3423).

bookdown::gitbook() post-processor parses the HTML produced by Pandoc to move the JS scripts after the body:

bookdown/R/gitbook.R

Lines 130 to 139 in 2227cd6

# gitbook JS scripts only work after the DOM has been loaded, so move them
# from head to foot
i = grep('^\\s*<script src=".+/gitbook([^/]+)?/js/[.a-z-]+[.]js"></script>\\s*$', head)
# it is probably a self-contained page, so look for base64 encoded scripts
if (length(i) == 0) i = grep(
'^\\s*<script src="data:application/x-javascript;base64,[^"]+"></script>\\s*$', head
)
s = head[i]; head[i] = ''
j = grep('<!--bookdown:config-->', foot)[1]
foot[j] = paste(c(s, foot[j]), collapse = '\n')

The trick used to detect JS scripts for self-contained documents does not work anymore for Pandoc 2.

Therefore, the JS scripts are not moved at the end of the HTML documents and the gitbook JS library fails.

@RLesur RLesur removed their assignment Oct 12, 2019
@RLesur RLesur added the bug an unexpected problem or unintended behavior label Oct 12, 2019
@Pindar777
Copy link
Contributor Author

@RLesur That's it!

@cderv
Copy link
Collaborator

cderv commented Oct 13, 2019

Thanks for the investigation @RLesur !
I could help modify the regex if you want, but we need to deal with older pandoc version and treat encoded and not encoded cases.

However, you the expert in JS, and you could have other idea than identify with a regex, and move at the end to deal with

gitbook JS scripts only work after the DOM has been loaded

What do you think ?

@RLesur
Copy link
Collaborator

RLesur commented Oct 13, 2019

I'm not a JS expert but I know that there exists some JS alternatives to ensure that the scripts run after the DOM is loaded.
For instance, each JS script could be embedded in $(document).ready(function(){...});.
I've locally tried this solution to debug this issue and I'ld be able to send a fix rather quickly.
However, this would be a quite different approach because bookdown internals mainly use regex.
I wonder whether there is some advantage to use regex over JS.

@cderv
Copy link
Collaborator

cderv commented Oct 13, 2019

I wonder whether there is some advantage to use regex over JS.

I continued to dig a bit on how to solve this with regex. Pandoc changes effects are that now js scripts are included as is inside <scripts> tags when self contained instead of being base64 encoding. This means that to move from header to footer those scripts part, it can't be done in one step regex as now because it can be multilines. It would mean to find the start of the script node, then the end and moving all those lines.
I guess in a way moving some nodes in the html would be easier using a parser like xml2, find the scripts node, and move them at the end. I would be an option here for this post rendering step.
I even wonder if it wouldn't be better if those post pandoc conversion where done during conversion using lua filters. However, pandoc <2 compatibility would not be possible.

I think there is a choice that has to be done, on how to deal with this the clever way for bookdown future.

@RLesur
Copy link
Collaborator

RLesur commented Oct 13, 2019

I even wonder if it wouldn't be better if those post pandoc conversion where done during conversion using lua filters.

Lua filters are useful to modify the AST. One also can modify metadata but I'm not sure that moving JS scripts from header-includes to after-body is straightforward.

cderv added a commit to cderv/bookdown that referenced this issue Oct 13, 2019
As investigated in rstudio#739, pandoc2 modified the way it handles js scripts in self contained document. They are not always base64 encoded by default, and are rather included in `<script>` tags as-is. This means that for gitbook post processing, multiline blocks must be identified and moved.
@cderv
Copy link
Collaborator

cderv commented Oct 13, 2019

This means that to move from header to footer those scripts part, it can't be done in one step regex as now because it can be multilines. It would mean to find the start of the script node, then the end and moving all those lines.

A quick solution using regex as of now in bookdown is quite easy to implement in fact. I'll work on that.

For the long run and prevent this type of issue, changing to another solution can be interesting but at the cost of rewriting some part of bookdown 😁

I'll open a PR when work is ready in my branch

@cderv
Copy link
Collaborator

cderv commented Oct 13, 2019

@Pindar777 you can test with the PR I just opened.

remotes::install_github("rstudio/bookdown#784")

In our tests with @RLesur, it seems to work well now

@yihui
Copy link
Member

yihui commented Oct 14, 2019

I didn't read this issue or the PR #784 carefully, but if we have to make a choice between Pandoc 1.19 and 2.x, I'll be totally happy to drop the support for Pandoc 1.19 in bookdown (I'd be more conservative only in the rmarkdown package).

@RLesur
Copy link
Collaborator

RLesur commented Oct 14, 2019

IMO, this issue does not require to make a choice between Pandoc 1.19 and 2.x. @cderv's PR works for both.

yihui pushed a commit that referenced this issue Oct 14, 2019
…document (#784)

allow to move multiline script node in html

As investigated in #739, Pandoc 2 modified the way it handles js scripts in self contained document. They are not always base64 encoded by default, and are rather included in `<script>` tags as-is. This means that for gitbook post processing, multiline blocks must be identified and moved.
@yihui
Copy link
Member

yihui commented Oct 14, 2019

Excellent! Just merged the PR. Thanks again for the wonderful help!

@yihui yihui added this to the v0.15 milestone Oct 14, 2019
@Pindar777
Copy link
Contributor Author

@cderv I just complied the standalone gitbook with pandoc 2.7.3: It is working perfectly and blazing fast! Thanks for the help.

@github-actions
Copy link

github-actions bot commented Nov 6, 2020

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants