-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Somehow make the spec searchable (e.g. by generating PDF version) #10218
Comments
Imported From: https://issues.scala-lang.org/browse/SI-10218?orig=1 |
@SethTisue said (edited on Mar 6, 2017 5:01:40 PM UTC): it would be wonderful if some volunteer tackled this. |
In an educational context, I think students learning Scala and also teachers designing Scala courses, in addition to a searchable document, would also greatly benefit from a pdf with the language spec readable off-line and printable from a paginated format. |
Hi, |
@atiqsayyed yes, you're more than welcome to work on this! |
I volunteered on gitter today, to make sure there wouldn't be a permanent record. |
@som-snytt sorry to have missed on this issue, can we discuss about it to make sure we understand what we have to do here? |
I took a glance but won't have time until a three-day weekend that is not US Labor Day. Halloween is on a Tuesday this year. |
It's very good to be able to search but also nice to be able to print it and view it in a paginated form in a pdf-viewer, so for my Scala teaching efforts here at Lund University, a pdf version would be really valuable. It would be really cool if you both could join forces an achieve some progress on this issue, @som-snytt @atiqsayyed |
I'm not sure, but I think there already exists a PDF version. At least I've used PDFs of previous Scala versions in the past. One solution to this problem would be https://www.algolia.com/. It's free for open-source projects. It would be cool for the rest of the docs too, not only the spec. But someone would need to step up to make it a reality. It wouldn't be difficult though, just:
|
I think a pdf version only exists for 2.11 which I think was written in latex, but now its markdown or something. A pdf-generation infrastructure for the language spec of 2.12 (and 2.13 and Dotty etc) would be really nice. |
correct. the change happened several years ago, in 2014, between 2.10 and 2.11 |
@SethTisue I find myself needing this. How can we make such a thing happen? |
Seems something like this https://www.sitepoint.com/creating-pdfs-from-markdown-with-pandoc-and-latex/ could work. Is there someone out there that would like to contribute such a thing? |
i'd probably render markdown to html with some JS library and feed the thing to electron-pdf, athenapdf or maybe chrome (headless). the latter works really well in my experience. where can i find the markdown sources? i might give it a try... |
@ritschwumm in the scala/scala repo under the |
@jvican can’t think what to add besides what’s already in the comments here, or in the linked past discussion |
@SethTisue Consider adding that next time folks update their will, they could include a small endowment or trust to ensure work on a ticket is funded. The resulting metric is the inverse bus factor, how many untimely deaths are required for features to progress. |
@ritschwumm https://github.com/scala/scala/tree/2.12.x/spec. Would be awesome if you give it a try. |
spent a few hours on it today - |
@ritschwumm if your attempt is abandoned, perhaps you could link to a wip branch that someone else could pick up...? |
I think just sticking this on it should work: <form method="get" action="http://www.google.com/search">
<input type="search" name="q" placeholder="Google site search">
<input type="hidden" name="sitesearch" value="https://www.scala-lang.org/files/archive/spec/2.11/" />
<input type="submit" value="Go!" />
</form> Also, you can use Algolia, like Play's docs. |
@SethTisue sorry, i don't have a branch - i refuse to sign a CLA, so that wouldn't make much sense. here's what i have so far: #!/bin/bash
rm spec/all.md
rm build/spec/all.html
rm -f test.pdf
# TODO index needs layout toc
chapters='
01-lexical-syntax
02-identifiers-names-and-scopes
03-types
04-basic-declarations-and-definitions
05-classes-and-objects
06-expressions
07-implicits
08-pattern-matching
09-top-level-definitions
10-xml-expressions-and-patterns
11-annotations
12-the-scala-standard-library
13-syntax-summary
14-references
15-changelog
'
# prefix chapters with a special anchor
(
#echo "---"
#echo "title: Scala Language Specification"
#echo "layout: default"
#echo "---"
#echo ""
for i in $chapters; do
echo >&2 "### $i"
echo '<a name="CHAPTER-'"$i"'"></a>'
cat "spec/$i.md"
#| tr '\n' '\0' | perl -pe 's/^---(.*?)---//' | tr '\0' '\n'
done
) |
# remove target page name from links to anchors
perl -pe "s/\[([^\]]+)\]\(\d\d-[a-z-]+\.html(#[^)]+)\)/[\1](\2)/g" |
# point links to chapters to the CHAPTER anchor
perl -pe "s/\[([^\]]+)\]\((\d\d-[a-z-]+).html\)/[\1](#CHAPTER-\2)/g" |
cat >spec/all.md
# TODO add a chapter-anchor
# \[ ([^\]]+) \]
# \( (\d\d-[a-z-]+\.html) \)
#[Unicode escape](01-lexical-syntax.html) or by an [escape sequence](#escape-sequences).
#<a name="pookie"></a>
bundle exec jekyll build -d build/spec/ -s spec/ --baseurl="."
docker run --security-opt seccomp:unconfined --rm -v "$(pwd):/converted/" arachnysdocker/athenapdf athenapdf -D 1000 build/spec/all.html test.pdf
evince test.pdf
|
Since you don't want to sign a CLA, could you clarify under which license you post this code? |
haha, good question :) |
@ritschwumm, thanks -- public domain (== WTFPL) is fine with me. Just looking to avoid any licensing issues for the project, which is ultimately what the CLA is about. |
Bjorn Regnell wrote:
It's very good to be able to search but also nice to be able to print it and view it in a paginated form in a pdf-viewer ...
I think it would also be useful if the snippets of BNF for the grammar were hyperlinked, e.g., in "x ::= y z", the "y" would be a link to the production defining y. (And maybe the "x" could be a link to a list of the things that refer to x, with each item in the list linked to the production containing the reference.)
For a hacked-together partial simulation of the forward linking, see https://dsbos.github.io/temp-scala-hyperlinked-spec/2016-11-13_2.12_output/09-top-level-definitions.html and sibling files.
Daniel
|
@ritschwumm I had to make some changes in your script to generate a valid pdf document, but that contribution is great, I wouldn't have been able to figure it out myself. Thank you. I also managed to create a mobi file out of the |
If I understand correctly, objective here is to generate PDF for one of the scala website(containing scala spccifications). |
I think it would be great if, as a first step, we get a whole html file (like the one made by @ritschwumm) that has all the chapters and which is readable. From there, we can easily convert to PDF and to ebook formats through athenapdf (or maybe pandoc too?) and kindlegen. |
Is it something like this we need: |
I agree with this:
There's already support for that in my hepek project. It uses headless Chrome via Selenium, waits for JS to load and snapshots its HTML (see example here). 😃 I'll try on weekend to tackle this! Probably hardest issue will be to map markdown files to corresponding hepek abstractions.. |
how about a slightly different approach: if i remember correctly, the main obstacle was the irregular link structure of the original files. maybe we can just make them more regular somehow? apart from that i'm not convinced that regex search&replace is the way to go - manipulating meaningful data structures is so much easier... is there a simple way to have those - some parser, maybe? |
I'm more than happy for someone to rework the markdown sources if that makes generating pdf/html/mobi... easier! |
As I see it though, these are the two true challenges:
There's not a lot of value in changing the content of the markdown sources if these two problems are not tackled (and also I would favor the least possible diff to makes this possible 😄). As soon as we have a unified markdown file with all the chapters, we can use pandoc to turn the spec into an ebook or PDF. |
@jvican how is mathjax problematic? |
Maybe it wasn't mathjax but whatever is being used for the notation of the language. In the PDF I generated a while ago, the notation was poorly displayed and it rendered most of the snippets explaining Scala's grammar unreadable. |
scala/scala#7432 is merged! So we can now generate a PDF locally. I'm not closing this ticket yet, though, because there is work left to do: I need to actually publish the PDF on our website. Soon! |
whoa, we're live! https://scala-lang.org/files/archive/spec/2.13/spec.pdf |
For those who like PDF versions of things, see also the discussion at scala/scala3#10767 (comment) about a PDF version of the Scala 3 Reference |
I can't imagine this is a new issue but I can't find an old one, so.
Please make the specification searchable (available as a single page as a quick fix?) or, better, figure out how to make a proper index. The language is complex enough that questions about its behavior come up a lot for users, and it's often quite hard to find the relevant section in the spec. For example, I had a question last night about import priority and found it in the introduction to identifiers, names and scopes in the 2.9 spec PDF and it happens to still be there (but not in the section on import statements, which I found via the TOC, which is where I looked first).
The text was updated successfully, but these errors were encountered: