New role ghref for pointing to files on github #56

gares · 2021-08-10T14:49:50Z

I find myself adding links to stuff which is not a coqdoc generated html file, so I came up with this hack.

In my first rst comment I do:

.. role:: elpi-api(ghref)
   :pattern: ^(% \[$name|% $name|pred $name)

.. role:: lib(elpi-api)
   :src: https://github.com/LPCIC/coq-elpi/blob/master/elpi/coq-lib.elpi

.. role:: builtin(elpi-api)
   :src: https://github.com/LPCIC/coq-elpi/blob/master/coq-builtin.elpi

.. role:: elpi-type(ghref)
   :pattern: ^kind $name

.. role:: type(elpi-type)
   :src: https://github.com/LPCIC/coq-elpi/blob/master/coq-builtin.elpi

and then I can easily put links to my (builtin or defined) APIs or data type declarations.

The first one `Elpi Command hello.` sets the current program to hello.
Since it is declared as a `Command` some code is loaded automatically:

* APIs (eg :builtin:`coq.say`) and data types (eg Coq :type:`term` s) are loaded from
  `coq-builtin.elpi <https://github.com/LPCIC/coq-elpi/blob/master/coq-builtin.elpi>`_
* some utilities, like :lib:`copy` or `whd1` are loaded from
  `elpi-command-template.elpi <https://github.com/LPCIC/coq-elpi/blob/master/elpi/elpi-command-template.elpi>`_

The pattern is used to locate the line (in the raw file) so that the href can point to $file#L$line, in turn GH highlights the line.

Todo:

permalinks (scrape for <a class="d-none js-permalink-shortcut" data-hotkey="y" href="/gares/alectryon/blob/f256fe49367d1cadbeeb4285e9786a3fc79f286a/CHANGES.rst">Permalink</a>)
shorter syntax for src, eg org + repo + path
permanent cache

Feedback is welcome.

Disclaimer: last time I wrote some python it was version 1 (yes, there were no objects) so my code is likely to suck.

cpitclaudel · 2021-08-11T06:18:13Z

The code looks good. Some high-level questions:

Do you use plain docutils or do you use Sphinx?
The cache isn't persisted, so this will re-fetch the github page every time the document is built, right? We'd need to be a bit careful (for example alectryon.el runs docutils in the background while you edit a file in Emacs, so it's not ideal if it every time it runs it connects to the internet)
Did you consider using Chrome's new "link to text" feature? (Firefox doesn't implement it yet, but they're planning to, at some point): For example here's a link to "coq.say" https://github.com/LPCIC/coq-elpi/blob/master/coq-builtin.elpi#:~:text=coq.say

alectryon/docutils.py

gares · 2021-08-11T09:50:05Z

Thanks, I did update the code and implemented permalinks.

Do you use plain docutils or do you use Sphinx?

I'd like my tutorials to stay .v files, so I don't know ;-) I call alectryon... I'm pretty clueless on doc tools, I'm happy If I can stay ignorant on this matter ;-)

The cache isn't persisted, so this will re-fetch the github page every time the document is built, right? We'd need to be a bit careful (for example alectryon.el runs docutils in the background while you edit a file in Emacs, so it's not ideal if it every time it runs it connects to the internet)

Sure, but I have no clue on how to make it persistent. Any pointers?
It now contains the permalinks, so we can check for outdated entries.

Did you consider using Chrome's new "link to text" feature? (Firefox doesn't implement it yet, but they're planning to, at some point): For example here's a link to "coq.say" https://github.com/LPCIC/coq-elpi/blob/master/coq-builtin.elpi#:~:text=coq.say

This looks like a very natural feature, I did google it but could not find it. I still see a few advantages in doing things this way:

I use firefox 😇
I can check that the pattern is found at doc preparation time, and error if not. This is quite important IMO.
GH #L also supports ranges, I was wondering about having a pattern_begin and pattern_end options

What you propose makes a lot of sense for non-GH links, IMO.

gares · 2021-08-20T11:13:40Z

@cpitclaudel I'd like some hints on the cache thing, and any other comment on this PR, so that I can finish it.

cpitclaudel · 2021-08-20T11:41:05Z

Yep, will get to it. It took me a while because I was working on the Pygments stuff first ^^

alectryon/docutils.py

cpitclaudel · 2021-08-22T04:23:49Z

Just had a new look, the code looks reasonable (though an API that involves eval is a bit scary).

On the cache: Sphinx has decent support for caching natively; for a more general solution you could just open a file and update it every time you try to resolve one of these links. If that's too bad for performance (opening and writing a file for every such link), then you would use a transform: you'd delay most of the work of resolving links and then resolve all links at once (along with updating a cache) in a "Transform". Not sure if it's worth the effort.

I'm expecting this code will evolve as the use-case gets refined, and I'm not entirely comfortable with running arbitrary code every time Alectryon compiles a document, so I'm not sure about merging this as-is. For your use case, if Alectryon had a --plugin flag that you could pass a Python file to, would it be enough? That's easy to add, and custom plugins are a pretty common use case (I originally thought people would use custom drivers built on top of Alectryon for things like that, but a --plugin option might be nicer?).

gares · 2021-08-22T06:34:05Z

Hum --plugin is fine, but I prefer to clean it up if it can be merged, eventually. You tell me.

About eval, I think a replace expression is more than enough, I just don't know which is the best way to pass something like s/this/that/.

About the cache, I can write to a file. I was just wondering if alectryon already had a cache (invalidation) mechanism to hook to.

gares · 2021-08-22T08:26:38Z

I should have addressed all issues. The cache is still to be done.

cpitclaudel · 2021-08-22T18:17:06Z

Hum --plugin is fine, but I prefer to clean it up if it can be merged, eventually. You tell me.

I'd prefer to keep this external for now, and if it catches on we can merge it. Let me know if that works for you.

About eval, I think a replace expression is more than enough, I just don't know which is the best way to pass something like s/this/that/.

Probably as an argument with that s/…/…/ syntax, parsed with a regexp.

About the cache, I can write to a file. I was just wondering if alectryon already had a cache (invalidation) mechanism to hook to.

There is, but it's only keeping track of the Coq parts, not of anything related to docutils. Sphinx has an extensible caching mechanism, but it doesn't work with plain Docutils. I think your best bet is a separate cache file.

gares · 2021-08-23T08:10:33Z

I'd prefer to keep this external for now, and if it catches on we can merge it. Let me know if that works for you.

OK, thanks for you reviews by the way.

I guess the --plugin will let me both register one extra role and monkey patch the pygment style, right?

gares · 2021-08-23T08:11:37Z

Probably as an argument with that s/…/…/ syntax, parsed with a regexp.

I changed that already into 2 options, the second one defaulting to the empty string which is what I currently need.

gares · 2021-08-23T18:08:50Z

I'd prefer to keep this external for now, and if it catches on we can merge it. Let me know if that works for you.

I guess the --plugin will let me both register one extra role and monkey patch the pygment style, right?

please have a look at the last commit to see what I mean

cpitclaudel · 2021-08-24T16:21:25Z

OK, I've looked into nested parser. I pushed an example in recipes/custom_driver.py, and I've improved the documentation for custom driver. Can you have a look? I tried to make the example as close as possible to your use case.

I think that's the most flexible solution, since with a plugin there's always the risk that things get imported in the wrong order, whereas with a custom driver your code is guaranteed to run first. I think that's the right solution, but if after looking into the new documentation and example you think a --plugin flag would be much better I can add that, too.

gares · 2021-08-24T20:21:56Z

Thanks, I'll look at it more carefully, but after a quick look I still don't get how you go back to he host language from, say, CLexer.
For example in

alectryon/recipes/custom_highlighting.v

Line 54 in 95a31c4

Definition c_program := "C:{{

you nest C inside Gallina, but no Gallina inside C.

gares · 2021-08-24T20:23:12Z

As a reference, see the second code block here: https://lpcic.github.io/coq-elpi/tutorial_coq_elpi_HOAS.html#quotations-and-antiquotations

cpitclaudel · 2021-08-24T21:40:37Z

you nest C inside Gallina, but no Gallina inside C.

Would it not work to mark the Gallina-inside-ELPI tokens as "other"?
IOW, you don't get back into Gallina from ELPI; instead, you mark all Gallina beforehand. Would that work?

gares · 2021-08-25T08:39:01Z

Now I recall what my problem was. In order to use using you need to find the end of the embedded language. You do:

            ('(C)(:[{][{])(.+?)([}][}])',
             bygroups(token.Keyword, token.Operator, using(CLexer), token.Operator)),

but this does not work for me since C:{{ bla {{ more }} extra }} would run CLexer on bla and more but not extra.
So, unless python regex can count, I need to use the pygments stack machine, I believe.

cpitclaudel · 2021-08-25T20:42:17Z

Thanks, you're right, and sorry for being a bit slow at this.

I think your approach is likely to be the best there is. What you could do, for sure, is a three-step process: first use a simple lexer that tags each bit of code as Coq or ELPI, and then use two nested DelegatingLexers to highlight both of these separately. But that would be neither much simpler nor cleaner, and worse, it's arguably a bit less correct, because each individual parser sees a truncated version of the source (if you have Definition a := lp:{{ xyz {{ Some n }} abc }}, what the Coq lexer will see is Definition a := Some n ., and what the ELPI lexer will see is xyz abc, and neither of those are "right" (in corner cases that might even confuse the lexers entirely, for example in Definition b := lp:X. the Coq lexer would see an invalid sentence Definition b := .

So your approach is good, I think. I will leave specific comments inline.

alectryon/pygments_lexer.py

alectryon/pygments.py

alectryon/elpi_lexer.py

alectryon/docutils.py

cpitclaudel · 2021-08-25T20:58:50Z

Yup, just went through everything. I think your approach works nicely, and I can't think of something much smoother.

gares · 2021-08-26T07:59:04Z

Thanks for all the help, I could move to a custom driver: LPCIC/coq-elpi#283
Looking forward to a release including --pygments-style, so that I can drop the ugly pinning.

cpitclaudel · 2021-08-26T14:24:47Z

Woohoo! Thanks so much for your patience and for helping me understand the issue(s) :) The new tutorials look great.

cpitclaudel reviewed Aug 11, 2021

View reviewed changes