-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syntax highlighting skipped on literal_block with unicode #4225
Comments
Additional info: the html produced from
The bits Similarly I tried with the latex formatter of pygmentize, I picked up its produced code, replaced all So, it does seem that even on Pygmentize side the Unicode code-points are treated as some sort of error. |
Even if Pygmentize 2.2.0's COQ lexer does not yet handle such input, the question remains why Sphinx would abort syntax highlighting altogether rather than keeping the |
about the good first issue tag, I don't think we have a definitive policy here, so in the end I removed it as I don't like so much the idea of it actually... ;-) |
Now sphinx uses
|
Oh, I had missed the |
I don't know why pygments raises an error because I don't know about coq language. Anyway, current Sphinx always uses |
The code is valid coq as it is accepted by the compiler. Does that mean this should be additionally be reported with pygments? |
@abooij it does look as being primarily a Pygments problem. |
@jfbu, why do you say that this is "primarily" a pygments problem? After all, pygments' own output is usable, although indeed not perfect. |
@abooij 1) the "raiseonerror" Pygments filter does not seem to have great granularity, it generates an exception when the lexer generates an error token, and 2) if Unicode characters are ok in Coq input (I don't know), then why does the Coq lexer generate an error token? Except if something is wrong in the way Sphinx calls Pygments, this puts most of the problem on the Pygments side. |
@jfbu Well indeed there is a problem on the pygments side. However, I highly doubt pygments will ever be able to get Coq syntax (or any other moderately complicated modern syntax, for that matter) completely right. In those cases, it would be preferable if sphinx generated something workable, rather than give up completely as soon as some subtle aspect of a language's syntax is used. (Unicode is OK in coq and widely used.) |
I think @tk0miya concurred that an option to let Sphinx not use Issue #4249 explains that Pygments' LaTeX formatter has a bug in that respect because it uses a |
Can this option to disable raiseonerror be added, or alternatively, could Sphinx just not use it at all? I want to syntax highlight some IPython console code and if the output contains a Unicode character (like it does with This is how, for instance, the highlighter on GitHub works, and also virtually every Markdown parser I've ever used. If you put some text in In [1]: %timeit sum([i for i in range(10000)])
335 µs ± 15.2 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) It seems that in general it should be possible to choose a highlighter and have it just do the best it can with the input. It will at least definitely color the parts of it that definitely are Python (or whatever language), but the way it works now, a single "bad" character means it doesn't color anything. |
Fixed by 7d8df06. A |
When using unicode in a language that accepts it, on a block of code that is accepted by pygments, the doc build breaks.
Problem
I am trying to get syntax highlighting for the following code:
Definition logeq_both_false {X Y : UU} : ¬X -> ¬Y -> (X <-> Y).
As you can see, this uses the unicode symbol
¬
.Procedure to reproduce the problem
As a test case, I try to compile the following
index.rst
in an otherwise vanilla sphinx project:Error logs / results
Expected results
The original code is processed fine by pygments directly, hence sphinx should, too.
And if sphinx insists on breaking, it should tell me why.
Environment info
The text was updated successfully, but these errors were encountered: