-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add syntax highlighting to code tags #2015
Add syntax highlighting to code tags #2015
Conversation
Ah, I see one issue with this approach: Edit: |
And in fact this doesn't fix #2011 as line highlighting has been so far forgotten. |
I'm not too happy about modifying
How people can use light themes without their eyes burning is beyond me, but fixed.
"Fixes" is just a closing keyword used to link issues. I could've also used close(s|d)?|resolve(s|d)?, but that's semantics and irrelevant. |
app/lib/advanced_text_formatter.rb
Outdated
<<~HTML | ||
<pre><code>#{ERB::Util.h(code).gsub("\n", '<br/>')}</code></pre> | ||
<pre><code title="#{language}">#{ERB::Util.h(code).gsub("\n", '<br/>')}</code></pre> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure title
is the best option here, but I don't know what would be better. I thought about lang
but programming languages are explicitly out of scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also considering using a style class attribute instead, but as mentioned, using the title attribute has the advantage of showing which code language a code block was written in when hovering it.
I think title is the best option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmh... Now that highlighting requires the class attribute anyway, I think it's actually better to also change title to class. I would still like the element to have a title attribute though.
Ah, that doesn't work, because vanilla also sanitises class names. So while setting a title in the backend, converting it to class and then removing it again in the frontend isn't ideal, that's the only approach that works and doesn't allow arbitrary classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use a data-codelang
or something? Do you know if other fediverse software have such a feature, and if so, how they communicate the language being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a data attribute sounds good.
It seems Misskey has syntax highlighting, but looking at their implementation, I can't figure out which attribute they use, if any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it looks like Misskey just uses a class attribute: class="language-<langcode>"
on both pre and code elements, which gets then interpreted by their highlighter prism.js, and a "lang" prop internally.
I'm not sure that's what you wanted to know though.
This pull request has merge conflicts that must be resolved before it can be merged. |
862b738
to
cde4ad3
Compare
|
||
Outdated base version: https://github.com/primer/github-syntax-dark | ||
Current colors taken from GitHub's CSS | ||
*/.hljs{color:#c9d1d9;background:#0d1117}.hljs-doctag,.hljs-keyword,.hljs-meta .hljs-keyword,.hljs-template-tag,.hljs-template-variable,.hljs-type,.hljs-variable.language_{color:#ff7b72}.hljs-title,.hljs-title.class_,.hljs-title.class_.inherited__,.hljs-title.function_{color:#d2a8ff}.hljs-attr,.hljs-attribute,.hljs-literal,.hljs-meta,.hljs-number,.hljs-operator,.hljs-selector-attr,.hljs-selector-class,.hljs-selector-id,.hljs-variable{color:#79c0ff}.hljs-meta .hljs-string,.hljs-regexp,.hljs-string{color:#a5d6ff}.hljs-built_in,.hljs-symbol{color:#ffa657}.hljs-code,.hljs-comment,.hljs-formula{color:#8b949e}.hljs-name,.hljs-quote,.hljs-selector-pseudo,.hljs-selector-tag{color:#7ee787}.hljs-subst{color:#c9d1d9}.hljs-section{color:#1f6feb;font-weight:700}.hljs-bullet{color:#f2cc60}.hljs-emphasis{color:#c9d1d9;font-style:italic}.hljs-strong{color:#c9d1d9;font-weight:700}.hljs-addition{color:#aff5b4;background-color:#033a16}.hljs-deletion{color:#ffdcd7;background-color:#67060c} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have permission to use this CSS bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's part of highlight.js, which is licensed under BSD 3-Clause, which permits usage.
GitHub's CSS itself is licenced under MIT.
I'm not sure if the license file included in the module is sufficient compliance or if we would need to add an additional copyright notice.
I copied the styles from the module, because I couldn't import them directly.
Actually I tried it again and importing from the module worked this time.
Maybe I had the wrong path or something when I initially tried it.
But this also means including the style files is unnecessary.
<<~HTML | ||
<pre><code>#{ERB::Util.h(code).gsub("\n", '<br/>')}</code></pre> | ||
<pre><code data-codelang="#{language}">#{ERB::Util.h(code).gsub("\n", '<br/>')}</code></pre> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't suppose language
is guaranteed to be escaped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what kind of escaping would be suitable here.
At least in my testing I couldn't cause any unexpected behaviour, nevertheless I agree accepting user input as is is never a good idea.
Ideally, I would check here if language
is a valid code language, but this would create overhead.
What I could do though is checking that language
only contains allowed characters via regex: /^[a-zA-Z0-9.#+-]+$/
lib/sanitize_ext/sanitize_config.rb
Outdated
@@ -81,6 +81,7 @@ module Config | |||
'blockquote' => %w(cite), | |||
'ol' => %w(start reversed), | |||
'li' => %w(value), | |||
'code' => [:data], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to only allow data-codelang
, I can imagine arbitrary data-
attributes causing other issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally yes and I tried to limit it to only data-codelang
, but nothing I tried worked, it just removed all data-
attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm I must have had a typo in last time I tried or there was some kind of other issue which prevented it from working.
This pull request has merge conflicts that must be resolved before it can be merged. |
9e886f8
to
cd173f9
Compare
This pull request has resolved merge conflicts and is ready for review. |
This pull request has merge conflicts that must be resolved before it can be merged. |
55cd774
to
d9ac464
Compare
This pull request has resolved merge conflicts and is ready for review. |
This pull request has merge conflicts that must be resolved before it can be merged. |
4190e7e
to
1775188
Compare
This pull request has resolved merge conflicts and is ready for review. |
Just noticed that highlighting is missing in the reply-indicator, because it doesn't use the status_content component and while I could move the highlightCode function to a util and reuse it in |
This pull request has merge conflicts that must be resolved before it can be merged. |
This pull request has merge conflicts that must be resolved before it can be merged. |
Adds a new node dependency. Also adds required scss. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
This creates a new constant used to sanitize the html tree, which allows title attributes on code tags. Adds title attribute specifying language to code tags. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Adds a new html util which converts html entities to their symbols Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Adds a new function which runs on contentHtml to add styling to code blocks. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Adds new check which makes sure title attribute is set correctly Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Removes modified sanitation and adds it directly to `MASTODON_STRICT` This is needed so that the title attritube is not being stripped for remote toots. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Removes the ".scss" extension from the import of the highlight style Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Removed convertToChar from `utils/html.js` Let highlight.js run directly on DOM instead of using RegEx. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Instead of using RegEx to replace br tags, use DOM Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Use data-codelang instead of title attribute for user set code language. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Little oversight. Changes the wording of the test case to use language instead of title, as title isn't used anymore. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Having user specified input display anywhere without sanitizing and removing it when invalid is a bad idea. This removes the data attribute when it's not a valid or supported code language. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
These are third-party styles, which introduce lots of lint problems as they don't adhere to style rules. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Removes highlight.scss files. Removes entries in ignoreFiles. Stylesheets are imported from highlight.js module directly. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
This style change was previously skipped and I missed it while rebasing. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Instead of passing arbitrary user input to the frontend and checking if it's valid there, use Sanitize to strip invalid code languages. For that a yaml file was added containing all valid language codes, which needs to be kept in sync with highlight.js. This is inconvenient, but shouldn't be a problem. Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
17c35ce
to
9c48ed6
Compare
Messed up while rebasing... Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
Signed-off-by: Plastikmensch <plastikmensch@users.noreply.github.com>
This pull request has resolved merge conflicts and is ready for review. |
This pull request has merge conflicts that must be resolved before it can be merged. |
Fixes #2011
Important changes:
highlight.js
advanced_text_formatter
to allow data attribute on code tagsMASTODON_STRICT
Preview:

Some considerations:
As this adds a title attribute, when hovering it shows the specified language.
I think that's a good idea, as it might not always be obvious which language code is actually written in.
This also means it has to be sanitised to not show anything a user specifies.
Therefore it is checked, whether it's a valid language or not and if it isn't, the title attribute gets removed.
I didn't want to changeMASTODON_OUTGOING
directly, as I don't know where it's all used and I didn't want to accidentally mess with other stuff.I chose the github-dark theme as it's what's most people will be familiar with.
Ideally more styles would be included and users can choose in which style code is shown, but I considered this out of scope for the initial implementation.