-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to specify the HightlightJS language #391
Comments
Perhaps relevant: diff2html/src/ui/js/diff2html-ui-base.ts Line 164 in 9ed73c7
|
@joshgoebel Thanks. I believe html2diff writes that itself from reading the diff and extracting the extension. My current fallback is to get that attribute and manually change it, then reparse highlightjs on the file, but that's not ideal obviously. |
Yeah, but It's not terrible either for small amounts of data. :-)
Did you mean language classes? I think so based on your snaps. I would definitely call that a bug. Of course a REAL diff diff could contain multiple languages (multiple files in one diff) but I'm not sure if that's relevant here. Definitely randomly changing language from line to line isn't desirable. Auto-detect is not intended for use against individual lines and the noise from doing that would be MUCH higher than the signal. |
Yes. Different language classes. The |
Sure, we often do have extensions as aliases. Though only the canonical ones... which might have implications for languages Exercism uses other than Elixir with many different runtimes that use different extensions.
yes, if
Yeah, sounds about right. |
Hi @iHiD, Unfortunately at the moment it is not possible to specify languages. The code is basically passing the extensions to highlight.js and getting the language from it. I would be up to receive a contribution with the second option if we cannot do the first. What do you think? Do you have any other ideas? |
Ok lets circle back.
We don't guess the language based on the file extension, we have a discrete mapping table... file extensions are directly aliased to languages. I suppose you could say then that's it's diff2html "guessing" based on the file extension - but that is what almost all text editors do, and it's most often a great heuristic... are there cases where that alone wouldn't work for Exercism if we actually made sure all the extensions were properly mapped to languages? IE: Are there cases where the extension is going to be vague or misleading or missing? The issue of random highlighting per line is a different issue in my mind. If the language wasn't known it would be preferable to just not highlighting at all, which is what I suggested in the other issue I opened. |
@rtfpessoa Hey 🙂 Thanks for the reply. So my preferred fix would be to add a config option to
@joshgoebel You probably know this better than me, but I imagine we have situations where multiple languages share extensions. I'm pretty sure we ran into that in v2. But equally we can make a really solid guess at which language that extension is for as we have loads of context here. I'm not sure which "we" hat you're wearing here - ie where the mapping would be. In highlightjs - sure that'd be brilliant. In Exercism - fine too as long as we have a mechanism to pass that in. In diff2html, yep - that makes sense too, although it would probably be better from d2h to be getting that upstream from highlightjs. So if you were happy to do it upstream in highlightjs, that would work best. I'd still have some concern about clashes, but it's less of a concern. FYI: This was the list we compiled for v2 (for Prism but 🤷): https://github.com/exercism/v2-website/blob/main/config/initializers/prism.rb#L187-L315 |
That is indeed a problem. I think that's only true with one of our 184 languages now, but perhaps we're missing some extensions as well...
Well, as I think I mentioned elsewhere Highlight.js traditionally includes canonical extensions. For example we should include That said it's trivial to register additional aliases with the Highlight.js API if Exercism had to maintain a small additional list that we wouldn't upstream to core. The larger issue with using purely extension here is the case where extension is ambiguous. Could you come up with an exact example to share? If that happens in the real world I think that would be a reasonable argument that the current behavior of using only file extension isn't enough for all use cases and that the library should consider allowing one to pass the correct language (when file extension isn't sufficient). |
I think that solution is going to be to hackish. Diffs tend go have multiple files that usually spread across multiple languages, if we just accept a fallback language seems like it is going to be misused really easy and not have a nice outcome. That is also the reason why I would prefer to receive an extension mapping, since that way it would support the use case correctly.
That sounds weird conceptually (although I would imagine it to happen in real world 😞), I cannot remember any case that I would know.
I mean we in the sense from the diff2html side (either me or you) could contribute it.
Currently diff2html uses the Taking into consideration the @joshgoebel answer seems like it would have to be handled mostly in diff2html side as configuration. EDIT: Updated after seeing previous message |
@joshgoebel just out of curiosity can you share which one it is? |
I honestly don't recall. You can find the full list at: https://github.com/highlightjs/highlight.js/blob/main/SUPPORTED_LANGUAGES.md That's why I was asking to see if @iHiD has a more concrete real-world example. |
Yeah, really you'd have to supporting passing in a map, not a single language: {
"booger.asm" => "mips_assembly"
} There are also ambiguous extensions:
So even when the file extension gets you close, sometimes it doesn't get you all the way. |
That makes sense, thanks for the examples. Then definitely diff2html would need to receive at least a suffix that could be either whole filenames or just extensions for it to correctly pass languages to highlight.js. |
This would solve my problem. Can you point me to how best to do that please Josh (maybe on Exercism's Slack so as not to take this issue on a tangent too much?). Thanks for your help 🙂 I think I'm done here with that as a solution, so happy to close this. I also still like the idea of specifying a language, and the map approach would work great for that as well, but I'd rather take the additional mappings, and if there's some that "miss", just leave them unhighlighted (as per the other issue), or clash (then so be it - it'll be rare). As this hasn't come up as an issue before, it's probably low value enough that it doesn't huge matter (Exercism is unusual in having a lot of different files/types so we get edge-cases). Thanks for your time @rtfpessoa and building out such a useful library. It'll be featuring in a few places in Exercism v3, including this one that we built out today: |
I would also be interested in the final solution since it might be interesting to use in diff2html. Maybe it would be to invoke
Great that it is being useful for you 🙏 |
It's just registerAliases: https://highlightjs.readthedocs.io/en/latest/api.html#registeraliases For example to "fix" Elixir outside of core: hljs.registerAliases(["exs"], { languageName: "elixir"}); Creates an alias from
I'm not sure if a 3rd party library should register aliases without informing the user, but perhaps if it was documented that you were doing that... at the very least you could mention in your readme how easy it is to do and show an example. The issue is that someone might use the library in other places and this changes the behavior of the library GLOBALLY. There is no way to define scoped aliases (though it also isn't super hard to keep the mapping table outside Highlight.js either). |
Hello. Thanks for this great library.
At the moment, it seems highlightJS is guessing the language based on the the file extension (I think Im reading that correctly here). That often results in incorrect matchings. If we know the language of the file that the diff is for, we'd like to specify it manually. Is there a way to do that? If not, could an option be added please? If you'd like us to add the option, could you point us in the direction you'd like us to do it please? And if its already an option and I've missed it, my apologies :)
The text was updated successfully, but these errors were encountered: