-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discuss: 3rd party language packages spec #2328
Comments
Also an anti-pattern (in my opinion) I'm seeing is:
Resulting in custom registration code that hides the language name/key: var hljsDefineRobotsTxt = require('highlightjs-robots-txt');
hljsDefineRobotsTxt(hljs); I'm not sure what the big advantage here is and I'd prefer to instead see: module.exports = function(hljs) {
return {
case_insensitive: true,
contains: [
... or (same thing really): function tsql(hljs) { ... }
module.exports = tsl And README examples like: var tsqlGrammar = require('highlightjs-robots-tsql');
hljs.registerLanguage('tsql',tsqlGrammar); IE, what is exported is the grammar function itself, nothing else. Prefer to let people do the registration themselves. This will allow a simpler build process and also I like that it keeps the naming explicit rather than implicit. |
From another thread about testing 3rd party languages in isolation.
That might wait until later (or require someone else to step up). My focus right now is to make things "just work" in the context of a working highlight.js checkout. (Since you need that to build the actual library anyways, etc). And of course that's immediately most useful to me, as a maintainer of the library - everything in one place. So checkout highlight.js and then vendored language projects, run full highlight.js test suite. No 3rd step. The other side of things (as you suggest) would be to have your language stand-alone... then you'd pull in highlight.js as a dependency via npm/yarn... and you might have some sort of scaffold to run your custom tests plus "standardized" detect/markup test... of course currently none of the testing stuff ships with our npm library at all since the library itself is just a build product of the actual source. Open to discussion this element of things here, but it might be outside the scope of what I'm currently working on. |
Another question is how to distribute "web"/CDN versions of languages. Currently I was imagining the source files would all be CommonJS ( I think some people may want an easy way to just get the CDN ready version... of course our build process could handle this and then drop CDN ready "binaries" in the Ok, actually Solidity does it with just a custom function name (which of course pollutes the global namespace also), but I think this is a bit messy: // in the external solidity.js file
function hljsDefineSolidity(hljs) {
...
// and then to register after you loaded the JS
hljs.registerLanguage('solidity', window.hljsDefineSolidity); |
Though now that I'm thinking about it all I've seen is that you have two different main modes... inside and outside... can't you just:
But they'd both register normally.
All the custom config logic is hidden inside Am I missing something? If this is such a common need that most users of your library would need it, it seems (given current limitations of HLJS) that the easily way to do it from a user standpoint would be just to register two slightly different languages - rather than have them copy and pasting code from the README that is different than every other language module? |
IE if I'm understanding properly: shexc-texpr.js
Or some such. Now it's just a standard language module. Obviously I wouldn't suggest this if you had 5 or 15 or 50 variant modes, but for just two it doesn't seem a terrible compromise to me. This is also VERY similar to the question we're currently discussing regarding PHP and whether it's a snipper or a full PHP template: Although at this point I'm starting to wonder if perhaps the "child mode" should perhaps be it's own language anyways and use "sublanguage" (perhaps overkill) or else the parent should depend on the child (requiring it) - instead of the either/or configuration way you're doing it. Many different ways to solve this problem. |
I've only tested two entry points but there are ~8 corresponding to the assignments to It seems |
If you have more than 2-3 I'd almost suggest building your own auto-detect on top of HLJS. What you really want is what I've mentioned elsewhere, one language that maps to multiple variants - then auto-detect could figure out which variant makes the most sense:
Long-term this might be internal to a grammar. But for now you'd just build a wrapper very similar to You almost have to deal with it yourself somehow I think. As soon as you get past 2-3 asking user of your grammar to label every snippet with say 8-15 different variants is really a no go. ;-) Or at least not every user friendly. :-) |
Actually this MIGHT be possible with the callback support that there is already a PR for... the missing piece of the puzzle is how to "hide" a bunch of language variants... registering them all isn't great, though I suppose if you namespaced them and turned off auto-detect (at the sub-module level) that wouldn't be so bar: registerLanguage("sparq_context-1", sparq(...))
registerLanguage("sparq_context-2", sparq(...))
registerLanguage("sparq_context-3", sparq(...))
// sparq grammar
{
multiLanguage: ["sparq_context-1", "sparq_context-2", ...],
// or you could even have a determiner callback if the context was "easy" to sniff without parsing a the whole content
// for example for PHP we could just look for "<?php" in the beginning that would be a pretty safe bet...
determinerCallback: function (...) => Then your "main" Does that make any sense? |
And then the markup tests would be under the various
|
That is a very good question. I'd love to find a "better" format for the markup tests in general... they definitely don't scale to multiple variants of a single language every well at all as they currently stand.
That may be where you're at for now. |
It should also be discussed how 3rd party languages are published. I'm not sure how which permissions come with the standard member role of the npm organization. Can any member publish any package that was published under that organization or only those to which he has access on GitHub? Speaking of which, @marcosc and I are currently the only members of the |
That should be a whole other thread - this one is long enough. I see that as a different discussion, but a worthwhile one. If you're like to open one that'd be great. @idleberg I'd take an invite, but don't think I've seen one. |
What's left to implement/prototype before this rolls out? |
It's part of the new build system, so it'll get merged when that gets merged. Only a few more things to polish and then just need to get Egor to sign off on it.
Should be part of v10 which should be the next release I THINK. If it takes a bit longer I might bump out one last 9 series just to get some of the more recent language fixes out sooner, but right now the goal is mid-Jan, and we might hit that.
See the issue topic on v10, but I'm not sure all that will make it.
I wouldn't mind discussing that with you. I could point you to a few utility/workflow based things I think would really help that you could probably bang out if you were interested. What's needed first (IMHO) is a MUCH larger test suite for auto-detection before more nuanced changes back be made - right now we have no real true way to measure improvement vs regressions. |
Of course this is still making the tests easier to run from inside a folder, etc... but I don' think that's a blocker for this functionality, just a future improvement we should find time for. |
I have updated the |
@gusbemacbe Think you're all good now right? I made the logging a bit more verbose today:
|
Closing this thread for now. Improving testing/integration process for 3rd party devs is really a different thing than defining the core format/structure itself. I think we handled that decently here and we're seeing lots of people contributing 3rd party modules, so that's awesome. One unanswered question here is how a multi-language pack might ship a multi-language distribution distributable and my thoughts on that:
So the multi package would just be all the individual files pasted together... each one of them doing auto-registration... so then a user could just include that single file and get all 3 languages... or they could (theoretically) include all 3 files separates and get the same result. I don't know if we should add this last step to the build process and have a standardized name for the bundle? Perhaps just using the repository name?
|
Fix code to correctly register languages to reveal.js, when loading from a browser. Add missing keywords to ATL.
I did not follow this thread. Can you finalize what should I do to the repo to comply? |
If your making a new 3rd party grammar just check out the quick start readme in my branch and look at the other examples repositories. |
so for code that's shared between the different language files (which is probably 80% for the RDF suite I modeled), would you advise the old exports trick? |
Hard to say. I'm not sure I've decided how the main repo should work long-term. Your choices are either:
If the amount of code is small (and not a burden to download repeatedly - OR is likely to be bundled in a monolithic build) I'd actually consider modules. Run-time dependencies get annoying if you don't load things in the right order, or someone forgets to load one, etc... Right now using modules would result in a slightly larger build since the code ends up repeated in multiple places - though in the future I plan to fix that (at least for those building a monolithic library). The benefit is no dependency hell. Related: |
Of note for 3rd party language authors: Should be a quick add. |
This was referenced in highlightjs/highlight.js#2328 (comment)
Starting a discussion for an official 3rd party grammar format/spec (directory structure, how modules are exported, etc). I've been working on adding support to the build process for seamlessly building 3rd party grammars. The idea being that you just check them out:
And then building "just works":
And high-level grammar testing just works (detect tests, markup tests, etc) in context of the whole test suite. Just run tests like normal. The tests for extra languages are bundled with the extra languages, but magically "just work".
In looking at the existing 3rd party stuff it seems we have two major types of layouts. I think perhaps we could support both.
The slightly flatter npm package:
A more "bundled"/traditional layout:
One alternative to supporting both is to always prefer/require the bundled layout. I prefer this layout slightly as it makes it easier to add multiple languages to a single "package/repo". For example you might bundle Python, Python REPL... bundled layout gives you an easy way to do that (keeping the same directly structure we use internally). Also if we ever again decided to pull a 3rd party into the core library it'd be trivial to do with the bundled layout as it'd just essentially be copying files from one repo to another.
As to how NPM would work with bundled layout I'd suggest an
index.js
that simply required the individual modules and exported a hash of them by name:Thoughts?
The text was updated successfully, but these errors were encountered: