Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support auto detecting language #1313

Closed
paladox opened this issue Feb 28, 2018 · 15 comments
Closed

Support auto detecting language #1313

paladox opened this issue Feb 28, 2018 · 15 comments

Comments

@paladox
Copy link
Contributor

paladox commented Feb 28, 2018

Hi, could support for auto detecting the language be added please?

Also would it be possible to support this type of language

var html = Prism.highlight(code, language);

(language as in mime types php, etc).

Like highlight.js usage. As they are now unmaintained, prime.js is a better alternative but needs support for Prism.highlight(code, lanaguage); so that it auto loads the language it needs.

@uhafner
Copy link

uhafner commented Mar 12, 2018

Is there at least a mapping of file extensions to class names available? I would like to use it in an application where the actual visualized file is not known in advance.

@Golmote
Copy link
Contributor

Golmote commented Mar 12, 2018

@uhafner Take a look at the File Highlight plugin.

@uhafner
Copy link

uhafner commented Mar 12, 2018

Thanks, that did the trick!

@HarryCaveMan
Copy link

HarryCaveMan commented May 12, 2018

Is there some where to view the dependency graph of the language packs? A simple way to get auto detection support for all languages (without importing them all) is to import the languages that are high on the dependency tree (i.e. clike, c, java,html, javascript) every time, then load the extension classes dynamically from the input (i.e import('prismjs/components/prism-'+inputLang)), but , aside from reading/testing them all or seeing the dependency graph, it is hard to determine which language packs to load initially. This may not be a good solution if there are lots of nested dependencies because you'd still end up importing a lot of stuff for every instance. I will attempt to fork and do some testing in a few weeks if no one else has time. Here is an example react-markdown plugin that (kind of) adds support automatically (works for at least 20 or so langs), and works with real-time previews. It defaults to JavaScript if no support is found or if deps are missing:

import React from "react";
import Prism from "prismjs/components/prism-core";
//other languages depend on these
import "prismjs/components/prism-clike";
import "prismjs/components/prism-c";
import "prismjs/components/prism-java";
import "prismjs/components/prism-html";
//include javascript as default fallback
import "prismjs/components/prism-javascript";

let CodeBlock = {
  Block (props){
    let html;
    let cls;
    //console.log(props.value)
    try{
      //try to load prism component for language
      import("prismjs/components/prism-"+props.language);
      html = Prism.highlight(props.value ||"...", Prism.languages[props.language]);
      cls = `language-${props.language}`;
    }
    catch(er){
      //if load failed, fall back to javascript
      console.log(er.message+": \""+props.language+"\"");
      html = Prism.highlight(props.value||"...", Prism.languages["js"]);
      cls = "language-js";
    }    
    return (
      <pre className={cls}>
        <code
          dangerouslySetInnerHTML={{__html: html}}
          className={cls}
        />
      </pre>
    );
  },
  InLine(props) {
      let html = props.value;
      let cls = "language-js";    
      return (
          <code
            dangerouslySetInnerHTML={{__html: html}}
            className={cls}
          />
      );
  }
};

export default CodeBlock;

@Golmote
Copy link
Contributor

Golmote commented May 13, 2018

I don't get how this would solve the issue of auto-detection. AFAIK highlight.js does it by testing the code (or part of it?) with every language. Each test returns a relevance score, based on specific caracteristics of the language (mainly keywords, but also special syntaxes). Sometimes a test can return early with a relevance of 0 if it detects an invalid syntax.

This is a smart approach, but it still requires to test every language and it's something I'm not happy about. Yet, Prism could probably support this by adding these concepts (relevancy and invalid syntaxes).

@HarryCaveMan
Copy link

HarryCaveMan commented May 14, 2018

Trying to train some sort of statistical classifier with a training set could be a viable option. Something like this , which can train a model from samples and then just load the trained model to detect.

@izelnakri
Copy link

Any updates on this? This would really enhance the developer experience if the code block doesn't have a language class.

@mAAdhaTTah
Copy link
Member

Honestly, I think this is unlikely to be supported / implemented by the core team, but we'd accept a PR for a plugin.

@asbjornu
Copy link

highlight.js supports language auto-detection. Perhaps some inspiration can be drawn from there?

@RunDevelopment
Copy link
Member

As @Golmote said before: Highlight.js' approach while quite inefficient might be usable.

But to support illegal tokens and relevance, we would probably have to adjust every language definition.

@mAAdhaTTah
Copy link
Member

Not gonna lie, 4 downvotes from people who want others to do work for things they want is frustrating. As stated, if this is desired, we will happily accept a PR, but we will not be implementing this.

@andrewjmead
Copy link

andrewjmead commented Jun 9, 2022

If you're interested, you can use the auto detect from highlight.js but then use the syntax highlighting from prism. This loads highlight.js and runs the highlighter just to generate the correct language-xxxx class that prism can then pick up on.

<script src="//cdnjs.cloudflare.com/ajax/libs/highlight.js/11.5.1/highlight.min.js"></script>
<script>
	document.addEventListener("DOMContentLoaded", function () {
		hljs.highlightAll();
	})
</script>


<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/themes/prism.min.css" integrity="sha512-tN7Ec6zAFaVSG3TpNAKtk4DOHNpSwKHxxrsiw4GHKESGPs5njn/0sMCUMl2svV4wo4BK/rCP7juYz+zx+l6oeQ==" crossorigin="anonymous" referrerpolicy="no-referrer" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/plugins/line-numbers/prism-line-numbers.min.css" integrity="sha512-cbQXwDFK7lj2Fqfkuxbo5iD1dSbLlJGXGpfTDqbggqjHJeyzx88I3rfwjS38WJag/ihH7lzuGlGHpDBymLirZQ==" crossorigin="anonymous" referrerpolicy="no-referrer" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/components/prism-core.min.js" integrity="sha512-9khQRAUBYEJDCDVP2yw3LRUQvjJ0Pjx0EShmaQjcHa6AXiOv6qHQu9lCAIR8O+/D8FtaCoJ2c0Tf9Xo7hYH01Q==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/plugins/autoloader/prism-autoloader.min.js" integrity="sha512-fTl/qcO1VgvKtOMApX2PdZzkziyr2stM65GYPLGuYMnuMm1z2JLJG6XVU7C/mR+E7xBUqCivykuhlzfqxXBXbg==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/plugins/line-numbers/prism-line-numbers.min.js" integrity="sha512-BttltKXFyWnGZQcRWj6osIg7lbizJchuAMotOkdLxHxwt/Hyo+cl47bZU0QADg+Qt5DJwni3SbYGXeGMB5cBcw==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>

Wrote about it here: https://mead.io/2022/06/09/wordpress-post-syntax-highlighting-with-highlightjs-and-prism/

@abulka
Copy link

abulka commented Apr 27, 2023

A tip to those using the code above: ensure the two script tags involving the highlight.js library precede the prism link and script tags, just as the example shows, otherwise prism doesn't run at the correct time.

Whilst the technique works, I'm not that impressed with the highlight.js library's detection of languages:

  • markdown files I'm getting language-csharp
  • plain text files e.g. requirements.txt I'm getting language-undefined, which causes an import error https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-undefined.min.js net::ERR_ABORTED 404.
  • .gitignore I'm getting language-bash
  • some bash scripts I'm getting language-css

I'm building a file preview tool, I wish there was some way of passing the file extension which would be a massive hint as to which language to switch to.

@Nantris
Copy link

Nantris commented Aug 2, 2024

@andrewjmead that sure seems expensive on multiple levels. I don't understand why Prism can't implement a language guesser like highlight.js does.

@masylum
Copy link

masylum commented Sep 10, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests