Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wishlist: server-side KaTeX #6651

Closed
zackw opened this issue Aug 30, 2020 · 15 comments
Closed

wishlist: server-side KaTeX #6651

zackw opened this issue Aug 30, 2020 · 15 comments

Comments

@zackw
Copy link

zackw commented Aug 30, 2020

KaTeX can be used to render math to HTML in advance. You need node.js and the katex command line tool at render time, but then no JavaScript needs to be executed in the browser (a CSS file is still necessary). I've written a proof-of-concept Lua filter that implements this mode in Pandoc and it works reasonably well:

-- Pandoc filter: if we are generating HTML, replace each Math element
-- with the result of running KaTeX on its contents.  This requires
-- the command-line "katex" program to be installed at rendering time,
-- but does not require any JavaScript to be executed on the reader's
-- browser.  (The built-in --katex mode makes the opposite tradeoff.)
if FORMAT:match 'html' then
   have_math = false

   function Math(elem)
      local function trim(s)
         return s:gsub("^%s+", ""):gsub("%s+$", "")
      end

      have_math = true

      local katex_args = {'--no-throw-on-error'}
      if elem.mathtype == 'DisplayMath' then
         table.insert(katex_args, '--display-mode')
      end

      return pandoc.RawInline(
         FORMAT, trim(pandoc.pipe("katex", katex_args, trim(elem.text))))
   end

   function Meta(data)
      -- The "has_math" property will be absent when there is no math
      -- and the string "true" when there is math.
      if have_math then
        data.has_math = "true"
      end
      return data
   end
end

Due to the need to start a fairly heavyweight program (the node.js interpreter) for every math element, though, it's quite slow. I'm going to experiment with using a JSON filter written in node.js instead, but I wonder whether native support for this mode in Pandoc might be even faster -- it could fork off the katex utility the first time it ran into a math element, and pipe it the math markup as it arrives. Native support would also allow --standalone to know when it should link to the KaTeX CSS instead of the JS.

@jgm
Copy link
Owner

jgm commented Aug 30, 2020

it could fork off the katex utility the first time it ran into a math element, and pipe it the math markup as it arrives

Isn't this something that could, in principle, be done in a filter too?

@zackw
Copy link
Author

zackw commented Aug 30, 2020 via email

@jgm
Copy link
Owner

jgm commented Aug 30, 2020

Doesn't lua have primitives allowing you to open a subprocess and pipe things to it?
I may be missing something, @tarleb can comment more helpfully.

@tarleb
Copy link
Collaborator

tarleb commented Aug 31, 2020

There are no built-in primitives, but there are libraries which provide this functionality. E.g., the posix library comes with posix.popen, posix.spawn, posix.sys.wait, etc.

This requires pandoc to be compiled against the system's Lua installation. Distro-packages usually do this, as do the official Docker images. The Lua posix package can conveniently be installed via the distro's package manager.

@mb21
Copy link
Collaborator

mb21 commented Aug 31, 2020

I'm going to experiment with using a JSON filter written in node.js instead

yes, sounds like that's the way to go here. you'll need node.js anyway to run in, so no particular advantage in using a lua filter here.

btw. see also https://pandoc.org/MANUAL.html#math-rendering-in-html for built-in alternatives

@tarleb
Copy link
Collaborator

tarleb commented Aug 31, 2020

Well, Lua filters should be a bit faster, and more so if there are only a few math elements in a very long document. But I agree that using a node.js filter is probably the better approach here.

@jgm
Copy link
Owner

jgm commented Sep 1, 2020

Another option: set up a server that does the conversions, and use --webtex (pointing at this server) and --self-contained. Then you don't need a filter at all.

@MyriaCore
Copy link

Sounds a lot like this filter!

@averms
Copy link
Contributor

averms commented Sep 22, 2020

There is also https://github.com/lierdakil/mathjax-pandoc-filter. It's written in TypeScript so it doesn't fork for every equation. Not sure how MathJax compares with KaTeX, though.

@mk12
Copy link

mk12 commented Jan 11, 2021

For those worried about starting a new process for every math element, I came up with a very over-engineered solution for my project. The Lua filter uses luaposix to communicate over a Unix socket with katex.ts, a server running on Deno. With Deno there's no package.json, node_modules, etc., just a single TypeScript file. It works nicely since KaTeX provides an ECMAScript Module.

@mb21
Copy link
Collaborator

mb21 commented Apr 18, 2021

Writing a node.js-json-filter is probably still the way to go here to transform the HTML to what KaTeX CSS needs (so you don't have to run JS in the browser). But we could of course reimplement this in Haskell in the HTML writer... I'm not saying this is worthwhile, but we can keep the issue open for that feature request.

@mk12
Copy link

mk12 commented Apr 14, 2022

I came up with another solution: invoke JS from the Lua filter, but batch everything into one invocation. Comparison:

  • zackw's original comment: Use pandoc.pipe from Lua filter
    • pro: Easy to write
    • con: Very slow for lots of math
  • mb21's suggestion: JSON filter in JS
    • pro: Can use KaTeX API directly
    • con: Less powerful than Lua filters: can't access pandoc.read, pandoc.write, etc.
  • My earlier idea: Communicate with JS server using luaposix
    • pro: Avoids per-math process overhead
    • con: Lots of boilerplate to write server correctly
    • con: Hard to access LuaRocks from Pandoc Lua
      • Only works in dynamically linked pandoc
      • In some cases must configure LUA_PATH and LUA_CPATH
  • My new idea: Invoke JS from Lua filter, but only once for all math
    • pro: Avoids per-math process overhead
    • pro: Much simpler JS script than using sockets

math.ts:

import { readLines } from "https://deno.land/std@0.134.0/io/mod.ts";
import katex from "https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.mjs";

for await (const line of readLines(Deno.stdin)) {
  try {
    console.log(katex.renderToString(line, {
      displayMode: false,
      strict: "error",
      throwOnError: true,
    }));
  } catch (error) {
    throw new Error(`Input: ${line}\n\nError: ${error}`);
  }
}

filter.lua:

function Pandoc(doc)
    -- We have to use a temporary file because Lua does not support
    -- bidirectional communication with a subprocess:
    -- http://lua-users.org/lists/lua-l/2007-10/msg00189.html
    local tmp_name = os.tmpname()
    local math = assert(io.popen("deno run math.ts > " .. tmp_name, "w"))
    doc:walk({
        Math = function(el)
            assert(math:write(el.text:gsub("\n", " ") .. "\n"))
        end
    })
    math:close()
    local tmp = assert(io.open(tmp_name, "r"))
    doc = doc:walk({
        Math = function(el)
            return pandoc.RawInline(FORMAT, tmp:read())
        end
    })
    tmp:close()
    os.remove(tmp_name)
    return doc
end

@castedo
Copy link
Contributor

castedo commented Oct 13, 2022

Here's an existing nodejs KaTeX filter that I've started using and it is working well so far:
https://github.com/StevenZoo/pandoc-katex-filter
https://npm.io/package/pandoc-katex-filter
Much faster than the Python version. Thank you @StevenZoo!

slotThe added a commit to slotThe/slotThe.github.io that referenced this issue Jul 17, 2024
+ This is now part of site.hs, with a tiny bit of shelling out to
  a math.ts script.

+ Remove the (deprecated) mathjax-node-page as a dependency.

+ Instead of making the build shell script parallel, remove it
  completely and utilise Hakyll's built-in parallelism—as well as
  KaTeX's general speed advantage over MathJax—to make compilation
  acceptably fast.

+ Move the fonts directory from /fonts to /css/fonts, as katex.css
  expects this, and it seems easier to change fonts.css to accommodate
  this than the other way around—especially with an eye on updating to
  new KaTeX releases.

Related: jgm/pandoc#6651
@sgraf812
Copy link

There is also this package which is published on nixpkgs: https://github.com/xu-cheng/pandoc-katex

@jgm
Copy link
Owner

jgm commented Sep 23, 2024

These are all great links! It seems that there are many options. But this issue can be closed, because no changes to pandoc itself are called for.

@jgm jgm closed this as completed Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants