Very basic (dirty regex stuff) CLI conversion tool for longer texts written in node.js. Intended as a first step of a longer cleanup (eg. when editorial process goes on in Word, but it's needed to convert it into plaintext).
- preserves
p
,h1
,i
,b
tags as their markdown equivalentsdiv
is considered to be same asp
- splits text into chapters (delimited by h1 headers)
- preserves footnotes and fixes their links to Multimarkdown syntax
- footnote link:
[^num]
- footnote text:
[num]: text
- footnote link:
- does not expect tables and lists (not needed yet)
Zero % test coverage, hence it's not submitted into npm.
git clone https://github.com/next-book/recast-word-html.git recast-word-html
cd recast-word-html
npm install
npm link
recast --src=path/to/file.html
Output folder defaults to ./out
, change it via --out=path
param.
Convert MS Word HTML export into UTF-8 before using this tool.