xml-zero.js

Friendly and forgiving HTML5/XML5 parser that supports React JSX, and uses zero-copy techniques to allow parsing large files efficiently.

Most markup parsers convert a string of markup into a nested map(hash/dict) of keys and values, with each of these allocated as separate variables in memory. This means that a 10MB XML file may balloon to 100MB of memory.

A different technique would be to retain the original string and generate an index of string offsets. Because these offsets are just numbers they can be packed more efficiently (a tutorial on zero-copy approaches).

This software is beta and it doesn't yet work

Features

Fault tolerant like HTML5/XML5.
- Valueless-attributes like HTML5 / XML5 eg <input multiple type=file>
- Attribute values may be quoted (E.g. <tag "some key"=false/> ) or not
- React JSX attributes and in text (not executed of course, but they're parsed as distinct node types).
- Multiple root nodes. Doesn't care about well-formedness. GIGO.
Minimising memory use through Zero-Copy techniques.
Tiny, no dependencies, and can run in Web Workers (e.g. doesn't use DOM APIs).
Safer by removing SGML cruft.
No support for external DTD resolution, or nested entity expansion. Only default entities in XML, NCRs, and HTML5 named entities are supported.
Lots of tests.

Out of scope

Complete W3C DOM (at least for now) although we will follow their API naming conventions where reasonable.
HTML5 implied tags (e.g. won't automatically create tags such as <html>, <head>, <tbody>, ...etc).

Install

npm install xml-zero-lexer

npm install xml-zero-beautify

npm install whats-the-damage

(more packages to come, but i'm making it modular)

Progress

Lexer (2.6KB no dependencies, minified and gzipped)
Beautifier (4KB all dependencies, minified and gzipped)
What's The Damage benchmarker that measures time/memory/CPU of scripts
A W3C DOM-like API
Editable XML (by way of making new strings and leaving the original untouched, so it's still immutable)

References

XML5
MicroXML (JClark, Intro Presentation)
Beautiful Soup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

xml-zero.js

Features

Out of scope

Install

Progress

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

xml-zero.js

Features

Out of scope

Install

Progress

References