javascript processing speed #1375

eiselekd · 2016-11-19T19:23:14Z

Hi, the javascript target in antlr4 seems very slow
for bigger files. I use a 130k vhdl file scanned in with a antlr4
grammar. I did a little test to measure:

cpp-target: 3 sec
javascript-target: 2 minutes
cpp-target-converted_with_emscripten_to_javascript: 15 sec

The test is here: https://github.com/eiselekd/jshdl . Using
emscripten to convert the cpp target parser is
usable however output size is big...
// Greetings Konrad

sharwell · 2016-11-19T20:27:42Z

A huge thank you for putting this test together. I used the test to run some benchmarks on the TypeScript target which is just showing signs of life.

In a direct conversion of the test, the TypeScript target completed parsing in 20 seconds. However, if I ran the parser a second time on the same file, it was only a hair shy of 20 seconds, which tells me the parser is spending most of its time in full-context prediction, which by default doesn't use a DFA. So I took things a step further...

The primary approach to reducing full-context prediction time is leveraging 2-stage parsing, where we first attempt to parse the file with full-context disabled altogether - in most cases for many grammars, this works for valid input and full-context is never needed. For the VHDL grammar, this did not work, and based on the number of errors it reported in SLL mode it appears that some rules in the grammar might need to be rewritten for 2-stage to ever provide benefits. tl;dr: Two-stage parsing does not improve performance for the VHDL grammar used in this test.

The optimized fork of ANTLR 4, along with the two targets derived from that fork (Tunnel Vision Labs' C# target and now the TypeScript target) support using a DFA for full-context prediction. So the next step in evaluating this was to enable that feature for the test. I ran the sample parse operation 4 times in sequence, and the results were as follows:

Pass	Description	Time
1	First pass, no warm-up	7781ms
2	Parse input again, DFA is reused	1736ms
3	Clear DFA and parse again	6739ms
4	Parse input again, DFA is reused	1689ms

The time difference between pass 1 and 3 is primarily due to the fact that the JavaScript files are already loaded (and presumably JIT compiled) for pass 3. This load/compilation time would be included in pass 1's time.

eiselekd · 2016-11-19T20:50:02Z

Not familiar with the parser details but I guess that the
main problem in the javascript target is the javascript runtime and
that parser uses javascript objects. On the other side emscripten generated
javascript code (based on the cpp target output) uses ArrayBuffer and no
objects, I guess that is why it can get close to the cpp speed when running jitted.

eiselekd closed this as completed Nov 19, 2016

nishtahir mentioned this issue Apr 20, 2018

Performance willowtreeapps/wist#71

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

javascript processing speed #1375

javascript processing speed #1375

eiselekd commented Nov 19, 2016

sharwell commented Nov 19, 2016

eiselekd commented Nov 19, 2016

javascript processing speed #1375

javascript processing speed #1375

Comments

eiselekd commented Nov 19, 2016

sharwell commented Nov 19, 2016

eiselekd commented Nov 19, 2016