Skip to content

Event Parser update, JParse event driven vs. Noggit.

Richard Hightower edited this page Feb 20, 2023 · 1 revision

Wrote a STAX-style parser to compete against event-driven parsers.

Noggit claims to be the world's fastest streaming JSON parser for Java and is used by SOLR. Noggit has a streaming API which is a StAX/pull parser. I'd call it an event-driven parser.

Oddly when testing against Noggit, I noticed that the JParse index overlay parser was faster than just walking the events with Noggit and not building a DOM.

But what if you need an event parser? If JParse can do a functioning DOM using an index overlay faster than Noggit can walk the events, then if JParse adds an event-based parser, it should scream.

Here is the benchmark code

    //Read the glossary example from JSON dot org with the index overlay using JParse. 
    @Benchmark
    public void readGlossaryJParse(Blackhole bh) {
        bh.consume(new JsonParser().parse(glossaryJsonData));
    }

    //Read the glossary example from JSON dot org while building an index overlay using JParse in event-driven mode. 
    @Benchmark
    public void readGlossaryJParseWithEvents(Blackhole bh) {
        bh.consume(new JsonEventParser().parse(glossaryJsonData));
    }

    //Read the glossary example from JSON dot org with Noggit and just listen to events. 
    @Benchmark
    public void readGlossaryNoggitEvent(Blackhole bh) throws Exception {

        final var jsonParser =  new JSONParser(glossaryJsonData);

        int event = -1;
        while (event!=JSONParser.EOF) {
            event = jsonParser.nextEvent();
        }

        bh.consume(event);
    }


    //Read the glossary example from JSON dot org with JParse and just listen to events. 
    @Benchmark
    public void readGlossaryEventJParse(Blackhole bh) throws Exception {

        final var jsonParser =  new JsonEventParser();
        final int [] token = new int[1];
        final var events = new TokenEventListener() {
            @Override
            public void start(int tokenId, int index, CharSource source) {
                token[0] = tokenId;
            }

            @Override
            public void end(int tokenId, int index, CharSource source) {
                token[0] = tokenId;
            }
        };

        jsonParser.parse(glossaryJsonData, events);

        bh.consume(token);
    }

    //Build a useable DOM with Noggit
    @Benchmark
    public void readWebGlossaryNoggitObjectBuilder(Blackhole bh) throws Exception {

        bh.consume(ObjectBuilder.fromJSON(glossaryJsonData));
    }

Results comparing Noggit Event Driven to JParse event-driven and JParse index overlay.


Benchmark                                      Mode  Cnt        Score   Error  Units
BenchMark.readGlossaryEventJParse             thrpt    2  1510822.955          ops/s
BenchMark.readGlossaryJParse                  thrpt    2   996041.462          ops/s
BenchMark.readGlossaryNoggitEvent             thrpt    2   960327.894          ops/s

As stated, the JParse index overlay parser, which builds a usable DOM, is faster than Noggit just walking the events and building nothing! The JParse index event-driven parser is substantially faster than Noggit just walking the events and building nothing!

BenchMark.readGlossaryJParse                  thrpt    2   996041.462          ops/s
BenchMark.readGlossaryJParseWithEvents        thrpt    2   620481.661          ops/s
BenchMark.readWebGlossaryNoggitObjectBuilder  thrpt    2   530800.854          ops/s

The far more common use case will be using a DOM-style approach. The BenchMark.readGlossaryJParseWithEvents is included just for fun. You would always use the JParse index overlay to build a DOM. We build a DOM with the event parser to test it thoroughly and as an example of using the event-based approach.

Both the JParse index overlays are faster than the equivalent DOM builder from Noggit. While the regular JParse index-overlay parser is substantially faster.