-
Notifications
You must be signed in to change notification settings - Fork 0
Event Parser update, JParse event driven vs. Noggit.
Wrote a STAX-style parser to compete against event-driven parsers.
Noggit claims to be the world's fastest streaming JSON parser for Java and is used by SOLR. Noggit has a streaming API which is a StAX/pull parser. I'd call it an event-driven parser.
Oddly when testing against Noggit, I noticed that the JParse index overlay parser was faster than just walking the events with Noggit and not building a DOM.
But what if you need an event parser? If JParse can do a functioning DOM using an index overlay faster than Noggit can walk the events, then if JParse adds an event-based parser, it should scream.
//Read the glossary example from JSON dot org with the index overlay using JParse.
@Benchmark
public void readGlossaryJParse(Blackhole bh) {
bh.consume(new JsonParser().parse(glossaryJsonData));
}
//Read the glossary example from JSON dot org while building an index overlay using JParse in event-driven mode.
@Benchmark
public void readGlossaryJParseWithEvents(Blackhole bh) {
bh.consume(new JsonEventParser().parse(glossaryJsonData));
}
//Read the glossary example from JSON dot org with Noggit and just listen to events.
@Benchmark
public void readGlossaryNoggitEvent(Blackhole bh) throws Exception {
final var jsonParser = new JSONParser(glossaryJsonData);
int event = -1;
while (event!=JSONParser.EOF) {
event = jsonParser.nextEvent();
}
bh.consume(event);
}
//Read the glossary example from JSON dot org with JParse and just listen to events.
@Benchmark
public void readGlossaryEventJParse(Blackhole bh) throws Exception {
final var jsonParser = new JsonEventParser();
final int [] token = new int[1];
final var events = new TokenEventListener() {
@Override
public void start(int tokenId, int index, CharSource source) {
token[0] = tokenId;
}
@Override
public void end(int tokenId, int index, CharSource source) {
token[0] = tokenId;
}
};
jsonParser.parse(glossaryJsonData, events);
bh.consume(token);
}
//Build a useable DOM with Noggit
@Benchmark
public void readWebGlossaryNoggitObjectBuilder(Blackhole bh) throws Exception {
bh.consume(ObjectBuilder.fromJSON(glossaryJsonData));
}
Benchmark Mode Cnt Score Error Units
BenchMark.readGlossaryEventJParse thrpt 2 1510822.955 ops/s
BenchMark.readGlossaryJParse thrpt 2 996041.462 ops/s
BenchMark.readGlossaryNoggitEvent thrpt 2 960327.894 ops/s
As stated, the JParse index overlay parser, which builds a usable DOM, is faster than Noggit just walking the events and building nothing! The JParse index event-driven parser is substantially faster than Noggit just walking the events and building nothing!
BenchMark.readGlossaryJParse thrpt 2 996041.462 ops/s
BenchMark.readGlossaryJParseWithEvents thrpt 2 620481.661 ops/s
BenchMark.readWebGlossaryNoggitObjectBuilder thrpt 2 530800.854 ops/s
The far more common use case will be using a DOM-style approach.
The BenchMark.readGlossaryJParseWithEvents
is included just for fun. You would always use the JParse index overlay to build a DOM.
We build a DOM with the event parser to test it thoroughly and as an example of using the event-based approach.
Both the JParse index overlays are faster than the equivalent DOM builder from Noggit. While the regular JParse index-overlay parser is substantially faster.
- Java Open AI Client
- Using ChatGpt embeddings and hyde to improve search results
- Anthropics Claude Chatbot Gets Upgrade
- Elon Musks XAi new frontier for artificial intelligence
- Using Mockito to test JAI Java Open AI Client
- Fine tuning journey with Open AI API
- Using Open AI to create callback functions, the basis for plugins
- Using Java Open AI Client Async
- Fastest Java JSON Parser
- Java Open AI API Client on Github
- Medium: Introducing Java Open AI Client
- Medium: Using ChatGPT, Embeddings, and HyDE to Improve Search Results