Benchmark with GB/s columns #33

nojvek · 2020-04-10T20:24:27Z

Fixes #21

@lemire / @luizperes I am not seeing the > 1GB/s results though. Not sure what the nodejs overhead is, or whether my macbook pro from 2018 is just too slow.

lemire · 2020-04-10T20:33:23Z

I think that @croteaucarine is getting > 1 GB/s speeds but she is using a different approach.

https://github.com/croteaucarine/simdjson_node_objectwrap

luizperes · 2020-04-10T20:36:35Z

Hi @nojvek, it is possible that there is still a lot of overhead. lazyParse and isValid should take around the same time, because they basically do the same work. There is already an issue open for that #28 . I will to take a look at it carefully, also I am watching @croteaucarine as she working on https://github.com/croteaucarine/simdjson_node_objectwrap as her masters thesis.

luizperes · 2020-04-10T20:38:08Z

@lemire I see, I will take a closer look at her repo next week (I will probably have a little more time).

CC @nojvek

luizperes · 2020-04-10T20:39:12Z

Thanks for your work, @nojvek!

nojvek · 2020-04-10T21:38:29Z

It seems std::move is having a significant perf cost. I wonder if there is a way to work around it.

  Napi::External<dom::document> buffer = Napi::External<dom::document>::New(env, &parser.doc,
    [](Napi::Env /*env*/, dom::document * doc) {
      // delete doc;
    });

I tried this and get > 1GB/s (almost double what I got before). I know ^ is not really valid code as it will give SIGSERV errors, but just wanted to try something surgical.

filename	filesize (MB)	JSON.parse (GB/s)	simdjson.lazyParse (GB/s)	X faster
apache_builds.json	0.13	0.38	1.53	4.00
canada.json	2.25	0.16	0.57	3.44
citm_catalog.json	1.73	0.51	0.32	0.62
github_events.json	0.07	0.44	1.21	2.79
gsoc_2018.json	3.33	0.72	1.20	1.66
instruments.json	0.22	0.48	1.40	2.95
marine_ik.json	2.98	0.24	0.54	2.25
mesh_pretty.json	1.58	0.40	0.92	2.31
mesh.json	0.72	0.25	0.75	3.01
numbers.json	0.15	0.25	0.83	3.25
random.json	0.51	0.28	0.30	1.09
twitter.json	0.63	0.41	0.34	0.85
twitterescaped.json	0.56	0.31	1.06	3.40
update_center.json	0.53	0.28	0.32	1.14

luizperes · 2020-04-10T21:46:02Z

I see, that makes a lot of sense! When you do it, is it the same performance of the isValid?

nojvek · 2020-04-10T22:48:59Z

filename	filesize (MB)	lazyParse (GB/s)	isValid (GB/s)
apache_builds.json	0.13	0.69	1.36
canada.json	2.25	0.39	0.39
citm_catalog.json	1.73	0.27	0.28
github_events.json	0.07	0.73	1.14
gsoc_2018.json	3.33	0.89	1.03
instruments.json	0.22	0.66	1.27
marine_ik.json	2.98	0.41	0.45
mesh_pretty.json	1.58	0.76	0.92
mesh.json	0.72	0.51	0.66
numbers.json	0.15	0.56	0.79
random.json	0.51	0.26	0.30
twitter.json	0.63	0.31	0.34
twitterescaped.json	0.56	0.73	0.83
update_center.json	0.53	0.28	0.32

Seems like the overhead adds up for small files more than bigger files. I assume that's because benchmark runs more ops for smaller files than for bigger files.

luizperes · 2020-04-11T00:03:50Z

Not only that, it seems to be the cost of doing new dom::document(std::move(parser.doc) here. You see that the values you get with isValid are very similar to the values that you get with lazyParse without the std::move. Very interesting indeed.

lemire · 2020-04-11T01:20:58Z

Ping @jkeiser : you may want to check these numbers and the discussion on the cost of std::move.

jkeiser · 2020-04-11T01:55:01Z

I'm wondering if it's the new rather than the move? There should only be 2 pointers to move here, I would have expected it literally to be 1-2 load/stores.

We could try a few things:

Is there a way to create a 2-pointer-size value? I don't think I saw one.
Can you try to do this without the new? Like, make sure the benchmark doesn't try to keep two documents around, make a global static document doc, assign the result of the parse directly into that, and just take &doc?
Provide an interface that lets the user manage a parser object, and use memory from that--don't make any other js objects besides the parser.

@lemire all of this makes a reasonable case for the document to be rooted in a single pointer. Bindings are far more likely to naively support storing one raw pointer than two.

luizperes · 2020-04-11T02:57:33Z

By creating an empty document

Napi::External<dom::document> buffer = Napi::External<dom::document>::New(env, new dom::document(),
    [](Napi::Env /*env*/, dom::document * doc) {
      delete doc;
    });

I get:

filename	filesize (MB)	JSON.parse(ms)	simdjson.lazyParse (ms)	JSON.parse (GB/s)	simdjson.lazyParse (GB/s)	X faster
apache_builds.json	0.13	0.698	0.116	0.18	1.09	6.00
canada.json	2.25	27.311	8.774	0.08	0.26	3.11
citm_catalog.json	1.73	20.584	8.044	0.08	0.21	2.56
github_events.json	0.07	0.724	0.276	0.09	0.24	2.62
gsoc_2018.json	3.33	18.987	3.416	0.18	0.97	5.56
instruments.json	0.22	0.963	0.212	0.23	1.04	4.54
marine_ik.json	2.98	24.261	7.476	0.12	0.40	3.25
mesh_pretty.json	1.58	7.439	2.517	0.21	0.63	2.96
mesh.json	0.72	5.941	1.286	0.12	0.56	4.62
numbers.json	0.15	1.124	0.257	0.13	0.58	4.37
random.json	0.51	9.979	2.308	0.05	0.22	4.32
sf_citylots.json	189.78	2319.485	903.317	0.08	0.21	2.57
twitter.json	0.63	9.038	2.687	0.07	0.24	3.36
twitterescaped.json	0.56	3.256	0.740	0.17	0.76	4.40
update_center.json	0.53	7.899	2.254	0.07	0.24	3.50

So, unless I got something wrong, it is not the new operator?

luizperes · 2020-04-16T05:35:55Z

I took a look at this problem today again and is really the std::move operator (in my case). I also checked @crouteaucarine 's repo and she is using it as well. I am not so sure how she's achieving such high perfomance, the only difference in our repos is that she uses a C++ a class to deal with the problem, while a don't. She is using a different model for her benchmark as well (will take another look at it later).

croteaucarine · 2020-04-16T11:53:10Z

@luizperes I made a lot of research and tests to find out that the conversion between C++ Objects and Napi Objects really takes a lot of time and must be avoided as much as possible. Event passing the JSON doc as string and converting it to Napi::String before calling the simdjson parser has a considerable impact on performances.
In my repo, I only parse the document with simdjson parser in the constructor and keep for a later use (no lazyParse). This is probaly why my benchmarks are more performants. Constructor can also take path or document depending on situations.
I have also overrided mosts used JS methods (toString, length, keys, iterator, etc) on the C++ side and try to make Napi conversions as less as possible. Once a method is called, the result is stored for other uses so other calls sould be faster. What my benchmarks don't show is that some of theese methods are a little bit slower than native ones. Theese results will be presented in my thesis in a couple of weeks.

jkeiser · 2020-04-16T17:13:49Z

@luizperes maybe the right thing to do (at least for now) is to return to the old way of using new parser to construct the parser, and then just attaching a pointer to the parser instead of the document ... and we can try to make this scenario more performant with changes in 0.4.

jkeiser · 2020-04-16T17:48:39Z

I made a PR for that: #36 . I'm not sure it really helps, some numbers are up and some are down, which makes me suspect we just have a wide error margin for measurement.

Also filed #35 for a way to amortize a lot of the allocation cost and use simdjson a little bit more idiomatically.

luizperes · 2020-04-16T18:22:44Z

Thanks for your help @jkeiser! Yeah, as I mentioned on #36, it seems that this approach did the opposite (it seems a little more costly). As for the error margin, I don't think I understand what you mean that "some numbers are up and some are down", could you explain it a little better?

#35 seems interesting :)

jkeiser · 2020-04-16T18:37:00Z

As for the error margin, I don't think I understand what you mean that "some numbers are up and some are down", could you explain it a little better?

In my results there, the PR actually makes some files faster, and others slower. I was blaming error margin and jitter in the results because I couldn't explain it--everything in the change should be fixed per-document overhead-- but after running it a few more times, it might be real.

Benchmark with GB/s columns

f8280bb

luizperes approved these changes Apr 10, 2020

View reviewed changes

luizperes merged commit c1d3b09 into luizperes:master Apr 10, 2020

luizperes mentioned this pull request Apr 10, 2020

Reporting timings in GB/s #21

Closed

luizperes mentioned this pull request Apr 10, 2020

Slower than JSON.parse #28

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark with GB/s columns #33

Benchmark with GB/s columns #33

nojvek commented Apr 10, 2020

lemire commented Apr 10, 2020

luizperes commented Apr 10, 2020

luizperes commented Apr 10, 2020

luizperes commented Apr 10, 2020

nojvek commented Apr 10, 2020

luizperes commented Apr 10, 2020

nojvek commented Apr 10, 2020

luizperes commented Apr 11, 2020

lemire commented Apr 11, 2020

jkeiser commented Apr 11, 2020 •

edited

Loading

luizperes commented Apr 11, 2020

luizperes commented Apr 16, 2020

croteaucarine commented Apr 16, 2020

jkeiser commented Apr 16, 2020

jkeiser commented Apr 16, 2020

luizperes commented Apr 16, 2020

jkeiser commented Apr 16, 2020

Benchmark with GB/s columns #33

Benchmark with GB/s columns #33

Conversation

nojvek commented Apr 10, 2020

lemire commented Apr 10, 2020

luizperes commented Apr 10, 2020

luizperes commented Apr 10, 2020

luizperes commented Apr 10, 2020

nojvek commented Apr 10, 2020

luizperes commented Apr 10, 2020

nojvek commented Apr 10, 2020

luizperes commented Apr 11, 2020

lemire commented Apr 11, 2020

jkeiser commented Apr 11, 2020 • edited Loading

luizperes commented Apr 11, 2020

luizperes commented Apr 16, 2020

croteaucarine commented Apr 16, 2020

jkeiser commented Apr 16, 2020

jkeiser commented Apr 16, 2020

luizperes commented Apr 16, 2020

jkeiser commented Apr 16, 2020

jkeiser commented Apr 11, 2020 •

edited

Loading