Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark with GB/s columns #33

Merged
merged 1 commit into from
Apr 10, 2020
Merged

Benchmark with GB/s columns #33

merged 1 commit into from
Apr 10, 2020

Conversation

nojvek
Copy link
Contributor

@nojvek nojvek commented Apr 10, 2020

Fixes #21

@lemire / @luizperes I am not seeing the > 1GB/s results though. Not sure what the nodejs overhead is, or whether my macbook pro from 2018 is just too slow.

@lemire
Copy link
Contributor

lemire commented Apr 10, 2020

I think that @croteaucarine is getting > 1 GB/s speeds but she is using a different approach.

https://github.com/croteaucarine/simdjson_node_objectwrap

@luizperes
Copy link
Owner

Hi @nojvek, it is possible that there is still a lot of overhead. lazyParse and isValid should take around the same time, because they basically do the same work. There is already an issue open for that #28 . I will to take a look at it carefully, also I am watching @croteaucarine as she working on https://github.com/croteaucarine/simdjson_node_objectwrap as her masters thesis.

@luizperes
Copy link
Owner

@lemire I see, I will take a closer look at her repo next week (I will probably have a little more time).

CC @nojvek

@luizperes
Copy link
Owner

Thanks for your work, @nojvek!

@luizperes luizperes merged commit c1d3b09 into luizperes:master Apr 10, 2020
@nojvek
Copy link
Contributor Author

nojvek commented Apr 10, 2020

It seems std::move is having a significant perf cost. I wonder if there is a way to work around it.

  Napi::External<dom::document> buffer = Napi::External<dom::document>::New(env, &parser.doc,
    [](Napi::Env /*env*/, dom::document * doc) {
      // delete doc;
    });

I tried this and get > 1GB/s (almost double what I got before). I know ^ is not really valid code as it will give SIGSERV errors, but just wanted to try something surgical.

filename filesize (MB) JSON.parse (GB/s) simdjson.lazyParse (GB/s) X faster
apache_builds.json 0.13 0.38 1.53 4.00
canada.json 2.25 0.16 0.57 3.44
citm_catalog.json 1.73 0.51 0.32 0.62
github_events.json 0.07 0.44 1.21 2.79
gsoc_2018.json 3.33 0.72 1.20 1.66
instruments.json 0.22 0.48 1.40 2.95
marine_ik.json 2.98 0.24 0.54 2.25
mesh_pretty.json 1.58 0.40 0.92 2.31
mesh.json 0.72 0.25 0.75 3.01
numbers.json 0.15 0.25 0.83 3.25
random.json 0.51 0.28 0.30 1.09
twitter.json 0.63 0.41 0.34 0.85
twitterescaped.json 0.56 0.31 1.06 3.40
update_center.json 0.53 0.28 0.32 1.14

@luizperes
Copy link
Owner

I see, that makes a lot of sense! When you do it, is it the same performance of the isValid?

@nojvek
Copy link
Contributor Author

nojvek commented Apr 10, 2020

filename filesize (MB) lazyParse (GB/s) isValid (GB/s)
apache_builds.json 0.13 0.69 1.36
canada.json 2.25 0.39 0.39
citm_catalog.json 1.73 0.27 0.28
github_events.json 0.07 0.73 1.14
gsoc_2018.json 3.33 0.89 1.03
instruments.json 0.22 0.66 1.27
marine_ik.json 2.98 0.41 0.45
mesh_pretty.json 1.58 0.76 0.92
mesh.json 0.72 0.51 0.66
numbers.json 0.15 0.56 0.79
random.json 0.51 0.26 0.30
twitter.json 0.63 0.31 0.34
twitterescaped.json 0.56 0.73 0.83
update_center.json 0.53 0.28 0.32

Seems like the overhead adds up for small files more than bigger files. I assume that's because benchmark runs more ops for smaller files than for bigger files.

@luizperes
Copy link
Owner

Not only that, it seems to be the cost of doing new dom::document(std::move(parser.doc) here. You see that the values you get with isValid are very similar to the values that you get with lazyParse without the std::move. Very interesting indeed.

@lemire
Copy link
Contributor

lemire commented Apr 11, 2020

Ping @jkeiser : you may want to check these numbers and the discussion on the cost of std::move.

@jkeiser
Copy link
Contributor

jkeiser commented Apr 11, 2020

I'm wondering if it's the new rather than the move? There should only be 2 pointers to move here, I would have expected it literally to be 1-2 load/stores.

We could try a few things:

  1. Is there a way to create a 2-pointer-size value? I don't think I saw one.
  2. Can you try to do this without the new? Like, make sure the benchmark doesn't try to keep two documents around, make a global static document doc, assign the result of the parse directly into that, and just take &doc?
  3. Provide an interface that lets the user manage a parser object, and use memory from that--don't make any other js objects besides the parser.

@lemire all of this makes a reasonable case for the document to be rooted in a single pointer. Bindings are far more likely to naively support storing one raw pointer than two.

@luizperes
Copy link
Owner

By creating an empty document

Napi::External<dom::document> buffer = Napi::External<dom::document>::New(env, new dom::document(),
    [](Napi::Env /*env*/, dom::document * doc) {
      delete doc;
    });

I get:

filename filesize (MB) JSON.parse(ms) simdjson.lazyParse (ms) JSON.parse (GB/s) simdjson.lazyParse (GB/s) X faster
apache_builds.json 0.13 0.698 0.116 0.18 1.09 6.00
canada.json 2.25 27.311 8.774 0.08 0.26 3.11
citm_catalog.json 1.73 20.584 8.044 0.08 0.21 2.56
github_events.json 0.07 0.724 0.276 0.09 0.24 2.62
gsoc_2018.json 3.33 18.987 3.416 0.18 0.97 5.56
instruments.json 0.22 0.963 0.212 0.23 1.04 4.54
marine_ik.json 2.98 24.261 7.476 0.12 0.40 3.25
mesh_pretty.json 1.58 7.439 2.517 0.21 0.63 2.96
mesh.json 0.72 5.941 1.286 0.12 0.56 4.62
numbers.json 0.15 1.124 0.257 0.13 0.58 4.37
random.json 0.51 9.979 2.308 0.05 0.22 4.32
sf_citylots.json 189.78 2319.485 903.317 0.08 0.21 2.57
twitter.json 0.63 9.038 2.687 0.07 0.24 3.36
twitterescaped.json 0.56 3.256 0.740 0.17 0.76 4.40
update_center.json 0.53 7.899 2.254 0.07 0.24 3.50

So, unless I got something wrong, it is not the new operator?

@luizperes
Copy link
Owner

I took a look at this problem today again and is really the std::move operator (in my case). I also checked @crouteaucarine 's repo and she is using it as well. I am not so sure how she's achieving such high perfomance, the only difference in our repos is that she uses a C++ a class to deal with the problem, while a don't. She is using a different model for her benchmark as well (will take another look at it later).

@croteaucarine
Copy link

@luizperes I made a lot of research and tests to find out that the conversion between C++ Objects and Napi Objects really takes a lot of time and must be avoided as much as possible. Event passing the JSON doc as string and converting it to Napi::String before calling the simdjson parser has a considerable impact on performances.
In my repo, I only parse the document with simdjson parser in the constructor and keep for a later use (no lazyParse). This is probaly why my benchmarks are more performants. Constructor can also take path or document depending on situations.
I have also overrided mosts used JS methods (toString, length, keys, iterator, etc) on the C++ side and try to make Napi conversions as less as possible. Once a method is called, the result is stored for other uses so other calls sould be faster. What my benchmarks don't show is that some of theese methods are a little bit slower than native ones. Theese results will be presented in my thesis in a couple of weeks.

@jkeiser
Copy link
Contributor

jkeiser commented Apr 16, 2020

@luizperes maybe the right thing to do (at least for now) is to return to the old way of using new parser to construct the parser, and then just attaching a pointer to the parser instead of the document ... and we can try to make this scenario more performant with changes in 0.4.

@jkeiser
Copy link
Contributor

jkeiser commented Apr 16, 2020

I made a PR for that: #36 . I'm not sure it really helps, some numbers are up and some are down, which makes me suspect we just have a wide error margin for measurement.

Also filed #35 for a way to amortize a lot of the allocation cost and use simdjson a little bit more idiomatically.

@luizperes
Copy link
Owner

Thanks for your help @jkeiser! Yeah, as I mentioned on #36, it seems that this approach did the opposite (it seems a little more costly). As for the error margin, I don't think I understand what you mean that "some numbers are up and some are down", could you explain it a little better?

#35 seems interesting :)

@jkeiser
Copy link
Contributor

jkeiser commented Apr 16, 2020

As for the error margin, I don't think I understand what you mean that "some numbers are up and some are down", could you explain it a little better?

In my results there, the PR actually makes some files faster, and others slower. I was blaming error margin and jitter in the results because I couldn't explain it--everything in the change should be fixed per-document overhead-- but after running it a few more times, it might be real.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reporting timings in GB/s
5 participants