Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable SIMD when comparing to simd-json benchmarks #3

Open
Licenser opened this issue Jun 1, 2022 · 6 comments
Open

enable SIMD when comparing to simd-json benchmarks #3

Licenser opened this issue Jun 1, 2022 · 6 comments

Comments

@Licenser
Copy link

Licenser commented Jun 1, 2022

Hi,

first of all, this is a really impressive creation I love the trick of delaying number parsing darn smart :D.

That said I would suggest enabling SIMD when comparing against simd-json, otherwise, it's a bit of a pointless comparison. The best way to do that in a non too CPU-dependent way is to use RUSTFLAGS="-C target-feature=+avx,+avx2,+sse4.2" those features are present on all modern CPUs.

@jorgecarleitao
Copy link
Owner

Thanks!

There is a bench using it on the README:

string json_deserializer 2^18 time:   [10.106 ms 10.138 ms 10.173 ms]
string serde_json 2^18        time:   [23.177 ms 23.209 ms 23.243 ms]
string simd_json 2^18         time:   [10.924 ms 10.941 ms 10.959 ms]

# with `RUSTFLAGS='-C target-cpu=native'` (skilake in this case)
string simd_json 2^18         time:   [8.0735 ms 8.0887 ms 8.1046 ms]

basically, simd_json is faster for strings (best case for this crate) by ~20%.

The background of this crate is jorgecarleitao/arrow2#1024 - the delay of number parsing came as a requirement from arrow2 since it allows the user to decide to which integer/float to parse them into.

I like that this crate is forbid unsafe, but despite using it bringing a 50% improvement over serde_json, 20% is still 20%, so it is a tradeoff that I am thinking about.

@Licenser
Copy link
Author

Licenser commented Jun 2, 2022

Absolutely, and it works in non-simd-enabled environments, which is a big plus. Over in simd-lite/simd-json#218 we've been talking about having a fallback for cases where no SIMD acceleration is available perhaps json-deserializer would be a better target then serde for that :)

I'm curious how you feel about implementing the value trait for the json-deserializer Value? It has been pretty nice for swapping between borrowed / owned value types.

@jorgecarleitao
Copy link
Owner

wow, I am humbled by this - that would be awesome!

Note that this still does not support surrogates encoded strings. Need to spend some time to do that (and find a json with them).

Sorry, not very familiar with the ecosystem - could you clarify which "value trait" you mean?

@Licenser
Copy link
Author

Licenser commented Jun 3, 2022

Oh sorry,

the value-trait is basically this repo. The idea is to allow users to have a trait that can represent different implementations for JSONesque values so libraries can be implemented w/o targeting a specific implementation.

It basically offers all the functions needed to access, manipulate and traverse the datastructures.

@jorgecarleitao
Copy link
Owner

jorgecarleitao commented Jun 4, 2022

Got it. Thanks! How do you recommend? feature-flag value-trait on this crate and expose the implementation?

@Licenser
Copy link
Author

Licenser commented Jun 8, 2022

That's the way I'd go ja 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants