-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encode with ensure_ascii=True and then fix #9
Conversation
2b9c51d
to
8ed84b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming the handling of the surrogate pairs are correct. It might be nice to have some tests for those cases too, potentially testing _unascii directly.
This turns out to be way quicker. It also allows us to fix #2.
8ed84b6
to
6eb904b
Compare
Well, we do test the surrogate pairs, via the encoding of the smiley poo (https://github.com/matrix-org/python-canonicaljson/pull/9/files#diff-a9559bb86637bd1d66d1a37f0b48a732R35). I feel like this is sufficient? |
Ah, my bad, yup that's fine |
I really wish I'd kept a record of why I thought that this was faster than just using (I also don't remember how I benchmarked it) |
ohhh, I found https://github.com/richvdh/json_benchmark. Benchmarks against simplejson 3.17.0: (seconds per loop: smaller is better)
conclusion: this approach is still substantially faster, although the difference is less marked than it used to be under python 2.7. The main difference (I think) is that simplejson has an optimised implementation of (the simplejson folks have in the past said they'd more than happy to accept new optimisations, so we could certainly send them a PR for an optimised version of |
Having said all that, repeating the py3.7.5 benchmarks with stdlib json instead of simplejson:
so yeah. #23. |
This turns out to be way quicker. It also allows us to fix #2.
(includes #7 and #3)Fixes #2.