-
-
Notifications
You must be signed in to change notification settings - Fork 6.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve dump_integer performance #1411
Conversation
The Appveyor is failing with following message:
Apparently they are experiencing some outage for now. Is there any way to re-run Appveyor CI for this PR after they resolve the issue? Its latest status is being reported here: https://status.appveyor.com/ |
Will do. Thanks for the PR! I am looking forward to having this performance improvement in the library! |
This patch looks pretty good and seems reasonable. My only concern is whether this patch would have any licensing implications. Was this patch developed from scratch or copied from somewhere? If it is copied from the video, was there anything in the video about the license that the code is released under? Just because it's on youtube doesn't necessarily make it public domain. |
The Youtube video was the motivation to incorporate such technique, but I didn't use the implementation in the video. As I mentioned, this patch is adapted from the implementation in fmtlib. It is licensed under BSD and I am not sure the extend of its implication. If there could be any, I can rewrite the implementation from scratch. |
I have rewritten the implementation with some optimizations along the way. The new implementation benchmark as following with
Compared to implementation adapted from I've also tried this implementation with a larger lookup table with 1000 per-calculated numbers (zero to 999) instead of 100 numbers (zero to 99) and it lead to a speedup of ~9% compared to Though I haven't ran benchmarks on other compilers, I hope there will be not a performance drop on them at least. We have three options to proceed:
|
As for the copyright issue, I would like to watch the video myself. I hope to find time for this on the weekend. |
(Any idea why the AppVeyor tests fail?) |
@nlohmann Thanks. I will wait for your response on how to proceed. As for Appveyor, I mentioned earlier that there was downtime in their servers when I submitted the PR, hence the failure. Here is more detail: https://status.appveyor.com/incidents/39tq9r92flzh Apparently there is no facility that automatically re-runs the CI. I wonder if it can be done manually? As workaround, I also can submit "noop" commit to trigger the CI. |
In fact I restarted the jobs and now there are actual failures, see https://ci.appveyor.com/project/nlohmann/json/builds/21351818/job/4y0907w9i0ri4dxv. |
(And BSD and MIT should be no problem IIRC) |
Sorry I missed that. It seems there is an overflow issue with the unit tests I have added. For instance:
causes its corresponding test to fail:
It is failing on MinGW-W64 7.3.0 compiler. It might be due to x86/x64 or invalid use of |
Now I watched the video and I think there is no issue with copyrights. The video presents ideas and exemplifies them with code snippets. However, I did not find a source saying that it is OK to re-license BSD-licensed code under a MIT license. |
I didn't find any sources either. In such cases where two open-source projects have different (and perhaps incompatible) licensing, the safest approach seems to be incorporating the source license into the destination project. However, that's too much burden for us in this case. As mentioned eariler, I've already re-implemented the technique from scratch which actually has a slight performance improvement. Is it fine to add this implementation as a commit on top of previous commits on this pull request? Or it's better to open a new pull request that drops this commit from history which in some parts contains code from fmtlib? |
You can just rebase and fixup your commits, no need to open a new PR |
I've push the new changes which are present in this commit: 7a6102f. The other commits are untouched and same as before. |
I like the algorithmic improvement, which is pretty much in line with Andrei Alexandrescu's work at Facebook. It's always good to squeeze a couple of instructions out of simple operations. My concern, however, is not about the quality of the submission, but the structure of code. Notably, the duplication of code, |
The only reason could be my limited knowledge of the codebase. Can you point out where is the implementation? I couldn't find such method in BTW thank you for the feedback! Please let me know if you have other suggestions. |
@justinasvd Could you please provide a link to the duplicated code? I can't find it myself, but it's been a while that I worked on the serializer. |
I just ran the benchmarks myself and can confirm a significant performance improvement. Well done! For the licensing part: We must not take code with a non-MIT license. This would make this library a time-bomb for future use. As long as the code was only motivated by the YouTube video, I see no problems. But anything from fmtlib should go out. @nickaein Can you realize this? |
You're welcome! Thank you for this great library! I'm aware that the speed is not the top design goal so there could be the ways that improves the performance without affecting public API or causes a major impact on library design. As for licensing, I understand the significance of possible license violation. The initial implementation was partially adapted from fmtlib (e.g. Nevertheless, I have re-written the implementation. The current code isn't derived from fmtlib in any way (neither in its design nor in implementation details). |
…2ascii This commits implements a faster int2ascii inspired by "Fastware" talk given by Andrei Alexandrescu. See: https://www.youtube.com/watch?v=o4-CwDo2zpg
This benchmark is a sample of 1 million "small" integers in range [-1000000 1000000) sampled from uniform distribution.
Add some unit tests for formatting integers to keep code coverage as before.
Looks good to me! |
Thanks a lot! |
This PR improves the performance of
dump_integer
using a tuned int2ascii implementation. The technique is discussed in Fastware talk by Andrei Alexandrescu and has been already incorporated in some of other well-known libraries like fmtlib.In this pull request:
The benchmarks show an considerable performance improvement (~1.5x) on my VM linux. This could be different (and probably higher) in a non-VM environment. Here is the original performance of Dump* operations (Ubuntu 18.04 x64, Clang 6):
And this is the performance with fast int2ascii method (note the last six benchmarks):
More detailed benchmark results (including gcc, msvc) is available here: https://docs.google.com/spreadsheets/d/1sGE3UvLeLT_KrnC_Ujvtl_NFXPIp28xFOe9u9-6GSU0/edit?usp=sharing
This is my first contribution so any feedback is welcome.
Pull request checklist
Read the Contribution Guidelines for detailed information.
include/nlohmann
directory, runmake amalgamate
to create the single-header filesingle_include/nlohmann/json.hpp
. The whole process is described here.Please don't
#ifdef
s or other means.