-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use fasthttp instead of net/http #37
base: master
Are you sure you want to change the base?
Conversation
kkty
commented
Sep 9, 2019
•
edited
Loading
edited
- Add a program for benchmarking.
- See https://github.com/kazeburo/chocon/blob/ea808ac37468e70acfa272fd616586909eb12eed/benchmark/README.md for details.
- Use fasthttp instead of net/http.
- Add tests.
benchmark: update Dockerfile for chocon
benchmark: better handling of SIGTERM
Performance comparison between the original implementation and the new oneThe following graph compares the throughputs between the original implementation (with net/http) and the new one (with fasthttp.) The CPU usage limit was set to 100%, 125%, 150%, 175% and 200% for each. Hence, the graph has 2 * 5 = 10 lines. The other parameters passed to the benchmark application can be seen in the legend section. The x-axis represents how much load was placed (number of concurrent requests made during a test) and the y-axis represents the throughputs (number of successful responses during a test.) Commit values, which can also be seen in the legend, show the hash of the most recent commit when the benchmark was run (refer to "master" branch and the "use-fasthttp" branch In summary, the colors are described by the table below.
ResultsFrom the above results, we can see that
One of their reasons, as I see it, is that fasthttp has put much effort in reducing the number of heap allocations, and that it does not reduce the number of CPU operations. With a lot of (concurrent) heap memory allocations, we can see performance degradation due to the heap's locking mechanisms. Go's memory allocator, which is loosely based on tcmalloc, performs really well in this kind of situation. But still, we are seeing some impacts. GC pressure may account for the results, too. Also, it is worth mentioning that fasthttp uses worker pool model. And the first some requests, until the number of workers has reached the limit, involve worker creations. That might be the reason why the new implementation is not performant with smaller loads. If we send some requests before the actual benchmarks, the performance improves, which may confirm that the creation of workers is having impacts on the performance. ProfileI also collected profiles of heap memory allocation with pprof and its The new implementation The original implementation With the same number of requests passed through (which can be confirmed by the fact that the number of allocations by NotesThe machine spec, with which I tested, is as follows.
|
Update tests to see if request paths are passed from client to server