Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server.go: use worker goroutines for fewer stack allocations #3204

Merged
merged 6 commits into from
Apr 23, 2020

Commits on Dec 18, 2019

  1. server.go: use worker goroutines for fewer stack allocations

    Currently (go1.13.4), the default stack size for newly spawned
    goroutines is 2048 bytes. This is insufficient when processing gRPC
    requests as the we often require more than 4 KiB stacks. This causes the
    Go runtime to call runtime.morestack at least twice per RPC, which
    causes performance to suffer needlessly as stack reallocations require
    all sorts of internal work such as changing pointers to point to new
    addresses.
    
    Since this stack growth is guaranteed to happen at least twice per RPC,
    reusing goroutines gives us two wins:
    
      1. The stack is already grown to 8 KiB after the first RPC, so
         subsequent RPCs do not call runtime.morestack.
      2. We eliminate the need to spawn a new goroutine for each request
         (even though they're relatively inexpensive).
    
    Performance improves across the board. The improvement is especially
    visible in small, unary requests as the overhead of stack reallocation
    is higher, percentage-wise. QPS is up anywhere between 3% and 5%
    depending on the number of concurrent RPC requests in flight. Latency is
    down ~3%. There is even a 1% decrease in memory footprint in some cases,
    though that is an unintended, but happy coincidence.
    
    unary-networkMode_none-bufConn_false-keepalive_false-benchTime_1m0s-trace_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_8-reqSize_1B-respSize_1B-compressor_off-channelz_false-preloader_false
                   Title       Before        After Percentage
                TotalOps      2613512      2701705     3.37%
                 SendOps            0            0      NaN%
                 RecvOps            0            0      NaN%
                Bytes/op      8657.00      8654.17    -0.03%
               Allocs/op       173.37       173.28     0.00%
                 ReqT/op    348468.27    360227.33     3.37%
                RespT/op    348468.27    360227.33     3.37%
                50th-Lat    174.601µs    167.378µs    -4.14%
                90th-Lat    233.132µs    229.087µs    -1.74%
                99th-Lat     438.98µs    441.857µs     0.66%
                 Avg-Lat    183.263µs     177.26µs    -3.28%
    Adhityaa Chandrasekar committed Dec 18, 2019
    Configuration menu
    Copy the full SHA
    b7f16d4 View commit details
    Browse the repository at this point in the history
  2. V2: server option, stream ID bitshift

    Adhityaa Chandrasekar committed Dec 18, 2019
    Configuration menu
    Copy the full SHA
    11cd160 View commit details
    Browse the repository at this point in the history

Commits on Dec 20, 2019

  1. V3: server workers instead of stream workers

    Adhityaa Chandrasekar committed Dec 20, 2019
    Configuration menu
    Copy the full SHA
    f13952c View commit details
    Browse the repository at this point in the history
  2. V4: go fmt

    Adhityaa Chandrasekar committed Dec 20, 2019
    Configuration menu
    Copy the full SHA
    af27325 View commit details
    Browse the repository at this point in the history
  3. V5: use a drifting, random number of iterations

    Adhityaa Chandrasekar committed Dec 20, 2019
    Configuration menu
    Copy the full SHA
    d104fc2 View commit details
    Browse the repository at this point in the history
  4. multiple channels

    Adhityaa Chandrasekar committed Dec 20, 2019
    Configuration menu
    Copy the full SHA
    9939840 View commit details
    Browse the repository at this point in the history