Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize mpsc #358

Merged
merged 9 commits into from
Nov 30, 2022
Merged

Optimize mpsc #358

merged 9 commits into from
Nov 30, 2022

Conversation

zenhack
Copy link
Contributor

@zenhack zenhack commented Nov 30, 2022

This wasn't at the very top of the profiler output, but it was also relatively quick compared to other options so I went for it; this patch reworks the implementation of mpsc to allocate far less. It's also now a tiny shim around a new spsc package; we should consider removing mpsc or at least using spsc directly where possible.

Note that this includes commits from #296, to use that as a perf baseline; that should be merged first.

Profiling data below.

Before

goos: linux
goarch: amd64
pkg: capnproto.org/go/capnp/v3/rpc
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkStreaming-8   	       1	1236673765 ns/op
PASS
ok  	capnproto.org/go/capnp/v3/rpc	1.530s

(pprof) [isd@box rpc]$ go tool pprof before.cpu
File: rpc.test
Type: cpu
Time: Nov 30, 2022 at 12:51am (EST)
Duration: 1.50s, Total samples = 2.34s (155.85%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 990ms, 42.31% of 2340ms total
Dropped 65 nodes (cum <= 11.70ms)
Showing top 10 nodes out of 173
      flat  flat%   sum%        cum   cum%
     290ms 12.39% 12.39%      540ms 23.08%  runtime.scanobject
     130ms  5.56% 17.95%      450ms 19.23%  runtime.selectgo
     110ms  4.70% 22.65%      240ms 10.26%  runtime.lock2
     100ms  4.27% 26.92%      100ms  4.27%  runtime.unlock2
      70ms  2.99% 29.91%       70ms  2.99%  runtime.casgstatus
      70ms  2.99% 32.91%       90ms  3.85%  runtime.findObject
      60ms  2.56% 35.47%       60ms  2.56%  runtime.markBits.isMarked (inline)
      60ms  2.56% 38.03%       60ms  2.56%  runtime.pageIndexOf (inline)
      50ms  2.14% 40.17%       50ms  2.14%  runtime.(*waitq).dequeue (inline)
      50ms  2.14% 42.31%      180ms  7.69%  runtime.mallocgc

(pprof) [isd@box rpc]$ go tool pprof before.mem
File: rpc.test
Type: alloc_space
Time: Nov 30, 2022 at 12:51am (EST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 237.04MB, 62.04% of 382.06MB total
Dropped 42 nodes (cum <= 1.91MB)
Showing top 10 nodes out of 66
      flat  flat%   sum%        cum   cum%
   43.01MB 11.26% 11.26%    51.51MB 13.48%  capnproto.org/go/capnp/v3.NewPromise (inline)
   39.50MB 10.34% 21.60%    48.51MB 12.70%  capnproto.org/go/capnp/v3.(*Decoder).Decode
      31MB  8.11% 29.71%       31MB  8.11%  capnproto.org/go/capnp/v3/internal/chanmutex.NewLocked (inline)
   25.50MB  6.68% 36.39%    25.50MB  6.68%  capnproto.org/go/capnp/v3.NewMessage
   19.01MB  4.97% 41.36%    19.01MB  4.97%  capnproto.org/go/capnp/v3.(*Metadata).Put (inline)
      19MB  4.97% 46.34%   104.01MB 27.22%  capnproto.org/go/capnp/v3/rpc.(*Conn).handleCall
   16.50MB  4.32% 50.65%       42MB 10.99%  capnproto.org/go/capnp/v3/rpc/transport.(*transport).NewMessage
   15.50MB  4.06% 54.71%    15.50MB  4.06%  capnproto.org/go/capnp/v3/server.newAnswerQueue (inline)
   14.50MB  3.80% 58.51%    19.50MB  5.10%  capnproto.org/go/capnp/v3.ImmediateAnswer
   13.50MB  3.53% 62.04%    58.51MB 15.31%  capnproto.org/go/capnp/v3/rpc.(*Conn).newQuestion

After

goos: linux
goarch: amd64
pkg: capnproto.org/go/capnp/v3/rpc
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkStreaming-8   	       1	1185679591 ns/op
PASS
ok  	capnproto.org/go/capnp/v3/rpc	1.448s

[isd@box rpc]$ go tool pprof after.cpu
File: rpc.test
Type: cpu
Time: Nov 30, 2022 at 1:42am (EST)
Duration: 1.40s, Total samples = 2.22s (158.06%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 1050ms, 47.30% of 2220ms total
Dropped 81 nodes (cum <= 11.10ms)
Showing top 10 nodes out of 166
      flat  flat%   sum%        cum   cum%
     230ms 10.36% 10.36%      510ms 22.97%  runtime.scanobject
     170ms  7.66% 18.02%      440ms 19.82%  runtime.selectgo
     140ms  6.31% 24.32%      270ms 12.16%  runtime.lock2
     130ms  5.86% 30.18%      150ms  6.76%  runtime.findObject
     100ms  4.50% 34.68%      220ms  9.91%  runtime.mallocgc
      70ms  3.15% 37.84%       70ms  3.15%  runtime.unlock2
      60ms  2.70% 40.54%       60ms  2.70%  runtime.pageIndexOf (inline)
      50ms  2.25% 42.79%       50ms  2.25%  capnproto.org/go/capnp/v3.(*Segment).writePtr
      50ms  2.25% 45.05%       60ms  2.70%  runtime.casgstatus
      50ms  2.25% 47.30%       50ms  2.25%  runtime.futex

[isd@box rpc]$ go tool pprof after.mem
File: rpc.test
Type: alloc_space
Time: Nov 30, 2022 at 1:42am (EST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 250.04MB, 66.66% of 375.12MB total
Dropped 62 nodes (cum <= 1.88MB)
Showing top 10 nodes out of 54
      flat  flat%   sum%        cum   cum%
   55.51MB 14.80% 14.80%    63.51MB 16.93%  capnproto.org/go/capnp/v3.NewPromise (inline)
   48.01MB 12.80% 27.60%    55.51MB 14.80%  capnproto.org/go/capnp/v3.(*Decoder).Decode
      29MB  7.73% 35.33%       29MB  7.73%  capnproto.org/go/capnp/v3.NewMessage
   22.01MB  5.87% 41.19%    22.01MB  5.87%  capnproto.org/go/capnp/v3.(*Metadata).Put
      19MB  5.07% 46.26%       48MB 12.80%  capnproto.org/go/capnp/v3/rpc/transport.(*transport).NewMessage
   17.50MB  4.67% 50.93%   103.51MB 27.60%  capnproto.org/go/capnp/v3/rpc.(*Conn).handleCall
   15.50MB  4.13% 55.06%    21.50MB  5.73%  capnproto.org/go/capnp/v3.ImmediateAnswer (inline)
      15MB  4.00% 59.06%       15MB  4.00%  capnproto.org/go/capnp/v3/server.newAnswerQueue (inline)
   14.50MB  3.87% 62.92%    36.01MB  9.60%  capnproto.org/go/capnp/v3/server.(*answerQueue).fulfill
      14MB  3.73% 66.66%       29MB  7.73%  capnproto.org/go/capnp/v3/server.(*Server).start

A substantial amount of the complexity of the transport package is due
the interface: we pass contexts to functions that do IO, and then jump
through hoops to respect those contexts. Instead, this patch changes
Transport and Codec's Close() to work like net.Close(): it interrupts
any outstanding IO. We demand the same semantics from the
io.ReadWriteCloser that we pass in, which seems like not a tall order.

Still TODO:

- [ ] rework the rpc subsystem's shutdown logic. The basic idea is we
spawn an extra goroutine to wait on context.Done() and then call close,
but the shutdown logic is intricate and I am too tired right now; this
patch introduces test regressions for now.
- [ ] Tweak the docs on how this changes the interface for e.g. NewConn
- Add a timeout during shutdown for aborts
- Make sure we are actually reading from the channel
Presumably this couldn't happen before we added the timeout during
shutdown, but now I am seeing an occasional panic here without this if.
This fixes some occasional test failures where questions were sent even
though the context is canceled before the relevant functions were
called; these are dependent on sendMessage to check the context before
actually doing the send.
...to mostly avoid allocations. There's now also an spsc, which mpsc
just adds a lock to. We should consider removing mpsc; it's behind a
lock in Conn anyway.

Before:

```
goos: linux
goarch: amd64
pkg: capnproto.org/go/capnp/v3/rpc
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkStreaming-8   	       1	1236673765 ns/op
PASS
ok  	capnproto.org/go/capnp/v3/rpc	1.530s

(pprof) [isd@box rpc]$ go tool pprof before.cpu
File: rpc.test
Type: cpu
Time: Nov 30, 2022 at 12:51am (EST)
Duration: 1.50s, Total samples = 2.34s (155.85%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 990ms, 42.31% of 2340ms total
Dropped 65 nodes (cum <= 11.70ms)
Showing top 10 nodes out of 173
      flat  flat%   sum%        cum   cum%
     290ms 12.39% 12.39%      540ms 23.08%  runtime.scanobject
     130ms  5.56% 17.95%      450ms 19.23%  runtime.selectgo
     110ms  4.70% 22.65%      240ms 10.26%  runtime.lock2
     100ms  4.27% 26.92%      100ms  4.27%  runtime.unlock2
      70ms  2.99% 29.91%       70ms  2.99%  runtime.casgstatus
      70ms  2.99% 32.91%       90ms  3.85%  runtime.findObject
      60ms  2.56% 35.47%       60ms  2.56%  runtime.markBits.isMarked (inline)
      60ms  2.56% 38.03%       60ms  2.56%  runtime.pageIndexOf (inline)
      50ms  2.14% 40.17%       50ms  2.14%  runtime.(*waitq).dequeue (inline)
      50ms  2.14% 42.31%      180ms  7.69%  runtime.mallocgc

(pprof) [isd@box rpc]$ go tool pprof before.mem
File: rpc.test
Type: alloc_space
Time: Nov 30, 2022 at 12:51am (EST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 237.04MB, 62.04% of 382.06MB total
Dropped 42 nodes (cum <= 1.91MB)
Showing top 10 nodes out of 66
      flat  flat%   sum%        cum   cum%
   43.01MB 11.26% 11.26%    51.51MB 13.48%  capnproto.org/go/capnp/v3.NewPromise (inline)
   39.50MB 10.34% 21.60%    48.51MB 12.70%  capnproto.org/go/capnp/v3.(*Decoder).Decode
      31MB  8.11% 29.71%       31MB  8.11%  capnproto.org/go/capnp/v3/internal/chanmutex.NewLocked (inline)
   25.50MB  6.68% 36.39%    25.50MB  6.68%  capnproto.org/go/capnp/v3.NewMessage
   19.01MB  4.97% 41.36%    19.01MB  4.97%  capnproto.org/go/capnp/v3.(*Metadata).Put (inline)
      19MB  4.97% 46.34%   104.01MB 27.22%  capnproto.org/go/capnp/v3/rpc.(*Conn).handleCall
   16.50MB  4.32% 50.65%       42MB 10.99%  capnproto.org/go/capnp/v3/rpc/transport.(*transport).NewMessage
   15.50MB  4.06% 54.71%    15.50MB  4.06%  capnproto.org/go/capnp/v3/server.newAnswerQueue (inline)
   14.50MB  3.80% 58.51%    19.50MB  5.10%  capnproto.org/go/capnp/v3.ImmediateAnswer
   13.50MB  3.53% 62.04%    58.51MB 15.31%  capnproto.org/go/capnp/v3/rpc.(*Conn).newQuestion
```

After:

```
goos: linux
goarch: amd64
pkg: capnproto.org/go/capnp/v3/rpc
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkStreaming-8   	       1	1185679591 ns/op
PASS
ok  	capnproto.org/go/capnp/v3/rpc	1.448s

[isd@box rpc]$ go tool pprof after.cpu
File: rpc.test
Type: cpu
Time: Nov 30, 2022 at 1:42am (EST)
Duration: 1.40s, Total samples = 2.22s (158.06%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 1050ms, 47.30% of 2220ms total
Dropped 81 nodes (cum <= 11.10ms)
Showing top 10 nodes out of 166
      flat  flat%   sum%        cum   cum%
     230ms 10.36% 10.36%      510ms 22.97%  runtime.scanobject
     170ms  7.66% 18.02%      440ms 19.82%  runtime.selectgo
     140ms  6.31% 24.32%      270ms 12.16%  runtime.lock2
     130ms  5.86% 30.18%      150ms  6.76%  runtime.findObject
     100ms  4.50% 34.68%      220ms  9.91%  runtime.mallocgc
      70ms  3.15% 37.84%       70ms  3.15%  runtime.unlock2
      60ms  2.70% 40.54%       60ms  2.70%  runtime.pageIndexOf (inline)
      50ms  2.25% 42.79%       50ms  2.25%  capnproto.org/go/capnp/v3.(*Segment).writePtr
      50ms  2.25% 45.05%       60ms  2.70%  runtime.casgstatus
      50ms  2.25% 47.30%       50ms  2.25%  runtime.futex

[isd@box rpc]$ go tool pprof after.mem
File: rpc.test
Type: alloc_space
Time: Nov 30, 2022 at 1:42am (EST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 250.04MB, 66.66% of 375.12MB total
Dropped 62 nodes (cum <= 1.88MB)
Showing top 10 nodes out of 54
      flat  flat%   sum%        cum   cum%
   55.51MB 14.80% 14.80%    63.51MB 16.93%  capnproto.org/go/capnp/v3.NewPromise (inline)
   48.01MB 12.80% 27.60%    55.51MB 14.80%  capnproto.org/go/capnp/v3.(*Decoder).Decode
      29MB  7.73% 35.33%       29MB  7.73%  capnproto.org/go/capnp/v3.NewMessage
   22.01MB  5.87% 41.19%    22.01MB  5.87%  capnproto.org/go/capnp/v3.(*Metadata).Put
      19MB  5.07% 46.26%       48MB 12.80%  capnproto.org/go/capnp/v3/rpc/transport.(*transport).NewMessage
   17.50MB  4.67% 50.93%   103.51MB 27.60%  capnproto.org/go/capnp/v3/rpc.(*Conn).handleCall
   15.50MB  4.13% 55.06%    21.50MB  5.73%  capnproto.org/go/capnp/v3.ImmediateAnswer (inline)
      15MB  4.00% 59.06%       15MB  4.00%  capnproto.org/go/capnp/v3/server.newAnswerQueue (inline)
   14.50MB  3.87% 62.92%    36.01MB  9.60%  capnproto.org/go/capnp/v3/server.(*answerQueue).fulfill
      14MB  3.73% 66.66%       29MB  7.73%  capnproto.org/go/capnp/v3/server.(*Server).start
```
@lthibault lthibault merged commit 4df0ed8 into capnproto:main Nov 30, 2022
@zenhack zenhack deleted the mpsc-perf branch November 30, 2022 19:26
@zenhack zenhack mentioned this pull request Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants