Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Very slow TCP writes can hang a Weave process #445

Closed
bboreham opened this issue Mar 9, 2015 · 2 comments · Fixed by #476
Closed

Very slow TCP writes can hang a Weave process #445

bboreham opened this issue Mar 9, 2015 · 2 comments · Fixed by #476
Assignees
Labels
Milestone

Comments

@bboreham
Copy link
Contributor

bboreham commented Mar 9, 2015

As seen in this stack trace, if the TCP write gets stuck for some reason, LocalPeer will make no progress, so no connections can be added, removed, the user cannot obtain status, etc.

I think it makes sense to set a deadline on writing similar to what we have for reading.

goroutine 6 [IO wait, 5 minutes]:
net.(*pollDesc).Wait(0xc20a990d10, 0x77, 0x0, 0x0)
    /usr/local/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitWrite(0xc20a990d10, 0x0, 0x0)
    /usr/local/go/src/net/fd_poll_runtime.go:93 +0x43
net.(*netFD).Write(0xc20a990cb0, 0xc20b2c6006, 0x3de, 0x7ffa, 0x117, 0x7fe1432f9730, 0xc20ba0bf50)
    /usr/local/go/src/net/fd_unix.go:335 +0x5ee
net.(*conn).Write(0xc209f0f238, 0xc20b2c6006, 0x3de, 0x7ffa, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/net.go:129 +0xdc
encoding/gob.(*Encoder).writeMessage(0xc20ad565a0, 0x7fe1418b74d0, 0xc209f0f238, 0xc20ad565d8)
    /usr/local/go/src/encoding/gob/encoder.go:75 +0x238
encoding/gob.(*Encoder).EncodeValue(0xc20ad565a0, 0x862820, 0xc20aba66c0, 0x57, 0x0, 0x0)
    /usr/local/go/src/encoding/gob/encoder.go:243 +0x69b
encoding/gob.(*Encoder).Encode(0xc20ad565a0, 0x862820, 0xc20aba66c0, 0x0, 0x0)
    /usr/local/go/src/encoding/gob/encoder.go:168 +0x7f
github.com/zettio/weave/router.(*SimpleTCPSender).Send(0xc209f0a9f0, 0xc20be0f000, 0x3d6, 0x400, 0x0, 0x0)
    /github.com/zettio/weave/router/crypto.go:470 +0x89
github.com/zettio/weave/router.(*LocalConnection).sendProtocolMsg(0xc2094ec780, 0x5, 0xc20be0c900, 0x3d5, 0x443, 0x0, 0x0)
    /github.com/zettio/weave/router/connection.go:418 +0x172
github.com/zettio/weave/router.(*LocalConnection).SendProtocolMsg(0xc2094ec780, 0x5, 0xc20be0c900, 0x3d5, 0x443)
    /github.com/zettio/weave/router/connection.go:190 +0x3a
github.com/zettio/weave/router.(*GossipChannel).SendGossipMsg(0xc2090db860, 0xc20be0c480, 0x3bb, 0x411)
    /github.com/zettio/weave/router/gossip.go:131 +0x1c7
github.com/zettio/weave/router.(*LocalPeer).broadcastPeerUpdate(0xc2090df4a0, 0x0, 0x0, 0x0)
@rade rade added the chore label Mar 9, 2015
@rade
Copy link
Member

rade commented Mar 9, 2015

What's the easiest way to reproduce this?

@bboreham
Copy link
Contributor Author

I can reproduce this by starting 25 Weave peers then letting them all connect at once.

It appears to be another deadlock: the other side of the cycle is:

sync.(*Mutex).Lock()
encoding/gob.(*Encoder).EncodeValue()
encoding/gob.(*Encoder).Encode()
weave/router.(*SimpleTCPSender).Send()
weave/router.(*LocalConnection).sendProtocolMsg()
weave/router.(*LocalConnection).SendProtocolMsg()
weave/router.(*GossipChannel).SendGossipMsg()
weave/router.deliverGossip()
weave/router.(*Router).handleGossip()
weave/router.(*LocalConnection).handleProtocolMsg()
weave/router.(*LocalConnection).receiveTCP()

In other words, when the TCP buffers fill up between peers, the fact that we send out new updates directly from the receiving thread means that we stop reading more, so nether side makes progress.

In this context, setting a deadline on writes is insufficient - we need to decide what to do with further outgoing messages, to avoid blocking. I can see three alternatives:

  • drop them on the floor - the system will pick up eventually since we gossip everything periodically
  • buffer them in memory - just punts the problem a bit later
  • merge new messages into existing queued messages - when the messages are CRDT state updates the merge operation is simply to replace older messages with newer ones.

rade added a commit to rade/weave that referenced this issue Mar 24, 2015
Introduce an intermediary - GossipSender - between GossipChannel and
Connection. This accumulates gossip data (now represented by the new
GossipData interface) from the former until it can be passed onto the
latter, allowing the former to proceed when the latter may be blocked
on i/o.

This

1) prevents deadlocks that arise from cycles in the communication
topology

2) improves performance by allowing GossipChannel and its calling code
to proceed when connection i/o is blocked

3) improves performance by accumulating GossipData - the accumulated
data often is considerably more compact than the sum of all the
accumulated bits, and it is transmitted in a single communication
event.

In order to support accumulation, the GossipData interface has a Merge
method. Furthermore, encoding is deferred until the data can actually
be sent, since accumulation would be hard/impossible to encoded
data. This required changes on some interfaces, most notably Gossiper.

The implementation of GossipData for topology gossip is
TopologyGossipData. It carries a reference to Peers and a set of
PeerNames, indexing into Peers and referencing the Peer entries which
have changed. Previously updates were represented as a list of Peer
entries. The indirection via PeerNames allows Encode to omit entries
which have been removed. The change in representation required changes
to the signature of some methods on Peers.

Note that all the above only applies to the "periodic gossip" portion
of the Gossiper API, which topology gossip in fact uses for *all*
gossip. GossipUnicast and GossipBroadcast are unchanged, but could
conceivably receive the same treatment in the future.

Fixes weaveworks#445.
@rade rade self-assigned this Mar 24, 2015
bboreham added a commit that referenced this issue Mar 24, 2015
@rade rade modified the milestone: 0.10.0 Apr 18, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants