crypto errors can stall udp receiver loop #564

rade · 2015-04-15T21:01:56Z

Receiving a stray udp packet can cause the decryptor to wait for a new nonce which will never arrive. This stalls the entire udp receiver loop for 30 seconds, until the offending connection gets shut down by heartbeat timeouts. That of course will cause all other connections to miss heartbeats too and shut down too. Not great.

rade · 2015-04-15T21:10:39Z

A quick hack fix would introduce a timeout. 1 second? 3? Timeouts have a habit of, err, timing out when they shouldn't, so this may well replace one problem with another. Though arguably less severe one. If the packet was genuine then we will drop it. and the next packet along will once again cause us to wait for a nonce. No big deal.

However, if we have a whole stream of such stray packets we could still end up slowing the entire UDP receiver loop to a crawl.

An alternative would be to introduce per-connection goroutines for inbound udp packet processing. just like we have on the outbound side. This is costly in performance terms, and also poses some problems for packet injection - iirc the injection handle isn't thread safe. We could create separate handles, I suppose.

rade · 2015-04-16T09:50:40Z

Another option would be not to wait for a nonce at all, i.e. drop the packet if we cannot get hold of a new nonce. I've experimented with that and it actually works very well.

Note that new nonces are sent when the sender is half-way through the offset range, so they should arrive at the remote well in advance of any packets using the new nonces. The main exception is the first packet. In fact odds of the nonce not making it through before that packet arrives are quite high.

rade · 2015-04-16T12:55:35Z

Idea from @bboreham only wait (with a timeout) once, and only for the first nonce. That would cure the common case of the first nonce arriving fractionally later than the first packet.

Another related idea: replace the source peer name in our udp encapsulation with the connection uid. The former is just a lookup away from the latter, and looking up connections in the udp reader by connection uid instead of peer name ensures that packets get processed by the correct decryptor.

replace random nonces with message sequence numbers Fixes #564.

rade added the bug label Apr 15, 2015

This was referenced Apr 15, 2015

All peers become "unestablished" #515

Closed

reduce crypto glitches and ignore crypto errors #565

Merged

rade self-assigned this Apr 16, 2015

This was referenced Apr 26, 2015

use predictable instead of random nonces #608

Closed

replace random nonces with message sequence numbers #617

Merged

dpw closed this as completed in #617 Apr 30, 2015

dpw added a commit that referenced this issue Apr 30, 2015

Merge rade/counter_nonce (PR #617)

17be43f

replace random nonces with message sequence numbers Fixes #564.

rade modified the milestone: 0.11.0 May 12, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crypto errors can stall udp receiver loop #564

crypto errors can stall udp receiver loop #564

rade commented Apr 15, 2015

rade commented Apr 15, 2015

rade commented Apr 16, 2015

rade commented Apr 16, 2015

crypto errors can stall udp receiver loop #564

crypto errors can stall udp receiver loop #564

Comments

rade commented Apr 15, 2015

rade commented Apr 15, 2015

rade commented Apr 16, 2015

rade commented Apr 16, 2015