-
Notifications
You must be signed in to change notification settings - Fork 4.5k
repair window responses are not retransmitted #336
repair window responses are not retransmitted #336
Comments
we don’t have a retransmit flag, the leader sets the blobs sender id to self. And packets from the leader are retransmitted to peers. So we can set the id to self on the first repair packet. Or the second one from a different node |
I think if we included the window bits in the repair messages we could do something smarter and proactive. So if the leader gets multiple requests for repair they can evaluate all windows and see what packets should be retransmitted to all the peers. We would need to weight them by stake size to be spam resistant eventually |
How about this:
As you said, the leader can maintain a window. Also, does/can leader know which validator got which packet (in the window) when it was originally transmitted? If retransmission requests are being received for packets sent to a particular validator, it can indicate some problem (network/host) with that validator. Thoughts? |
Would you wait before responding, until message 2? Or just keep a counter of how many unique repair requests there are? Right now the validators randomly send each other repair requests, and the leader is part of the random group. |
There can be a small (TBD) fixed wait before responding. If one of the validator is down/bottlenecked (or if leader to validator packet was dropped), more than one peer validators will request for a retransmission within some time interval. If the unicast packet from one validator to another was dropped, then only one of them will request retransmit. So, validators don't know who the leader is? |
@pgarg66 validators know who the leader is. so each validator gets a different packet and retransmits to all the other peers, thats who we are splitting the leaders bandwidth into N downstream nodes. I think something simple we can try is asking to retransmit with exponential backoff, so 2, 4, 8th... repair request |
"I think something simple we can try is asking to retransmit with exponential backoff, so 2, 4, 8th... repair request" Sorry, I am slightly confused with this. Is the requester (validator) exponentially backing off before requesting a retransmission? Is the purpose of back off that multiple validators won't ask for a retransmission of the same packet? |
the leader sets the sender id as self (which indicates retransmit), every time the number of requests to repair that specific packet doubles. |
Isn't this code already retrying to repair the window? streamer.rs: line 203 |
the problem is here we set the response to the repair request to not retransmit ever. so if the packet is dropped in the first hop, all the peers are missing the packet and none will broadcast it to the rest of the network |
I understand it now. |
* runtime: do fewer syscalls in remap_append_vec_file Use renameat2(src, dest, NOREPLACE) as an atomic version of if statx(dest).is_err() { rename(src, dest) }. We have high inode contention during storage rebuild and this saves 1 fs syscall for each appendvec. * Address review feedback
if the packet is dropped, we do not know if it's in step 1 or 2. We basically need some way to decide in the validators if they should ask the peers, or the leader about the packet, and the leader should respond with a packet that the validator will retransmit if it was dropped in step 1.
the hard part here is avoiding having multiple validators retransmit this packet to the peers, because it would flood the network. so the leader needs to do some flow control.
The text was updated successfully, but these errors were encountered: