Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(DoS vector?) Peers are removed from the routing table on Disconnect #45

Closed
JustinDrake opened this issue Jan 23, 2017 · 11 comments
Closed
Labels
exp/expert Having worked on the specific codebase is important kind/bug A bug in existing code (including security flaws) status/ready Ready to be worked

Comments

@JustinDrake
Copy link
Contributor

Currently when a peer disconnects, the peer is removed from the routing table. To me this seems needlessly aggressive. Shouldn't the Kademlia least-recently seen eviction policy deal with clearing inactive nodes? Certainly a node shouldn't be evicted from the routing table for a temporary disconnection.

I don't see this remove-on-disconnect policy in the Kademlia whitepaper, and to my eye it is a DoS attack vector. An attacker can flood a victim at the network level to force the temporary closure of all its connections. This would flush the node's routing state, and the attacker could then fill the victim's routing table with bad nodes.

Even in a non-hostile scenario, if a node is temporary shut off from the internet (e.g. for just a few minutes), then it needlessly has to repopulate its routing table from scratch.

Am I missing something obvious?

@cpacia
Copy link

cpacia commented Jan 23, 2017

The Kademlia paper assumes UDP so there isn't any concept of a disconnection. It only removes them when you try to send them a rpc message and it times out.

@JustinDrake
Copy link
Contributor Author

It only removes them when you try to send them a message and it times out.

Yes, I think that's right. To quote the white paper:

If the appropriate k-bucket is full, however, then the recipient pings the k-bucket’s least-recently seen node to decide what to do. If the least-recently seen node fails to respond, it is evicted from the k-bucket and the new sender inserted at the tail

Eviction should happen if the following three conditions are met:

  1. The relevant k-bucket is full
  2. The least-recently seen node is pinged
  3. The least-recently seen node fails to pong

Eviction on disconnect is probably too aggressive.

@whyrusleeping
Copy link
Contributor

@JustinDrake hrm... youre right. We should fix this.

@whyrusleeping
Copy link
Contributor

We should probably set a reasonably short timeout on the eviction ping though, don't want to sit around for ages waiting on that.

@whyrusleeping whyrusleeping added the kind/bug A bug in existing code (including security flaws) label Feb 10, 2017
@JustinDrake
Copy link
Contributor Author

Awesome. I'd love to have this bug fixed for the public release of OpenBazaar 2.0 as it affects Duo. I'd be happy to help, e.g. by doing a code review.

As mentioned in #31, we probably want to use the Kademlia PING message.

We should probably set a reasonably short timeout on the eviction ping

Yes. To speed things up further we can use heuristics such as:

  • If the node was last seen less than (say) 5 minutes ago, assume that the node is online and don't evict.
  • If the node was last seen more than (say) 1 day ago, assume that the node is offline and evict without a PING.

@whyrusleeping
Copy link
Contributor

@JustinDrake Sounds good. I'll try and get this done ASAP for you guys.

@JustinDrake
Copy link
Contributor Author

@whyrusleeping Have you made any progress on this? It would be great to have this for the initial release of OpenBazaar 2.0.

@FrankSzendzielarz
Copy link

Just a quick note on the above thread, as I have recently been studying Kademlia for the Ethereum peer discovery protocol and have it fresh in my mind. The M&M paper contradicts itself somewhat in more than one place. The above observation that the oldest node in the k-bucket should be pinged is correct, but the paper supersedes itself towards the end of the document by advising against doing that, as it could cause a ping storm. They recommend maintaining a cache of possible replacement nodes, but as cpacia correctly points out, to only remove the oldest kbucket node when a meaningful rpc call fails at some later stage, replacing the oldest node with the most recent from the replacement cache. Just an FYI.

@whyrusleeping
Copy link
Contributor

@FrankSzendzielarz interesting observation, thank you!

@anacrolix
Copy link
Contributor

Let's please keep this issue on topic, if there are observations or comments unrelated to the disconnect behaviour please continue those in an appropriate issue. I am currently looking into the disconnect behaviour.

@Stebalien
Copy link
Member

Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exp/expert Having worked on the specific codebase is important kind/bug A bug in existing code (including security flaws) status/ready Ready to be worked
Projects
None yet
Development

No branches or pull requests

7 participants