Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DHT node spam from exit nodes #3065

Open
Captain-Coder opened this issue Aug 24, 2017 · 7 comments
Open

DHT node spam from exit nodes #3065

Captain-Coder opened this issue Aug 24, 2017 · 7 comments

Comments

@Captain-Coder
Copy link

Captain-Coder commented Aug 24, 2017

During the work on Gumby's DHT-isolation module I noticed the following.

For each circuit length (1 hop, 2 hops, 3 hops, etc) Tribler sets up an instance of LibTorrent. This instance has its DHT enabled. The DHT traffic generated by the libtorrent instance is tunneled to the exit node from where it is forwarded to the internet. So effectively each exit node apprears as a single IP hosting many DHT nodes, with each its own short lived port number.

However most of the nodes forming the public DHT employ measures to combat this sort of node spam from single ip's. Many will not even consider multiple IP's from a single /24 subnet in each routing table bucket. We could end up on blacklists, or simply get no replies, or get flakey service. On top of that the high node churn due to circuits closing and reopening on different ip:port numbers does not help matters along and makes an exit's IP look very unreliable. The tunneled libtorrent DHT's will not end up in routing tables, and not perform their function as intended.

Notice that even the default (non anon.) libtorrent instance conflicts with the pymdht that Tribler runs on the same host to support e2e encryption. Though this effect is (probably) of a lesser degree and has less impact.

One proposed solution is as follows:

  • Merge pymdht #2 of our pymdht fork.
  • Disable the DHT of all LibTorrent instances
  • Tribler will now have to make DHT lookups and inform libtorrent through add_peer()
  • For the non anon libtorrent instance: we can directly use the local pymdht.
  • For the anon libtorrent instances: circuits already have tunnel community messages to have the circuit exit perform dht announce/lookups on their behalf. The exit already executes these using pymdht.

The pymdht layer should be checked to see if it does caching right, so it should not spam messages (either announce or lookup). If this happens we are likely to trigger flood defences in other DHT implementations, for example because everyone is downloading their favorite new torrent.

From what I understand of the tunnel code, the dht messages are already encrypted and are only seen unencrypted by the circuit origin and the exit, so the privacy of DHT requests should be good. However, the person fixing this issue should probably first investigate if my intuition and understanding is accurate. Or otherwise think of the privacy concerns of the proposed fix.

Also if BEP42 (dht security extensions) ever gets supported, our problems will multiply. BEP45 has some interresting observations about many ip/multi homed DHTs that also applies to our current libtorrent DHTs hopping exit address every so often. This should be fixed by using the exits pymdht instance for everything.

Links:
http://www.bittorrent.org/beps/bep_0005.html (dht standard)
http://www.bittorrent.org/beps/bep_0042.html (draft, dht security)
http://www.bittorrent.org/beps/bep_0045.html (multi homed DHT instances)

@devos50
Copy link
Contributor

devos50 commented Sep 7, 2019

@egbertbouman is this issue addressed with the implementation/deployment of our own DHT overlay?

@egbertbouman
Copy link
Member

@devos50 Unfortunately, it's not. We're not using our own overlay for bittorrent DHT lookups.

@synctext
Copy link
Member

synctext commented Sep 1, 2020

This is now a critical issue for #3868. We can't get swarms stats due to DHT security. No idea for fix yet!

@qstokkink
Copy link
Contributor

qstokkink commented Sep 1, 2020

One possibility is to pay out for DHT lookups and distribute the load in the network.

@synctext
Copy link
Member

synctext commented Mar 18, 2021

We conducted a DHT reliability experiment. We seen the effect in the wild from the Libtorrent DHT maximum speed limit. (5 messages per second)

@egbertbouman
Copy link
Member

Just some quick graphs. The first one shows the number of responses received after sending 1000 requests to 100 random DHT nodes at different rates. The second one shows the percentage of peers that blocked us during the experiment.

I'm not sure how trustworthy these figures are considering fewer peers blocked us at higher request rates. That could also be due to an issue on our side.

 
dht_responses_100_peers

dht_blocked_100_peers

@hbiyik
Copy link

hbiyik commented Nov 2, 2021

Dont want to sound like a jerk but, dont you think that lack of exit nodes is basically making the whole network centralized? Dht Spam looks like a part of the problem.

@qstokkink qstokkink removed this from the Backlog milestone Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

8 participants