-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive bandwidth use #3429
Comments
@wrouesnel Try running your daemon in dht client mode with |
This didn't appreciably help - launching with the option still pegged about 900kbps of constant upstream usage which is still way too much to keep a node consistently alive - which interferes heavily with the ability to decentralize to home users or mobile devices (i.e. use distributed IPFS services on a limited connection). |
@wrouesnel thats very odd... Could you check |
It's actually looking a lot better now for some reason (with |
So far so good, although downloading 2GB cumulative over the course of 6 hours for no actual upstream usage I would still argue isn't great behavior.
|
@wrouesnel hrm... yeah. Thats definitely not ideal. I'll comment back later with some things to try to help diagnose the slowness |
An update: Looking at bandwidth graphs on my router its averaging around 500kbps of traffic, spiking up to 1mbit though. This almost immediately flatlines after I kill the So there's definitely way too much traffic going on, and it doesn't look like it's being accounted for properly - tying up 50% of my DSL upstream for an idle node permanently just isn't practical.
This is after running for about a day or 2. |
I have been struggling with excessive IPFS bandwidth usage for over a year now.. #2489 I believe this is due to IPFS having an unrestricted number of peers, often you will see many hundreds of concurrent connections all consuming 5-10KB each. I have tried to manually limit the number of peer connections using This combined with the fact that peers do not relay blocks means that any unique content on the restricted peer cannot be accessed. I did have some luck throttling the ipfs daemon #1482, although it does make accessing the content extremely slow. |
@slothbag Youre right for the most part, Having tons of peer connections is a cause of excessive bandwidth usage. One of the biggest issues right now is that it turns out DHTs don't scale super well, we're looking at and thinking deeply about solutions to this problem: ipfs/notes#162 The next part of the problem, as you suggest, is that we keep too many open connections. Which, in an ideal world, wouldnt necessarily mean that bandwidth usage increases, but since bitswap and the dht both send lots of chatter to all the peers they are connected to (we can fix this, it just needs to be thought through), it results in a pretty direct correlation between number of peers and bandwidth usage. We've been thinking also about connection closing, its also a hard problem (have to keep track of which peers to keep bitswap sessions open to, have to manage dht routing tables and peer search connections). Until we get to working out these issues (its very high on our priority list), updating to 0.4.5-rc1 should yield an improvement (the more people who update to 0.4.5 the more bandwidth savings everyone gets). @slothbag again, thanks for sticking with us and continuing to pester us about this, it really helps. |
Thanks @whyrusleeping , I wasn't aware of those notes. Interesting that you mention DHT as not being scalable as this affects just about all P2P applications that aspire to "take over the world" so to speak. I'll add some of my thoughts to your notes. |
@slothbag Thanks! The more thought we put into solving this the better :) Its a relatively unsolved problem in general. The bittorrent DHT gets away with it because theyre nowhere near the scale that ipfs pushes as far as dht_records/node (you might have maybe 100 torrents per peer on mainline where in ipfs you'll have tens to hundreds of thousands to millions of blocks you need to announce). |
Is this the primary challenge? (Writing from an uninformed perspective and curious about the scalability challenges the system faces.) |
@btc Yeah, from the content routing (DHT) perspective, managing providers is the biggest issue in my opinion. |
Just ran into that problem myself. Got our azure budget maxed out because of IPFS sending out around 500GB per month... reported bandwith usage by ipfs is way off, I measure 120kB/s in and 80kB/s out via iptraf, while ipfs reports 8kB/s in and 9kB/s out via ipfs stats bw. |
Any update on this? My primary reason for not using ipfs yet is that the bandwidth is just unrestricted. I could, but I don't want to do it from outside using traffic shaping or whatever. I want to be able to configure maximum bandwidth in ipfs itself. Last time I tried it, my bandwidth use was extremely high as well. |
@voidzero Any idea how to restrict it in an intelligent way from the outside? I do not want to restrict the actual download of blocks, only the background information exchange to sensible amounts. |
No, wish I did @ingokeck, sorry. |
@voidzero @ingokeck we're working on it, some recent things to try out (on latest master) are:
In general, with each update we've had improvements that reduce bandwidth consumption. Coming soon we have a connection closing module that should help significantly. |
Allright, I'm not an IPFS dev, but I think the idea has a lot of potential, and I've done some research on various ways of improving performance(Some of these are not my ideas and are just things I've found in various discussions) and here's what I think: I'm a little confused about how bitswap works. Do you send wantlists to all peers? And why connect to hundreds of nodes in the first place? Why not only connect to nodes that have the files you want, or that want files from you? Also, what about going one step further from the roots pinning, and letting provider records have a flag that says you also have all child blocks? There are many files that are unlikely to be downloaded individually. Allowing people to provide entire folders with only one record would make a big difference in some cases. Imagine a million people suddenly use an IPFS based file sync solution. Nobody except a few people care about each file, yet there's 20 records for every one scattered all about the swarm, but any node that's interested likely has a complete copy, so we can replace billions of records with possibly 10000x fewer root of folder only records. Also, it would incentivise referring to files as hash/path/etc instead of linking directly to the hash. Using full paths preserves historical info about the origin of the file, while still allowing anyone who wants to to pin the file individually if they have a reason to. You'd probably need to automatically pin the full chain of blocks when pinning something by it's path in this mode for this to be most effective but that's simple enough. To allow linking to an existing block that everyone customarily references by full path without losing the path, symlinks could be added, so that a block could reference another block by path instead of directly. Another idea is to stop providing everything you download 20 times. If you find that there are already 10 nodes pinning something, then maybe you only need to add 1 or 2 copies of the provider record. There's And most importantly of all, I think transparent caching proxies could solve a lot of these issues. If all functions including adding provider records could go through a proxy, most users wouldn't have to worry about it, and most traffic could eventually be within data centers of colocated proxies, with old style DHT crawling as a fallback. If you tell a massive data center to pin something, and someone else using the same proxy downloads it, that process can be as efficient as centralized stuff, because proxies can cache everything. The company could also decide not to add provider records for all of the millions of files that clients use, and instead only have records saying "X big company can get get this file", basically aggregating thousand of provider records into one, possibly passing a few through directly for reliability. It would also allow a company to run a centralized server to make private data transfers faster, without having to rely on that server. Also, it would allow for the same kind of functionality as the various caching acceleration systems that browsers use, in a a standard way. You could define multiple levels of what a proxy would do for who, all the way up to actually storing pinned files. Now there's a standard protocol for pinning services, and any node can be a pinning service for any other node(Handy if IPFS gets built into a router and you're on mobile). Proxies could cache the actual data, meaning in theory there should be no performance hit vs using centralized services, because it essentialy is centralized, right up until the server goes down. Maybe IPNS names could associate a default proxy with a name, so as to say "This entire IPNS site uses this proxy as a tracker, unless it goes down then use the DHT". The tracker would still get some assistance from the swarm for large files, but you wouldn't need to do DHT lookups at all so long as the tracker was up and not overloaded. Heavy reliance on proxies adds a bit of centralization, but it's seamless. If a proxy goes down it could take some cached stuff and provider records with it, but they'd be back soon enough as nodes noticed that the proxy was down. And the potential gain in some cases could be hundreds of times fewer provider records (Via the aggregation mechanism) and near-centralized latency for some popular content. |
Hey @EternityForest, good notes. 'recursive' or 'aggregate' provider records is something we've been thinking about, as well as some form of delegation in the routing system (proxies, as you call it). Discussion on that is over in this thread: ipfs/notes#162 As for bitswap, The 'dumb' default mode is to broadcast your wantlist to your connected peers optimistically, to wait until you find provider records for an object would add annoying amounts of latency to the requests. We have a newer API for bitswap called 'bitswap sessions' that only does that broadcast for the first node in a request, and then reuses peers its gotten data from for future requests within the context of that request. You can read more about that here: #3786 Another optimization that will help bitswap significantly is 'ipld selectors', and we have an issue discussing that here: ipfs/notes#272 As for proxies, thats a very interesting idea that has lots of different approaches. For that you really need to consider different trust scenarios, and maybe even have some sort of reputation system, otherwise you might end up getting fed bad data or censored. My apologies for not responding to every point, trying to take some time away from the computer, but i saw this comment come in and felt compelled to respond :) |
Thanks for getting back to me! It's cool to see how active this community is. I'll probably take some time to look at those threads after the holidays. I'm heading out for a bit in a minute or two, but one idea for proxy trust is just to manually configure it, and use a "web of trust" type model. Maybe you manually set the trust level of google.com to 1, and they "suggest" two other servers, which they claim are as trustworthy as them. So you trust them half as much, because they're one hop away. Maybe you also trust example.com, and they recommend those same servers, so you trust them a little more now that you have 2 good recommendations. |
More random ideas! What if we had "implicit" provider records created when you make a DHT request? So that crawling the DHT leaves a "trail" of logged requests for others to find you. If someone else wants that file, the node says "This guy probably found it by now", but you never had to create any explicit records. "Real" provider records could be reserved for pinned data, and instead of optimistically asking every peer to reduce latency, we could simply start crawling the DHT, and if the content is popular, we'll find it pretty fast, if the content isn't popular, we might not have found it anyway without a DHT crawl. |
I don't understand the ins and outs and while I appreciate the enthusiasm, here's the thing: I just want to set a global maximum amount of bandwidth. This is possible with so many clients, from bittorrent clients, to Tor, and to browser extensions. Why can't I just ensure that ipfs is allowed to use a maximum of (for example) 150KB/s in/egress? It doesn't have to be this difficult, does it? |
@voidzero There's an open issue for limiting bandwidth here: /issues/3065. |
Ah, perfect. Thanks @leerspace; much obliged. |
@voidzero You can use trickle to limit bandwidth. I'm using this command to limit to 50 kb/s upload in my crontab: @reboot /usr/bin/trickle -s -u 50 -d 1000 /usr/local/bin/ipfs daemon --routing=dhtclient 1> /home/ipfsnode/ipfsnode.log 2> /home/ipfsnode/ipfsnode.err.log |
I'm not sure if it's bandwidth of the sheer number of connections to peers, but running IPFS on my home network renders my internet connection unusable. I've not tried trickle yet, but i'd prefer way to say "please only connect to 50 peers". The watermark settings don't seem to allow this... |
The trouble with limiting connections is that AFAIK the DHT doesn't seem to
keep track of peers you aren't connected to(Am I correct on this?).
What about adding a third state where a peer is registered but not
connected, where it's only pinged at exponentially increasing intervals up
to a day or so, to ensure that it's still there.
Wantlists wouldn't get broadcast to them, so you could have thousands while
only actively connecting briefly to nodes in your LAN and nodes that you're
exchanging blocks with.
Are people using the roots provider strategy yet? That seems like it could
save a lot of storage if people used it.
In the end I still think taking some ideas from centralized systems is
probably the best performing way to go. Manually selected or LAN
discovered trackers would give almost all the performance benefits of the
centralized internet while still using the DHT as a backup, and trusted
always on servers would let you store fewer redundant provider records in
the DHT.
…On Sat, Sep 22, 2018, 5:41 AM Edd Barrett ***@***.***> wrote:
I'm not sure if it's bandwidth of the sheer number of connections to
peers, but running IPFS on my home network renders my internet connection
unusable.
I've not tried trickle yet, but i'd prefer way to say "please only connect
to 50 peers". The watermark settings don't seem to allow this...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3429 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAuRH3wVhLrmEfiZ_ou2yBbnxOZ9U7Iyks5udjAOgaJpZM4K9IKk>
.
|
...
See discussion here: #3320. It's the number of connections. Our current solution to this is QUIC (which go-ipfs now has experimental support for). It's a UDP based protocol so, at the protocol level, it has no connections. The hope is that this will convince routers to STOP TRYING TO DO SMART THINGS AND JUST ROUTE THE PACKETS DAMMIT! |
Thats the whole point of the watermark settings. If you want a hard cap at 50, set the highWater value to 50. |
@whyrusleeping that doesn't quite work as it doesn't prevent new connections. I think he just wants max 50 connections open at any given time. |
Correct. The watermark doesn't seem to prevent new connections, so you still end up with hundreds of sockets open. I'm still unable to use IPFS on my home network :( |
I actually have the same issue. This is my IPFS node over the last 30 days: It's quite insane, considering that the nodes are serving just a bunch of static HTML files (in total, the shared data is less than 5 MB) and that there are only 5 people accessing that data, each person around once a day, through Cloudflare (which caches the data too). |
Update: I ran the same commands as @wrouesnel and here's the result for me. My nodes are still using 400-500 GB per month, both in ingress and egress (ingress is usually higher). / # for p in /ipfs/bitswap/1.1.0 /ipfs/dht /ipfs/bitswap /ipfs/bitswap/1.0.0 /ipfs/kad/1.0.0 ; do echo ipfs stats bw --prot
o $p && ipfs stats bw --proto $p && echo "---" ; done
ipfs stats bw --proto /ipfs/bitswap/1.1.0
Bandwidth
TotalIn: 632 MB
TotalOut: 5.6 MB
RateIn: 9.6 kB/s
RateOut: 13 B/s
---
ipfs stats bw --proto /ipfs/dht
Bandwidth
TotalIn: 937 kB
TotalOut: 7.8 MB
RateIn: 0 B/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/bitswap
Bandwidth
TotalIn: 97 MB
TotalOut: 2.5 kB
RateIn: 0 B/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/bitswap/1.0.0
Bandwidth
TotalIn: 0 B
TotalOut: 0 B
RateIn: 0 B/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/kad/1.0.0
Bandwidth
TotalIn: 1.1 GB
TotalOut: 1.5 GB
RateIn: 12 kB/s
RateOut: 8.3 kB/s Routing is set to "dht" and not "dhtclient", but I am still going to change it and see if it makes any difference. Any idea what might be causing all that traffic? The node isn't hosting a lot of data and traffic to documents that are pinned by the node should be very low... |
@ItalyPaleAle looks like dht traffic and bitswap wantlist broadcasts. These are both greatly improved in 0.4.19 (not yet released, but latest master has all the changes), i would recommend updating. The more everyone else upgrades the better it will get. |
@whyrusleeping Glad to hear about 0.4.19. This is a "production" node so I'd rather not run something from master, so I'll wait for the update (I'm using Docker btw) Just to confirm I understood correctly:
|
@ItalyPaleAle yeah, dht traffic can be reduced by setting your node to be just a dht client. Much of the traffic i'm seeing in your log is DHT traffic. Bitswap traffic is mostly other peers telling you what they want, as the current mechanism for finding data is a broadcast to all connected peers of what you want. thats greatly improved in 0.4.19 |
@whyrusleeping the bigges traffic (1.1GB in and 1.5GB out) is actually from Kad. That's over half of the total traffic. I guess those are actual files I'm serving? |
@ItalyPaleAle no, actual files are served over bitswap. All kad traffic is just announcements and searching. |
Seems to me like definitely "--routing=dhtclient" should be the default setting. When developers first get started using this technology they're not going to know every 'gotcha', and we don't want them getting slammed by unexpected massive bandwidth consumption. That will create a very negative (and potentially costly) experience for nearly 100% of early adopters. For the those who really do want to participate as a DHT server, they can be expected to figure out how to turn that setting on. |
Hey guys! I've been playing around with some mesh projects and was reminded of this ongoing project and had a random question. Currently, as I understand it, IPFS stores DHT records on the N nodes that are closest to the hash, selected from the entire set of all nodes. Does any concept of a "sub-DHT" exist yer? It seems that if there is some shared set of peers that most of your peers have, there's no real need to flood wantlists to everyone, you can just have all nodes store a record on the closest node within that "sub-DHT", because it's only a 1-hop lookup for any node in the group that uses it. Treating nodes with nearby IDs as a sub-DHT, and everyone on your local network as another, Sending fewer wantlists to other nodes would increase privacy, and using sub-DHTs wouldn't require any extra trusted nodes. You'd have some issues like needing a way to keep one node with tons of records from majorly flooding it's peer group, but eliminating wantlist floods would do a lot. You'd also have issues with different nodes having different sized sets of peers and not overlapping perfectly, but the overall performance gain might still be better, and you could always have a flood mode for people who really cared about latency. It would also open the possibility for things like geographical node IDs. Convert your node is to GPS coordinates, then keep generating till you get one that's within a few miles. Since you mostly share the same sub-DHT as your true geographical peers, fetching locally generated content might get a bit of a boost. Or use "vanity" addresses, and you might have a decent chance of being in a sub-DHT with someone who wants the blocks you have. Perhaps you could even reduce the replication factor dynamically, for things like Linux disk images where there's already tons of peers everywhere, and you really just want the nearby ones if possible. |
0.5.0 will have a lan-only DHT but records won't be advertised to it if we're also connected to the WAN DHT. We were concerned that flooding the LAN DHT with records would be a problem for asymmetric networks (e.g., a network with a heavily loaded server storing a lot of content and, say, a laptop, smartphone, etc.). Other than that, you may be interested in some of the discussion around ipfs/notes#291.
What do you mean by "flooding"? Bitswap Before trying the DHT, we ask all of our already connected nodes. Is this what you meant by "flood"? We do this because querying connected peers tends to be faster (they also tend to be the peers that have useful content. However, we could and should get smarter about this. We should ideally have some form of staggered flood where we send messages to the peers most likely to have the content first. Also note, when this issue was created, bitswap would send every wantlist item to every peer always. This has since changed (#3786). Now, once we've started receiving blocks, we only ask peers in the session for more blocks. DHT When trying to find content, we traverse the DHT in an ordered search. We don't (intentionally) flood the DHT. (note: I say intentionally because prior to the release coming out tomorrow (0.5.0), go-ipfs would connect to a large portion of the DHT for every DHT request due to some bugs). OT: This discussion is probably best had on the forums (https://discuss.ipfs.io). I'm going to close this issue is as it spans a very large period of time. IPFS has changed quite a bit over that period of time. |
I couldn't find any simple advice to just start
|
In case this helps any GO-IPFS (via. docker) users: My solution was to use the https://github.com/Clay-Ferguson/quantizr/blob/master/docker-compose-dev.yaml But actually it might be better to simply put the |
Just had to kill the persistent ipfs-node I've been running on my home fileserver for the last 2 weeks due to excessive uploads (without any apparent fileserving activity taking place). The process was sitting at around 1 mbps of uploads constantly (judging from my router's bandwidth monitor), which on a home DSL connection is a huge chunk of upload capacity to be taking over.
This was running the docker image SHA:
sha256:f9d41131894a178f2e57ca3db8ea6f098338f636cb0631231ffecf5fecdc2569
I do note that my logs have a fair few messages like the following, but they don't seem particularly informative:
The text was updated successfully, but these errors were encountered: