-
Notifications
You must be signed in to change notification settings - Fork 17
Discuss: batch vs. on-demand connection pruning #19
Comments
Nodes under heavy load will waste a lot of resources doing the thing that the javascript implementation does, especially as implemented in the javascript code (making an array of all peers we're connected to and resorting it each time we get a new connection). (Another note, the javscript impl appears to be broken, it tries to limit peer per protocol, but if it breaks that limit, it goes and disconnects any random peer). The 'hard limit' approach isnt great, as it tends to behave poorly for usecases such as performing a dht query, or fetching some data over bitswap. It leads to us sitting all the time at the high end of how many connections we want to have, meaning every single new connect causes a disconnect (and things like DHT queries tend to cause many connects). Ideally, we would have some target number, and the connection manager would exert pressure to always move towards that number (this is the direction i was hoping to go with an 'upgrade' to the basic conn manager). Something like, while the number of connections we have is greater than our target, remove the least valued streams periodically at a rate proportional to how far away from the target our current count is. Meaning that if our connection count gets really high, we should be pruning connections really fast, but if its less than 5% higher than our target, do things more slowly. btw, we should probably add in a priority queue for all peer values, should improve perf on both js and go.
I wouldn't call it wastage, you should set your low water mark to the amount of connections you are okay with having persistently open. If you are lamenting the fact that you don't have 300 open connections, set your low water mark to 300.
Yeah, I'd love to have some other metrics, like bandwidth and latency (and maybe even libp2p version :P ) to help weight the decisions. but that is 1, tricky to measure accurately and 2, could lead to weird network partitions where peers with worse connections are segmented off into their own graph. |
Also, the javscript implementations lack of a grace period is worrying. If all existing peers are valued, and a new peer joins at the limit, the new peer will not have any value, and if i'm reading this code correctly, it will be dropped immediately. |
Inviting @pgte, @jacobheun and @vasco-santos to join this thread. |
Killing off connections one-by-one as we add new connections can lead to a bunch of really short-lived connections. By trimming down to a low watermark, we give ourselves some time to establish new connections and figure out which ones will likely be useful before we need to start trimming connections again. Basically, we actually want a zig-zag pattern. As @whyrusleeping noted, this also saves CPU cycles (which is why most GCs operate this way). Personally, I'd think of the low watermark as the target and the high watermark as the "time to go back to the target" mark. Really, as @whyrusleeping said, we should probably have an explicit "target"
A better connection manager would also "politely" trim old connections when above Target. |
Thanks for your feedback! I haven't concerned myself with the JS implementation, only with its behaviour. (Not proposing to use the same data structures or algorithms) What I'm trying to postulate is that abrupt actions affecting connectivity can be harmful and can lead to undesirable scenarios and patterns. A steady flux can lead to more stable networks (conjecture).
I regard our problem definition as different than GC. In GC you are cleaning up stuff that's used by nobody, so you let it accumulate until those resources are needed. In our case – IIUC – those connections are always in use. So it becomes a problem of optimising vs. cleaning. I suspect the system should push towards full resource utilisation, yet continuously optimising for high-value connections. In fact, I'm inclined to think that, in our case, max. # of connections is only a proxy for available bandwidth and sockets (as well as the memory and CPU that handling those require), which are the true resources to optimise for. More generally, IMO there's no point in reserving unused capacity in this case, unless you have some protocols that require elasticity/stretch (see below). On a related note, conn manager's decisions can impact affect overall network topology, e.g. terminating a seemingly low-value connection could lead to a network partition on the global level. But this is a different chapter.
+1. A heap-backed priority queue would help in both scenarios (batch or flux).
Completely agree, and I like where you're heading. It introduces the concept of elasticity. But rather than reserving a spare capacity, I'd like to introduce a mechanism whereby protocols can acquire temporary boosted allowances from the conn manager, for a limited time window. Another property we could think about is plasticity: how the protocols themselves can shape the behaviour of the connection manager. After all, constraints applicable to beacon nodes, signalling servers or rendezvous servers in a network (requiring lots of short-lived connections) will differ to those that make sense for an IPFS node. trying to optimise limited resources for more efficiency. If we're allowed to run 300 connections, we want those to be highly-scored (although we want to leave room for newcomers to the network).
On some level, this is what the grace period is there for, correct? Allowing sufficient time for a protocol to calculate the weight/usefulness/score/value of a given connection, after its establishment. Altogether I like the concepts of backpressure, polite trimming / disconnection notices, and the elasticity around a target, vs. hard limits. There's definitely more research and thinking that needs to go into the overall design – lots of good ideas pouring in 💯 |
For context, there are two reasons we keep so many connections open:
We can improve (1) by:
Unfortunately, improving the situation around (2) is a hard problem. In theory, we should be able to get away with ~160 connections (conservatively), fewer for lighter clients. Really, the only connections we need to keep (other than those actively in use by, e.g., bitswap), are the DHT routing connections. Note: Much of what I'm saying (except for the fact that keeping tones of connections kills routers) is untested. It should be correct (if my assumptions hold) but we should also try different approaches to see if they work better.
The idea is that we keep way more connections than we really need and only trim them when we have way too many connections.
In theory, that should be the case here as well. If all goes well, we should only end up trimming connections that we aren't actively using. Unfortunately, the current "weight" system doesn't really capture this. Ideally, we'd have some way to mark connections as "in-use" and then never kill them. Note: We don't just close these connections up-front because establishing a connection can be expensive. We keep them around in case we need them in the future.
Unfortunately, routers suck. Take a look at ipfs/kubo#3320 for some context. Basically, keeping a bunch of connections open can make routers crash, ISPs send RSTs, etc.
We definitely want to reserve "unused" bandwidth/memory/cpu capacity in most cases. Otherwise, we can't run on laptops, mobile, etc. As a matter of fact, users tend to complain about bandwidth overhead when running on cloud servers because bandwidth is expensive.
If we use it correctly, the probability of this should be vanishingly small due to DHT. The DHT forms an overlay network that's designed to avoid network partitions and we tag all peers in this overlay network with a non-zero weight.
Yes. Unfortunately, it isn't always sufficient and tends to work best for connections we initiate (easier to determine that they're useful). This is exacerbated by the fact that bitswap doesn't currently give the connection manager enough information about which peers likely need us (need to get better about telling the connection manager about peers with outstanding requests). Beyond this, batch killing should (untested) lead to longer, more useful connection lifetimes. For example, let's say we're serving some really popular content:
Note: Currently, our batch-killing mechanism is actually kind of broken. When batch killing, we should be telling peers to back-off. Unfortunately, we just disconnect and then the peers immediately reconnect. Fixing this would probably drastically improve the behavior or the connection manager. This is obviously sub-optimal and we could improve the situation by:
|
By the way, thanks for discussing this in such detail. Quite a bit of this hasn't really been thought though thoroughly and almost none of the thinking that has happened has been recorded anywhere. |
Relevant: Having a 'disconnect' 'back-off' or 'go-away' type protocol will help out dramatically. Also, a softer 'suspend' protocol, where we keep the connection, but neither side uses it, should help out in many low resource scenarios (i imagine this helping a lot for mobile and laptop use). A proper disconnect protocol should also help us avoid any sort of partitions during a batch disconnect, if a peer receives a disconnect and that disconnection would be detrimental to their topology, they could maybe send back a 'please let me stay connected' type request, which might give them a reset grace period, allowing them time to fill up their connection table. I do like the idea of 'stretch', giving protocols the ability to temporarily expand the limits for certain operations. Also worth noting: I initially wrote this code while the network was on fire and i was supposed to be on vacation. The room for improvement here is uncontested :P |
The
BasicConnMgr
terminates connections every time adding a new connection makes the connection count exceed the high watermark. This is done via theTrimOpenConns()
method. The trimming heuristics are as follows:connection count == low watermark
(batch termination).Possible effects of the above behaviour
For the sake of illustration, let's assume:
Effects:
An observer would see a chainsaw pattern in connection count.
Alternative approaches
The text was updated successfully, but these errors were encountered: