Skip to content
This repository has been archived by the owner on Aug 2, 2021. It is now read-only.

Commit

Permalink
network: Suggest peer by address space gap (#2065)
Browse files Browse the repository at this point in the history
* network/kademlia: proposed solution for peer suggestion in Kademlia by using address space gaps. A thorough description can be found here: ethersphere/SWIPs#32

Co-authored-by: Álvaro <kortatu@gmail.com>
  • Loading branch information
2 people authored and acud committed Jan 13, 2020
1 parent 2c9e315 commit e7e98cf
Show file tree
Hide file tree
Showing 6 changed files with 466 additions and 17 deletions.
92 changes: 89 additions & 3 deletions network/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ the latter on the downstream peer.

Subscribe on StreamerPeer launches an incoming streamer that sends
a subscribe msg upstream. The streamer on the upstream peer
handles the subscribe msg by installing the relevant outgoing streamer
. The modules now engage in a process of upstream sending a sequence of hashes of
handles the subscribe msg by installing the relevant outgoing streamer.
The modules now engage in a process of upstream sending a sequence of hashes of
chunks downstream (OfferedHashesMsg). The downstream peer evaluates which hashes are needed
and get it delivered by sending back a msg (WantedHashesMsg).

Expand Down Expand Up @@ -121,7 +121,7 @@ the constructor is the Run function itself. which takes a streamerpeer as argume
### provable streams

The swarm hash over the hash stream has many advantages. It implements a provable data transfer
and provide efficient storage for receipts in the form of inclusion proofs useable for finger pointing litigation.
and provide efficient storage for receipts in the form of inclusion proofs usable for finger pointing litigation.
When challenged on a missing chunk, upstream peer will provide an inclusion proof of a chunk hash against the state of the
sync stream. In order to be able to generate such an inclusion proof, upstream peer needs to store the hash index (counting consecutive hash-size segments) alongside the chunk data and preserve it even when the chunk data is deleted until the chunk is no longer insured.
if there is no valid insurance on the files the entry may be deleted.
Expand Down Expand Up @@ -150,3 +150,89 @@ and simply iterate on index per bin when syncing with a peer.
priority queues are used for sending chunks so that user triggered requests should be responded to first, session syncing second, and historical with lower priority.
The request on chunks remains implemented as a dataless entry in the memory store.
The lifecycle of this object should be more carefully thought through, ie., when it fails to retrieve it should be removed.

## Address space gaps
In order to optimize Kademlia load balancing, performance and peer suggestion, we define the concept of `address space gap`
or simply `gap`.
A `gap` is a portion of the overlay address space in which the current node does not know any peer. It could be represented
as a range of addresses: `0xxx`, meaning `0000-0111`

The `proximity order of a gap` or `gap po` is the proximity order of that address space with respect to the nearest peer(s)
in the kademlia connected table (and considering also the current node address). For example if the node address is `0000`,
the gap of addresses `1xxx` has proximity order 0. However the proximity order of the gap `01xx` has po 1.

The `size of a gap` is defined as the number of addresses that could fit in it. If the area of the whole address space is 1,
the `size of a gap` could be defined from the `gap po` as `1 / 2 ^ (po + 1)`. For example, our previous `1xxx` gap has a size of
`1 / (2 ^ 1) = 1/2`. The size of `01xx` is `1 / (2 ^ 2) = 1/4`.

In order to increment performance of content retrieval and delivery the node should minimize the size of its gaps, because this
means that it knows peers near almost all addresses. If the minimum `gap` in the kademlia table is 4, it means that whatever
look up or forwarding done will be at least 4 po far away. On the other hand, if the node has a 0 po `gap`, it means that
for half the addresses, the next jump will be still 0 po away!.

### Gaps for peer suggestion
The current kademlia bootstrap algorithm try to fill in the bins (or po spaces) until some level of saturation is reached.
In the process of doing that, the `gaps` will diminish, but not in the optimal way.

For example, if the node address is `00000000`, it is connected only with one peer in bin 0 `10000000` and the known
addresses for bin 0 are: `10000001` and `11000000`. The current algorithm we will take the first `callable` one, so
for example, it may suggest `10000001` as next peer. This is not optimal, as the biggest `gap` in bin 0 will still be
po 1 => `11xxxxxx`. If however, the algorithm is improved searching for a peer which covers a bigger `gap`, `11000000` would
be selected and now the biggest `gaps` will be po2 => `111xxxx` and `101xxxx`.

Additionally, even though the node does not have an address in a particular `gap`, it could still select the furthest away
from the current peers so it covers a bigger `gap`. In the previous example with node `00000000` and one peer already connected
`10000000`, if the known addresses are `10000001` and `1001000`, the best suggestion would be the last one, because it is po 3
from the nearest peer as opposed to `10000001` that is only po 7 away. The best case will cover a `gap` of po 3 size
(1/16 of area or 16 addresses) and the other one just po 7 size (1/256 area or 1 address).

### Gaps and load balancing
One additional benefit in considering `gaps` is load balancing. If the target addresses are distributed randomly
(although address popularity is another problem that can also be studied from the `gap` perspective), the request will
be automatically load balanced if we try to connect to peers covering the bigger `gaps`. Continuing with our example,
if in bin 0 we have peers `10000000` and `10000001` (Fig. 1), almost all addresses in space `1xxxxxxx`, that is, half of the
addresses will have the same distance from both peers. If we need to send to some of those address we will need to use
one of those peers. This could be done randomly, always the first or with some load balancing accounting to use the least
used one.
![Fig. 1](https://raw.githubusercontent.com/kortatu/swarm_doc/master/address_space_gaps-lb-1.png)
Fig.1 - Closer peers needs an external Load Balancing mechanism

This last method will still be useful, but if the `gap` filling strategy is used, most probably both peers will
be separated enough that they never compete for an address and a natural load balancing will be made among them (for example,
`10000000` and `11000000` will be used each for half the addresses in bin 0 (Fig. 2)).
![Fig. 2](https://raw.githubusercontent.com/kortatu/swarm_doc/master/address_space_gaps-lb-2.png)
Fig.2 - Peers chosen by space address gap have a natural load balancing
### Implementation
The search for gaps can be done easily using a proximity order tree or `pot`. Traversing the bins of a node, a `gap` is
found if there is some of the po's missing starting from furthest (left). In each level the starting po to search for is the
parent po (not 0, because in the second level, under a node of po=0, the minimum po that could be found is 1).

Implementation of the function that looks for the bigger Gap in a `pot` can be seen in
`pot.BiggestAddressGap`. That function returns the biggest gap in the form of a po and
a node under the gap can be found.

This function is used in `kademlia.suggestPeerInBinByGap`, which it returns a BzzAddress in a particular bin which fills
up the biggest address gap. This function is not used in `SuggestPeer`, but it will be enough to replace the call to
`suggestPeerInBin` with the new one.

### Further improvements
Instead of the size of a gap, maybe it could be more interesting to see the ratio between size and number of current
peers serving that gap. If we have `n` current peers that are equidistant to a particular gap of size `s`,
the load of each of these peers will be on average `s/n`.
We can define a gap's `temperature` as that number `s/n`. When looking for new peers to connect, instead of looking for
bigger gaps we could look for `hotter` gaps.
For example, if in our first example, we can't find a peer in `11xxxxxx` and we instead, used the best peer, we could end
with the configuration in Fig. 3.
![Fig. 3](https://raw.githubusercontent.com/kortatu/swarm_doc/master/address_space_gaps-lb-3.png)
Fig. 3 - Comparing gaps temperature

Here we still have `11xxxxxx` as the biggest gap (po=1, size 1/4), same size as `01xxxxxx`. But if consider temperature,
`01xxxxxx` is hotter because is served only by our node `00000000`, being its temperature is `(1/4)/ 1 = 1/4`. However,
`11xxxxxx` is now served by two peers, so its temperature is `(1/4) / 2 = 1/8`, and that will mean that we will select
`01xxxxxx` as the hotter one.

There is a way of implementing temperature calculation so its cost it is the same as looking for biggest gap. Temperature
can be calculated on the fly as the gap is found using a `pot`.

Other metrics could be considered in the temperature, as recently number of requests per address space, performance of
current peers...
73 changes: 61 additions & 12 deletions network/kademlia.go
Original file line number Diff line number Diff line change
Expand Up @@ -427,28 +427,77 @@ func (k *Kademlia) SuggestPeer() (suggestedPeer *BzzAddr, saturationDepth int, c
return false
}
}
// curPO found
// find a callable peer out of the addresses in the unsaturated bin
// stop if found
bin.ValIterator(func(val pot.Val) bool {
e := val.(*entry)
if k.callable(e) {
suggestedPeer = e.BzzAddr
return false
}

return true
})
suggestedPeer = k.suggestPeerInBin(bin)
return cur < len(bins) && suggestedPeer == nil
}, true)
}

if uint8(saturationDepth) < k.saturationDepth {
k.saturationDepth = uint8(saturationDepth)
return suggestedPeer, saturationDepth, true
}
return suggestedPeer, 0, false
}

func (k *Kademlia) suggestPeerInBin(bin *pot.Bin) *BzzAddr {
var foundPeer *BzzAddr
// curPO found
// find a callable peer out of the addresses in the unsaturated bin
// stop if found
bin.ValIterator(func(val pot.Val) bool {
e := val.(*entry)
if k.callable(e) {
foundPeer = e.BzzAddr
return false
}
return true
})
return foundPeer
}

//suggestPeerInBinByGap tries to find the best peer to connect in a particular bin looking for the biggest
//address gap in the current connections bin of same proximity order instead of using the first address that is
//callable. In case there is no current bin of po = bin.ProximityOrder, or is empty, the usual suggestPeerInBin algorithm
//will take place.
//bin parameter is the bin in the addresses in which to select a BzzAddr
//return value is the BzzAddr selected
func (k *Kademlia) suggestPeerInBinByGap(bin *pot.Bin) *BzzAddr {
connBin := k.defaultIndex.conns.PotWithPo(k.base, bin.ProximityOrder, Pof)
if connBin == nil {
return k.suggestPeerInBin(bin)
}
gapPo, gapVal := connBin.BiggestAddressGap()
// I need an address in the missing gapPo space with respect to gapVal
// the lower gapPo the biggest the address space gap
var foundPeer *BzzAddr
var candidatePeer *BzzAddr
furthestPo := 256
// find a callable peer out of the addresses in the unsaturated bin
// stop if found
bin.ValIterator(func(val pot.Val) bool {
e := val.(*entry)
addrPo, _ := Pof(gapVal, e.BzzAddr, bin.ProximityOrder)
if k.callable(e) {
if addrPo == gapPo {
foundPeer = e.BzzAddr
return false
}
if addrPo < furthestPo {
furthestPo = addrPo
candidatePeer = e.BzzAddr
}
return true
}
return true
})
if foundPeer != nil {
return foundPeer
} else {
// Peer with an address po away from pin not found, so we return the farthest
return candidatePeer
}
}

// On inserts the peer as a kademlia peer into the live peers
func (k *Kademlia) On(p *Peer) (uint8, bool) {
k.lock.Lock()
Expand Down
75 changes: 75 additions & 0 deletions network/kademlia_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1069,3 +1069,78 @@ func TestCapabilityNeighbourhoodDepth(t *testing.T) {
t.Fatalf("cap 'one' expected depth 2, was %d", depth)
}
}

//TestSuggestPeerInBinByGap will check that when several addresses are available for register in the same bin, the
//one suggested is the one that fills the biggest gap of address in that bin.
func TestSuggestPeerInBinByGap(t *testing.T) {
tk := newTestKademlia(t, "11111111")
tk.Register("00000000", "00000001")
bin0 := tk.getAddressBin(0)
if bin0 == nil {
t.Errorf("Expected bin 0 in addresses to be found but is nil")
}

// Adding 00000000 for example, doesn't really mater among the first two
tk.On("00000000")
tk.Register("01000000")
suggestedByGapPeer := tk.suggestPeerInBinByGap(tk.getAddressBin(0))
binaryString := bzzAddrToBinary(suggestedByGapPeer)
// Expected suggestion is 01000000 because it covers bigger part of the address space in bin 0.
if binaryString != "01000000" {
t.Errorf("Expected suggestion by gap to be 01000000 because is in po=1 gap, but got %v", binaryString)
}
// Adding 01000000
tk.On(binaryString)
//Now wi will try to fill in po 1
tk.Register("10000000", "11110000")
bin1 := tk.getAddressBin(1)
//Among the two peers in first one (10000000) covers more gap than the other one in our kademlia table (is farther from
// our base 11111111)
suggestedByGapPeer = tk.suggestPeerInBinByGap(bin1)
binaryString = bzzAddrToBinary(suggestedByGapPeer)
if binaryString != "10000000" {
t.Errorf("Expected suggestion by gap to be 10000000 because is in po=1 gap, but got %v", binaryString)
}
}

//TestSuggestPeerInBinByGapCandidate checks than when suggesting addresses, if an address in the desired gap can't be
//found, the furthest away from the reference peer will be chosen (the one with lower po so it will fill up a bigger
//part of the gap)
func TestSuggestPeerInBinByGapCandidate(t *testing.T) {
tk := newTestKademlia(t, "11111111")
tk.On("00000000", "10000000")
//Registering address (10000100) po=5 from 1000000 to leave a big gap [2..4]
tk.On("10000100")
//Now we are going to suggest a biggest gap that doesn't match with any of the available addresses. The algorithm
//should take the furthest from the reference address (parent of the gap, so 10000000)
//Now we have a gap po=2 under 10000000 in bin1. We are not going to register an address po=2 (f.ex. 10100000) but
//two addresses at po=3 and po=4 from it. Algorithm should return the farthest candidate(po=3).
//10010000 => po=3 from 10000000
//10001000 => po=4 from 10000000
tk.Register("10010000", "10001000")
suggestedCandidate := tk.suggestPeerInBinByGap(tk.getAddressBin(1))
binaryString := bzzAddrToBinary(suggestedCandidate)
if binaryString != "10010000" {
t.Errorf("Expected furthest candidate to be 10010000 at po=3, but got %v", binaryString)
}
}

//getAddressBin is an utility function to obtain a Bin by po
func (tk *testKademlia) getAddressBin(po int) *pot.Bin {
var theBin *pot.Bin
tk.defaultIndex.addrs.EachBin(tk.base, Pof, po, func(bin *pot.Bin) bool {
if bin.ProximityOrder == po {
theBin = bin
return false
} else if bin.ProximityOrder > po {
return false
} else {
return true
}
}, true)
return theBin
}

func bzzAddrToBinary(bzzAddress *BzzAddr) string {
return byteToBitString(bzzAddress.OAddr[0])
}
5 changes: 3 additions & 2 deletions network_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -352,13 +352,14 @@ func testSwarmNetwork(t *testing.T, o *testSwarmNetworkOptions, steps ...testSwa

for syncing := true; syncing; {
syncing = false
time.Sleep(1 * time.Second)

for _, id := range nodeIDs {
if sim.MustNodeItem(id, bucketKeyInspector).(*api.Inspector).IsPullSyncing() {
syncing = true
break
}
}

time.Sleep(1 * time.Second)
}

for {
Expand Down
83 changes: 83 additions & 0 deletions pot/pot.go
Original file line number Diff line number Diff line change
Expand Up @@ -925,3 +925,86 @@ func (t *Pot) sstring(indent string) string {
}
return s
}

//PotWithPo returns a Pot with all elements with proximity order desiredPo w.r.t. pivotVal.
//is similar to obtain a bin but in a tree structure that helps in some calculations
func (t *Pot) PotWithPo(pivotVal Val, desiredPo int, pof Pof) *Pot {
if t == nil || t.size == 0 {
return nil
}
pivotProximityOrder, _ := pof(t.pin, pivotVal, 0)
pivotPot, pivotBinIndex := t.getPos(pivotProximityOrder)
if pivotProximityOrder < desiredPo {
if pivotPot != nil && pivotPot.po == pivotProximityOrder {
return pivotPot.PotWithPo(pivotVal, desiredPo, pof)
} else { //There is no bin with the desired po
return nil
}
}
if pivotProximityOrder == desiredPo {
prunedPot := t.clone()
prunedPot.po = desiredPo
actualPivotPlace := pivotBinIndex
if pivotPot == nil {
actualPivotPlace--
}
var removedBinsSize int
for i := 0; i < len(prunedPot.bins) && i <= actualPivotPlace; i++ {
removedBinsSize += prunedPot.bins[i].size
}
prunedPot.size = prunedPot.size - removedBinsSize
if prunedPot.bins != nil {
prunedPot.bins = prunedPot.bins[actualPivotPlace+1:]
}
return prunedPot
}
// if pivotProximityOrder > desiredPo
for i := 0; i < len(t.bins); i++ {
n := t.bins[i]
if n.po == desiredPo {
return n
}
}
return nil
}

//BiggestAddressGap tries to find the biggest address not covered by an element in the address space.
//Biggest gaps tend to be top left of the tree (if the pot is rendered root top and bins with po = 0 left).
//As the bins progress to the right or down (higher proximity order) the address space gap left is smaller.
//An address gap is defined as a missing proximity order without any value. So for example, a root value with two
//bins, one with po 0 and one with po 2 has a gap in po=1. Of course it also has a gap in po>=3 but that gap is smaller
//in number of addresses contained. If the total space area is 1, the space covered by a bin of proximity order n can
//be defined as 1/2^n. So po=0 will occupy half of the area, po=5 1/32 of the area and so on.
//When a gap is found there is no need to go further on that level because advancing (horizontally or vertically) will
//decrease the maximum gap space by half.
//The function returns the proximity order of the gap and the reference value where the gap has been found (so the
//exact address set can be calculated)
func (t *Pot) BiggestAddressGap() (po int, val Val) {
if t == nil || t.size == 0 {
return 0, nil
}

if len(t.bins) == 0 {
return t.po + 1, t.pin
}

wrt := t.pin
biggest := 256
last := t.po
for _, subPot := range t.bins {
if subPot.po > last+1 && last+1 <= biggest {
wrt = t.pin
biggest = last + 1
break
} else {
last = subPot.po
subBiggest, aVal := subPot.BiggestAddressGap()
if subBiggest < biggest {
biggest = subBiggest
wrt = aVal
}
}
}

return biggest, wrt
}
Loading

0 comments on commit e7e98cf

Please sign in to comment.