Skip to content
This repository has been archived by the owner on Feb 1, 2023. It is now read-only.

Bitswap Sessions Exploration #165

Closed
dirkmc opened this issue Aug 1, 2019 · 4 comments
Closed

Bitswap Sessions Exploration #165

dirkmc opened this issue Aug 1, 2019 · 4 comments

Comments

@dirkmc
Copy link
Contributor

dirkmc commented Aug 1, 2019

Bitswap sessions group together requests for related blocks (eg the blocks that make up a file) to take advantage of the fact that usually related blocks can be retrieved from the same peers.

This Issue describes the current implementation of sessions.

Background

In IPFS a file is broken up into blocks that are organized into a DAG. The file is identified by the CID of the block representing the DAG root. To fetch a file:

  • IPFS asks Bitswap for the block corresponding to the root CID
  • Bitswap retrieves the block with that CID
  • IPFS asks Bitswap for the blocks for the root's children
  • Bitswap retrieves the children
  • IPFS asks Bitswap for the children of those children, and so on

If these requests are all made using a single Bitswap session, the session remembers which peers have responded before, how "far" each peer is (in terms of network latency) and manages splitting the requests across peers and combining the responses.

Session Implementation

Timers

The session maintains two timers:

  • idle timer (1 second by default)

    • triggered if no block has been received in the idle interval
    • broadcasts the wantlist to all connected peers
    • searches for more peers that have the first CID in the wantlist (eg by asking the DHT)
      The idle interval triggers after 1 second by default. Once at least one peer has responded to a request, the idle interval is set to 500ms + (3 x the average latency), with an increasing back-off each time the idle timer is triggered (if no block was received in the interim).
  • periodic search timer (1 minute by default)

    • triggered periodically
    • broadcasts a random CID from the wantlist to all connected peers
    • searches for more peers that have the random CID (eg by asking the DHT)

Request sharding

Peer list

As peers are discovered, they are added to a peer list. Each request to a peer is timed, and the latency for the peer is adjusted according to the formula:
latency = <current latency> * 0.5 + <new latency> * 0.5

Live wants

The session maintains a list of "live" wants, ie requests for a CID that have not yet been fulfilled. There can be up to 32 live wants at a time.

Requests are made when

  • a new request for block CIDs is made via GetBlock() or GetBlocks()
    • the free space in the live want queue is filled (up to 32 slots)
    • any CIDs that don't fit are put into a secondary queue
  • a response comes in
    • the CIDs from the response are removed from the live want queue
    • the live want queue is filled with CIDs from the secondary queue

The CIDs that were added to the live want queue are sent out as a want request

Request Splitting

The first want request is broadcast to all peers that this node is connected to. As responses come in, the peers are added to the peer list in order of latency.

Once there are peers in the peer list, any subsequent want request

  • selects the "closest" (by latency) peers from the peer list, up to a maximum of 32
  • splits the peers into groups
  • splits the wanted CIDs into groups
  • sends each CID group to the peers in a peer group

For example if there are 8 peers (A - H) and 10 CIDs (0 - 9), and the split factor is 3:

cid0 cid3 cid6 cid9  ->  A D G
cid1 cid4 cid7       ->  B E H
cid2 cid5 cid8       ->  C F

Often more than one peer will have the same block (eg peers A & D might both have cid0). When the local node receives a block, it broadcasts a cancel message for that block CID to all connected peers. However the cancel may not be processed by the recipient peer before it has sent out the block, meaning the local node will receive a duplicate of the block. The local node keeps track of the ratio of duplicates / received blocks and adjusts the split factor:

  • If the ratio goes above 0.4 (a lot of duplicates) the split factor is increased. This means that the same CID will be sent to less peers, eg with a split factor of 5 for the above example:
  cid0 cid5  ->  A F
  cid1 cid6  ->  B G
  cid2 cid7  ->  C
  cid3 cid8  ->  D
  cid4 cid9  ->  E
  • If the ratio goes below 0.2 (very few duplicates) the split factor is decreased. This means that the same CID will be sent to more peers, eg with a split factor of 2 for the above example:
  cid0 cid2 cid4 cid6 cid8  ->  A C E G
  cid1 cid3 cid5 cid7 cid9  ->  B D F H

The split factor is initially set to 2 and can range from 1 - 16.

Currently the peer list is ordered linearly by latency, however there is a WIP PR to adjust the list order probabilistically according to latency.

@dirkmc
Copy link
Contributor Author

dirkmc commented Aug 1, 2019

@Stebalien and @hannahhoward I created this issue to make sure I understand the current state of Bitswap sessions before thinking about how to iterate on improving performance. Please read over it and let me know if there are any inaccuracies or things that can be clarified.

@Stebalien
Copy link
Member

To the best of my knowledge, that looks correct.

@dirkmc
Copy link
Contributor Author

dirkmc commented Aug 2, 2019

@daviddias @hannahhoward @whyrusleeping it would be super helpful to have your ideas on the issues linked to this issue: #166 #167 #168

@dirkmc
Copy link
Contributor Author

dirkmc commented Mar 4, 2020

Implemented as part of ipfs/kubo#6782

@dirkmc dirkmc closed this as completed Mar 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants