Bitswap Rounds #417

jbenet · 2014-12-08T08:13:48Z

This doc describes how the Bitswap Protocol v1 should work.

For now, we assume an understanding of the goals and high level description. In time, this document will hopefully be self-contained.

I describe here the desired state. Not necessarily what we'll implement right away

Elements

Bitswap Concepts

Peer - other hosts we exchange data with
Ledger - containing peer-to-peer accounting
Wantlist - list of desired blocks
Strategy - pluggable decision making unit

Implementation Details

Client - the client of bitswap (e.g. ipfs node)
Message - a bitswap message, outgoing or incoming
Blockstore - local block storage
Available Bandwidth - defined by Client
Round - an iteration of the protocol, within one host
Allocation - what to send to whom. decided on by Strategy, per Round.

go-ipfs specifics

net.Service - registered service, which sends + receives messages
net.Handler - request handler

Message looks like this:

message Message {

  message Wantlist {

    message Entry {
      optional string block = 1; // the block key
      optional int priority = 2; // the priority (normalized). default to 1
      optional bool cancel = 3;  // whether this revokes an entry
    }

    repeated Entry entries = 1; // a list of wantlist entries
    optional bool full = 2;     // whether this is the full wantlist. default to false
  }

  optional Wantlist wantlist = 1;
  repeated bytes blocks = 2;
}

High Level:

the core bitswap service employs two async workers:
the RoundWorker which handles rounds, allocations, and sending
the ClientWorker (need better name) which handles requests from Client
all incoming requests from Peers and Client pass information to the workers through channels. workers modify state, may issue outgoing requests, and send back responses.

The RoundWorker:

maintains a peer's Wantlist
iteratively performs rounds:
- call (s *Strategy) GetAllocation(Ledgers, Wantlist, Available Bandwidth, Blockstore) *Allocation
- send out blocks according to Allocation
(select on Wantlist update channel, and on round time interval?)

Normally, we would let the data sending define the rounds (this makes it easier to use available bandwidth correctly), but our net system may buffer the msgs out and return before actually sending them (can see about removing this). So we may have to use timed intervals (0.25, 0.5, 1s ?) instead.

Example of the RoundWorker

The ClientWorker:

maintains local Wantlist
periodically rebroadcasts Wantlist (5-15s?)
sends out single Wantlist updates to other nodes
(select on local request channel, and on periodic rebroadcast)

Note that Message.Wantlist.Entry has int priority. These are normalized, meaning that a priority is really: p_i / sum(p_j for all p). We need it because sending a single WantlistEntry as an update is not able to convey where the priority is. Also, sorting is not able to express priority differences. Hence this change.

As Peer messages come in:

send peer wantlist updates to RoundWorker (over channel)
call HasBlock on blocks

As Client requests come in:

send local wantlist updates to ClientWorker (over channel)
return blocks promise (subscribe, send out requests etc)

(WIP)

The text was updated successfully, but these errors were encountered:

whyrusleeping · 2014-12-08T19:41:12Z

Changes im going to make for the alpha:

Create a roundWorker loop/thread
Change ReceiveMessage to no longer process the wantlist, and instead send it to the roundWorker
When a new wantlist comes in for a peer, discard the old wantlist
Every RoundPeriod send a subset of N blocks wanted by peers, where N is determined by available bandwidth / blocksize
more todos as I work this out more

jbenet · 2014-12-08T19:45:08Z

@whyrusleeping probably important to make the Message change above (into entires). that will seriously cut down bitswap b/w use. wantlists are big.

btc · 2014-12-08T20:35:53Z

How will available bandwidth be determined?

whyrusleeping · 2014-12-08T20:49:43Z

@maybebtc good question...

whyrusleeping · 2014-12-08T20:54:18Z

@jbenet could you elaborate a little more on the Message change? Im not sure I completely understand where youre going with that.

jbenet · 2014-12-08T21:20:09Z

How will available bandwidth be determined?

For now will be set by Client. (ipfs can measure bandwidth capacity over time. for now we can punt on this (not give it to the strategy), but leave room for it).

could you elaborate a little more on the Message change?

Sure thing. how about these examples? (json is easier):

full wantlist, no blocks. blocks of equal priority. prio is normalized, so each is 1/sum = 0.25

# from p1 to p2
{
  "wantlist": {
    "full": true,
    "entries": [
      { "block": "<hash-a>", "priority": 1},
      { "block": "<hash-b>", "priority": 1},
      { "block": "<hash-c>", "priority": 1},
      { "block": "<hash-d>", "priority": 1}
    ]
  }
}

say p2 sends <hash-a> and <hash-b> to p1. (and its own wantlist)

# from p2 to p1
{
  "blocks": [
    "<data which hashes to <hash-a>>",
    "<data which hashes to <hash-b>>",
  ],
  "wantlist": {
    "full": true,
    "entries": [
      { "block": "<hash-c>", "priority": 100}, # really wants c
      { "block": "<hash-d>", "priority": 1}
      { "block": "<hash-e>", "priority": 1}
      { "block": "<hash-f>", "priority": 1}
    ]
  }
}

on getting the blocks, p1 sends out to all its peers (doesnt have to be an independent message, could be on another message going to each peer, but it's fine if it is because we use buffered streams).

# from p1 to p2
{
  "blocks": [
    "<data which hashes to <hash-f>>",
  ],
  "wantlist": {
    "full": false, # note partial. could leave it out (defaults to false).
    "entries": [
      { "block": "<hash-a>", "cancel": true}, # - a
      { "block": "<hash-b>", "cancel": true}, # - b
      { "block": "<hash-g>", "priority": 2}   # + g
    ]
  }
}

whyrusleeping · 2014-12-08T21:21:42Z

awesome, makes more sense now, thanks!

whyrusleeping · 2014-12-08T21:25:14Z

@jbenet with priorities, the list ordering no longer matter, correct?

jbenet · 2014-12-08T21:29:55Z

@whyrusleeping correct

btc · 2014-12-09T00:21:08Z

For now will be set by Client. (ipfs can measure bandwidth capacity over time. for now we can punt on this (not give it to the strategy), but leave room for it).

Would we set to a value that is effectively inf?

jbenet · 2014-12-09T00:37:22Z

Would we set to a value that is effectively inf?

or just leave it out of the implementation for now, but comment where it would go?

whyrusleeping · 2014-12-09T19:18:18Z

@jbenet how are you envisioning the Cancel being used? For example, If I receive a block ive been wanting, I should want to cancel that item from my wantlist, what should be done there?

jbenet · 2014-12-09T20:09:29Z

@whyrusleeping basically, on receiving the block (or the client somehow canceling things, i dont think we allow that presently. cancelling the context doesnt remove the block from the wantlist, i dont think?)... anyway.. on these, we should send out updates to all our peers, "canceling" that block from our wantlist. it's effectively an Ack to the sender, but more importantly, it's a very cheap wantlist update to everyone else.

whyrusleeping · 2014-12-09T20:23:02Z

okay, so, how do we decide on the set of peers to send that update to? we dont really keep track of who has our wantlist or not

jbenet · 2014-12-09T20:44:14Z

@whyrusleeping ? bitswap should be keeping a set of peers we're currently "trading with". they are the people we rebroadcast wantlists to. (this set can be the swarm connected peers for now. but doesnt need to be later on)

whyrusleeping · 2014-12-09T21:34:26Z

Ah, the strategy does have a list of peers... youre right. How do we want to go about 'ending' a trading session?

jbenet · 2014-12-09T22:24:40Z

This is really tricky.

A session should be open because either

we want something from them, and they may give it to us.
they want something from us, and we may give it to them.
(note the complications :) )

these have different end conditions:

they want something from us
- they get it (from us or someone else) and close the connection. (we dont do anything)
- we wont give it to them. (hard to ascertain, because they may be sending us things)
we want something from them
- they gave it to us, we're done, we close. (easy)
- seems like they wont be giving it to us.... (hard). this one basically means we should have a notion of likelihood of success (based on maybe going to get blocks they want for them, and sending them to get our debt ratio up) but this gets tricky.

Coming up with a correct "we're done here, close" is non trivial.

Perhaps a muuuuuuch simpler thing to do that may work well in practice is just this:

type session struct {
  peer peer.Peer
  deadline time.Time
}

func (s *session) shouldRemainOpen() {
  return time.Now().Before(s.deadline)
}

func (s *session) bumpDeadline(t time.Duration) {
  new := time.Now().Add(t)
  if new.After(time.Now()) {
    s.deadline = new
  }
}

// when (a) we open connection, or (b) get or (c) send a block, bump deadline.
// (the first one (a), could be longer)
s.bumpDeadline(timeout)

In any case, i think isolating all this and wrapping it with a func (s *session) shouldRemainOpen() makes it easier to alter later.

whyrusleeping · 2014-12-09T23:06:03Z

I think i have a slightly better idea, we should extract the routing table from the DHT, and let bitswap have more direct access to it. When a peer is bumped from the routing table, close our session with them.

jbenet · 2014-12-09T23:19:54Z

@whyrusleeping that won't work. not every peer is in the routing table.

whyrusleeping · 2014-12-09T23:30:00Z

But they would be if bitswap had the routing table as well, and we call table.Update(...) for every interaction within bitswap

btc · 2014-12-09T23:32:33Z

We probably want a more generic mechanism for getting this information across the boundary.

The routing table isn't an actual source of truth. The RT gets its information from a different layer. #418

jbenet · 2014-12-09T23:40:27Z

But they would be if bitswap had the routing table as well, and we call table.Update(...) for every interaction within bitswap

Incorrect. The routing table has a limited set of slots, corresponding to (roughly) k log2 of the size of the DHT, and prefers long running nodes useful in DHT queries. (once the buckets are full, new nodes will just be dropped). Bitswap should be able to connect to people no matter what the routing table thinks of them. This is a separate system. Assumptions like these drive over schoolkids with their cars.

call table.Update(...)
We probably want a more generic mechanism for getting this information across the boundary.

Yes. Routing information is its own thing, and should make its own conclusions separately. this implies a tighter bond between Routing and Net, not between bitswap and routing. (bitswap should have no clue at all about table.Update).

btc · 2014-12-13T16:20:53Z

Ran some benchmarks. The system's throughput is hard-limited by the choices of roundPeriod and bandwidthPerRound. And for every fixed choice of these two variables, there exists a scenario that doesn't quite fit. And by allowing these values to vary/adapt at runtime, optimality comes at a high cost of complexity.

There seems to be an impedance mismatch between the streaming nature of the bitswap system and the batch models we've been attempting to reify.

In machine learning, there's the notion of online learning where, rather than performing large batch updates, the system can incrementally update as new data comes in. And in data processing, there are systems like Apache Storm that are designed to process unbounded streams of data. For many real-time scenarios, Storm is a more appropriate fit than Hadoop/MapReduce.

I suspect in the case of bitswap, a streaming model may be a better match for the referent system.

  1 for { // worker sends messages as long as there exist messages in Outbox
  2     select { 
  3         case msg := <-bs.decider.NextMessage: // decider has a PQ/heap of messages to go out
  4                   sender.SendMessage(msg) // sender/network rate limits this operation
  5         case <-ctx.Done():
  6     }
  7 }

whyrusleeping · 2014-12-13T18:38:03Z

Yeah, I think a system like this would work nicely. Do we have the requred backpressure to prevent this from clogging the system?

whyrusleeping · 2014-12-15T18:15:56Z

Taking @maybebtc's suggestions into account, i have started brainstorming a better way for deciding what bitswap should do at any given moment. Im planning on building a priority queue of sorts, where higher priority blocks (based on some heuristic) are at the front of the list. This queue will only contain blocks we have available locally, and will be added to whenever we receive a wantlist, or whenever we receive new blocks. I havent figured out the sorting heuristic yet, as basing it off of the peers debtRatio would require constant resorting, and basing it off of the wanted blocks priority would encourage people to game the system (unless priorities were normalized). But my idea (once a heuristic is determined) is that to select the next item, a random number is chosen on an inversely exponential distribution, such that blocks near the front of the list are more likely to be selected.

@jbenet, thoughts?

btc · 2014-12-15T18:41:35Z

At this stage in the release cycle, I'd pursue the absolute simplest thing that could possibly work.

I wouldn't worry about the debt ratio at this stage. Assume nice strategy.

From where I sit, it would appear that bitswap MVP priorities are simplicity and throughput.

whyrusleeping · 2014-12-15T19:08:24Z

Okay, so absolute simplest would just be a heuristic of H(x) = 1. Makes things pretty easy.

btc mentioned this issue Dec 9, 2014

ensure bitswap doesn’t rate-limit add operation #427

Closed

whyrusleeping mentioned this issue Dec 10, 2014

Bitswap rounds #438

Merged

whyrusleeping self-assigned this Dec 10, 2014

whyrusleeping added the status/in-progress In progress label Dec 10, 2014

btc assigned btc and unassigned whyrusleeping Dec 17, 2014

btc mentioned this issue Dec 17, 2014

Bitswap Rounds CR suggestions #458

Merged

btc assigned whyrusleeping and unassigned btc Dec 18, 2014

jbenet closed this as completed Jan 5, 2015

jbenet removed the status/in-progress In progress label Jan 5, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bitswap Rounds #417

Bitswap Rounds #417

jbenet commented Dec 8, 2014

whyrusleeping commented Dec 8, 2014

jbenet commented Dec 8, 2014

btc commented Dec 8, 2014

whyrusleeping commented Dec 8, 2014

whyrusleeping commented Dec 8, 2014

jbenet commented Dec 8, 2014

whyrusleeping commented Dec 8, 2014

whyrusleeping commented Dec 8, 2014

jbenet commented Dec 8, 2014

btc commented Dec 9, 2014

jbenet commented Dec 9, 2014

whyrusleeping commented Dec 9, 2014

jbenet commented Dec 9, 2014

whyrusleeping commented Dec 9, 2014

jbenet commented Dec 9, 2014

whyrusleeping commented Dec 9, 2014

jbenet commented Dec 9, 2014

whyrusleeping commented Dec 9, 2014

jbenet commented Dec 9, 2014

whyrusleeping commented Dec 9, 2014

btc commented Dec 9, 2014

jbenet commented Dec 9, 2014

btc commented Dec 13, 2014

whyrusleeping commented Dec 13, 2014

whyrusleeping commented Dec 15, 2014

btc commented Dec 15, 2014

whyrusleeping commented Dec 15, 2014

Bitswap Rounds #417

Bitswap Rounds #417

Comments

jbenet commented Dec 8, 2014

Elements

whyrusleeping commented Dec 8, 2014

jbenet commented Dec 8, 2014

btc commented Dec 8, 2014

whyrusleeping commented Dec 8, 2014

whyrusleeping commented Dec 8, 2014

jbenet commented Dec 8, 2014

whyrusleeping commented Dec 8, 2014

whyrusleeping commented Dec 8, 2014

jbenet commented Dec 8, 2014

btc commented Dec 9, 2014

jbenet commented Dec 9, 2014

whyrusleeping commented Dec 9, 2014

jbenet commented Dec 9, 2014

whyrusleeping commented Dec 9, 2014

jbenet commented Dec 9, 2014

whyrusleeping commented Dec 9, 2014

jbenet commented Dec 9, 2014

whyrusleeping commented Dec 9, 2014

jbenet commented Dec 9, 2014

whyrusleeping commented Dec 9, 2014

btc commented Dec 9, 2014

jbenet commented Dec 9, 2014

btc commented Dec 13, 2014

whyrusleeping commented Dec 13, 2014

whyrusleeping commented Dec 15, 2014

btc commented Dec 15, 2014

whyrusleeping commented Dec 15, 2014