Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ipfs/go-bitswap] Proposal: Streaming GetBlocks #121

Open
Stebalien opened this issue Dec 9, 2019 · 1 comment
Open

[ipfs/go-bitswap] Proposal: Streaming GetBlocks #121

Stebalien opened this issue Dec 9, 2019 · 1 comment
Assignees

Comments

@Stebalien
Copy link
Member

At the moment, it's a bit tricky to continuously prefetch blocks as one needs to launch a goroutine per batch.

Proposal: Change the signature of GetBlocks to take an input channel and return an output channel:

// Fetcher is an object that can be used to retrieve blocks (defined in go-ipfs-exchange-interface)
type Fetcher interface {
	// GetBlock returns the block associated with a given key.
	GetBlock(context.Context, cid.Cid) (blocks.Block, error)

	// GetBlocks returns a stream of blocks, given a stream of CIDs. It will
	// return blocks in any order.
	//
	// To wait for all remaining blocks, close the CID channel and wait for
	// the blocks channel to be closed. A closed channel does not mean that
	// _all_ blocks were retrieved, it just means that the fetcher is done
	// retrieving blocks.
	GetBlocks(context.Context, <-chan cid.Cid) (<-chan blocks.Block, error)
}

Additional elements:

  • The go-blockservice.BlockGetter interface should be replaced with the Fetcher interface.
  • The output channel should have the same buffer as the input channel (I think?).
  • The error return type is for indicating that the request couldn't even be started (e.g., closed service). This interface doesn't really have a way to report runtime errors (and there's almost always nothing we can do about them anyways).

This will require changes to go-ipfs-exchange-interface, go-blockservice, and go-bitswap.

@Jorropo Jorropo changed the title Proposal: Streaming GetBlocks [ipfs/go-bitswap] Proposal: Streaming GetBlocks Jan 27, 2023
@aschmahmann aschmahmann reopened this Mar 27, 2023
@aschmahmann
Copy link
Contributor

Two other variations on the streaming proposal above

Per Block errors

Return <-chan BlockOption instead of <-chan blocks.Block so that we can handle per-block errors

Streaming sessions-like requests

A more complex, but more featureful version would be if we could add/remove requests from the channel so that we don't have to spin up one goroutine per block-get (or batch of gets). Could be something like

type AddRemoveCid interface {
	IsAdd() bool
	Key() cid.Cid
} // or a similar struct

GetBlocksCh(ctx context.Context, keys <-chan AddRemoveCid) (<-chan blocks.Block, error) // Or BlockOption if that's better

I wrote some code for this in this experiment
https://github.com/ipfs/go-bitswap/pull/593/files#diff-b751089c21e2437423d20fbee6c9a01a61c2422d5f7630ab8272d7fdeb63b654R17

It was useful to me in being able to limit the number of goroutines in use while still being able to control fetching of data based on when new data is returned. The use case was incrementally verifiable downloading of large SHA2-256 blocks of data via Bitswap + manifests where the graph is effectively linear (i.e. miserably slow to download with Bitswap) but the manifest, despite being untrusted, could allow for much faster downloads by growing trust in it as it was proven to be correct as more data was downloaded and validated.

While I hadn't utilized it for the associated demo, the Removal function would've been convenient for removing CIDs that turned out to not be what we needed (i.e. problems with the manifest). It also seems like it would be convenient for composing multiple data retrieval systems (e.g. Bitswap + HTTP Trustless Gateways) since if a block was fetched by one system it could be cancelled in the other system while still limiting the number of goroutines in use.

This issue is being transferred. Timeline may not be complete until it finishes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants