Event bus all-the-things #79

pipermerriam · 2018-11-20T03:54:28Z

What is wrong?

Our use of the proxy object pattern has had continued fallout and un-intended results. Specific cases that I'm thinking about are:

Quickly discover sparse peers with particular subprotocol #1299 has been difficult due to issues with pickle-ability
Weird APIs where we are kind-of interacting with a chain object, but not really since it's actually over a proxy object.
Weird APIs where we wrap the non-async Chain methods to make them async.
Issues with the JSON-RPC server and it needing direct access to the Chain object which produces awkward coupling/dependencies

How can it be fixed

I think we need to go ahead and get everything over onto an explicit event bus pattern. Please discuss.

pipermerriam · 2018-11-20T04:08:09Z

Here's my top-of-the-head plan.

Database process

restrict to just the base db
server component using get/set/exists/delete events.
client abstraction hides the events internally re-exposing our standard base db API

Chain process

exposes Chain and ChainDB APIs
server component using events for the Chain APIs that we use.
client component can be high level, exposing these APIs using the same Chain interface but using event system under the hood.

Networking process

exposes PeerPool and Peer APIs over event bus
probably want to do the de-coupling of the Protocol and Peer as a pre-requisit to this.
experiment with a high level API for interacting with peers from a client perspective that wraps the event system.

Ethereum Client Process

all of the synchronization code.
should largely just be a client to the other processes
needs to expose a few server type APIs, primarily for consumption by the JSON-RPC server.

JSON-RPC server process

should largely just be a client to the other processes

This pattern gives us things like:

stronger guarantees about not blocking the networking process (and thus not timing out our peers).
sustainable architecture for future plans for ability to run any/all of beacon-chain, mainnet, shards, swarm, pss concurrently.
explicitness about what crosses process boundaries.

cburgdorf · 2018-11-20T10:34:02Z

Database process
server component using get/set/exists/delete events.

I read that as you want one db process and route every db access through the event bus. If that is what you mean then I'm currently 👎 on that because I believe this will have a negative impact on performance. I believe db access needs to be as fast as it can be and funneling that through the event bus will work against that goal.

I think the other route, using an alternative db (e.g. RocksDB) that allows multiple processes to read simultaneously is more promising.

Ethereum Client Process

I see this process mostly as the main process that would host the main EventBus. I believe that most if not all of the remaining logic can be implemented as plugins.

all of the synchronization code.

I think synchronization should become a plugin that runs in an isolated process and expose several sync related events on the event bus. (e.g. NewChainTipEvent etc.) In fact, probably different plugins for light sync, full sync etc. I think we should have building blocks in the main code base but then plugins that run in their own isolated processes can be built out of them. This would also allow better experimentation with different syncing strategies (e.g. if people want to plug in a warp sync, they can) (related: https://github.com/ethereum/py-evm/issues/1020#issuecomment-405853665)

JSON-RPC server process

That's already running in it's own process as an isolated plugin. Did you just mean to list it as a process?

Also, I want to raise awareness that we do not have performance metrics for the EventBus under high load yet. I believe that the current implementation does not handle high frequency traffic well enough just yet. And for some things like routing all db access through it, I'm highly skeptical if it could ever be a suitable option no matter how much we optimize it.

pipermerriam · 2018-11-20T17:51:34Z

Regarding the database issue,

Agreed that we can and should make whatever compromises are necessary to ensure performance numbers stay high, but worth pointing out that we currently already route all database access over a multiprocessing boundary... but whether the existing mechanism or the event bus have measurably different performance footprints is something we would want to check. My understanding is that the current lahja model is that everything flows through a central bus meaning two hops. Maybe we need to establish an abstraction for direct connections to allow the database interactions to be direct connections with only a single hop.

cburgdorf · 2018-11-20T18:17:31Z

Maybe we need to establish an abstraction for direct connections to allow the database interactions to be direct connections with only a single hop.

In fact, having the event bus working without that central coordination layer was on my radar as well
ethereum/lahja#1

That said, if there are multiple consumers for an event (which the one producing the event, in general, should not even have knowledge about) then from the perspective of the process that is broadcasting the event, it is more efficient to just have one hop to the central coordinator and then let the central coordinator make n connections to m different endpoints.

But without getting lost in the specific details, there's a lot we can optimize for efficiency. I just wanted to raise awareness that it may not be as efficient yet or at least we don't know.

I honestly think we should iterate on the RocksDB work you've been doing. Did that already make use of the fact that multiple processes could read simultaneously? Or did that just change the db without exploiting that feature yet?

pipermerriam · 2018-11-24T02:54:24Z

Just had a thought.

If the interactions between synchronization and peers are changed to be event bus based then it gets a lot easier to mock out the peer side of things. This extends to a bunch of other things as well. Maybe the higher level thing here is that the event bus can act as a way to decouple our various sections of functionality from their implementations.

I think this has some positive and negative implications but I wanted to drop this here to make note of it.

cburgdorf · 2018-11-24T08:46:53Z

@pipermerriam yes, that is absolutely true and one of the main upsides of the eventbus pattern in general. You get very loose coupling. Like, events can become your main interface and it becomes easy to swap out functionality against different implementations.

The downside to this is complexity and "fragility" (not sure if it's the best word to describe it) which can be mitigated with more tests.

I'm a fan of the pattern and have used it in many code bases. Coincidentally I learned today that embark (truffle competitor by status) is using the same pattern for the same reasons (flexibility, plugin all the things)

cburgdorf · 2018-11-27T11:24:31Z

Just cross-linking that I've started to put a benchmark together and so far it revealed interesting things such as dropped events.

pipermerriam transferred this issue from ethereum/py-evm Dec 21, 2018

pipermerriam mentioned this issue Jan 8, 2019

PeerPool / Discovery don't answer event bus requests after some time #100

Closed

cburgdorf mentioned this issue Jan 8, 2019

Make RPCModules independent of concrete chain #121

Merged

pipermerriam mentioned this issue Jan 14, 2019

Bcc syncing #163

Closed

cburgdorf mentioned this issue Jan 22, 2019

Goal: Make it possible to move PeerPool into an isolated plugin #213

Closed

pipermerriam closed this as completed Jul 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Event bus all-the-things #79

Event bus all-the-things #79

pipermerriam commented Nov 20, 2018

pipermerriam commented Nov 20, 2018

cburgdorf commented Nov 20, 2018

pipermerriam commented Nov 20, 2018

cburgdorf commented Nov 20, 2018

pipermerriam commented Nov 24, 2018

cburgdorf commented Nov 24, 2018

cburgdorf commented Nov 27, 2018

Event bus all-the-things #79

Event bus all-the-things #79

Comments

pipermerriam commented Nov 20, 2018

What is wrong?

How can it be fixed

pipermerriam commented Nov 20, 2018

cburgdorf commented Nov 20, 2018

pipermerriam commented Nov 20, 2018

cburgdorf commented Nov 20, 2018

pipermerriam commented Nov 24, 2018

cburgdorf commented Nov 24, 2018

cburgdorf commented Nov 27, 2018