You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I can simulate ~200 nodes on my laptop, or rent beefy AWS servers for more when required.
This is an example of a run with 100 nodes, 30 publishers, 22 connections per nodes, and a fixed 100ms of RTT.
After 20 seconds, a massive blackhole attack start, where the 70 remaining nodes are swallowing every messages silently.
With a pretty simple scoring setup (only relying on firstMessageDeliveries & opportunisticGraft), the effect on latency is quite minimal, and we don't get any message losses, which is very good. I'm hoping to generalize to scoring setup, and test it in a wide variety of scenarios to make sure it holds.
The text was updated successfully, but these errors were encountered:
Menduist
changed the title
Simulations & tests to improve gossipsub performances & reliability (via testground)
Simulations to improve gossipsub performances & reliability
Dec 21, 2022
Today, I did a simulation to understand the impact of the amount of "outbound only" nodes in a network.
This is critical to waku, since it's going to be a network with a majority of nodes "in the wild" (compared to "in a DC"), with a good chunk of nodes relying on hole-punching, that can sometimes fail.
Here are some results:
I'm a bit surprised to already see an effect above 70%, something to investigate to figure if it's a bug or an emerging behavior
Use testground to:
I'll use this issue to track my progress on the matter.
Here is my working branch: vacp2p/libp2p-test-plans@46e2269...status-im:libp2p-test-plans:pubsub2
To be able to use wireshark, I've started to implement #821
I've quickly noticed that Naggle was causing issues, opened #822 which is waiting on chronos to be merged.
Also found a bug in yamux #823
And obviously a few bugs in gossipsub #827
I can simulate ~200 nodes on my laptop, or rent beefy AWS servers for more when required.
This is an example of a run with 100 nodes, 30 publishers, 22 connections per nodes, and a fixed 100ms of RTT.
After 20 seconds, a massive blackhole attack start, where the 70 remaining nodes are swallowing every messages silently.
With a pretty simple scoring setup (only relying on
firstMessageDeliveries
&opportunisticGraft
), the effect on latency is quite minimal, and we don't get any message losses, which is very good. I'm hoping to generalize to scoring setup, and test it in a wide variety of scenarios to make sure it holds.The text was updated successfully, but these errors were encountered: