Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: autosharding core algorithm #1854

Merged
merged 14 commits into from
Aug 1, 2023
Merged

feat: autosharding core algorithm #1854

merged 14 commits into from
Aug 1, 2023

Conversation

SionoiS
Copy link
Contributor

@SionoiS SionoiS commented Jul 17, 2023

Description

Impl. rendezvous hashing with parameters for bias and weights.

Rendezvous Hashing

We concat. the content topic, application, version, the bytes of the cluster index and the bytes of the shard index
The first 64 bits of the hash is used as the priority value for the shard.
Shards are ordered by this value.
We can apply a weighting function to change the priorities (off by default).
Last step is the choose a shard to use. We always pick the highest priority one by convention.

Content Topic Format

So that app can manipulate the shard choice, 2 prefixes have been added to content topics. Generation & bias. These are implicit when omitted.

The new format now reads. /gen/bias/app/version/subject/codec
But /app/version/subject/codec is still valid and uses the default values for gen and bias

Generation

gen is for determining the total number of shards in TWN. Each generation would have a specific # of shards.

Default: 0

Bias

bias is used to skew the priority values via the weighting function in a specific way.

One of the requirement was to allow app the affect the sharding, to this end a simple 20/80 split between higher shards and lower shards has been added.The lower 20% of shards will be picked more often when adding the "lower20" bias in the topic and "higher80" would favor the other shards.

Default: unbiased

Changes

  • rendezvous hashing algorithm.
  • weighting function
  • content topic parsing
  • updated tests
  • new tests

Tracking #1846

Also, the algo. MUST BE 100% SOLID as changing it will be breaking change!

There's also various optimizations possible. egg. pre-hash the shards then merge the hashes with SplitMix64 & pre-compute some part of the weighting function.
I did not impl. those to keep the it simple but I feel like it would future proof the whole thing. Maybe it's unnecessary?

@SionoiS
Copy link
Contributor Author

SionoiS commented Jul 18, 2023

It breaks the bridge! No...

edit: easy fix thank god!

@SionoiS SionoiS marked this pull request as ready for review July 19, 2023 13:37
@SionoiS
Copy link
Contributor Author

SionoiS commented Jul 19, 2023

I'm adding other tests for reproducibility and a simulation for stats analysis

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making great progress! Would it make sense to formulate this as a short specification even just as a comment/in the description of this issue, stating the main features of the solution:

  • the proposed content topic format
  • requirements re backwards compatibility of the content topic
  • bias mechanism and assumption, etc.

The one thing I'm uncertain about is whether we should consider the k-anonymity vs throughput bias in shards at all when launching. Perhaps we can include this capability, but support/dogfood the network first with assumption of no bias (i.e. "none" by default). The main feature we want out of autosharding is proper traffic distribution for predictable scaling and the simplest way to do that is to consider the gen 0 shards to be exactly equal. This way we can even launch with only two shards, simplifying things even more, but laying the groundwork for a multi-shard solution in next generations. Wdyt?

./content_topic,
./pubsub_topic

const ClusterIndex* = 49152
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to the core logic, so can be addressed later, but what do people think of us simplifying both terminology and ranges here by renaming the cluster index here to "NetworkId" and using 1 from the reserved range? This could be our "mainnet" with 2 (and perhaps some others?) reserved for a "testnet" later on. We can leave 0 for a future mainnet that may be even more uncompromising than what we see as feasible now. All this would require an RFC update, but may be a good way to indicate that autosharding really should be the default way of running the Waku Network, with specific applications, such as Status, simply "forking" by running on their own network IDs with static sharding.

Copy link
Contributor Author

@SionoiS SionoiS Jul 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I'm in favor of simplifying/reordering the namespace. I just don't like the blockchain terms. TWN is not really a network it's more of a meta network (Waku MetaNet tm) or a network of network. I don't want to confuse ppl, TWN is not logically centralized like a blockchain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...my understanding is that TWN actually is a network - it is one instantiation of an opinionated relay network. You could then establish a multi-level forking terminilogy as in "soft fork" where you use the same discovery as TWN, but different shards (i.e. nodes participating in TWN may not relay your messages), or a "hard fork" where you also split the discovery layer (i.e. create a completely separate network somehow). My point of view could be over-simplified though:)

Anyway, +1 for simplifying the cluster/namespace numbering

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that set on specific name here, but to me "network" does not necessarily have to imply logical centralization - it's just a collective noun for the "suggested" cluster and index range for default WN routing. That said, I'm happy if we continue calling it "cluster", though we may want to think about whether we have a "main" cluster and a "test" cluster of shards and to rather opt for simpler cluster indexes (e.g. 1 vs 49152).

Copy link
Contributor Author

@SionoiS SionoiS Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our space, I think when ppl hear "mainnet" they would go 💡 blockchain but maybe that's just me.

Your right that it IS a network but because of the space we live in words have cultural weight to them.
We may want to find different words to differentiate ourselves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coming late to the party but agree with @jm-clius. network id or something similar is more descriptive than cluster. we had some discussions around this during the offsite: https://notes.status.im/waku-offsite-23-day-2#the-idea

@SionoiS

In our space, I think when ppl hear "mainnet" they would go 💡 blockchain but maybe that's just me.

if we move on with RLN, having this mainnet/testnet (networkId) will make more sense since Waku testnet will be linked to goerli (or ony other testnet) and waku mainnet will be linked to ethereum mainnet.

but i totally acknoledge that waku is a bit different than a blockchain.

waku/v2/waku_core/topics/sharding.nim Outdated Show resolved Hide resolved
waku/v2/waku_core/topics/content_topic.nim Show resolved Hide resolved
waku/v2/waku_core/topics/sharding.nim Outdated Show resolved Hide resolved
@jm-clius
Copy link
Contributor

One more thing, should it be possible for applications to manually select their underlying shards? I.e. some mechanism on top of the bias that allows an application to choose their specific shard and bypass the hashing? This will allow some "shard marketplace" where applications could select the shard they best believe works for them and provide the anonymity vs throughput for their use case.

Note: even if we want such a mechanism, I think this can be introduced in future and our initial solution should be as simple as possible (since sophisticated algorithms break more easily and in this instance we never want to break backwards compatibility).

@SionoiS
Copy link
Contributor Author

SionoiS commented Jul 26, 2023

One more thing, should it be possible for applications to manually select their underlying shards? I.e. some mechanism on top of the bias that allows an application to choose their specific shard and bypass the hashing? This will allow some "shard marketplace" where applications could select the shard they best believe works for them and provide the anonymity vs throughput for their use case.

Note: even if we want such a mechanism, I think this can be introduced in future and our initial solution should be as simple as possible (since sophisticated algorithms break more easily and in this instance we never want to break backwards compatibility).

Is that not what static sharding is for?

We can't prevent usage of shards reserved for autosharding from being manually selected anyway.

@SionoiS
Copy link
Contributor Author

SionoiS commented Jul 26, 2023

Thanks for making great progress! Would it make sense to formulate this as a short specification even just as a comment/in the description of this issue, stating the main features of the solution:

  • the proposed content topic format
  • requirements re backwards compatibility of the content topic
  • bias mechanism and assumption, etc.

A mistake on my part I should have wrote a small spec. somewhere. It will make it easier for you guys to review. I will rectify.

The one thing I'm uncertain about is whether we should consider the k-anonymity vs throughput bias in shards at all when launching. Perhaps we can include this capability, but support/dogfood the network first with assumption of no bias (i.e. "none" by default).

I agree it's unnecessary to include a way to bias the algo for now.

The main feature we want out of autosharding is proper traffic distribution for predictable scaling and the simplest way to do that is to consider the gen 0 shards to be exactly equal. This way we can even launch with only two shards, simplifying things even more, but laying the groundwork for a multi-shard solution in next generations. Wdyt?

I agree that's what we want but autosharding v1 cannot directly give us traffic distribution or predictable scaling only indirectly.

I talked with @kaiserd for 90m about this and there's no clear solution right now. I see this as a temporary fix to satisfy the roadmap.

Copy link
Collaborator

@Ivansete-status Ivansete-status left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR SP! I thought I already reviewed it ;P
Anyway, I added a few comments that I hope you find useful!

waku/v2/waku_core/topics/sharding.nim Outdated Show resolved Hide resolved
tests/v2/waku_core/test_sharding.nim Outdated Show resolved Hide resolved
tests/v2/waku_core/test_sharding.nim Outdated Show resolved Hide resolved
tests/v2/waku_core/test_sharding.nim Outdated Show resolved Hide resolved
tests/v2/waku_core/test_sharding.nim Outdated Show resolved Hide resolved
waku/v2/waku_core/topics/sharding.nim Show resolved Hide resolved
waku/v2/waku_core/topics/sharding.nim Outdated Show resolved Hide resolved
waku/v2/waku_core/topics/sharding.nim Show resolved Hide resolved
waku/v2/waku_core/topics/sharding.nim Show resolved Hide resolved
waku/v2/waku_core/topics/sharding.nim Show resolved Hide resolved
@jm-clius
Copy link
Contributor

We can't prevent usage of shards reserved for autosharding from being manually selected anyway.

True. This would be a question of whether there should be a way built into the protocol to do this if we see enough of a useful application of this feature, but I agree that it's probably just a speculative idea for now with easy workaround of manually publishing to a shard of choice.

I agree it's unnecessary to include a way to bias the algo for now.

We could still have this as a hidden feature which we can test ourselves or document as a beta feature. The idea makes sense to me, but I think by default we just want naive hashing with equal-ish treatment of shards as a start.

I agree that's what we want but autosharding v1 cannot directly give us traffic distribution or predictable scaling only indirectly.

True. But I think even a naive mechanism to distribute traffic on multiple shards is going to provide value.

@SionoiS
Copy link
Contributor Author

SionoiS commented Jul 27, 2023

We could still have this as a hidden feature which we can test ourselves or document as a beta feature. The idea makes sense to me, but I think by default we just want naive hashing with equal-ish treatment of shards as a start.

That's one of the reason I picked rendezvous hashing it's very flexible. I'll leave the code there but default will be as you said.

I agree that's what we want but autosharding v1 cannot directly give us traffic distribution or predictable scaling only indirectly.

True. But I think even a naive mechanism to distribute traffic on multiple shards is going to provide value.

That's where I disagree, I think it will cause problems in the future.

@SionoiS
Copy link
Contributor Author

SionoiS commented Jul 27, 2023

I made generation 0 and unbiased implicit and the default for content topics.

@SionoiS
Copy link
Contributor Author

SionoiS commented Jul 27, 2023

I added a way for apps to name the shard used. I also changed the way the hashing is done to be more solid.

@SionoiS
Copy link
Contributor Author

SionoiS commented Jul 28, 2023

I renamed the biases to what they actually do and removed the third prefix as it's unnecessary.

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as an increment and a good first start to get the hashing logic ready.

@SionoiS SionoiS merged commit bbff1ac into master Aug 1, 2023
@SionoiS SionoiS deleted the hashing branch August 1, 2023 13:05
@SionoiS SionoiS self-assigned this Aug 2, 2023
@SionoiS SionoiS mentioned this pull request Aug 4, 2023
3 tasks
@fryorcraken fryorcraken added E:2023-10k-users E:1.2: Autosharding for autoscaling See https://github.com/waku-org/pm/issues/65 for details and removed E:2023-10k-users labels Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E:1.2: Autosharding for autoscaling See https://github.com/waku-org/pm/issues/65 for details
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants