-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: why so large counter? (42bits) #13
Comments
Hi! That's a very good question. This question relates to unguessability and uniqueness properties. As for unguessability, the 42-bit counter obviously sacrifices this property, because 96 bits of a UUID can be derived from the immediately preceding value. I didn't put priority on unguessability because it is hard by design of UUIDv7 to ensure unguessability given the limited bit space available. As for uniqueness, let's compare a 42-bit randomly seeded counter and a 42-bit random number field. In summary, when you generate N IDs for each scenario:
With the simple random number field, the probability of having at least one duplicate is known as the birthday probability and can be calculated using the formula in the article. With the randomly seeded counter, you will pick an initial counter value randomly for each batch of consecutive IDs, and as long as the initial values are sufficiently distant from each other, you will never get a duplicate. Such a probability can be calculated using the near-match variant also in the article. Plug parameters in the formulae and you will find the conclusion in the above summary. You can read a similar discussion in this blog post, too. For the general UUID use cases, I believe the probability of having at least one duplicate is very important property. That's the reason why I selected the 42-bit counter. |
Thanks. I see that a rust implementation also use the same/similar format as you, possibly based on your design: https://kodraus.github.io/rust/2024/06/24/uuid-v7-counters.html. How do you think this design compares to eg. Ulid's that uses all 80bits as counter? Better, worse, equal? |
I believe I improved ULID's 80-bit counter by ensuring a certain level of unguessability with the trailing 32-bit entropy. 32 bits won't be strong enough against offline brute-force attacks but might be practically sufficient against attacks over the network. The counter length is smaller than ULID's, but 42 bits are already overkill for realistic use cases. Plus, the smart rollover handling using the timestamp field as a spare counter has made counter overflow a negligible event. From the implementation standpoint, a 42-bit counter is much easier to implement because it fits in the 64-bit int/float range, than a 80-bit counter which needs BigInt arithmetics.. |
I agree. |
Hi. I was making my own and was thinking about a randomly seeded counter of 12, 20 or 26 bits, with +1 or a random increment. Even 12 bits with +1 would rarely rollover for me, only when the random seed is close to the max (a problem with any size random seeded counter).
Then I saw this using surprisingly large 42bits counter with +1. My thinking is, that a large counter will have many bits that is unchanged when doing +1, and this will decrease the randomness /entropy?
So my thinking is, have a counter that is fully "utilized" (changes as many bits as possible) compared to how many Uuids can be made in the same ms.
But probably there is a reason here that I am not seeing, I am guessing it can have to do with supporting time going back?
Thanks.
The text was updated successfully, but these errors were encountered: