Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to generate comb uuid? #53

Closed
jaimesangcap opened this issue Mar 25, 2015 · 16 comments
Closed

how to generate comb uuid? #53

jaimesangcap opened this issue Mar 25, 2015 · 16 comments

Comments

@jaimesangcap
Copy link

I've tried to generate atleast thousands of uuid using Uuid::uuid4() but they are not in sequence.

@fabre-thibaud
Copy link

Current release does not provide COMB support. You'll have to wait for the 3.0 release to generate COMB's.

@jaimesangcap
Copy link
Author

I see. I thought I've read before that it was already supported. Thank you for the information

@hi2u
Copy link

hi2u commented Nov 4, 2016

Now that it's in the library, it would be good if it was documented in the README.md

I couldn't figure it out until I came across https://benramsey.com/blog/2016/04/ramsey-uuid/

$factory = new \Ramsey\Uuid\UuidFactory();
$generator = new \Ramsey\Uuid\Generator\CombGenerator($factory->getRandomGenerator(), $factory->getNumberConverter());
$codec = new \Ramsey\Uuid\Codec\TimestampFirstCombCodec($factory->getUuidBuilder());

$factory->setRandomGenerator($generator);
$factory->setCodec($codec);

\Ramsey\Uuid\Uuid::setFactory($factory);

$uuid = \Ramsey\Uuid\Uuid::uuid4();

Is this the easiest way to do it?

Would it also be worth adding a simple method to the main class such as Uuid::uuid4comb() that does it for us without pasting all the above into our own code?

@ramsey
Copy link
Owner

ramsey commented Nov 4, 2016

Thanks for the recommendation. I'm currently working on documentation for this and hope to have it posted soon.

@1ma
Copy link
Contributor

1ma commented Aug 7, 2017

How feasible would it be to add a new static method to Uuid to generate this kind of UUIDs?

Context: I did a benchmark on PostgreSQL 9.6: inserting 100milion of indexed uuid4's is about 30 times slower than using the comb variant. I believe this type of uuid should be considered a "first class citizen" of the library.

@ramsey
Copy link
Owner

ramsey commented Aug 9, 2017

@1ma That sounds like it could be reasonable to me. Something like Uuid::comb()?

@1ma
Copy link
Contributor

1ma commented Aug 9, 2017

Yes. I'll study the code and see if I can draft a PR.

@ramsey
Copy link
Owner

ramsey commented Aug 9, 2017

Sounds great. Thanks!

@kael-shipman
Copy link

Just a reminder: Looks like this never made it into the docs. I'm using @hi2u's implementation for now but would love to know if there's a "standard" way to do it.

@broofa
Copy link

broofa commented Jan 31, 2019

FWIW, I decided to pass on supporting "comb" uuids in node-uuid. See uuidjs/uuid#303 (comment)

tl;dr: RFC4122 Version 1 UUIDs serve basically the same purpose.

@hi2u
Copy link

hi2u commented Feb 1, 2019

@broofa The difference is that UUIDv1 isn't actually in timestamp order.

Generate a few and you'll see the characters at the start moving faster than the ones in the middle. i.e. It's kind of like ss:mm:hh dd-mm-yyyy rather than yyyy-mm-dd hh:mm:ss. That's why comb UUIDs came about.

@kael-shipman
Copy link

Hm. I wonder if that's an implementation detail. I just generated 40 on this page and they did appear to be in ascending order.

For me, the problem with UUIDv1 is their guessability. They're little more than a large sequential ID with some extra spatial data that guarantees they won't collide when mixed with data from other computers.

Comb v4 UUIDs are a hybrid of these two models, with a large ascending first segment and a randomized second half, giving you index efficiency, obscurity (I know, not to be confused with security!), and relatively uniqueness. Win-win-win in my book.

@ramsey
Copy link
Owner

ramsey commented Feb 1, 2019

V1 UUIDs should not appear to be ascending. They would get scattered in your DB index because of the way the time fields are reordered. I’m not sure why that benchmark shows them to be almost identical to COMB sequential v1 UUIDs in performance. Perhaps it’s because they do sort of appear sequential when generated quickly over a short period of time.

@broofa
Copy link

broofa commented Feb 1, 2019

Hmm... 'Not sure I follow. UUID fields start with time_lo, time_mid, then time_high... i.e low-order time fields correspond to low-order bytes, no?
All other fields being equal (Namely node and clock_seq), and barring the endian-ness of all this, doesn't that suggest that v1 ids (from the same host) are expected to increase monotonically?

Edit: Missed the bit in the RFC that says, "The fields are presented with the most significant one first." so... Octet #0 is the most significant? (Sorry, it's been awhile since I had to actually care about this stuff. 'Not actually sure I want to care about it now, either. ;-) )

@ramsey
Copy link
Owner

ramsey commented Feb 1, 2019

When you generate the time for the UUID, it's in this format:

time_high + time_mid + time_lo

Then, according to RFC 4122, we rearrange the bytes in the time integer to:

time_lo + time_mid + time_hi

So, the fields aren't in an order that can be guaranteed to increase monotonically.

Not actually sure I want to care about it now, either.

I know how you feel. 😆

@ramsey
Copy link
Owner

ramsey commented Jan 16, 2020

Another discussion pointed me back to this, and I wanted to clarify something...

Octet #0 is the most significant?

Yes. This is because bytes in UUIDs are arranged in "network byte order" (a.k.a. big endian).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants