Add UUID v7 support #15

khasinski · 2022-11-02T20:44:27Z

UUIDv7 (currently in RFC) is a new version of UUID that allows for time ordering values thanks to a unix timestamp component. Can be helpful to iterate over a large set of data (think for example of backfilling migrations in_batches) while still maintaining some of the randomness of UUIDv4.

see ~~https://datatracker.ietf.org/doc/draft-peabody-dispatch-new-uuid-format/04/~~
There is an updated version of this document: https://datatracker.ietf.org/doc/draft-ietf-uuidrev-rfc4122bis/

UUIDv7, currently RFC is a new version that allows for time ordering thanks to a unix timestamp component. @see https://datatracker.ietf.org/doc/draft-peabody-dispatch-new-uuid-format/04/

skull-squadron · 2023-01-30T07:08:08Z

lib/random/formatter.rb

+  # See RFC 4122 for details of UUID.
+  #
+  def uuid_v7
+    ts = [Process.clock_gettime(Process::CLOCK_REALTIME, :millisecond)].pack('Q>').unpack('nNn').drop(1)


The RFC is fundamentally flawed >> and will not work at scale if monotonic total ordering is required <<. CLOCK_REALTIME skips forwards and backwards on many events, just to name a few: hibernation, NTP adjustments, daylight savings time, and leapseconds. And CLOCK_MONOTOMIC_RAW is not suitable for use between systems. If there will only ever be a single system generating UUIDs, then CLOCK_MONOTONIC_RAW fallback on CLOCK_MONOTONIC is appropriate. If multiple systems expect canonical total monotonic ordering, then deploy PTP and use TAI ( CLOCK_TAI on Linux ). CLOCK_REALTIME with a timezone of UTC can never be monotonic due leapseconds. UTC(t) = TAI(t) - leap_seconds_for_year_and_month(t(m, y)) data here. TAI is the primary reliable, global monotonic time standard and essential to providing lock-free, unique, total ordering across multiple systems. The fallback method to global ordering is to have a single (possible SPoF risk) UUID master issuer. TL;DR In any case, this type of UUID won't be useful for anything important.

Thank you for your comment! Is monotonic total ordering required though? From my perspective there are a lot of use cases where a certain instability is accepted while an approximate monotonic ordering will help.

Consider for example a batching mechanism for backfills in a typical RoR application:

Model.in_batches do |batch| # Loads records by 1000 keeping the latest id batch.update_all(something: :something) # batch operation that would normally lock the table, but it's now locking only selected rows end

In the above-mentioned example having an UUIDv4 as a primary key means that the records don't have a stable order. The occasional inconsistency of UUIDv7 is usually covered by the batch size.

However I'd be open to rewrite this to use TAI (perhaps as an option) if necessary.

shreyasbharath · 2023-04-25T07:56:25Z

Can we merge this?

pupeno · 2023-04-28T07:45:30Z

I'm a fan of using UUID as identifiers, but yeah, sometimes the loss of monotonically increasing is a pain. UUIDv7 would be helpful in many cases. I know technically you can have clock issues, but those clock issues tend to cause problems in the millisecond ranges while most user-generated data tend to be in the seconds or minute ranges for the apps that I build, so it's not a problem. Knowing which record was created first when they were created 2 days apart, just from the id, can be useful.

I think this can be a middle ground before going to a central monotonically increasing generation of ids, ala Twitter Snowflake.

nevans · 2023-06-23T20:26:05Z

Sorry, I didn't realize that lib/random/formatter.rb belonged to this repository, and I made a very similar PR here: ruby/ruby#7953.

My implementation was originally almost identical to this. But after someone made a comment about monotonicity and I thought about it a little bit, I added an optional part of the draft RFC: a kwarg for 0..12 extra timestamp bits. This changes the timestamp precision from 1ms to up to ~250ns, at the loss of up to 12 bits of randomness, and slightly more complex code.

I agree with @khasinski and @pupeno that perfect monotonicity isn't necessary for most use-cases, and in the places where it is necessary, you probably need to handle it in a centralized DB server anyway (and probably a special purpose database). Considering that my current DBs use v4 UUIDs with zero monotonicity, 1ms precision is certainly good enough for nearly anything I'd use it in. And for simple single-node monotonicity, you can always simply sort, like so: Array.new(1000) { SecureRandom.uuid_v7 }.sort.

IMO, the other techniques provided by the RFC for improving monotonicity are all far too complicated and come with far too many trade-offs. If a ruby application truly needs a monotonicity guarantees better than 240ns of precision (single node, and also whatever clock skew your ntp-managed servers might have), then that application knows what tradeoffs make the most sense and can implement whatever global system state (counters, etc) it needs.

nevans · 2023-06-29T22:50:01Z

FWIW, I added my slightly different PR here: #19.

khasinski force-pushed the uuid-v7 branch 3 times, most recently from 8e36c6a to 72bc8e6 Compare November 2, 2022 20:48

khasinski marked this pull request as ready for review November 2, 2022 21:04

khasinski force-pushed the uuid-v7 branch 5 times, most recently from 21ffc3b to def6ad2 Compare November 3, 2022 07:48

Add uuid_v7 support

84fa9eb

UUIDv7, currently RFC is a new version that allows for time ordering thanks to a unix timestamp component. @see https://datatracker.ietf.org/doc/draft-peabody-dispatch-new-uuid-format/04/

khasinski force-pushed the uuid-v7 branch from def6ad2 to 84fa9eb Compare November 3, 2022 07:48

skull-squadron reviewed Jan 30, 2023

View reviewed changes

kachick mentioned this pull request Mar 7, 2023

UUIDv6, UUIDv7, UUIDv8 kachick/ruby-ulid#37

Closed

khasinski requested a review from skull-squadron April 23, 2023 19:01

This was referenced Jun 29, 2023

Add support for UUID version 7 #19

Merged

Add support for UUID version 7 ruby/ruby#7953

Closed

khasinski closed this Sep 26, 2023

khasinski deleted the uuid-v7 branch September 26, 2023 15:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add UUID v7 support #15

Add UUID v7 support #15

khasinski commented Nov 2, 2022 •

edited

Loading

skull-squadron Jan 30, 2023

khasinski Jan 30, 2023 •

edited

Loading

shreyasbharath commented Apr 25, 2023

pupeno commented Apr 28, 2023

nevans commented Jun 23, 2023 •

edited

Loading

nevans commented Jun 29, 2023 •

edited

Loading

Add UUID v7 support #15

Add UUID v7 support #15

Conversation

khasinski commented Nov 2, 2022 • edited Loading

skull-squadron Jan 30, 2023

Choose a reason for hiding this comment

khasinski Jan 30, 2023 • edited Loading

Choose a reason for hiding this comment

shreyasbharath commented Apr 25, 2023

pupeno commented Apr 28, 2023

nevans commented Jun 23, 2023 • edited Loading

nevans commented Jun 29, 2023 • edited Loading

khasinski commented Nov 2, 2022 •

edited

Loading

khasinski Jan 30, 2023 •

edited

Loading

nevans commented Jun 23, 2023 •

edited

Loading

nevans commented Jun 29, 2023 •

edited

Loading