Skip to content

gh-89083: add support for UUID version 7 (RFC 9562) #121119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 70 commits into from
Mar 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
42d55b4
add UUIDv7 implementation
picnixz Jun 28, 2024
6826fa1
add tests
picnixz Jun 28, 2024
edc2cab
blurb
picnixz Jun 28, 2024
c6d26b6
update CHANGELOG
picnixz Jun 28, 2024
2ddb4b8
update RFC number
picnixz Jun 28, 2024
bcd1417
add TODO in the docs
picnixz Jun 28, 2024
4630c8f
Merge branch 'main' into uuid-v7-method-1
picnixz Jul 22, 2024
cd80afb
Merge branch 'main' into uuid-v7-89083
picnixz Aug 21, 2024
c3d4745
add UUIDv8 implementation
picnixz Aug 22, 2024
392d289
add tests
picnixz Aug 22, 2024
26889ea
blurb
picnixz Aug 22, 2024
44b66e6
add What's New entry
picnixz Aug 22, 2024
7be6dc4
add docs
picnixz Aug 22, 2024
8ba3d8b
Improve hexadecimal masks reading
picnixz Sep 25, 2024
a14ae9b
add uniqueness test
picnixz Sep 25, 2024
7a169c9
Update mentions to RFC 4122 to RFC 4122/9562 when possible.
picnixz Sep 25, 2024
b082c90
Update docs
picnixz Sep 25, 2024
94c70e9
Merge branch 'main' into uuid-v8-89083
picnixz Sep 25, 2024
05b7a2b
Merge branch 'main' into uuid-v7-method-1
hugovk Nov 2, 2024
275deb7
Merge branch 'main' into uuid-v8-89083
hugovk Nov 2, 2024
5e97cc3
Apply suggestions from code review
picnixz Nov 11, 2024
051f34e
Update Lib/test/test_uuid.py
picnixz Nov 11, 2024
bdf9a77
Apply suggestions from code review
picnixz Nov 11, 2024
00661fc
Merge remote-tracking branch 'origin/uuid-v8-89083'
picnixz Nov 13, 2024
0474de4
Merge remote-tracking branch 'origin/uuid-v8-89083' into uuid-v7-89083
picnixz Nov 14, 2024
a446d53
Merge remote-tracking branch 'upstream/main' into uuid-v7-89083
picnixz Nov 14, 2024
2e39072
update CLI
picnixz Nov 14, 2024
ebc1a07
Merge branch 'main' into uuid-v7-89083
picnixz Nov 14, 2024
694e07f
post-merge
picnixz Nov 14, 2024
965dbc8
Merge remote-tracking branch 'origin/uuid-v7-method-1' into uuid-v7-8…
picnixz Nov 14, 2024
7ff4368
improve readability
picnixz Nov 14, 2024
7c3cab6
post-merge
picnixz Nov 14, 2024
e758741
uniqueness test
picnixz Nov 14, 2024
c18d0c4
improve test comments
picnixz Nov 14, 2024
2df6f41
Merge remote-tracking branch 'upstream/main'
picnixz Nov 15, 2024
6fcb6a1
fix lint
picnixz Nov 15, 2024
f6048c9
Merge branch 'main' into uuid-v7-89083
picnixz Nov 15, 2024
be3f024
post-merge
picnixz Nov 15, 2024
99c6761
Merge branch 'main' into uuid-v7-89083
picnixz Nov 15, 2024
06befca
use versionchanged instead of versionadded
picnixz Nov 15, 2024
2aacadf
Merge branch 'main' into uuid-v7-method-1
picnixz Nov 16, 2024
f7f536e
Merge branch 'main' into uuid-v7-method-1
picnixz Dec 5, 2024
aee2898
improve UUIDv7 tests readability
picnixz Dec 19, 2024
1a5ac19
improve UUIDv7 uniqueness tests
picnixz Dec 19, 2024
8764b28
Merge branch 'main' into uuid-v7-method-1
picnixz Dec 21, 2024
af0baef
Merge branch 'main' into uuid-v7-method-1
picnixz Jan 11, 2025
939b5a8
Merge branch 'main' into feat/uuid/v7-89083
picnixz Jan 20, 2025
ef85b20
use `UUID._from_int` for UUIDv7 and remove `divmod` usage
picnixz Jan 20, 2025
2d08821
Merge branch 'main' into uuid-v7-method-1
picnixz Jan 20, 2025
eaa9ad4
Merge branch 'main' into uuid-v7-method-1
picnixz Feb 17, 2025
571d2fe
backport Victor's review on UUIDv6
picnixz Feb 23, 2025
f9ac658
address Victor's review
picnixz Feb 25, 2025
a756b9d
remove mention of UNIX_EPOCH + 10k years as the proof is long
picnixz Feb 25, 2025
4406796
import `time` globally as UUIDv7 is likely to be used now
picnixz Feb 25, 2025
d4eeded
run half-black
picnixz Feb 25, 2025
0e54a72
update docs
picnixz Feb 25, 2025
40ab2fa
Revert "run half-black"
picnixz Feb 25, 2025
5ee85ad
Merge branch 'main' into uuid-v7-method-1
picnixz Feb 25, 2025
3ce8943
add blank line for readability
picnixz Feb 25, 2025
59e6d7e
update comment
picnixz Feb 25, 2025
437d8cf
Update Lib/uuid.py
picnixz Feb 25, 2025
2d917b0
Merge remote-tracking branch 'upstream/main' into feat/uuid/v7-89083
picnixz Mar 2, 2025
73ab656
improve online docs
picnixz Mar 3, 2025
54d07ae
`constructor` -> `factory` in labels
picnixz Mar 3, 2025
6d76389
reword prolog
picnixz Mar 3, 2025
bd4ab55
'is outside the scope' -> 'exceeds the scope'
picnixz Mar 3, 2025
e9ddb74
Apply suggestions from code review
picnixz Mar 3, 2025
8755de0
apply PEP-8 only for UUID6, UUID7 and UUID8
picnixz Mar 3, 2025
12d7ad4
small fix
merwok Mar 3, 2025
560d87c
avoid complex language :)
picnixz Mar 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 21 additions & 6 deletions Doc/library/uuid.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,10 @@
--------------

This module provides immutable :class:`UUID` objects (the :class:`UUID` class)
and the functions :func:`uuid1`, :func:`uuid3`, :func:`uuid4`, :func:`uuid5`,
:func:`uuid6`, and :func:`uuid8` for generating version 1, 3, 4, 5, 6,
and 8 UUIDs as specified in :rfc:`9562` (which supersedes :rfc:`4122`).
and :ref:`functions <uuid-factory-functions>` for generating UUIDs corresponding
to a specific UUID version as specified in :rfc:`9562` (which supersedes :rfc:`4122`),
for example, :func:`uuid1` for UUID version 1, :func:`uuid3` for UUID version 3, and so on.
Note that UUID version 2 is deliberately omitted as it is outside the scope of the RFC.

If all you want is a unique ID, you should probably call :func:`uuid1` or
:func:`uuid4`. Note that :func:`uuid1` may compromise privacy since it creates
Expand Down Expand Up @@ -154,7 +155,7 @@ which relays any information about the UUID's safety, using this enumeration:
:const:`RFC_4122`).

.. versionchanged:: next
Added UUID versions 6 and 8.
Added UUID versions 6, 7 and 8.


.. attribute:: UUID.is_safe
Expand Down Expand Up @@ -185,6 +186,8 @@ The :mod:`uuid` module defines the following functions:
globally unique, while the latter are not.


.. _uuid-factory-functions:

.. function:: uuid1(node=None, clock_seq=None)

Generate a UUID from a host ID, sequence number, and the current time. If *node*
Expand Down Expand Up @@ -228,6 +231,18 @@ The :mod:`uuid` module defines the following functions:
.. versionadded:: next


.. function:: uuid7()

Generate a time-based UUID according to
:rfc:`RFC 9562, §5.7 <9562#section-5.7>`.

For portability across platforms lacking sub-millisecond precision, UUIDs
produced by this function embed a 48-bit timestamp and use a 42-bit counter
to guarantee monotonicity within a millisecond.

.. versionadded:: next


.. function:: uuid8(a=None, b=None, c=None)

Generate a pseudo-random UUID according to
Expand Down Expand Up @@ -330,7 +345,7 @@ The :mod:`uuid` module can be executed as a script from the command line.

.. code-block:: sh

python -m uuid [-h] [-u {uuid1,uuid3,uuid4,uuid5,uuid6,uuid8}] [-n NAMESPACE] [-N NAME]
python -m uuid [-h] [-u {uuid1,uuid3,uuid4,uuid5,uuid6,uuid7,uuid8}] [-n NAMESPACE] [-N NAME]

The following options are accepted:

Expand All @@ -347,7 +362,7 @@ The following options are accepted:
is used.

.. versionchanged:: next
Allow generating UUID versions 6 and 8.
Allow generating UUID versions 6, 7 and 8.

.. option:: -n <namespace>
--namespace <namespace>
Expand Down
5 changes: 3 additions & 2 deletions Doc/whatsnew/3.14.rst
Original file line number Diff line number Diff line change
Expand Up @@ -919,8 +919,9 @@ urllib
uuid
----

* Add support for UUID versions 6 and 8 via :func:`uuid.uuid6` and
:func:`uuid.uuid8` respectively, as specified in :rfc:`9562`.
* Add support for UUID versions 6, 7, and 8 via :func:`uuid.uuid6`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to be pedantic, but is via correct? It seems to suggest that the functions are the only thing that support these versions, but the support is added in the UUID class and the functions are there too as a convenience.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but in general, we don't really want people to directly use the UUID class. Strictly speaking, we're only adding the support for the version value but we don't check how it's been generated.

I prefer users to actually use the factories. Otherwise, I can say that the UUID class now accepts version to be 6, 7, or 8.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly disagree but won’t argue 🙂

:func:`uuid.uuid7`, and :func:`uuid.uuid8` respectively, as specified
in :rfc:`9562`.
(Contributed by Bénédikt Tran in :gh:`89083`.)

* :const:`uuid.NIL` and :const:`uuid.MAX` are now available to represent the
Expand Down
200 changes: 200 additions & 0 deletions Lib/test/test_uuid.py
Original file line number Diff line number Diff line change
Expand Up @@ -871,6 +871,206 @@ def test_uuid6_test_vectors(self):
equal((u.int >> 80) & 0xffff, 0x232a)
equal((u.int >> 96) & 0xffff_ffff, 0x1ec9_414c)

def test_uuid7(self):
equal = self.assertEqual
u = self.uuid.uuid7()
equal(u.variant, self.uuid.RFC_4122)
equal(u.version, 7)

# 1 Jan 2023 12:34:56.123_456_789
timestamp_ns = 1672533296_123_456_789 # ns precision
timestamp_ms, _ = divmod(timestamp_ns, 1_000_000)

for _ in range(100):
counter_hi = random.getrandbits(11)
counter_lo = random.getrandbits(30)
counter = (counter_hi << 30) | counter_lo

tail = random.getrandbits(32)
# effective number of bits is 32 + 30 + 11 = 73
random_bits = counter << 32 | tail

# set all remaining MSB of fake random bits to 1 to ensure that
# the implementation correctly removes them
random_bits = (((1 << 7) - 1) << 73) | random_bits
random_data = random_bits.to_bytes(10)

with (
mock.patch.multiple(
self.uuid,
_last_timestamp_v7=None,
_last_counter_v7=0,
),
mock.patch('time.time_ns', return_value=timestamp_ns),
mock.patch('os.urandom', return_value=random_data) as urand
):
u = self.uuid.uuid7()
urand.assert_called_once_with(10)
equal(u.variant, self.uuid.RFC_4122)
equal(u.version, 7)

equal(self.uuid._last_timestamp_v7, timestamp_ms)
equal(self.uuid._last_counter_v7, counter)

unix_ts_ms = timestamp_ms & 0xffff_ffff_ffff
equal((u.int >> 80) & 0xffff_ffff_ffff, unix_ts_ms)

equal((u.int >> 75) & 1, 0) # check that the MSB is 0
equal((u.int >> 64) & 0xfff, counter_hi)
equal((u.int >> 32) & 0x3fff_ffff, counter_lo)
equal(u.int & 0xffff_ffff, tail)

def test_uuid7_uniqueness(self):
# Test that UUIDv7-generated values are unique.
#
# While UUIDv8 has an entropy of 122 bits, those 122 bits may not
# necessarily be sampled from a PRNG. On the other hand, UUIDv7
# uses os.urandom() as a PRNG which features better randomness.
N = 1000
uuids = {self.uuid.uuid7() for _ in range(N)}
self.assertEqual(len(uuids), N)

versions = {u.version for u in uuids}
self.assertSetEqual(versions, {7})

def test_uuid7_monotonicity(self):
equal = self.assertEqual

us = [self.uuid.uuid7() for _ in range(10_000)]
equal(us, sorted(us))

with mock.patch.multiple(
self.uuid,
_last_timestamp_v7=0,
_last_counter_v7=0,
):
# 1 Jan 2023 12:34:56.123_456_789
timestamp_ns = 1672533296_123_456_789 # ns precision
timestamp_ms, _ = divmod(timestamp_ns, 1_000_000)

# counter_{hi,lo} are chosen so that "counter + 1" does not overflow
counter_hi = random.getrandbits(11)
counter_lo = random.getrandbits(29)
counter = (counter_hi << 30) | counter_lo
self.assertLess(counter + 1, 0x3ff_ffff_ffff)

tail = random.getrandbits(32)
random_bits = counter << 32 | tail
random_data = random_bits.to_bytes(10)

with (
mock.patch('time.time_ns', return_value=timestamp_ns),
mock.patch('os.urandom', return_value=random_data) as urand
):
u1 = self.uuid.uuid7()
urand.assert_called_once_with(10)
equal(self.uuid._last_timestamp_v7, timestamp_ms)
equal(self.uuid._last_counter_v7, counter)
equal((u1.int >> 64) & 0xfff, counter_hi)
equal((u1.int >> 32) & 0x3fff_ffff, counter_lo)
equal(u1.int & 0xffff_ffff, tail)

# 1 Jan 2023 12:34:56.123_457_032 (same millisecond but not same ns)
next_timestamp_ns = 1672533296_123_457_032
next_timestamp_ms, _ = divmod(timestamp_ns, 1_000_000)
equal(timestamp_ms, next_timestamp_ms)

next_tail_bytes = os.urandom(4)
next_fail = int.from_bytes(next_tail_bytes)

with (
mock.patch('time.time_ns', return_value=next_timestamp_ns),
mock.patch('os.urandom', return_value=next_tail_bytes) as urand
):
u2 = self.uuid.uuid7()
urand.assert_called_once_with(4)
# same milli-second
equal(self.uuid._last_timestamp_v7, timestamp_ms)
# 42-bit counter advanced by 1
equal(self.uuid._last_counter_v7, counter + 1)
equal((u2.int >> 64) & 0xfff, counter_hi)
equal((u2.int >> 32) & 0x3fff_ffff, counter_lo + 1)
equal(u2.int & 0xffff_ffff, next_fail)

self.assertLess(u1, u2)

def test_uuid7_timestamp_backwards(self):
equal = self.assertEqual
# 1 Jan 2023 12:34:56.123_456_789
timestamp_ns = 1672533296_123_456_789 # ns precision
timestamp_ms, _ = divmod(timestamp_ns, 1_000_000)
fake_last_timestamp_v7 = timestamp_ms + 1

# counter_{hi,lo} are chosen so that "counter + 1" does not overflow
counter_hi = random.getrandbits(11)
counter_lo = random.getrandbits(29)
counter = (counter_hi << 30) | counter_lo
self.assertLess(counter + 1, 0x3ff_ffff_ffff)

tail_bytes = os.urandom(4)
tail = int.from_bytes(tail_bytes)

with (
mock.patch.multiple(
self.uuid,
_last_timestamp_v7=fake_last_timestamp_v7,
_last_counter_v7=counter,
),
mock.patch('time.time_ns', return_value=timestamp_ns),
mock.patch('os.urandom', return_value=tail_bytes) as urand
):
u = self.uuid.uuid7()
urand.assert_called_once_with(4)
equal(u.variant, self.uuid.RFC_4122)
equal(u.version, 7)
equal(self.uuid._last_timestamp_v7, fake_last_timestamp_v7 + 1)
unix_ts_ms = (fake_last_timestamp_v7 + 1) & 0xffff_ffff_ffff
equal((u.int >> 80) & 0xffff_ffff_ffff, unix_ts_ms)
# 42-bit counter advanced by 1
equal(self.uuid._last_counter_v7, counter + 1)
equal((u.int >> 64) & 0xfff, counter_hi)
# 42-bit counter advanced by 1 (counter_hi is untouched)
equal((u.int >> 32) & 0x3fff_ffff, counter_lo + 1)
equal(u.int & 0xffff_ffff, tail)

def test_uuid7_overflow_counter(self):
equal = self.assertEqual
# 1 Jan 2023 12:34:56.123_456_789
timestamp_ns = 1672533296_123_456_789 # ns precision
timestamp_ms, _ = divmod(timestamp_ns, 1_000_000)

new_counter_hi = random.getrandbits(11)
new_counter_lo = random.getrandbits(30)
new_counter = (new_counter_hi << 30) | new_counter_lo

tail = random.getrandbits(32)
random_bits = (new_counter << 32) | tail
random_data = random_bits.to_bytes(10)

with (
mock.patch.multiple(
self.uuid,
_last_timestamp_v7=timestamp_ms,
# same timestamp, but force an overflow on the counter
_last_counter_v7=0x3ff_ffff_ffff,
),
mock.patch('time.time_ns', return_value=timestamp_ns),
mock.patch('os.urandom', return_value=random_data) as urand
):
u = self.uuid.uuid7()
urand.assert_called_with(10)
equal(u.variant, self.uuid.RFC_4122)
equal(u.version, 7)
# timestamp advanced due to overflow
equal(self.uuid._last_timestamp_v7, timestamp_ms + 1)
unix_ts_ms = (timestamp_ms + 1) & 0xffff_ffff_ffff
equal((u.int >> 80) & 0xffff_ffff_ffff, unix_ts_ms)
# counter overflowed, so we picked a new one
equal(self.uuid._last_counter_v7, new_counter)
equal((u.int >> 64) & 0xfff, new_counter_hi)
equal((u.int >> 32) & 0x3fff_ffff, new_counter_lo)
equal(u.int & 0xffff_ffff, tail)

def test_uuid8(self):
equal = self.assertEqual
u = self.uuid.uuid8()
Expand Down
Loading
Loading