gh-89083: add support for UUID version 7 (RFC 9562) #121119

picnixz · 2024-06-28T09:46:15Z

Based on the discussion in #89083 and https://discuss.python.org/t/rfc-4122-9562-uuid-version-7-and-8-implementation/56725/2, this is the implementation that I suggest for the standard library.

The documentation is still missing because I don't have a good formulation for now.

In this PR, I did not include the following:

mutex guards
timestamp offsets

The reason is that I want to keep the first implementation simple for the sake of review. In addition, we did not give the add mutex for UUIDv1 so I don't want to do it only for v7.

@sergeyprokhorenko I don't know if you have the answer, but is there any safe guards if the timestamp overflows actually? or do we just don't care at all for now? (like, leave the problem for the future generations?)

Issue: Support UUIDv6, UUIDv7, and UUIDv8 from RFC 9562 #89083

📚 Documentation preview 📚: https://cpython-previews--121119.org.readthedocs.build/

sergeyprokhorenko · 2024-06-28T11:12:53Z

@sergeyprokhorenko I don't know if you have the answer, but is there any safe guards if the timestamp overflows actually? or do we just don't care at all for now? (like, leave the problem for the future generations?)

You already have three counter overflow protections:

Very long counter (42 bits)
Counter segment (MSB) initialized to 0
Incremented timestamp on overflow

The timestamp will not be full for about 6900 years. If the system clock stops and the timestamp is used as a counter, it will also last a long time.

You can be absolutely calm

picnixz · 2024-06-28T11:30:48Z

Yes, but I wanted to know whether the RFC actually considered the case when you use your own offset. Let's say we want to generate a future UUID for some obscure reason, I was wondering "is there anything on that topics?" But I think I'll just leave it to future generations.

What I meant is "what do you do if the operation of incrementing the timestamp itself overflows"?

sergeyprokhorenko · 2024-06-28T11:35:21Z

Yes, but I wanted to know whether the RFC actually considered the case when you use your own offset. Let's say we want to generate a future UUID for some obscure reason, I was wondering "is there anything on that topics?" But I think I'll just leave it to future generations.

What I meant is "what do you do if the operation of incrementing the timestamp itself overflows"?

Don't set offsets to 6900 years or minus 2k years, and everything will be OK. Foolproofing is an implementation detail.

sergeyprokhorenko · 2024-06-28T13:09:52Z

When the timestamp goes beyond the upper or lower limit of the acceptable range, a zero offset can be applied. This is how I would do it. The RFC does not cover this issue.

I think timestamp offset could be a competitive advantage of this implementation without significant cost.

UUIDv1 can be forgotten and no longer upgraded. This is an outdated version

Lib/uuid.py

pretoriusdre · 2024-07-22T12:57:23Z

Great job on this PR. One thing...

In the get_counter_and_tail method:
rand = int.from_bytes(os.urandom(10))

Might I suggest to explicitly specify the required byteorder using the byteorder argument?

Running this code in an older python env gives an error:
TypeError: from_bytes() missing required argument 'byteorder' (pos 2)

It seems like some default is now provided, but in my opinion, this could lead to some ambiguity. See below:
https://discuss.python.org/t/what-should-be-the-default-value-for-int-to-bytes-byteorder/10616

There is another usage of int.from_bytes in the same uuid module, perhaps if the above is being addressed, this could be put within same scope.

picnixz · 2024-07-22T13:28:39Z

Running this code in an older python env gives an error:

This feature would only be put in 3.14 or later, so we can ignore this.

but in my opinion, this could lead to some ambiguity

It doesn't matter whether it's little or big endian here since we are only interested in randomness and not actual data. In addition, not specifying it might be a bit faster since the C implementation currently does:

    if (byteorder == NULL)
        little_endian = 0;
    else if (_PyUnicode_Equal(byteorder, &_Py_ID(little)))
        little_endian = 1;
    else if (_PyUnicode_Equal(byteorder, &_Py_ID(big)))
        little_endian = 0;

So, not specifying the byteorder, is equivalent to have byteorder being NULL out there, which saves a string comparison.

jnoring-pw

Just a few minor comments as I looked through this code (I'm interested in uuid7 suport in our project). Thanks for this! It's looking good.

Lib/uuid.py

picnixz · 2025-03-02T11:52:32Z

@merwok Do you find the formulation still clear in uuid.py? I made it explicit in the online docs but for the Python file, I think we can be a bit less formal.

Doc/library/uuid.rst

Lib/uuid.py

picnixz · 2025-03-03T16:52:50Z

@merwok @hugovk I've applied some suggestions. WDYT?

Doc/library/uuid.rst

merwok · 2025-03-03T17:16:00Z

Doc/whatsnew/3.14.rst

@@ -919,8 +919,9 @@ urllib
 uuid
 ----

-* Add support for UUID versions 6 and 8 via :func:`uuid.uuid6` and
-  :func:`uuid.uuid8` respectively, as specified in :rfc:`9562`.
+* Add support for UUID versions 6, 7, and 8 via :func:`uuid.uuid6`,


Not to be pedantic, but is via correct? It seems to suggest that the functions are the only thing that support these versions, but the support is added in the UUID class and the functions are there too as a convenience.

Yes, but in general, we don't really want people to directly use the UUID class. Strictly speaking, we're only adding the support for the version value but we don't check how it's been generated.

I prefer users to actually use the factories. Otherwise, I can say that the UUID class now accepts version to be 6, 7, or 8.

I slightly disagree but won’t argue 🙂

merwok · 2025-03-03T17:17:12Z

Lib/uuid.py

@@ -808,6 +809,80 @@ def uuid6(node=None, clock_seq=None):
    int_uuid_6 |= _RFC_4122_VERSION_6_FLAGS
    return UUID._from_int(int_uuid_6)

+_last_timestamp_v7 = None


Suggested change

_last_timestamp_v7 = None

_last_timestamp_v7 = None

I wanted to apply a PEP-8 change in a separate PR because the module has inconsistencies. It seems a bit weird to only PEP-8ify this part of the code while the rest is not really PEP-8ified. See #121119 (comment).

python-dev doesn’t have a practice of doing reformatting-only PRs.
Remember that consistency for its own sake is not a goal (see PEP 20)

Instead, follow good conventions in code that is added or already changed.

Well... if a core dev endorses the change, I think it's fine. I don't mind endorsing it. I didn't do it for uuid6() nor for uuid8() when I wrote the function as there were more 1-blank lines separations rather than 2 blank lines separations. But if you insist on adding 2 blank lines, I'll also add them around the other functions because I prefer being consistent in this case (honestly, having 2 blank lines around only UUIDv7 makes it harder to read IMO).

I would say PEP-8 tells me that we can also ignore the PEP if the surrounding code already breaks it. But I will make a commit to just add blank lines around the functions I've added (uuid6 to uuid8).

I don't think that it's worth it to reformat the whole uuid.py file to PEP 8, but respecting PEP 8 for new code (or code near changed code) is a good practice.

Also, adding a few blank lines is innocuous (it does not change git blame, or risk changing the meaning of code), so it’s fine to do in existing code in this PR.

Generally people saying they want to «apply PEP 8» think of more bigger changes.

[note: marking this convo as unresolved just to help Victor or Hugo see it, not because there’s something left to do for the PR author]

I would say PEP-8 tells me that we can also ignore the PEP if the surrounding code already breaks it.

This is about for example methods using camelCase in unittest or logging, not spaces!

Lib/uuid.py

Doc/library/uuid.rst

Lib/uuid.py

Opportunity to make new code more PEP-8.

picnixz · 2025-03-03T18:29:09Z

@hugovk I eventually settled on doing PEP-8 for the new UUID functions and left the rest of the module untouched. Ideally, I really wanted a cosmetic-only PR because the module is slow to evolve, but I don't want to spend too much time trying to convince other core devs. So I'll just update new code.

Doc/library/uuid.rst

minimize git diff + remove now unwanted article

Doc/library/uuid.rst

Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>

picnixz · 2025-03-04T09:47:34Z

We finally landed that PR! Thank you everyone for the feedback and the patience :)

) Add support for generating UUIDv7 objects according to RFC 9562, §5.7 [1]. The functionality is provided by the `uuid.uuid7()` function. The implementation is based on a 42-bit counter as described by Method 1, §6.2 [2] and guarantees monotonicity within the same millisecond. [1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-5.7 [2]: https://www.rfc-editor.org/rfc/rfc9562.html#section-6.2 --------- Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Éric <merwok@netwok.org>

picnixz added 6 commits June 28, 2024 11:40

add UUIDv7 implementation

42d55b4

add tests

6826fa1

blurb

edc2cab

update CHANGELOG

c6d26b6

update RFC number

2ddb4b8

add TODO in the docs

bcd1417

bedevere-app bot mentioned this pull request Jun 28, 2024

Support UUIDv6, UUIDv7, and UUIDv8 from RFC 9562 #89083

Closed

bedevere-app bot added the awaiting review label Jun 28, 2024

This was referenced Jun 28, 2024

gh-89083: support UUID version 7 (monotonous version) (RFC 9562) [abandoned proposal] #120830

Closed

gh-89083: add support for UUID version 6 (RFC 9562) #120650

Merged

picnixz changed the title ~~gh-89083: add ref. impl. for UUID version 7 (RFC 9562)~~ gh-89083: add support for UUID version 7 (RFC 9562) Jun 28, 2024

picnixz mentioned this pull request Jun 30, 2024

[RFE] fields and time_* properties must not be used on UUIDs that are time-agnostic. #120878

Open

mastizada reviewed Jul 7, 2024

View reviewed changes

Lib/uuid.py Outdated Show resolved Hide resolved

sixcare mentioned this pull request Jul 8, 2024

Switch out UUIDv4 with UUIDv7 Turplanlegger/turplanlegger-fastapi#89

Open

Merge branch 'main' into uuid-v7-method-1

4630c8f

jnoring-pw reviewed Aug 20, 2024

View reviewed changes

Lib/uuid.py Outdated Show resolved Hide resolved

Lib/uuid.py Outdated Show resolved Hide resolved

picnixz added 6 commits August 21, 2024 13:32

Merge branch 'main' into uuid-v7-89083

cd80afb

add UUIDv8 implementation

c3d4745

add tests

392d289

blurb

26889ea

add What's New entry

44b66e6

add docs

7be6dc4

picnixz changed the title ~~gh-89083: add support for UUID version 7 (RFC 9562)~~ gh-89083: add support for UUID version 7 (RFC 9562) Aug 22, 2024

Improve hexadecimal masks reading

8ba3d8b

picnixz requested review from hugovk and merwok March 2, 2025 11:51

hugovk reviewed Mar 2, 2025

View reviewed changes

Doc/library/uuid.rst Outdated Show resolved Hide resolved

Lib/uuid.py Outdated Show resolved Hide resolved

improve online docs

73ab656

picnixz commented Mar 3, 2025

View reviewed changes

Lib/uuid.py Outdated Show resolved Hide resolved

merwok reviewed Mar 3, 2025

View reviewed changes

Doc/library/uuid.rst Outdated Show resolved Hide resolved

merwok reviewed Mar 3, 2025

View reviewed changes

Doc/library/uuid.rst Outdated Show resolved Hide resolved

merwok reviewed Mar 3, 2025

View reviewed changes

Lib/uuid.py Show resolved Hide resolved

picnixz added 2 commits March 3, 2025 18:24

constructor -> factory in labels

54d07ae

reword prolog

6d76389

picnixz commented Mar 3, 2025

View reviewed changes

Doc/library/uuid.rst Show resolved Hide resolved

'is outside the scope' -> 'exceeds the scope'

bd4ab55

picnixz commented Mar 3, 2025

View reviewed changes

Doc/library/uuid.rst Outdated Show resolved Hide resolved

Lib/uuid.py Outdated Show resolved Hide resolved

picnixz added 2 commits March 3, 2025 19:02

Apply suggestions from code review

e9ddb74

apply PEP-8 only for UUID6, UUID7 and UUID8

8755de0

Opportunity to make new code more PEP-8.

merwok reviewed Mar 3, 2025

View reviewed changes

Doc/library/uuid.rst Outdated Show resolved Hide resolved

small fix

12d7ad4

minimize git diff + remove now unwanted article

merwok approved these changes Mar 3, 2025

View reviewed changes

hugovk approved these changes Mar 3, 2025

View reviewed changes

Doc/library/uuid.rst Outdated Show resolved Hide resolved

avoid complex language :)

560d87c

Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>

picnixz merged commit 3929af5 into python:main Mar 4, 2025
39 checks passed

bedevere-app bot removed the awaiting merge label Mar 4, 2025

picnixz deleted the uuid-v7-method-1 branch March 4, 2025 09:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-89083: add support for UUID version 7 (RFC 9562) #121119

gh-89083: add support for UUID version 7 (RFC 9562) #121119

picnixz commented Jun 28, 2024 •

edited by github-actions bot

Loading

sergeyprokhorenko commented Jun 28, 2024 •

edited

Loading

picnixz commented Jun 28, 2024 •

edited

Loading

sergeyprokhorenko commented Jun 28, 2024 •

edited

Loading

sergeyprokhorenko commented Jun 28, 2024 •

edited

Loading

pretoriusdre commented Jul 22, 2024

picnixz commented Jul 22, 2024

jnoring-pw left a comment

picnixz commented Mar 2, 2025

picnixz commented Mar 3, 2025

merwok Mar 3, 2025

picnixz Mar 3, 2025

merwok Mar 3, 2025

merwok Mar 3, 2025

picnixz Mar 3, 2025

merwok Mar 3, 2025 •

edited

Loading

picnixz Mar 3, 2025

picnixz Mar 3, 2025

vstinner Mar 3, 2025

merwok Mar 3, 2025 •

edited

Loading

merwok Mar 3, 2025

picnixz commented Mar 3, 2025

picnixz commented Mar 4, 2025

gh-89083: add support for UUID version 7 (RFC 9562) #121119

gh-89083: add support for UUID version 7 (RFC 9562) #121119

Conversation

picnixz commented Jun 28, 2024 • edited by github-actions bot Loading

sergeyprokhorenko commented Jun 28, 2024 • edited Loading

picnixz commented Jun 28, 2024 • edited Loading

sergeyprokhorenko commented Jun 28, 2024 • edited Loading

sergeyprokhorenko commented Jun 28, 2024 • edited Loading

pretoriusdre commented Jul 22, 2024

picnixz commented Jul 22, 2024

jnoring-pw left a comment

Choose a reason for hiding this comment

picnixz commented Mar 2, 2025

picnixz commented Mar 3, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

merwok Mar 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

merwok Mar 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

picnixz commented Mar 3, 2025

picnixz commented Mar 4, 2025

picnixz commented Jun 28, 2024 •

edited by github-actions bot

Loading

sergeyprokhorenko commented Jun 28, 2024 •

edited

Loading

picnixz commented Jun 28, 2024 •

edited

Loading

sergeyprokhorenko commented Jun 28, 2024 •

edited

Loading

sergeyprokhorenko commented Jun 28, 2024 •

edited

Loading

merwok Mar 3, 2025 •

edited

Loading

merwok Mar 3, 2025 •

edited

Loading