-
Notifications
You must be signed in to change notification settings - Fork 9
Approximate UUID timestamp calculations #127
Comments
Let me elaborate a bit. |
What was the shortcoming? I don't understand how the new verbiage improves upon what's in the spec already.
The cost of a divide operation is several orders of magnitude below what we should be concerned with. |
DIV r32 can cost you some tens of nanoseconds [0]. On some platforms RDTSC is cheaper than DIV[1] and total cost of gettimeofday\clock_gettime will be comparable to this avoidable math manipulation. At which point are we concerned with performance? [0] https://www.agner.org/optimize/instruction_tables.pdf |
When the rate at which generating UUIDs becomes an issue for people using them. While I'll grant this is anecdotal, the NPM So in absolute terms I think it's safe to say that generation times on the order of 1000 nanoseconds are at the extreme limit of what's expected. In real world use cases UUIDs are generated in contexts where other more performance-intensive operations are the gating factor. Hence, why a few 10's of nanoseconds aren't a concern. For what it's worth, the performance work we've done on |
Here is a quote from the section 2. Motivation of the renewed standard:
This means a generation time of 100 nanoseconds per UUID. Therefore, tens of nanoseconds of the division operation is too much |
'Taking a bit more time to figure out what it is that's being asked for here, I think I understand the issue. To answer my own question ("what was the shortcoming?") the concern seems to be that the spec doesn't explicitly allow for deliberately sloppy timestamps computations, such as dividing by 1024 instead of 1000 to convert from microseconds to milliseconds. Assuming that's a fair synopsis, I would ask that we not take this new verbiage. The existing verbiage acknowledges that system clocks are known to be imperfect, while conveying the desire for accuracy. It's a good balance of "timestamps need to be accurate" and "timestamps are known to be inaccurate". The proposed verbiage carries a different tone. It tacitly endorses implementations that are deliberately inaccurate, which I think is a mistake. And it does so for performance reasons that are (imho) well beyond the margins of actual user concern. On the topic of performance ...
2. Motivation is just the original verbiage from RFC4122. It's included for (I assume) historical reference... and as it turns out, it is incorrect(!) Version 1 does not allow for "10 million per second ... or more". It allows for at most 10M/second. [@kyzer-davis: Can you confirm the original RFC motivation is misleading in this regard? I don't see any provisions in 4122 that allow for generators to support rates greater than 1 uuid/timestamp interval. Would this be worth calling out or revising in the new spec?] Thus, "10 million per second" isn't a requirement. It's an upper limit. And it's a theoretical limit based on the nature of the v1 (and v6 by extension) algorithms. None of the other versions have such a limit. v4, v5, v7? Those algorithms will let you crank out a trillion uuids/second if you can figure out how to do that. So should the spec concern itself with what might be needed to achieve picosecond-level optimizations? |
This claim is based on nothing and is misleading. This is not in the standard. This is an irresponsible speculation. |
The sole purpose of the timestamp is to order UUIDs by generation time. It does not matter how much the timestamp differs from real time. Moreover, the standard calls for distorting the timestamp when it is necessary for information security. For example, you can add or subtract several years to hide the true moment of generation. And the standard does not limit the motives for which the timestamp can be deliberately distorted. |
This statement grossly contradicts the text of the standard. |
Inflammatory, a little offensive, and nothing of substance to respond to.
Than we definitely shouldn't allow non-standard time intervals. Doing so causes ordering to become a hot mess as soon as uuids from different generators start to commingle. For example, Take a uuid generated now ( That's going to wreak havoc with the stated goal of better DB locality.
Prove it. Here's the's v1 implementation for Distinctly not holding my breath on this, however. |
Folks, just my 2 cents. |
Also, from a DB standpoint, mixing series of UUIDs with seriously different clocks wont make any impact on performance. Crucial part is that working set of new data is small. When the data is generated randomly across keyspace of whole dataset - that's a problem. If we have 13 data sources, one is dividing by 1000, another by 1002, one more by 1004 ... 1024 - this won't affect cache locality much. |
I think the benefit of optimizing the generation of timestamps by avoiding division is of little relevance. The test below shows that the difference between division and shift (division by a power of 2) can be negligible compared to the duration to get the current time. Measure the duration to get 1 billion milliseconds using div and shift separately Test result:
Also, in the long run, the difference between the timestamps produced with division and shift will get bigger and bigger. Currently, the difference is more than 400 days, considering the test below. Measure the distance between milliseconds using div and milliseconds using shift Test result:
Assembly difference between division and shift in the tests: user@computer:~$ diff get_milliseconds_with_div.s get_milliseconds_with_shift.s
23,33c23,27
< imulq $1000, %rax, %rsi
< movq -24(%rbp), %rcx
< movabsq $4835703278458516699, %rdx
< movq %rcx, %rax
< imulq %rdx
< movq %rdx, %rax
< sarq $18, %rax
< sarq $63, %rcx
< movq %rcx, %rdx
< subq %rdx, %rax
< addq %rsi, %rax
---
> salq $10, %rax
> movq %rax, %rdx
> movq -24(%rbp), %rax
> sarq $20, %rax
> orq %rdx, %rax |
I have not been checking this repo so I apologize for missing this thread. @broofa, the old doc said:
The latest new doc says:
The key being that it is now uncapped and 10m is just an example. I don't remember exactly when this was changed but I believe the idea is that one could create way more than this with modern computers. After-all that metric may just be something from early 2000s era computing vs some hard limit. As for the dividing a microsecond by 1024 vs 1000 for performance reasons. I see no issue adding something to "Altering, Fuzzing, or Smearing:" sub-section which gives an example of another reason one might modify the timestamp. Some proposed text:
Just to note, we just concluded IETF last call so I am not sure what changes I can realistically add that won't reset the AD reviews. Let me run this by the WG Chairs and get their opinion. |
My take on this is that dividing by 1024 for performance reasons should be allowed by the spec, but not encouraged. From an interoperability perspective if something is actually using the timestamp or needs the ordering provided by whatever the best clock source is, then this is problematic. But, that said, one of the goals of this new spec is to allow implementations to make these kinds of tradeoffs - if the interoperability in terms of timestamp accuracy just isn't a concern for your application, then sure, use 1024ths of a second instead of milliseconds. But I don't think we should promote this idea in the spec because it's a slippery slope - why not change the timestamp entirely to some other numbering system, and if you did, is that still UUIDv7? Just from a quick playground example, if you look at the difference for a specific timestamp, you can see here that https://go.dev/play/p/3_dA3xapsaR the date So again, if someone wants to do this for their system and the interoperability concerns are acceptable to them because they are sure nobody cares about how the timestamp reads, or they control all of the parsing code, great. But I don't think we should be telling people to do this as advice. How about this text instead:
|
I would argue that you should use a UUIDv8 in this case. |
I prefer the wording of @kyzer-davis as more specific, or the wording of @bradleypeabody needs to be clarified. The words "timestamp value accuracy" can cause controversy, as some may think that it is about minutes or hours, but not years. By the way, for security reasons, a deviation of several years is also a good opportunity. I want to remind that the only purpose of the timestamp is to ensure the generation of monotonically increasing keys to quickly search for records. But from a DB standpoint, mixing series of UUIDs with seriously different clocks wont make any impact on performance. Therefore, the interoperability will not be affected. See also: 6.11 Opacity I agree with the suggestion to document intentional timestamp deviations. |
Could be somewhat off topic: Plus, from my benchmarks, under extremely high workload, the syscall overhead becomes prominent, so caching the |
Btw, consider this approach as well: 1073 comes from: Beware of tv_nsec overflows. |
After some research, I learned that (uint64_t)tp.tv_sec * 1000 + ((uint64_t)tp.tv_nsec * UINT64_C(18014398510) >> 54) is 100% compatible with (uint64_t)tp.tv_sec * 1000 + (uint64_t)tp.tv_nsec / 1000000 Two expressions result in exactly the same value for each Plus, compilers conduct this sort of optimization automatically (see this paper for details), so when I look at a quick benchmark result, the former code is much faster than the latter with the unoptimized executable, while such a performance difference is not observable with the optimized executable. I'd say we don't need to be concerned about divisions and can just stick to: (uint64_t)tp.tv_sec * 1000 + (uint64_t)tp.tv_nsec / 1000000 |
Yup, I worried a bit about -O0 too, but thought that it makes sense. Relying on compiler optimizations did not strike me as a good idea. However, this optimization seems straightforward enough, when I saw the code in C :) |
I have committed the text change under ietf-wg-uuidrev/rfc4122bis@26f38c9 for draft-10 |
Merged, and closing. Thanks group. |
@kyzer-davis, |
As a result of the attempt to implement the standard, one unfortunate shortcoming was revealed. I propose to add the following text to the section "6.1. Timestamp Considerations" in the next version of the standard:
I believe this is now in line with the standard, as the standard says:
However, developers would like reasonable computational errors to be allowed explicitly.
The text was updated successfully, but these errors were encountered: