-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate non-v4 UUID usage #4
Comments
It's worth noting that the README for Question: How much of the v1 usage would have to be a mistake for it to have an impact on our decision making? (E.g. even if half of it was erroneous and we adjusted for that with a 90% v4 to 10% v1 distribution... does that change anything? My initial thought is, "Nope. We still need to design an API that allows for other versions, 'cause we're going to have to support them eventually.") |
Yeah, this is where the idea comes from: Developer gets told "use UUIDs", finds uuid module, stops reading at v1, uses it.
I do hope so but given the broad adoption of JavaScript these days in so many business areas where there may be less quality control than we might be used to from our work environments I don't exclude the possibility that actually a significant number of projects was choosing the wrong UUID version.
Should we identify such occurrences I'd like to get in touch with the maintainers of these modules to hear their arguments.
I had the chance to meet some TC39 folks in Berlin last week and my major lesson learned there was that we'll have to underpin every single design decision with rock solid arguments in order to eventually reach consensus with our proposal. So first of all, no matter the outcome of my research, I still believe that our API should be be open to allow for In #3 the very first assumption I suggested was that the API should be symmetric in the different versions, but given the thoughts from this thread I think it might actually be beneficial to nudge users towards I think thoughts like these ("what did we learn from the userland module and can do better in a standard library") might ultimately be of big importance for the success of this proposal. |
The method @ctavan describes above sounds amazingly good. I'm really excited to hear what you find! |
Here are the top 100 github repos (by watch count) that make use of
|
And these are the ones that make use of
|
BTW there may be still quite some false positives in these results. I'll pick some that sound interesting to me and check the source manually. Just did a quick check on gatsby and ghost and they indeed use v1 UUIDs, however at a first glance it looked like they could use v4 uuids equally well… I'll dig deeper over the next days and also make sure to provide all my queries for review in #7. |
@ctavan thank you for doing this. as per @codehag's out of band feedback; perhaps a next good step, before we dig too deep into the data you've collected, is coming up with a set of hypothesis that we're trying to prove or disprove. If we take on the qualitative approach of reaching out to a few folks using UUID v1, etc., we also need to be careful not to come in with a bias point of view. |
@bcoe agree. So let's first try to characterize the 3 classes of UUIDs that are described in the RFC:
Hypothesis: Following the principle of least surprise the hypothesis is that you should always use the simplest UUID version that fulfills your use case because this reduces the risk of unexpected problems. So if all you need is a unique identifier, you should always use v4 UUIDs. Only if you need time-ordering you should use v1. And only if you need namespacing, you should use v3/5. In particular, accidentally using v1 instead of v4 UUIDs in cases where the developer is simply expecting a random value but is not aware of the fact that the generated IDs are time-ordered can have very negative consequences:
Given these assumptions we want to understand whether:
Do you agree with these ideas? |
Non-v4 UUID usage has been analyzed in detail in https://github.com/bcoe/proposal-standard-library-uuid/blob/master/analysis/README.md Let's close this one and wait for explicit feedback on the analysis to figure out if additional research is necessary. |
The analysis looks great. I'm very happy to see the responsible approach taken here, including following up with the inappropriate users of UUID v1. Seems like we can conclude that the UUID standard library should only support v4. |
The initial discussion in #3 brought up the question whether the UUID standard module should support UUID versions other than v4 from the beginning.
A rough analysis of the BigQuery Github dataset revealed revealed the following usage:
@littledan made the hypothesis that some part of
v1
UUID usage could actually be caused by developers having accidentally chosen the wrong UUID version sincev1
sounds much more like the "default" UUID type rather thanv4
.To verify this hypothesis I suggest to look a bit deeper into the BigQuery Github dataset and:
We could do the same for v3/5.
Does this make sense to you? Anything you would add or do differently? Would be great to get some feedback before I actually start working on this 🙂.
The text was updated successfully, but these errors were encountered: