-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add UUID
.
#2716
Add UUID
.
#2716
Conversation
Hi @mirek tests are passing for std and compiler, but failed on formatting, you can run |
|
||
require "secure_random" | ||
|
||
private def assert_hex_pair_at!(value : String, i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it outside struct UUID
?
Should we perhaps namespace this into |
|
||
def initialize(bytes : Slice(UInt8)) | ||
raise ArgumentError.new "Invalid bytes length #{bytes.size}, expected 16." if bytes.size != 16 | ||
@data.to_unsafe.copy_from bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this perhaps validate the format?
@jhass UUID happens to have a random version (v4) but all other versions are neither random or secure. |
There are still constant fields, the highest two bits of @ysbaddaden This class only implements version 4 though |
What about avoiding to copy data around in struct UUID
def initialize(@bytes : Slice(UInt8))
unless @bytes.size == 16
raise ArgumentError.new("Invalid UUID: expected 16 bytes but got #{@bytes.size}")
end
end
def to_slice
@bytes
end
def to_unsafe
@bytes.to_unsafe
end
end What about having a single struct UUID
def self.new(uuid)
new parse(uuid)
end
private def self.parse(uuid : Slice(UInt8))
uuid
end
private def self.parse(uuid : UUID)
uuid.to_slice
end
private def self.parse(uuid : String)
# parse hexdump string and return Slice(UInt8)
end
def ==(other)
self == UUID.parse(other)
end
def ==(other : UUID)
@bytes == other.to_slice
end
end Last, but not least: constructors with explicit names would be nicer than version numbers. For example following the struct UUID
def self.timestamp_create(timestamp = nil, mac = nil)
# ...
end
def self.dec_create(timestamp = nil, mac = nil)
# ...
end
def self.md5_create(namespace, object)
bytes = digest("md5", namespace, object)
new_rfc4122(bytes, 3)
end
def self.random_create
bytes = SecureRandom.random_bytes(16)
new_rfc4122(bytes, 4)
end
def self.sha1_create(namespace, object)
bytes = digest("sha1", namespace, object)
new_rfc4122(bytes, 5)
end
def self.new_rfc4122(uuid : Slice(UInt8), version : Int32)
bytes[6] = (bytes[6] & 0x0f) | (version << 4)
bytes[8] = (bytes[8] & 0x3f) | 0x80
new bytes
end
private def self.digest(type, namespace, object)
hash = OpenSSL::Digest.new(type)
hash << parse(namespace)
hash << parse(object)
hash
end
end @jhass shall we enforce RFC4122, or shall we allow it to be loose and accept whatever 16 bytes (allowing any variant and version) and provide means to verify? For example: struct UUID
def rfc4122!
raise ArgumentError.new("Invalid UUID variant, expected RFC4122") unless rfc4122?
raise ArgumentError.new("Invalid UUID version: #{version}") unless 1 <= version <= 5
self
end
def rfc4122?
@bytes[8] & 0x80 == 0x80
end
def version
@bytes[6] >> 4
end
end EDIT: I use UUID v4 and v5 a lot, I guess it shows :) |
I'd say enforce it, it's a |
Well, technically a UUID is merely 128bits, and RFC4122 is a UUID variant among others:
The special NIL UUID That being said, I agree that except for the NIL UUID —often used for UUID v3 and v5— the variants are merely for obscure backward compatibility, and the reserved variant or new versions will come any time soon. Even so, one could just override |
I'm not sure about enforcing, it would (unnecessarily?) restrict the usage. Currently, for example, I'm working on a system with custom made variants of UUIDs stored in postgres (which doesn't reject them) where couple of bytes are managed manually. I think better way of dealing with it is to accept any UUID on constructors and add methods like Application logic can restrict to specific, known formats if needed. We can also decorate it with shorter helpers like |
@ysbaddaden it has been struct, not class from the beginning. |
@ysbaddaden regarding copying data, let's not forget that the data is just 2x i64 - it's tiny. I think it's better to keep it close as static bytes in the struct and not having slice pointer which itself will take this space. It also means we have correct value semantics, just as if UUID was an Int. If we want to support memory critical scenarios I think we could expose some static (mostly unsafe) functions like "Mostly unsafe" static functions because when you try to avoid this tiny overhead, then you also want to avoid creating slices. For example if you have millions of records in raw binary with uuid bytes somewhere and you want to filter records that are not in valid v4 for example. Ok, so consolidating comments I'm going to:
|
UUIDs are very simple, it would be great to have a simple and straightforward implementation and API. A Since most use cases conform to RFC4122, there really is no need to extract related methods. We may choose to enforce the variant/version or not,, but this is unrelated. If we do enforce it by default, we may provide a mean to change this behavior, so one can experiment with non conforming variants and versions. If someday one releases a new variant that becomes popular we'll reconsider, but it's unlikely to happen soon. One thing I'd really like is to be able to transparently pass a UUID, String or Slice anywhere I can pass a UUID (in the struct, that is) so it's simple to use, using whatever source of data (slice from PG, string from JSON, UUID from Crystal, ...) Maybe I don't see a need for the version enum, especially with distinct constructors. Yes, version 1 and version 2 UUIDs will be complicated because of the MAC. We may skip them or require to pass a value. Yes, let's keep a StaticArray contained in the struct, instead of a pointer involving the GC. |
@ysbaddaden yes, agree regarding
...but that's just cosmetics. Regarding transparent usage agree as well. Is there some idiomatic way of implementing struct/class as "transparent" with other types? |
I didn't mean transparent with other types, it would be great, but I think it's up to usages to deal with that, otherwise we'd inject methods like |
Adding things to prelude includes them at start time correct? |
@Javajax Good catch, |
I was hoping to bump this PR so I added the commit for removing prelude here #2861 Is there anything else that needs to be updated? |
@mirek are you still working on this UUID branch? |
@wontruefree no, please feel free to pick it up if you want. |
No description provided.