Skip to content

Add basic UUID support to libextra #8053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

gavinb
Copy link
Contributor

@gavinb gavinb commented Jul 26, 2013

Addresses part of #7104

This module adds the ability to generate UUIDs (on all Rust-supported platforms).

I reviewed the existing UUID support in libraries for a range of languages; Go, D, C#, Java and Boost++. The features were all very similar, and this patch essentially covers the union. The implmentation is quite straightforward, and uses the underlying rng support which is assumed to be sufficiently strong for this purpose.

This patch is not complete, however I have put this up for review to gather feedback before finalising. It has tests for most features and documentation for most functions.

Outstanding issues:

  • Only generates V4 (Random) UUIDs. Do we want to support the SHA-1 hash based flavour as well?
  • Is it worth having the field-based struct public as well as the byte array?
  • Formatting the string with '-' between groups not done yet.
  • Parsing full string not done as there appears to be no regexp support yet. I can write a simple manual parser for now?
  • D has a generator as well. This would be easy to add. However, given the simple interface for creating a new one, and the presence of the macro, is this useful?
  • Is it worth having a separate UUID trait and specific implementation? Or should it just have a struct+impl with the same name? Currently it feels weird to have the trait (which can't be named UUID so as to conflict) a separate thing.
  • Should the macro be visible at the top level scope?

As this is a first attempt, some code may not be idiomatic. Please comment below...

Thanks for all feedback!

@alexcrichton
Copy link
Member

This is awesome! I won't comment too closely on the code yet because you say it's still a work in progress, but here's some high-level comments:

  • First off, great work!
  • I personally think that this would belong better in libextra rather than libstd. Regardless, one of the core devs should probably weigh in on which one.
  • libstd/extra have #[deny(missing_doc)] by default because we want the standard of code to be pretty high. Would you mind adding documentation for everything?
  • This would be an excellent candidate for an implementation of the Rand trait
  • You made a trait IUUID, but you didn't implement it anywhere? If you're not implementing the trait on multiple objects, you can probably just remove it altogether
  • Right now macro import/export don't work in general, and I think the idea was for uuid!() to create a compile-time macro anyway. This can probably be removed for inclusion at a later date (if it's still necessary)
  • In the methods about the version/variant info, could you link directly to the wikipedia or equivalent docs? I had no idea that was even a thing!
  • The constructor methods should be on UUID itself, those are the conventions nowadays. Normally there's a Type::new() -> Type method and then you can add relevant others as Type::new_with_fancy_options(...) -> ...

Overall, nice work!

Edit: also if you could rebase everything into one patch, that's probably the best route to go (as opposed to having logs of smaller commits, perhaps not all of which build)

/// Creates a new random UUID
pub fn new_v4() -> UUID {
let mut rng = rand::rng();
let ub = rng.gen_bytes(16);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be rand::task_rng().gen_bytes(16), which saves creating a whole new RNG & reading from /dev/urandom etc.

@brson
Copy link
Contributor

brson commented Jul 26, 2013

Does this need to be in std? If not, I prefer it in extra.

@bstrie
Copy link
Contributor

bstrie commented Jul 26, 2013

Seems like @graydon might be interested in this, as the filer of #7104.

@omasanori
Copy link
Contributor

I'm not a core developer, though:

  • test_from_fields (and probably current implementation) lacks consideration about endianness.
  • Adding ISO/IEC 9834-8 (freely available at the ISO website) to the References section would be helpful.

@omasanori
Copy link
Contributor

well, ISO/IEC 9834-8 (= ITU-T X.667) and RFC 4122 are technically equivalent, so it's good but not necessary to have a reference to ISO/IEC 9834-8.

@gavinb
Copy link
Contributor Author

gavinb commented Jul 28, 2013

Thanks, all. I will update as per the comments and make a fresh PR with all suggested changes and squashed into a single commit.

@huonw Thanks, have changed it to use task_rng().

@brson Have moved it into extra.

@alexcrichton Thanks for the feedback. See below...

I personally think that this would belong better in libextra rather than libstd.

Yes, moved as above.

libstd/extra have #[deny(missing_doc)] by default because we want the standard of code to be pretty high. Would you mind adding documentation for everything?

Yes, that was just temporary. :) All documentation added and missing_doc attr removed.

This would be an excellent candidate for an implementation of the Rand trait

Added to return a v4 UUID.

You made a trait IUUID, but you didn't implement it anywhere? If you're not implementing the trait on multiple objects, you can probably just remove it altogether

Right, makes sense to remove.

Right now macro import/export don't work in general, and I think the idea was for uuid!() to create a compile-time macro anyway. This can probably be removed for inclusion at a later date (if it's still necessary)

Ok, removed macro.

In the methods about the version/variant info, could you link directly to the wikipedia or equivalent docs? I had no idea that was even a thing!

Ok, added links.

The constructor methods should be on UUID itself, those are the conventions nowadays. Normally there's a Type::new() -> Type method and then you can add relevant others as Type::new_with_fancy_options(...) -> ...

Moved ctors into type.

also if you could rebase everything into one patch, that's probably the best route to go (as opposed to having logs of smaller commits, perhaps not all of which build)

Certainly; I will open a fresh PR with the new squashed and update changes.

@mitsuhiko
Copy link
Contributor

All the UUID implementations I have seen in the past are entirely immutable and modification returns a new UUID (eg: setting a version). That has the nice advantage that putting those things into a hashmap is not confusing.

What's the policy in Rust on these kinds of things?

@gavinb
Copy link
Contributor Author

gavinb commented Jul 28, 2013

@mitsuhiko This implementation is also immutable.

But it's not really anything specific to Rust. The version and variant determine the meaning of the bytes as well as how they are generated. So semantically it doesn't make sense to change either, as that would invalidate the contents.

@brson
Copy link
Contributor

brson commented Jul 31, 2013

It sounds like most of the comments here have been addressed. I'll be sure to give this a close look tomorrow. Sorry for the delay.

@brson
Copy link
Contributor

brson commented Jul 31, 2013

OK, this looks great, but does need some updates before merging. I agree with all of @alexcrichton's comments, so please make sure when you update the pull to make sure they are addressed (I don't see any of those changes reflected here yet).

Some requests:

  • Regarding hash-based ID's I suggest not adding them until there's an obvious demand.
  • Since there's no way to get a UUIDFields, let's not make it public for now.
  • There's only one way to convert to a string to_simple_string. Do you have firm plans to add other string conversions? If not, let's remove that method in favor of ToStr.
  • Similarly to the constructor issue, the conversions from_bytes and from_fields should be methods called from_etc.
  • The UUID acronym in all cases should be typed Uuid. This is our current, semi-official style for acronyms.

Also please do squash these commits into a single one so there aren't intermediate commits that don't build.

Thanks!

@gavinb
Copy link
Contributor Author

gavinb commented Aug 7, 2013

Thanks, for the feedback @brson. I have addressed all the above suggestions, just not pushed my changes yet.

  • Hash-based IDs: agreed, easy to add later
  • Made UuidFields non-public
  • Made a single to_str impl
  • Constructors are from_bytes, from_fields and from_str as discussed
  • Changed UUID to Uuid, and also made the enums use CamelCase

I will squash all these together. I assume that since I have already pushed to this branch and this PR, I need to open a new one?

@emberian
Copy link
Member

emberian commented Aug 7, 2013

@gavinb nope, just force push to the branch

@gavinb
Copy link
Contributor Author

gavinb commented Aug 9, 2013

I've updated the branch having incorporated all the feedback above. I also added some more unit tests, so coverage should be pretty good. Docs are complete, and the endian issue has been solved. Please let me know if there is anything missing, or if there is something not idiomatic to Rust that could be improved. Thanks!

@gavinb
Copy link
Contributor Author

gavinb commented Aug 15, 2013

So for some reason, the commit 53f5348 is showing up as having been added 22 days ago, but I just pushed an update that addresses all the review feedback and this is the right hash. Just FYI...

To create a new random (V4) UUID and print it out in hexadecimal form:

~~~ {.rust}
use std::uuid;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To work, I think this needs to be use std::uuid::Uuid;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably extra::... even.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Actually, it's worse than that - it should be extra, not std! Fixed, and tested outside tree.

@alexcrichton
Copy link
Member

@gavinb, nice job! This code looks well written and should certainly serve as a good example for what rust code can do. Apart from the few minor nits, I think this is ready to go!

}

// Make sure all chars are either hex digits or hyphen
let valid_digit_count = us.iter().take_while(|&c|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, and the following if, could probably written as a single loop:

for (i, c) in us.iter().enumerate() {
    match c {
        '0'..'9' | 'A' .. 'F' | 'a' .. 'f' | '-' => {}
        _ => return Err(fmt!(...))
    }
}

Also, the as char cast on the byte below is wrong. It is converting the byte value to the character with that unicode codepoint, not that exact byte pattern, and so if this cuts a utf8 character, then it will print the wrong thing. It should probably print the unicode character (either using .char_at, or printing c directly in the loop above).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated as per your suggestion.

@gavinb
Copy link
Contributor Author

gavinb commented Aug 15, 2013

@alexcrichton Thanks for the feedback. Can you just cast an eye over the error reporting stuff in the parser? A slip of the fingers and it didn't get committed last time. There's tests too.

println(fmt!("test error format: %s", ErrorInvalidLength(31)));
println(fmt!("test error format: %s", ErrorInvalidCharacter('Q', 16)));
println(fmt!("test error format: %s", ErrorInvalidGroups(9)));
println(fmt!("test error format: %s", ErrorInvalidGroupLength(4, 5, 6)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now tests printing to stdout actually can cause problems with what the test runner prints, so it's best to just do some sort of string matching here or not at all (as in just test for a subset of the message).

I'm not sure that testing this is 100% necessary, up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I'll just leave them out for now. I have verified that they are printed correctly.

- generate random UUIDs
- convert to and from strings and bytes
- parse common string formats
- implements Zero, Clone, FromStr, ToStr, Eq, TotalEq and Rand
- unit tests and documentation
- parsing error codes and strings
- incorporate feedback from PR review
@gavinb
Copy link
Contributor Author

gavinb commented Aug 16, 2013

Oh, snap! There were a few lines >100 cols which borked the build. Sorry, fixed up now in latest update.

@alexcrichton Not at all; it's been great, getting feedback and learning. Thanks for your patience. :)

bors added a commit that referenced this pull request Aug 17, 2013
Addresses part of #7104

This module adds the ability to generate UUIDs (on all Rust-supported platforms).

I reviewed the existing UUID support in libraries for a range of languages; Go, D, C#, Java and Boost++. The features were all very similar, and this patch essentially covers the union.  The implmentation is quite straightforward, and uses the underlying rng support which is assumed to be sufficiently strong for this purpose.

This patch is not complete, however I have put this up for review to gather feedback before finalising. It has tests for most features and documentation for most functions.

Outstanding issues:

* Only generates V4 (Random) UUIDs. Do we want to support the SHA-1 hash based flavour as well?
* Is it worth having the field-based struct public as well as the byte array?
* Formatting the string with '-' between groups not done yet.
* Parsing full string not done as there appears to be no regexp support yet. I can write a simple manual parser for now?
* D has a generator as well. This would be easy to add. However, given the simple interface for creating a new one, and the presence of the macro, is this useful?
* Is it worth having a separate UUID trait and specific implementation? Or should it just have a struct+impl with the same name? Currently it feels weird to have the trait (which can't be named UUID so as to conflict) a separate thing.
* Should the macro be visible at the top level scope?

As this is a first attempt, some code may not be idiomatic. Please comment below...

Thanks for all feedback!
@bors bors closed this Aug 17, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants