-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove hashes from library file names? #10188
Comments
A lot of pain? :) How would these get used? My understanding was that they would disambiguate multiple crates so that two versions of the same library could be loaded at once (if I depended on X version 1 and some dependency of mine depended on X version 2, they could both have what they wanted, instead of beign out of luck unless all dependency requirements matched). I'm not sure how one would use this though, or what this buys over a version number. |
I'm willing to jump in and come up with a patch that does this (hopefully without removing too much code other than whatever includes the hash in the name, leaving the code for generating said metadata in place - still could be useful in the future). Anyone else want this? If not, I'll take a look into it, see if it's something I can do. As simple as it sounds right now, I haven't read enough of the compiler code to be sure... Of course, there's not even any assurance this is wanted, but I feel that just removing it from the library file name could be a good start point. I've definitely had a couple times I've tried to use libraries and had conflicts, and if none of it is used, removal of that metadata sounds good, even if it does end up temporary, |
I was wondering where this came from.. Why does it use a hash? Why can't it just use the actual text? That would allow the v1 + v2 scenario and at least make sense. i.e. |
@mletterle Graydon had a sophisticated plan for how Rust should handle versioning, and the hashes are part of that. They're meant to encode dependencies and not just versions. However, now that Graydon has stepped down from the team, that plan lacks a champion, and we may have to punt on it. |
Did that plan get documented? It's hard to evaluate without knowing what the plan was and how the hash relates. |
IMHO, versioning is important, but not so important that it should cause the difficulties that it currently does, with no appreciable benefit. Ending up with multiple files containing essentially the same library just from a small change is often a bit of a pain, and I'd imagine hashes in names could cause difficulties in linking from C code. (Though I haven't tried that, so I'm not sure) This all lacks perspective on the versioning plan graydon had. Filenames including info about dependencies, versions, etc. could offer interesting benefits. Until that plan is implemented or at least confirmed, however, I feel working with libraries could be simplified somewhat. |
The only description I'm aware of for how our version hashing operates is here: https://github.com/mozilla/rust/blob/master/src/librustc/back/link.rs#L462. It may not perfectly reflect reality. |
There is this too: https://github.com/mozilla/rust/wiki/Doc-crate-hashes (which is based on an IRC conversation with graydon according to the history). |
From my understanding of the wiki @huonw pointed out and the comment in the code, the only real reason for a hash in the crate name is for two same-named crates to actually be different libraries. I could create my own This seems a little nice, but it may not be worth having to explain to everyone why we have this weird hash in the crates (and this makes writing scripts working on crates incredibly difficult). It's worth considering this use case though. |
That use case actually makes a lot of sense, and solves a real problem. In that case, the hash can be absolutely predictable, since you can just use the linkid UUID for the hash and require those be unique. Unfortunately using a UUID here would greatly expand the size of the hash value. If the hash is predictable, all the problems of using it pretty much disappear. The problem right now is that the output of compilation is effectively a randomly named file with a known prefix. It's easy enough to grep the link args out of a file to determine the filename needed. |
I can see the benefits of having such a hash, I just don't understand the need for it to be in the filename. Can't it be placed in the binary itself? |
@mletterle it does need to go in the library symbols, but it's in the filename so they can coexist next to each other in the same directory on the filesystem. |
I had a hunch it might be something like that, but I don't think it's worth the pain personally. If a developer actually needs two copies of the same library hanging around, they can go through the trouble to uniquely rename them, no? Or maybe I'm missing something fundamental here... |
Reading the wiki, it looks like the only purpose of the hash is to resolve soname clashes. This makes sense, also taking into account the fact that for (far future) system-wide libraries, the soname space is shared also with non-rust libraries. As the hash is expected to be static (like the library name and in opposition to the library version/interface signature), I'd like to adopt @metajack proposal: re-use the first group of the crate UUID as the hash. This gives us 8 random alphadigits with a low probability of collision, which are statically defined and known to rustpkg, as well as short and easily greppable. Eg, this will result in a |
Yes, with using the UUID as the hash, you can just pick a new uuid if you find a collision that is particularly bad. Probably needs a bit more thinking but it seems like this will salvage the main benefits while also fixing all the tooling issues. |
If we simply stopped hashing the upstream dependencies into the hash, then they would be stable - it would just be the link metadata, including any potential uuid. |
Need to decide something for 1.0. P-backcompat-lang. |
The link metadata is supposed to be disappearing (see #8523), so using a UUID from that is probably not the best. Here's what I propose:
This means:
What do you think? |
This replaces the link meta attributes with a pkgid attribute and uses a hash of this as the crate hash. This makes the crate hash computable by things other than the Rust compiler. It also switches the hash function ot SHA1 since that is much more likely to be available in shell, Python, etc than SipHash. Fixes rust-lang#10188, rust-lang#8523.
This replaces the link meta attributes with a pkgid attribute and uses a hash of this as the crate hash. This makes the crate hash computable by things other than the Rust compiler. It also switches the hash function ot SHA1 since that is much more likely to be available in shell, Python, etc than SipHash. Fixes #10188, #8523.
…fate expl_impl_clone_on_copy: ignore packed structs with type/const params changelog: [`expl_impl_clone_on_copy`]: Ignore `#[repr(packed)]` structs with type or const paramaters Fixes rust-lang#10188 A more involved solution that checks if any bound on the trait impl aren't present on the struct definition would be ideal, but I couldn't see a nice way to go about that
These hashes are the version number plus some additional metadata, mostly the #[link] metadata. Right now we don't have any actual use cases where distinguishing libraries by this metadata matters, nor do we have any mechanism for changing the link metadata for different compilation configurations. I don't know what this naming convention is buying us.
Nominating.
The text was updated successfully, but these errors were encountered: