-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure downloads are correct via checksums #6773
Conversation
Needs updating for just-bumped Arpack version |
@tkelman thanks! |
Windows is using a different version of OpenBlas, needs another checksum entry?
|
I really like how you store the hashes in a simple and flexible format. It is an advantage that one can add hashes for new releases, before deciding to actually bump the version for everybody. I still wonder if keeping everything in We should probably have some documentation somewhere. This is likely to cause failures, so it would be nice if people unfamiliar with |
Alright. I think I've managed to make this a little more human-friendly, added in OpenBLAS 0.2.9-rc1, and addressed all of the comments so far. Note that we only interrupt the build if we have a ground truth hash stored, and it doesn't match the downloaded file. If we don't have a hash stored, then the compilation will just continue with a little warning. |
@@ -0,0 +1,98 @@ | |||
# This is the checksums file, lines starting with "#" are ignored as comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file stores checksums for the automatically downloaded dependencies, in an attempt to ensure that the correct file has been downloaded and that it has not been tampered with. deps/jlchecksum
contains the code for the actual verification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
I would propose for discussion another format: using a directory structure. Something like this:
This way there's nothing to parse – just read and chomp the appropriate file. If no hash file exists for a given file/hash pair, we could automatically generate it when building on a system that has the desired hashing |
Ah, that does seem like a better way to do things. I'll do this later, when I have some spare time again. :) |
... program and has downloaded the file, checked the other hashes, and built it's dependent targets successfully (this would prevent accidentally saving hashes for corrupted downloads). That way new files get hashes generated automatically without additional steps. |
I don't think automatically generating hashes is a good idea. I think it'd be a good idea to have a script that can spit out the |
Preferably the hashes should come from the project that released the dependency. Automatically generating the files would be pretty convenient though, and not all dependencies publish hashes. |
Maybe. I figure that the first time a dependency gets downloaded and built, we should checksum it. If that download was spoofed, they get a wrong checksum and others who get the real version won't be able to build it and this will quickly be discovered. Realistically, if you download a new version of some dependency and build it and it seems to work, are you going to do anything further to verify it? The only thing that's actually safer would be to require starting with some manually provided checksum from the download source or something like that. But is that really more likely to be secure? If someone is intercepting your download request, aren't they likely to also make the checksum match? Automatically hashing the first version means that only the first download is vulnerable to attack, which is a very small surface as compared to every single download ever being vulnerable to attack. It seems to me that this protection isn't really about preventing someone from spoofing all downloads of a dependency but about preventing them from spoofing some of the downloads without being detected. Automatically hashing the first version still accomplishes that. |
@StefanKarpinski alright, you convinced me. ;) |
Have considered making the error messages more user-friendly? For now it's OK, but in the long-term they will most commonly be printed to users with network issues (proxy...), and they won't understand what the checksum problem means. This was asked on the R mailing list recently. Something like: "Downloaded file checksum is incorrect. Check your network connection, especially network proxies." would be nice (realistically, while I agree an improved security is very important, in practice it won't be that common -- because of the presence of the check). |
In case anyone was curious, this branch works fine for from-scratch builds on Windows, both MSYS2 and Cygwin-to-MinGW cross. Also works fine in RHEL5 which is old enough that it doesn't have (P.S anyone want to return the testing favor and let us know if JuliaMath/openlibm#56 works on Mac and 32 bit Linux? 😉 would be nice to get the Windows build passing tests again, it's been nearly a week ) |
Alright. Anything else to improve on this, or does this satisfy everything? |
How does it work with the |
It should "just work", since it does all the checking in the |
Nice! |
Hah, |
Dependencies will be checked at extraction-time, and have their checksums stored in deps/checksums/. The deps/jlchecksum command tests a given filename against the checksums stored in that file, supporting MD5 and SHA512 checksums for all recorded files. SHA512 is preferred, with MD5 available as a fallback in the case that `shasum` is not available. If the hashes do not match, an error is thrown and compilation is halted. If a file is asked for and no hash has been given for that file, the hash is generated for both SHA512 and MD5, and stored
Fine with me with the improved error message! |
@StefanKarpinski are there any other comments on this, or am I free to merge? |
No, this looks awesome. |
Ensure downloads are correct via checksums
Bam. Another bug bites the dust. |
This commit adds in SHA512 and MD5 checksums for all dependencies I could find. I'd appreciate some testing, (easiest way is to mess up the checksums put in
deps/deps_checksums.{sha512,md5}
and then rebuild the dependency whose checksum you altered) to ensure both that normal compilation continues to work, as well as altered files don't allow compilation to continue.I tested on OSX and Linux, but didn't test on Windows or Windows cross-compile.
This PR is in reference to #6558