Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print a deterministic length of commit hash in --version #6258

Merged
merged 1 commit into from
Aug 4, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 7 additions & 5 deletions build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ fn main() {
// (git not installed or if this is not a git repository) just return an empty string.
fn commit_info() -> String {
match (channel(), commit_hash(), commit_date()) {
(channel, Some(hash), Some(date)) => format!("{} ({} {})", channel, hash.trim_end(), date),
(channel, Some(hash), Some(date)) => format!("{} ({} {})", channel, hash, date),
_ => String::new(),
}
}
Expand All @@ -39,11 +39,13 @@ fn channel() -> String {
}

fn commit_hash() -> Option<String> {
Command::new("git")
.args(["rev-parse", "--short", "HEAD"])
let output = Command::new("git")
.args(["rev-parse", "HEAD"])
.output()
.ok()
.and_then(|r| String::from_utf8(r.stdout).ok())
.ok()?;
let mut stdout = String::from_utf8(output.stdout).ok()?;
stdout.truncate(10);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we're truncating to 10?

I checked the hash length for some other tools and they're either 8-9 characters long:

rustc 1.81.0-nightly (6b0f4b5ec 2024-06-24)
cargo 1.81.0-nightly (bc89bffa5 2024-06-22)
clippy 0.1.81 (6b0f4b5e 2024-06-24)
rustup 1.27.1 (54dd3d00f 2024-04-24)
rust-analyzer 1.81.0-nightly (6b0f4b5e 2024-06-24)

Also, I think some of the non-determinism came from using the --short option with git's rev-parse subcommand. Here are the docs:

       --short[=length]
           Same as --verify but shortens the object name to a unique prefix with at least length
           characters. The minimum length is 4, the default is the effective value of the
           core.abbrev configuration variable (see git-config(1))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rust-lang/rust repo (which rustfmt releases are published from, and therefore the one whose commit hashes matter) currently has about 262k commits on master and 398k commits reachable from PR branches. Using 800k in the estimations to account for some growth, the formula for probability of any colliding H-digit hash over N years is:

$$ 1 - \left( 1 - \frac{800{\small,}000}{16^{H}} \right)^{\Huge \raise -1ex {429 N}} $$

where 429 is the estimated number of releases per year, including 365 nightlies, 8 stables, and 7 betas per stable release. For 8-digit hash, there is 33% probability of a collision in 5 years. For 9-digit, 2.5%, and 10-digit, 0.16%.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the detailed and thoughtful explanation!

Some(stdout)
}

fn commit_date() -> Option<String> {
Expand Down
Loading