Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve --version for non-releases #495

Merged
merged 3 commits into from
Sep 19, 2024
Merged

Conversation

Mirko-von-Leipzig
Copy link
Contributor

@Mirko-von-Leipzig Mirko-von-Leipzig commented Sep 12, 2024

This PR improves --version outputs for the node and faucet by including extra information for non-release builds.

--version is now set to git describe --tags --dirty which is no change for formal releases (since tag == package version), but for non-releases indicates the previous tag, commit count since that tag, commit SHA and appends -dirty if it was built with uncommitted changes.

Motivation is given in #493.

Some points of contention:

  1. I added this to the miden-node-utils crate. However, since this is a build.rs that depends on git data, this means the utils crate will be rebuilt after every git change i.e. this will cause a rebuild of utils and all of its dependents. It is probably better if this became its own top level crate - I decided not to do this because I didn't want to publish such a minor crate. But it might make sense to just do it. Alternatively, we can add this build.rs to every binary we create instead.
  2. The tag found is sensitive. As an example, git describe --tags on the next branch currently returns v0.4.0 because the v0.5.0 tag on main has not been merged back yet. Just something to consider for our release "process" :). This differs significantly to the current output which is v0.6.0 (the in-progress package version).

The --dirty flag currently doesn't work unfortunately. I've opened an issue for this in vergen.

We can also consider additional build descriptions. vergen has an illustrative example for verbose version info, which provides information that could be useful for debugging/reproduction purposes:

~/p/r/app λ app -vv
app 0.1.0

Build Timestamp:     2021-02-23T20:14:46.558472672+00:00
rustc Version:       1.52.0-nightly
rustc Channel:       nightly
rustc Host Triple:   x86_64-unknown-linux-gnu
rustc Commit SHA:    3f5aee2d5241139d808f4fdece0026603489afd1
cargo Target Triple: x86_64-unknown-linux-musl
cargo Profile:       release

Closes #493

@Mirko-von-Leipzig Mirko-von-Leipzig changed the title Mirko versionify feat: improve version information for non-releases Sep 12, 2024
@Mirko-von-Leipzig Mirko-von-Leipzig changed the title feat: improve version information for non-releases feat: improve --version for non-releases Sep 12, 2024
@Mirko-von-Leipzig
Copy link
Contributor Author

Can also consider creating a status page for the node. Something to monitor the status of the node and the metadata like block height, node version etc.

Right now we can only "monitor" via the client or submitting a note via the faucet webpage.

@bobbinth
Copy link
Contributor

I'm not fully understanding this yet - but a couple of questions/comments:

The effect of this is that we add more info to the --version command, right? And this info now going be based on Git tags rather than Cargo.toml? Or is it going to combine the two somehow?

Generally, I think it is good to add more details to the version like you have with the vargen example, but I think what gets printed out should be consistent with crates.io version. That is, if Cargo.toml says 0.6.1 - and then we can add more info to it.

I added this to the miden-node-utils crate. However, since this is a build.rs that depends on git data, this means the utils crate will be rebuilt after every git change i.e. this will cause a rebuild of utils and all of its dependents. It is probably better if this became its own top level crate - I decided not to do this because I didn't want to publish such a minor crate. But it might make sense to just do it. Alternatively, we can add this build.rs to every binary we create instead.

I also probably won't publish a separate crate just for this. Given that we have only 2 binaries, I'm wondering if putting this just in build.rs of the binaries is a better option?

Can also consider creating a status page for the node. Something to monitor the status of the node and the metadata like block height, node version etc.

Right now we can only "monitor" via the client or submitting a note via the faucet webpage.

Yes, that would be great. This is probably somewhat related to #145 and maybe #83 and #95. But maybe it is worth creating a separate issue (or maybe something to superseded some of these mentioned issues).

@Mirko-von-Leipzig
Copy link
Contributor Author

I'm not fully understanding this yet - but a couple of questions/comments:

The effect of this is that we add more info to the --version command, right?

Correct, more specifically it uses the output of git describe --tags --dirty which is injected at compile time via the build.rs. The output format looks like this:

# If at a tagged commit, only the tag is output
version=TAG             # e.g. v0.5.0

# At commit with <SHA> hash, N commits after the last tag
version=TAG-N-SHA.      # e.g. v0.5.0-12-ABCDEFG

# In addition if the repo contains uncommitted changes then a `-dirty` suffix is added.
version=TAG-dirty.      # e.g. v0.5.0-dirty if directly on a tagged commit, or 
version=TAG-N-SHA-dirty # e.g. v0.5.0-12-ABCDEFG-dirty.

And this info now going be based on Git tags rather than Cargo.toml? Or is it going to combine the two somehow?

This is based entirely on git. If we publish v0.7.0 but never tag it then the --version output will be based on whatever the last available tag is plus the deltas. Tagging the release is considered best practice by both crates.io and github so a gap here would really indicate an issue on our end. However this can be rectified post-publish/release by just tagging it.

And for the binaries in which this was wrong, we would still have access to the correct commit SHA.

Generally, I think it is good to add more details to the version like you have with the vargen example, but I think what gets printed out should be consistent with crates.io version. That is, if Cargo.toml says 0.6.1 - and then we can add more info to it.

This should always be the case unless we screw up the tags. Basing it on the manifest won't work for non-releases as the manifest is always bumped immediately after the previous release, meaning this would be "incorrect" for the duration of next.

If the potential error between manifest and git tag is an issue, then we can leave things as is and instead add this information as extended metadata for the long version output. Non-releases will have the top level version output, but the correct data will still be present for long version. Though I would argue that it's confusing :)

I also probably won't publish a separate crate just for this. Given that we have only 2 binaries, I'm wondering if putting this just in build.rs of the binaries is a better option?

I think that's probably the correct call. Will update first thing next week.

@Mirko-von-Leipzig
Copy link
Contributor Author

I should also add a tag filter so that only version related tags are considered. e.g. if we ever use tags for other things then those will be ignored.

@bobbinth
Copy link
Contributor

This is based entirely on git. If we publish v0.7.0 but never tag it then the --version output will be based on whatever the last available tag is plus the deltas. Tagging the release is considered best practice by both crates.io and github so a gap here would really indicate an issue on our end. However this can be rectified post-publish/release by just tagging it.

Currently, my process is a bit different - though, it can of course be changes. Specifically, I usually first publish crates to crates.io and then tag the release.

Also, sometimes I publish patches of individual crates without creating tags. For example, if there was a bug fixed in say the block producer crate, I would publish it under v0.6.1 but keep the rest of the crates at v0.6.0 and the tag would remain at v0.0.6.0. Maybe this is not a good practice though.

If the potential error between manifest and git tag is an issue, then we can leave things as is and instead add this information as extended metadata for the long version output. Non-releases will have the top level version output, but the correct data will still be present for long version. Though I would argue that it's confusing :)

To clarify, would the short version show something like:

miden-node 0.6.1

But the long version would show something like:

miden-node 0.6.1

Build Timestamp:     2021-02-23T20:14:46.558472672+00:00
rustc Version:       1.52.0-nightly
rustc Channel:       nightly
rustc Host Triple:   x86_64-unknown-linux-gnu
rustc Commit SHA:    3f5aee2d5241139d808f4fdece0026603489afd1
cargo Target Triple: x86_64-unknown-linux-musl
cargo Profile:       release

If so, I don't think this is too confusing. But would it satisfy the original goal of this PR?

@Mirko-von-Leipzig
Copy link
Contributor Author

Mirko-von-Leipzig commented Sep 16, 2024

Currently, my process is a bit different - though, it can of course be changes. Specifically, I usually first publish crates to crates.io and then tag the release.

Tagging afterwards is fine - crates.io doesn't actually store any binaries; so each subsequent install will checkout the repo and acquire the tag. There would be a small window prior to tagging where users might get this without the tag.

Also, sometimes I publish patches of individual crates without creating tags. For example, if there was a bug fixed in say the block producer crate, I would publish it under v0.6.1 but keep the rest of the crates at v0.6.0 and the tag would remain at v0.0.6.0. Maybe this is not a good practice though.

This version output only applies to the executables. So long as the tags match the intended versions of the node/faucet we would be okay. There are other options - one can also override these env variables during the build process if required.

I've reverted this change for now; I've only added additional metadata to the long version. The actual version used is now the crate/manifest version. The metadata will give us the additional data we need to accurately debug. We just need to be aware that a user claiming x.y.z version might be referring either to the release version, or the in-progress version.

To clarify, would the short version show something like:

miden-node 0.6.1

But the long version would show something like:

miden-node 0.6.1

Build Timestamp:     2021-02-23T20:14:46.558472672+00:00
rustc Version:       1.52.0-nightly
rustc Channel:       nightly
rustc Host Triple:   x86_64-unknown-linux-gnu
rustc Commit SHA:    3f5aee2d5241139d808f4fdece0026603489afd1
cargo Target Triple: x86_64-unknown-linux-musl
cargo Profile:       release

If so, I don't think this is too confusing. But would it satisfy the original goal of this PR?

I've updated the PR so that the short version output is:

$ miden-node -V

miden-node 0.6.0

and the long

$ miden-node --version

miden-node 0.6.0

SHA:          8d9602c-dirty
branch:       mirko-versionify
features:     testing
rust version: 1.80.1
target arch:  aarch64-apple-darwin
host arch:    aarch64-apple-darwin
opt-level:    3
debug:        false

Feel free to suggest a better formatting; I tried with prefixes like git.XXX and build.YYY or a more toml-like format but in the end I don't care that much :).

I've only implemented this for the miden-node binary for now; I'll copy it to the faucet once we have consensus.

Something else I was hoping to determine that it was built with --locked but unfortunately this isn't possible.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The format looks great! Thank you! I left one small comment inline, and I think once we propagate this to the faucet as well, we are good to merge.

bin/node/src/main.rs Outdated Show resolved Hide resolved
@bobbinth
Copy link
Contributor

I guess one thing to confirm: how would it work when the crate is pulled from crates.io? For example, how would SHA and branch fields be set in this case?

@Mirko-von-Leipzig
Copy link
Contributor Author

I guess one thing to confirm: how would it work when the crate is pulled from crates.io? For example, how would SHA and branch fields be set in this case?

I checked using cargo publish --dry-run ... and the git data is indeed just gone. For the current implementation it prints:

/miden-node --version`
miden-node 0.6.0

SHA:          VERGEN_IDEMPOTENT_OUTPUT
branch:       VERGEN_IDEMPOTENT_OUTPUT
features:     None
rust version: 1.81.0
target arch:  aarch64-apple-darwin
host arch:    aarch64-apple-darwin
opt-level:    0
debug:        true

which isn't ideal but I believe this should let us determine whether its a formal release/install at least? There are of course hacks around this if people want to abuse it but that is what it is.

@bobbinth
Copy link
Contributor

which isn't ideal but I believe this should let us determine whether its a formal release/install at least? There are of course hacks around this if people want to abuse it but that is what it is.

Should we detect this and not include the missing fields in the output? Would still make it clear if this is a release/install, but also would avoid exposing implementation details (i.e., that we use vergen).

@Mirko-von-Leipzig
Copy link
Contributor Author

Mirko-von-Leipzig commented Sep 17, 2024

which isn't ideal but I believe this should let us determine whether its a formal release/install at least? There are of course hacks around this if people want to abuse it but that is what it is.

Should we detect this and not include the missing fields in the output? Would still make it clear if this is a release/install, but also would avoid exposing implementation details (i.e., that we use vergen).

We can get at least the SHA from a published crate. cargo publish adds a .cargo_vcs_info.json file (except when using --allow-dirty), which we can detect and use instead:

{
  "git": {
    "sha1": "9d48046e9654d93a86212e77d6c92f14c95de44b"
  },
  "path_in_vcs": "bin/node"
}

I'm also taking a quick look into alternatives like built but maybe we don't care about this so much so just replacing the invalid string is enough.

@Mirko-von-Leipzig
Copy link
Contributor Author

I've added a check for the published state (so we still get sha data) and also override the default vergen output in case it somehow slips through.

I've copy pasted this to the faucet instead of the utils crate for the reason given in this comment.

I've tested that it works for cargo publish locally, though I only tested miden-node and not the faucet. Testing is a bit of a pita because publish/package requires all dependencies to have a version specified which of course we don't (all point to next for miden-base):

error: all dependencies must have a version specified when packaging.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left a few more comments inline.

crates/block-producer/src/skele/mod.rs Outdated Show resolved Hide resolved
bin/node/Cargo.toml Show resolved Hide resolved
bin/node/build.rs Outdated Show resolved Hide resolved
bin/node/src/main.rs Outdated Show resolved Hide resolved
bin/faucet/build.rs Outdated Show resolved Hide resolved
@Mirko-von-Leipzig
Copy link
Contributor Author

Mirko-von-Leipzig commented Sep 18, 2024

I've moved everything into the utils crate by:

  • Feature gating the build script stuff behind vergen. We can remove the feature if we feel okay with leaking the build-script deps into normal builds 🤷. With dependency resolution this doesn't matter much in the end.
  • Using option_env! which then defaults to an empty string if the environment variables aren't present. This means this function compiles even if the build script hasn't been run.

-- edit --

Okay this doesn't work because utils is compiled first - which means the environment variables aren't present yet, so they're always empty. Back to the drawing board.

-- 2nd edit --

Using your suggestion of a struct + Display implementation so the end caller can inject the variables.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left just one optional nit inline.

Comment on lines 31 to 33
[features]
# Enables depedencies intended for build script generation of version metadata.
vergen = ["dep:vergen", "dep:vergen-gitcl"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we usually put features section above the dependencies section.

bin/node/Cargo.toml Show resolved Hide resolved
@Mirko-von-Leipzig Mirko-von-Leipzig merged commit 790751c into next Sep 19, 2024
8 checks passed
@Mirko-von-Leipzig Mirko-von-Leipzig deleted the mirko-versionify branch September 19, 2024 07:47
@bobbinth bobbinth mentioned this pull request Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants