-
Notifications
You must be signed in to change notification settings - Fork 332
Automate the release guide #2156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
98e5288 to
67f41c7
Compare
| Print usage information. | ||
|
|
||
| Examples: | ||
| $(basename "$0") --version 1.0.0-incubating-rc1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to split version and rc numbers and make those separate parameters.
-incubating must be present, so it can be a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered pulling that part from your PR, indeed. But decided to not reuse it so that the code is overall simpler. The current approach is less smart, as it pushes the responsibility of adding -incubating to the release manager and enforces it via a regex validation.
The alternative would be:
- Have a
--version x.y.zparameter that acceptsx.y.zas well asx.y.z-incubating - Have a
--rc Nparameter - Have a logic that detects whether
-incubatingis already present and adds it, if missing
Overall, I would rather have a single --version x.y.z-incubating-rcN parameter validated by a regex. Simpler code.
The only downside, really, is that when we graduate, we will have to update multiple shell scripts to remove the -incubating part. I feel that it is acceptable.
c634041 to
f6d67e3
Compare
|
Thanks @pingtimeout , will take a look today! |
flyrain
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pingtimeout for doing this. LGTM overall with a few comments.
|
|
||
| print_info "Updating CHANGELOG.md" | ||
| exec_process cd "${releasey_dir}/.." | ||
| exec_process ./gradlew patchChangelog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script exists at line 139 in case of RC2+. Should we re-run the changelog for any RC2+?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. I am not 100% sure about the answer, to be honest. The release guide says that the changelog should be manually updated:
Manually add an -rcN suffix to the previously generated versioned CHANGELOG section.
Rerun the patchChangelog command
Manually remove RC sections from the CHANGELOG
Submit a PR to propagate CHANGELOG updates from the release branch to main.
And IIRC there are still some discussions on changelog updates, whether those updates should be merged to main, etc...
This is the reason why the script exits for RC>1.
My goal with this PR is to have a 1:1 automation of what the release guide says, so that we can start having some semi-automated (albeit imperfect) release process for the August release. But I suspect we will iterate on this as we learn more about our practices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR CHANGELOG.md has to be (re)generated for every version/RC/branch independently.
Content of CHANGELOG.md is maintained manually w/ the PRs being merged, it's the contributor's/reviewer's who update CHANGELOG.md. So changes to that file already exist and nothing has to be "merged back", and in fact cannot as the contents on "all the branches" can be different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, if nothing has to be "merged back", why do we need this command?
exec_process ./gradlew patchChangelog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are CHANGELOG.md changes on main while the release is in progress, then automatically handling the release sub-sections in that file will likely lead to merge conflicts.
If we want to completely automate this process, I'd suggest to identify CHANGELOG entries with PR numbers and make custom tooling for generating release sub-sections. The tool can use the PR number IDs for reconciling main and release branches... not sure if there's appetite for this 😅 My original idea was to leverage the existing gradle plugin for that, but it apparently does not fit the multi-branch use case.
| # If we're creating RC2 or later, verify previous RC tag exists | ||
| if [[ ${rc_number} -gt 1 ]]; then | ||
| previous_rc=$((rc_number - 1)) | ||
| previous_tag="apache-polaris-${version%-rc*}-rc${previous_rc}" | ||
|
|
||
| print_info "Checking for previous RC tag: ${previous_tag}" | ||
| if ! git tag -l "${previous_tag}" | grep -q "^${previous_tag}$"; then | ||
| print_error "Previous RC tag ${previous_tag} does not exist. Cannot create RC${rc_number} without RC${previous_rc}." | ||
| exit 1 | ||
| fi | ||
| print_info "Previous RC tag ${previous_tag} found" | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: Do we have to care if the previous tag exists or not? what if the previous RC tag has been deleted manually already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be a concern. My rationale is that no RC tag should ever be deleted. I don't think it is allowed per ASF principles, as it means a release candidate, for which a vote was called, could disappear.
Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn’t see anything in the ASF principles that explicitly calls that out—let me know if you have a link. If it’s not covered in ASF policy, the main reason to keep RC tags would just be for historical reference. That said, folks might still delete them to keep the tag list clean.
In any case, this isn’t a blocker. My thinking is that if the behavior isn’t defined, we don’t need previous tag checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting tags is not good practice and should really never happen, it breaks things.
Tags are referred from released artifacts/poms, that alone actually forbids deleting those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack that this is not a blocker 👍
Let's rediscuss this when JB is back from holidays. I think that's a fair question.
| print_info "Updating Helm index..." | ||
| exec_process cd "${dist_dev_dir}/helm-chart" | ||
| exec_process helm repo index . | ||
| exec_process cd "${releasey_dir}/.." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need to commit change after helm indexing, otherwise, the index file is not updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, let me try a few things locally and update the script.
| print_info "- Maven artifacts: Published to Apache staging repository" | ||
| echo | ||
| print_info "Next steps:" | ||
| print_info "1. Close the staging repository in Nexus" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to automate this part? Otherwise, we may call it out here, that release manager has to sign in the Nexus website with their ASF credential to close it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a way to automate that, indeed. The grand vision is to have a fully automated release process that runs in Github Workflows. However there are some things that needs discussed before this can happen. Things like :
- Should we trigger when a Git tag is created or when an empty Github Release with a specific tag pattern is created, ...
- As you pointed out in another comment, should we (can we?) automatically update the Changelog?
- Should we implement fixes on branches of the supported release(s) first and then merge those branches up to main (in which case the changelog updates are easy)? Or do we want to cherry-pick commits to add certain fixes to release branches (in which case it requires manual updates)?
- Does the ASF supports automatically sending announcement/vote/cancel e-mails? In which case this could be embedded as part of the release automation
- Etc...
As you can see, we are not quite there yet. I suggest we start with the current scripts, which provide a semi-automated model. And in parallel, we can start discussion to achieve the big goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed to automate it later. Can we provide a bit more detailed message how to close it? Just in case people get lost.
| print_info "1. Close the staging repository in Nexus" | |
| print_info "1. close the staging repository in Nexus manually per instructions in https://polaris.apache.org/community/release-guide/ " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to already start with semi-automatic releases. The required GH workflows require the scripts to be different, leading to two different implementations that have to be both verified and kept in sync, which is quite a big effort.
snazy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the amount of comments.
The TL;DR is that it's likely much easier overall to skip the "manual locally run scripts" and go straight to implement semi-automatic releases leveraging GitHub workflows.
There's a lot of stuff in these scripts that deals with verification, validation and dealing with local environment specific stuff. That's all not needed for GH workflows.
The amount of differences between Mac and Linux and the different tool versions plus the differences on everybody's opinionated local setup is quite big. Any change to these scripts has to be verified and validated on all these environments, not even thinking of a minor issue in the scripts that breaks a release.
Doing the "local scripts" first, then implement workflows, test both appropriately and keep both in sync all the time to eventually nuke the local scripts sounds like a huge amount of extra work, including the nitpickyness.
| return 1 | ||
| fi | ||
|
|
||
| major="${BASH_REMATCH[1]}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder whether the global variables should be explicitly declared in a "lib", commenting where/how those are populated. WDYT?
| fi | ||
|
|
||
| # Validate version format: x.y.z-incubating-rcN | ||
| # TODO: Remove incubating when we are a TLP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guess the TODO can go away, or have it everywhere (which isn't great either).
Maybe have a constant VERSION_INCUBATING="-incubating", so there's only one place to change in that case.
|
|
||
| print_info "Updating CHANGELOG.md" | ||
| exec_process cd "${releasey_dir}/.." | ||
| exec_process ./gradlew patchChangelog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR CHANGELOG.md has to be (re)generated for every version/RC/branch independently.
Content of CHANGELOG.md is maintained manually w/ the PRs being merged, it's the contributor's/reviewer's who update CHANGELOG.md. So changes to that file already exist and nothing has to be "merged back", and in fact cannot as the contents on "all the branches" can be different.
| function check_docker_hub_login_status() { | ||
| # Check if user is logged in to Docker Hub (optional check) | ||
| print_info "Checking Docker Hub login status..." | ||
| if docker info 2>/dev/null | grep -q "Username:"; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea - let's drop it.
How a login happens and particularly whether and how that appears in docker info is not standardized at all. It can actually be a very different login (different repo), so the check isn't that helpful.
| print_info "Polaris version: ${polaris_version}" | ||
| print_info "From commit: ${commit}" | ||
| echo | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess a git fetch [remote] doesn't hurt here, just to be safe.
| fi | ||
| } | ||
|
|
||
| function add_apache_remote() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a fan of this.
Having multiple remotes, especially those that point to the exact same repo, break developer experience in IDEs. Things appear twice for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can figure out the "right" remote name via git remote -v and looking for https://github.com/apache/polaris.git.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree there shouldn't be an "add remote" function. But i'd also note that the git -v output can also be a little tricky since if you don't use https
It would look like
apache git@github.com:apache/polaris.git (fetch)
apache git@github.com:apache/polaris.git (push)
I think you are safe if you just look for apache/polaris.git
| return 0 | ||
| else | ||
| print_error "GITHUB_TOKEN environment variable is not set" | ||
| print_error "A GitHub token is required to check CI status for releases" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's technically not required. The "anonymous" rate limit should be sufficient for a few runs of the script (Commit status checks). Setting up a GITHUB_TOKEN may lead to unnecessarily creating PATs or the like.
I suggest to change this to a warning.
| function get_gradle_signing_key_id() { | ||
| # Get the signing key ID from gradle.properties | ||
| grep '^signing.gnupg.keyName' "${GRADLE_PROPERTIES_FILE}" 2>/dev/null | cut -d'=' -f2 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this setting isn't technically required. It's rather the release manager's responsibility to ensure that the key is properly setup. Nothing prevents release managers to configure only one GPG key at all and/or set default-key in ~/.gnupg/gpg.conf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also forces a global setting, which can easily interfere with other stuff that people have to do with Gradle signing. So this actually adds burden to people who sign artifacts with Gradle for multiple projects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For (locally executed) scripts, you can list the available secrets and explicitly ask the release manager to explicitly confirm that their settings are correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't be an issue in GH workflows, where the workflows do the right thing anyways.
|
@flyrain @snazy I think there is some misunderstanding. I believe that only PMC members have write access to SVN distribution directory as well as permissions to create Nexus repositories. This makes testing and implementation tricky for me, for steps ≥05. Typically I can see this in the email template that is generated: I can only rely on the information that is in the release guide, or on the hints I can find in the build file (like assuming that the |
The message likely refers to the Nexus repo, not SVN? |
0e29193 to
c2ac1eb
Compare
|
@snazy This message says that I cannot test as many things as PMC members, and that I can only rely on the information in the current release guide. So if a PMC member requests changes that deviate from the release guide, I have to take the request at face value. |
|
Just a heads up, I am giving a try at a Github-workflow based release automation. Hopefully this should reduce the amount of code, given that it will only be runnable by Github. If it fails, we can always come back to these scripts. |
| source "$LIBS_DIR/_log.sh" | ||
|
|
||
| function exec_process { | ||
| # See print_command for information on why fd 3 is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably just copy the note from print_ command to here since it's not exactly easy to jump between the two files
| # Git Remote Setup Script | ||
| # | ||
| # This script ensures that the Git repository has the proper remote configuration for Apache Polaris releases: | ||
| # 1. Checks whether there is a Git remote named 'apache' pointing to https://github.com/apache/polaris.git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably better to just work backwards from the assumption "we need the correct target" and not require the other conditions.
- Checks remotes
- If one doesn't exist with the correct URI for apache/polaris, error out
- If it does exist return the name of the remote which will be used
|
|
||
| # Git/SVN repository constants | ||
| APACHE_REMOTE_NAME=${APACHE_REMOTE_NAME:-"apache"} | ||
| APACHE_REMOTE_URL=${APACHE_REMOTE_URL:-"https://github.com/apache/polaris.git"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only valid for HTTPS clones, a ssh clone would look like
git@github.com:apache/polaris.git
So it may be good to keep this just to "apache/polaris.git" since I believe we are only using this for validation?
| # | ||
|
|
||
| # | ||
| # Git Remote Setup Script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick note on this doc, this script also has a bunch of other utility methods for checking other project specific statuses. I have no problem with the organization but we should probably change the doc here to note that this also contains other utility functions for working with git. I also wouldn't be opposed if you wanted to separate out the self executing bits from the utility bits but honestly I have no strong feelings on that.
| curl -s -H \"Authorization: Bearer dummy_token\" -H \"Accept: application/vnd.github+json\" -H \"X-GitHub-Api-Version: 2022-11-28\" https://api.github.com/repos/apache/polaris/commits/$(git rev-parse HEAD)/check-runs | jq -r '[.check_runs[].conclusion | select(. != \"success\" and . != \"skipped\")] | length'" | ||
|
|
||
| # Compare content | ||
| if [[ "$actual_content" == "$expected_content" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could probably extract this out into a function since it's used in all the regressions if you like
That's pretty awesome too although I like the idea of having a set of scripts that could potentially work locally as well but I'd hate to put more work on you, so if it's simpler to just do the Github Workflow approach that seems very reasonable. |
|
@RussellSpitzer thanks for the review !
My initial goal was that these scripts would be short-lived and replaced as soon as we had a robust release guide that paved the way for a Github workflow automation. As Robert hinted, it is already quite complex to deal with Linux/MacOS differences (hello GNU I am not saying that nobody should ever be able to perform a release locally. What I am saying is that it will be rare enough that we should not be optimizing for it. |
Committers should have the permission for both svn and Nexus usually. Could you check with ASF infra for this? |
|
This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
|
Closing this PR as it is replaced by #2383 |
Description
This PR automates most of the commands listed in the release guide. It fixes #2017. See the README.md files for information on how the code is structured and how it can be used. The code includes:
Parts not automated:
This is a first implementation that aims at being a 1:1 mapping with the current release guide.
Note that it includes the instructions from #2039 that has not been merged yet, at the time of opening this PR, given that the high likelihood of that PR being merged before this one.
Hints for code review
Recommendation for reviewers: start with the
libs/folder to get an idea of all utility functions, then look at thetest/folder to see what each script is expected to do, then review the scripts themselves.How to test
By default, the scripts run in dry-run mode. This is intended in order to avoid accidentally performing releases during testing.
All the other parameters used by the scripts can be overridden using environment variables. For example,
pingtimeout, runexport APACHE_REMOTE_NAME=pingtimeout APACHE_REMOTE_URL=git@github.com:pingtimeout/polaris.githttps://downloads.apache.org/incubator/polaris/KEYS, create a KEYS file locally with your key, run a small http server (e.g.python -m http.server) from your laptop to serve it, and setexport KEYS_URL=http://localhost:8000/KEYSThen, imply run each script in order. The output will contain messages that describe the commands that would be executed, if dry-run mode was disabled. See example below:
How to view all commands that would be executed
Each script in the
test/directory verifies the commands that the associated release script would execute, if it was not in dry-run mode. See for example test-02-create-release-branch.sh#L87-L103 for the part that creates the release branch and updates the version files.