Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-17540: Create script for updating a reference of latest cached trunk commit #17204

Merged

Conversation

fonsdant
Copy link
Contributor

No description provided.

@fonsdant fonsdant marked this pull request as draft September 16, 2024 01:25
@fonsdant fonsdant changed the title KAFKA-17540: Create floating tag on trunk for CI cache [WIP] KAFKA-17540: Create floating tag on trunk for CI cache Sep 16, 2024
.github/workflows/build.yml Outdated Show resolved Hide resolved
@mumrah mumrah added the build Gradle build or GitHub Actions label Sep 16, 2024
@fonsdant fonsdant changed the title [WIP] KAFKA-17540: Create floating tag on trunk for CI cache KAFKA-17540: Create floating tag on trunk for CI cache Sep 17, 2024
@fonsdant
Copy link
Contributor Author

fonsdant commented Sep 17, 2024

Hi, @mumrah! I have made a test to check if it works as expected. You can check it: https://github.com/fonsdant/github-actions.

To test, I have created two jobs: one for a successful build (and then tag update) and another for a failed build (no tag update). I have needed to give read and write permission to GitHub Action to perform the tag creation.

@fonsdant fonsdant marked this pull request as ready for review September 17, 2024 12:11
Copy link
Contributor

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fonsdant! Left a few comments.

I also wonder if we should put this in the CI Complete workflow. Though unlikely, it is possible that the build succeeds, but writing to the cache fails in the Post Gradle Setup step. If this happened, we would have updated the tag prematurely.

If we add a "update-cached-tag" job in CI Complete, we can trigger it to only run when a CI workflow on trunk is successful (which would mean the cache was updated).

.github/workflows/build.yml Outdated Show resolved Hide resolved
.github/workflows/build.yml Outdated Show resolved Hide resolved
.github/workflows/build.yml Outdated Show resolved Hide resolved
@fonsdant
Copy link
Contributor Author

Thanks @fonsdant! Left a few comments.

I also wonder if we should put this in the CI Complete workflow. Though unlikely, it is possible that the build succeeds, but writing to the cache fails in the Post Gradle Setup step. If this happened, we would have updated the tag prematurely.

If we add a "update-cached-tag" job in CI Complete, we can trigger it to only run when a CI workflow on trunk is successful (which would mean the cache was updated).

Moved!

Copy link
Contributor

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small thing remaining.

Also, can you see if signed tags are possible here? That might be a nice thing to include.


update-cached-tag:
# Skip this workflow if the CI run was skipped or cancelled
if: (github.event.workflow_run.conclusion == 'success' || github.event.workflow_run.conclusion == 'failure')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, we only want to run if the workflow was successful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thanks! I will fix it!

@fonsdant
Copy link
Contributor Author

One small thing remaining.

Also, can you see if signed tags are possible here? That might be a nice thing to include.

Yes, they are possible but require a GPG key. Is the purpose of the sign to be the same as the commits? If so, maybe we could add a Signed-off-by trailer at the end of the description. But I think it will require an automation, it is not out-of-the-box.

@mumrah
Copy link
Contributor

mumrah commented Sep 18, 2024

Is the purpose of the sign to be the same as the commits?

No, the purpose is just to have a signature on the tag itself. When Github merges PRs, it will sign the resulting commit with its own gpg key. I was hoping that doing git tag -s in the action would do the same for tags. It should be possible to test it on your fork.

@mumrah
Copy link
Contributor

mumrah commented Sep 19, 2024

I think tag signing with git may not work as expected here. It may be possible to use the API to create a signed tag. https://docs.github.com/en/rest/git/tags?apiVersion=2022-11-28

@fonsdant
Copy link
Contributor Author

fonsdant commented Sep 19, 2024

I think tag signing with git may not work as expected here.

I have made a test. Only to confirm: is it not work as expected because of this? Git could not determine GitHub bot's email.

Maybe we could use gh CLI to create the tag. GitHub CLI is preinstalled on all GitHub-hosted runners

@mumrah
Copy link
Contributor

mumrah commented Sep 19, 2024

Yes, using gh should work. The docs link I posted has an example of creating a tag with gh api

gh api \
  --method POST \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/OWNER/REPO/git/tags \
   -f "tag=v0.0.1" -f "message=initial version" -f "object=c3d0be41ecbe669545ee3e94d31ed9a4bc91ee3c" -f "type=commit" -f "tagger[name]=Monalisa Octocat" -f "tagger[email]=octocat@github.com" -f "tagger[date]=2011-06-17T14:53:35-07:00"

We're using this tool to create custom statuses on PRs here https://github.com/apache/kafka/blob/trunk/.github/actions/gh-api-update-status/action.yml#L55-L59

@fonsdant
Copy link
Contributor Author

Oh! I have noted it fails when tag ref already exists. I am working on fix it!

@mumrah
Copy link
Contributor

mumrah commented Sep 20, 2024

@fonsdant first off, thanks for all the work on this. I was pondering this yesterday and had an idea. Instead of managing a tag, I think we can use a separate branch, something like trunk-cached. This branch just contains the commits from trunk that are included in the latest GitHub cache.

In this case, the CI becomes simply

git checkout trunk-cached
git merge SHA
git push origin trunk-cached

This will improve the developer experience somewhat since it doesn't require explicitly fetching tags. Just a simple git fetch origin will pull it down.

Can you give this a try?

@fonsdant
Copy link
Contributor Author

I am happy to help! :)

Sure! I will give it a try. It seems simpler for me too!

@fonsdant
Copy link
Contributor Author

Hi, @mumrah! I have made some modifications. I have used switch as it seems more semantic. Instead of merge, I have chosen to use reset --hard to prevent any conflicts that merge could raise. How about it?

steps:
- name: Update trunk-cached branch with trunk
run: |
git switch trunk-cached || git switch -c trunk-cached
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to fetch trunk-cached first? Or does the checkout step fetch everything?

Copy link
Contributor Author

@fonsdant fonsdant Sep 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we are doing a reset here, it is not needed to do fetch. Even if we do fetch, the fetched trunk-cached branch will be reset to github.sha as well as the unfetched trunk-cached branch. So the fetched trunk-cached does not change the result.

In a way, the github.sha is already our "fetched branch" or "reference", which the trunk-cached branch should point to. And push -f guarantees the update will be performed without fast-forward or similar issues.

I have done two tests for this. You can check them here and here.

This is the git log result:

$ git log --oneline 
e8da285 (HEAD -> trunk, origin/trunk-cached, origin/trunk) update ble.txt
d9c105e minimal impl
425d23e replace sha with origin trunk branch
...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. Looks like we're always doing git switch -c to create a local branch, so maybe we don't need the first git switch ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right! I will remove the git switch. Thanks!!

@mumrah
Copy link
Contributor

mumrah commented Sep 26, 2024

The new cut options are working 👍

$ git update-cache                 
264131cdaaef3f4696942f26534b3f61f3a2a162

@mumrah
Copy link
Contributor

mumrah commented Sep 26, 2024

Ok, now we need to take the SHA captured by cut and use update-ref to create a local branch.

I'd like to see this output:

$ git update-cache
Local branch 'trunk-cached' updated to 264131cdaaef3f4696942f26534b3f61f3a2a162

BTW, does the gh cache command work for you as a non-committer?

@fonsdant
Copy link
Contributor Author

Also, I wonder if we should do a git fetch origin as part of the command. Otherwise, we might pull in a SHA that doesn't exist locally

I think it will not work well for all cases. In my local repo, origin points to fonsdant/kafka (my fork) and upstream to apache/kafka. Maybe, we keep it more agnostic and print a message guiding the user to fetch. WDYT?

BTW, does the gh cache command work for you as a non-committer?

Yep! :)

$ git update-cache
264131cdaaef3f4696942f26534b3f61f3a2a162

@mumrah
Copy link
Contributor

mumrah commented Sep 26, 2024

Maybe, we keep it more agnostic and print a message guiding the user to fetch. WDYT?

Good point on different remote names. Let's still attempt to update the ref in this script. If it can't be done, print a warning and recommend to the user to update their remote.

> git update-cache
Cannot update 'trunk-cached' because SHA 264131cdaaef3f4696942f26534b3f61f3a2a162 does not exist locally. Please update your remote and try again.
> echo $? 
1
> git fetch origin-or-whatever
> git update-cache
Local branch 'trunk-cached' updated to 264131cdaaef3f4696942f26534b3f61f3a2a162

committer-tools/update-cache.sh Outdated Show resolved Hide resolved
Comment on lines 43 to 44
git switch trunk-cached &> /dev/null || git switch -c trunk-cached &> /dev/null
if git merge "$sha" &> /dev/null; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this script is managing the trunk-cached ref exclusively, lets just use git update-ref.

git update-ref -m 'some message' trunk-cached $sha

if the SHA doesn't exist, it will fail with a non-zero exit code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it is not working as expected... I am trying to update to 235cafa, but my trunk-cached still points to 1f04436de1. See:

$ git update-ref -m 'test' trunk-cached 235cafa805
$ echo $?
0
$ git log
...
1f04436de1 (trunk-cached) Merge commit '1854d4b8a11461b53b59fa109b95f2a4f5003997' into trunk-cached
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But using git branch -f has worked well:

$ git branch -f trunk-cached 235cafa805
$ git log
...
235cafa805 (trunk-cached) KAFKA-6197: Update Streams API and Javadoc references in documentation (#17215)
...

committer-tools/update-cache.sh Outdated Show resolved Hide resolved
Signed-off-by: Joao Pedro Fonseca Dantas <fonsdant@gmail.com>
Signed-off-by: Joao Pedro Fonseca Dantas <fonsdant@gmail.com>
Signed-off-by: Joao Pedro Fonseca Dantas <fonsdant@gmail.com>
@fonsdant
Copy link
Contributor Author

The script has failed with rev-parse (could not to find commit locally), so I have replaced it with show.

Copy link
Contributor

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @fonsdant! I tried it out and it worked as expected. Nice!

Just two small things then I'm happy to merge this.

@chia7712 did you have any additional comments?

docs/ops.html Outdated
@@ -1249,7 +1249,7 @@ <h4 class="anchor-heading"><a id="prodconfig" class="anchor-link"></a><a href="#

<h3 class="anchor-heading"><a id="java" class="anchor-link"></a><a href="#java">6.6 Java Version</a></h3>

Java 8, Java 11, Java 17, and Java 21 are supported.
Java 8, Java 11, Java 17, Java 21, and Java 23 are supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this change intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! It should not have been sent actually. Thanks! I will remove it.


sha="$(cut -d '-' -f 5 <<< "$key")"

git fetch &> /dev/null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior of git fetch with no args depends on the local config, so it might not be the same for everyone.

Since we are expecting developers to have done the appropriate git fetch or git pull prior to running this command, I think we can omit the git fetch completely from this script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, I will proceed this way!

Copy link
Contributor

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I'm going to go ahead and merge this after incorporating to two minor nitpicks from the last review. Thanks for sticking with us through this PR @fonsdant 😄. I think we ended up trying three totally different approaches before settling on the local script.

@mumrah mumrah changed the title KAFKA-17540: Create floating tag on trunk for CI cache KAFKA-17540: Create script for updating a reference of latest cached trunk commit Oct 1, 2024
@mumrah mumrah merged commit 84bcdc9 into apache:trunk Oct 1, 2024
7 checks passed
@chia7712
Copy link
Contributor

chia7712 commented Oct 1, 2024

@fonsdant thanks for this great contribution!

@fonsdant
Copy link
Contributor Author

fonsdant commented Oct 1, 2024

Thank you very much, @mumrah, @chia7712! I have learned a lot along this PR! 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Gradle build or GitHub Actions tools
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants