Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use -DdeplotAtEnd to avoid partial deployment for multi module which fail subsequent retries #28

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

olamy
Copy link

@olamy olamy commented Feb 21, 2025

@olamy olamy changed the title Use -DdeplotAtEnd to avoid partial deployment for multi module which … Use -DdeplotAtEnd to avoid partial deployment for multi module which fail subsequent retries Feb 21, 2025
@@ -5,7 +5,7 @@ then
gh api /repos/$GITHUB_REPOSITORY/releases | jq -e -r '[ .[] | select(.draft == true and .name == "next")] | max_by(.id).body' | egrep "$INTERESTING_CATEGORIES"
fi
export MAVEN_OPTS=-Djansi.force=true
mvn -B -V -e -s $GITHUB_ACTION_PATH/settings.xml -ntp -Dstyle.color=always -Dset.changelist -DaltDeploymentRepository=maven.jenkins-ci.org::default::https://repo.jenkins-ci.org/releases/ -Pquick-build -P\!consume-incrementals clean deploy
mvn -B -V -e -s $GITHUB_ACTION_PATH/settings.xml -ntp -Dstyle.color=always -Dset.changelist -DaltDeploymentRepository=maven.jenkins-ci.org::default::https://repo.jenkins-ci.org/releases/ -Pquick-build -P\!consume-incrementals clean deploy -DdeployAtEnd=true -DretryFailedDeploymentCount=4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-DretryFailedDeploymentCount=4 => a reason to choose this particular value of 4?

At first sight I would not retry: yes it is annoying to retrigger a release but I don't see what problem the retry would solve?

  • If it is a rate limit of DockerHub/GitHub API/any other, retrying won't solve
  • If it is an authentication problem in Artifactory, most of the cases are either error requiring a new release or token out of date which requires 20 min to 3h before it is fixed so a retry would not help
  • I'm not aware of any transient network issue in GHA runner which would be solved by a retry

=> I guess the retry should be at the runner allocation level?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-DretryFailedDeploymentCount=4 => a reason to choose this particular value of 4?

nothing scientific here just a value

At first sight I would not retry: yes it is annoying to retrigger a release but I don't see what problem the retry would solve?

  • If it is a rate limit of DockerHub/GitHub API/any other, retrying won't solve
  • If it is an authentication problem in Artifactory, most of the cases are either error requiring a new release or token out of date which requires 20 min to 3h before it is fixed so a retry would not help
  • I'm not aware of any transient network issue in GHA runner which would be solved by a retry

Well as any network error this can eventually happen for whatever reason.
Going from an infra to another through different networks anything can happen
Are you telling me you never had any unexpected network error or artifactory restart/temporary glitch or even the authz system behind artifactory? :)
I can't see why it can be a problem but if you think it's really a problem I don't mind removing it.

=> I guess the retry should be at the runner allocation level?

there is no real rollback of previous steps (tags, partial deployment of mutimodule) so retrying everything might not work (tag already here, part of deployment already there...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well as any network error this can eventually happen for whatever reason. Going from an infra to another through different networks anything can happen Are you telling me you never had any unexpected network error or artifactory restart/temporary glitch or even the authz system behind artifactory? :) I can't see why it can be a problem but if you think it's really a problem I don't mind removing it.

Yes we have had glitches, but as I said, the timespan of these glitches made their probability to be still present during the retries is high. My concern is not about the retry (it makes sense), it is more about:

  • The value: too much retries could hammer the poor glitched system trying to recover. We don't want to create an hysteresis cycle , that's why we have circuit breakers (in the form of build failure here :p )
    As such I suggest a 2 time, which is enough (as we don't have real problem to fix with this so no data on which to rely).
Suggested change
mvn -B -V -e -s $GITHUB_ACTION_PATH/settings.xml -ntp -Dstyle.color=always -Dset.changelist -DaltDeploymentRepository=maven.jenkins-ci.org::default::https://repo.jenkins-ci.org/releases/ -Pquick-build -P\!consume-incrementals clean deploy -DdeployAtEnd=true -DretryFailedDeploymentCount=4
mvn -B -V -e -s $GITHUB_ACTION_PATH/settings.xml -ntp -Dstyle.color=always -Dset.changelist -DaltDeploymentRepository=maven.jenkins-ci.org::default::https://repo.jenkins-ci.org/releases/ -Pquick-build -P\!consume-incrementals clean deploy -DdeployAtEnd=true -DretryFailedDeploymentCount=2
  • Asking the question was part of the code review: the flag is not described in the PR body so better ask to ensure it is not a mistake and why (since this library is used by a lot of plugins).
    => you have answered that it is a willing change. May I suggest to add a quick sentence in the PR body explaining this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants