Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow restore process to finish before starting Jenkins #843

Closed
bentlema opened this issue Jun 2, 2023 · 5 comments · Fixed by #844
Closed

Allow restore process to finish before starting Jenkins #843

bentlema opened this issue Jun 2, 2023 · 5 comments · Fixed by #844
Labels
bug Something isn't working
Milestone

Comments

@bentlema
Copy link

bentlema commented Jun 2, 2023

Describe the bug
When using multibranch pipelines (GitHub Branch Source plugin), upon a Jenkins restart, all branches, PRs, and tags are rebuilt. This appears to be because the restore process hasn't finished restoring all build history in time. This is a race condition, and it causes build history to get overwritten, as well as a massive spike in resource usage as (In our case) hundreds of build jobs are being kicked off simultaneously.

One possible solution would be to move the restore process to an initContainer to guarantee it finishes before the jenkins-master container starts.

To Reproduce
With the GitHub Branch Source plugin installed, and a multibranch pipeline configured to build on branches, PRs, and/or tags, execute several builds. Observe the state of the build history. Restart Jenkins (kill the pod), and observe when Jenkins comes back up it will re-build every branch, PR, and tag, at the same time old build history is still "flowing in" via the restore process. (May be hard to repro with a small test case. We have dozens of multibranch pipelines, and hundreds of branches/PRs/tags)

Additional information

Kubernetes version: 1.23 (AWS EKS)

Jenkins Operator version: v0.7.1 and v0.8.0-beta

This same issue was reported in #679 , but was closed as stale.

@bentlema bentlema added the bug Something isn't working label Jun 2, 2023
@brokenpip3
Copy link
Collaborator

yes that issue should not be closed as stale.
To solve this we need or the initcontainer like I already commented in the old issue or move the restore before the creation of the seed-job-init in the reconciliation loop.
I will try to do the second one but at this moment I still need to fix a couple of things for the 0.8 and finish the golang and operator-sdk migration to the newest version to start doing huge code changes (more info here). Unfortunately this project has been abandoned for a while so we may need time to recover and start be in track.
In the mean time you can try this ugly yet working workaround: #679 (comment)

@brokenpip3
Copy link
Collaborator

brokenpip3 commented Jun 4, 2023

@bentlema can you try this operator version:

quay.io/jenkins-kubernetes-operator/operator:d9ea2ee

and let me know? thanks!

I tried the quick fix I suggest before: move the restore before the seed job creation in the user reconcile loop

@bentlema
Copy link
Author

bentlema commented Jun 6, 2023

@bentlema can you try this operator version:

quay.io/jenkins-kubernetes-operator/operator:d9ea2ee

and let me know? thanks!

I tried the quick fix I suggest before: move the restore before the seed job creation in the user reconcile loop

@brokenpip3, yes, this does appear to fix it! After upgrading to this image, and restarting a couple times, it does appear that the restore process is finishing before the seed jobs are executed. The logs seem to confirm this as well:

jenkins-jenkins-operator-7547c95d55-9p8nt jenkins-operator 2023-06-06T06:22:22.942Z INFO  controller-jenkins  Restoring backup '368'  {"cr": "jenkins"}
jenkins-jenkins-operator-7547c95d55-9p8nt jenkins-operator 2023-06-06T06:22:44.197Z INFO  controller-jenkins  Restoring backup '368'  {"cr": "jenkins"}
.
.
.
jenkins-jenkins-operator-7547c95d55-9p8nt jenkins-operator 2023-06-06T06:23:21.924Z INFO  controller-jenkins  Waiting for Seed Job Agent `seed-job-agent`...  {"cr": "jenkins"}
jenkins-jenkins-operator-7547c95d55-9p8nt jenkins-operator 2023-06-06T06:23:26.450Z INFO  controller-jenkins  Waiting for Seed Job Agent `seed-job-agent`...  {"cr": "jenkins"}

@brokenpip3
Copy link
Collaborator

Cool! I'm glad that worked :)

Will maintain the issue open until the new version that contain the fix will be released

@brokenpip3
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants