-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JENKINS-40825] move transients into DecoratedLauncher #180
[JENKINS-40825] move transients into DecoratedLauncher #180
Conversation
as `decorate` will be called multiple times, and we don't want to share those among calls. Makes JENKINS-40825 much less severe, as "Pipe not connected" now most of the time seems to come from [DurableTaskStep][1], which will just retry instead of failing the job. Note: this is just a first stab and needs polishing [1]: https://github.com/jenkinsci/workflow-durable-task-step-plugin/blob/ae18393/src/main/java/org/jenkinsci/plugins/workflow/steps/durable_task/DurableTaskStep.java#L332
Whoops, forgot the tests. Will fix them right now. |
Tests pass on my system, but I needed a small "hack" to make the slaves running (on virtualbox) be able to connect to my Jenkins, see #181. |
Looks like this is not enough. Please don't merge yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For starters, kudos for this. I think that it will definitely help.
Having code and formatting changes in the same commit makes it a bit harder to review.
I don't have a change to request, but I'd welcome limiting the scope of changes (e.g. formatting and maybe removing the extra class CloasbleLauncher
as I think that we could have the same functionality without it).
"[" + containerName + "] of pod [" + podName + "]." + | ||
" Timed out waiting for container to become ready!"); | ||
} | ||
private class ClosableLauncher extends Launcher.DecoratedLauncher implements Closeable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to create a new class? Can't we just have the exiting class add things to a list of closeables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, see #180 (comment)
First of all, thanks for the review. The new class was for the implementation of the Anyway, I realized that this approach is not enough anyway - Even with this pull request, we are overwriting Closing this pull request now, expect a new one tomorrow. The formatting was just different because I extracted the class, but you are right about not putting formatting changes into the same commit as functionality changes, I will take care not to do that. |
@marvinthepa: If by "this is not enough" you mean that there are still stuck threads, I noticed in the client that the InputStreamPumper.close() method didn't properly broke the pumping loop (in some cases), causing the thread to remain stuck. It seems to work somehow nicer, with this: fabric8io/kubernetes-client@61ae6ed which is now available in: https://github.com/fabric8io/kubernetes-client/releases/tag/v2.5.8 (which is in central). So you might want to try this version out. |
I actually meant that we might still end up not even calling the See this branch for what I think works better. I will try to verify a few of my assumptions (especially about |
@marvinthepa: If we know for sure that proc is being killed by the shell step, then things get simplified a lot, as it frees us completely from having to babysit and hold refs of the watch and proc. |
Proc is not what is being killed by the shell step, but actually |
Am I understanding this correctly - that with this pull request, the "Pipe not connected" issue will be seen (logged) but will no longer "freeze" the build for 5 mins and then fail it? |
@michaelajr: If you have a test system, feel free to build from that branch and give it a spin, I would be interested to get feedback if it works for you. You still might have issues when it takes you more than 10 seconds to start processes in kubernetes (i.e. likely when your kubernetes is overwhelmed), but the general situation should improve. I managed to get rid of the pipe not connected errors, but I still need to get the |
@marvinthepa OK. We'll take a look. Thanks! |
Please see #182 instead. |
as
decorate
will be called multiple times, andwe don't want to share these transients among calls -
especially as they will not be properly cleaned up.
Makes JENKINS-40825 much less severe, as
"Pipe not connected" now most of the time seems to
come from DurableTaskStep, which will just retry
instead of failing the job.
More importantly, it fixes the resource leak mentioned in
this comment and the following comments.
Note: this is just a first stab and probably needs polishing, especially
the new
ContainerExecWatcher.close
implementation, whichmight not even be necessary.