-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QUESTION: What is terminating the DinD container? #616
Comments
@jwalters-gpsw Hey. This shouldn't happen as long as you use it normally. Are you bind mounting |
Not doing anything special with the controller Helm install or the RunnerDeployment. No changes to the controller install. It happens a small percentage of the time. I have also tried to move the runner to our private docker repo (to avoid rate limit issue) and it still occasionally happens. Here is the RunnerDeployment:
|
For more context, the failing job pulls down a tarfile of the workspace in a previous job and untars it. The untar fails becausw there are files already in the workspace. No checkout in the job.
|
Have you got an example you can see? Can you confirm in this example the runner that gets assigned and fails is unique? (see the |
Yeah, this might be related to #466. A warm of actions jobs can result in occasional failure due to epehemral github actions runners mistakenly dequeues jobs while shutting down, being unable to complete the jobs. @jwalters-gpsw Could you try upgrading to the latest controller 0.19.0 and try setting |
So, back to your original question, it is k8s that stopping the dind container on pod deletion. By default a runner is ephemeral, which means it tries its best to shut down after a single job run. |
I've hooked the system up to Datadog and am now capturing the all the logs (which were disappearing after the job run). You can close this issue until I have something more specific and concrete. On the shutdown of the DinD container... I want to know what the actual mechanism shutting it down is. A SIGTERM and if so where is it sent? Etc. |
@jwalters-gpsw It follows K8s' standard pod termination process as documented in https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination. Does it answer your question? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
About 5% of the time my workflows are failing because the workspace is "dirty" (contain the contents of a prior checkout which causes an
untar
to fail). Because it's inconsistent when it happens (sometimes the job succeeds, sometimes it doesn't) I'm guessing it must be a timing issue with the restarts of the pod's containers. I thought I would modify the runner container wait on the DinD container restart but couldn't figure out what was triggering the DinD container to restart.Something in runsvc.sh?
Log of the DinD container shutdown below.
The text was updated successfully, but these errors were encountered: