Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build post-commit container names are not unique #8266

Closed
bparees opened this issue Mar 28, 2016 · 32 comments · Fixed by #8316
Closed

build post-commit container names are not unique #8266

bparees opened this issue Mar 28, 2016 · 32 comments · Fixed by #8316

Comments

@bparees
Copy link
Contributor

bparees commented Mar 28, 2016

I ran a build, stopped my system, started it again, ran the build again, so now i get this error:

F0328 18:28:49.580909       1 builder.go:204] Error: build error: create container "openshift_s2i-build_ruby-sample-build-1_p1_post-commit": container already exists

users shouldn't have to clean up all their dead containers to be able to use this function. I think we need to append a unique suffix to the container name, if we're going to explicitly name them.

@mfojtik @rhcarvalho

@bparees
Copy link
Contributor Author

bparees commented Mar 28, 2016

(by stopped my system, i mean cleaned out etcd/etc. hence the reason my build sequence id went back to "1" and caused the conflict)

@bparees
Copy link
Contributor Author

bparees commented Mar 28, 2016

reducing to p3 because i think i hit this via a timing window (I killed my setup while the container was running, whereas normally the code would have removed the container after it finished). so it's less likely someone hits this under normal circumstances.

@rhcarvalho
Copy link
Contributor

I suspect the "clean up" left some dirty state? It is possible that if you interrupt the server while a build hook is running, the hook container will not be cleaned up, leading to the situation described in the issue.

Perhaps now our clean up scripts should remove not only k8s_* prefixed containers, but also openshift_*...

And we should also fix openshift/source-to-image#433 to "namespace" containers created by S2I, and remove them in the clean up script.

Here's the original discussion where it was decided that we don't want the random suffix: #6715 (comment)

@bparees
Copy link
Contributor Author

bparees commented Mar 29, 2016

I suspect the "clean up" left some dirty state? It is possible that if you interrupt the server while a build hook is running, the hook container will not be cleaned up, leading to the situation described in the issue.

right, that's what i was implying in my follow up comment.

Perhaps now our clean up scripts should remove not only k8s_* prefixed containers, but also openshift_*...

perhaps, but i still think we should be using unique names. k8s/openshift do not have this problem for other containers they launch (you don't have to clean up dead containers to rerun something).

And we should also fix openshift/source-to-image#433 to "namespace" containers created by S2I, and remove them in the clean up script.

+1

Here's the original discussion where it was decided that we don't want the random suffix: #6715 (comment)

thanks, it's a fair point i suppose, but if you've got two concurrent builds running w/ the same name, running two instances of the post-commit image seems like the least of your worries (they are unlikely to intterfere with each other anyway). Not to mention it could happen if the two builds were running on distinct nodes anyway, so using the same name doesn't actually protect you from all that much. @smarterclayton ?

@smarterclayton
Copy link
Contributor

I would expect the post-commit container names to be correlated to the
build and build namespace effectively so that they are unique across all
builds, for sure. I don't think we have to protect against two builds
running with the same name (they'd have to have the same pod, which means
something in Kube is horribly broken that we would fix first).

On Tue, Mar 29, 2016 at 8:50 AM, Ben Parees notifications@github.com
wrote:

I suspect the "clean up" left some dirty state? It is possible that if you
interrupt the server while a build hook is running, the hook container will
not be cleaned up, leading to the situation described in the issue.

right, that's what i was implying in my follow up comment.

Perhaps now our clean up scripts should remove not only k8s_* prefixed
containers, but also openshift_*...

perhaps, but i still think we should be using unique names. k8s/openshift
do not have this problem for other containers they launch (you don't have
to clean up dead containers to rerun something).

And we should also fix openshift/source-to-image#433
openshift/source-to-image#433 to "namespace"
containers created by S2I, and remove them in the clean up script.

+1

Here's the original discussion where it was decided that we don't want the
random suffix: #6715 #6715
(comment)

thanks, it's a fair point i suppose, but if you've got two concurrent
builds running w/ the same name, running two instances of the post-commit
image seems like the least of your worries (they are unlikely to intterfere
with each other anyway). Not to mention it could happen if the two builds
were running on distinct nodes anyway, so using the same name doesn't
actually protect you from all that much. @smarterclayton
https://github.com/smarterclayton ?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8266 (comment)

@rhcarvalho
Copy link
Contributor

Looks like we have at least two ways out here:

  1. Add a random suffix, just like Kubernetes does, as per the original design.
  2. Before starting the hook, force terminate any existing containers with a conflicting name, as per this comment from @smarterclayton.

@bparees
Copy link
Contributor Author

bparees commented Mar 30, 2016

personally i like the random suffix

  1. aligns w/ k8s
  2. avoids inadvertently terminating something important. (well, much less likely anyway)

@bparees
Copy link
Contributor Author

bparees commented Mar 30, 2016

(and we already have pruning to take care of old containers in case we accidentally leave one around...though i'm not sure if pruning applies to these containers?)

@rhcarvalho
Copy link
Contributor

i'm not sure if pruning applies to these containers

Most likely it does not apply, since nothing outside of the builder knows about it...

@bparees
Copy link
Contributor Author

bparees commented Mar 30, 2016

Most likely it does not apply, since nothing outside of the builder knows about it...

right, i'm not sure how pruning identifies containers for pruning. I actually think it may prune any stopped containers not associated w/ a pod, in which case these would qualify. If not, our assemble containers would also not be subject to pruning, which is a bigger problem.

@ncdc can you comment on how containers are identified for pruning?

@ncdc
Copy link
Contributor

ncdc commented Mar 30, 2016

Containers are garbage collected periodically. The default settings are to keep at most 100 dead containers across all pods, and at most 2 dead containers per pod. It won't start pruning them until you've reached these thresholds.

@rhcarvalho
Copy link
Contributor

@ncdc but how are the containers identified? By the prefix "k8s_" in the name?
Or is the pruning working on all containers in the host?

@ncdc
Copy link
Contributor

ncdc commented Mar 30, 2016

@smarterclayton
Copy link
Contributor

Please don't add random suffixes. If two builds are running at the same
time on the same node, one should fail.

On Wed, Mar 30, 2016 at 10:26 AM, Andy Goldstein notifications@github.com
wrote:

A little bit of both - see
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockertools/container_gc.go#L131-L231


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8266 (comment)

@rhcarvalho
Copy link
Contributor

Looks like the first listing comes from GetKubeletDockerContainers, which only includes containers with certain prefixes https://github.com/kubernetes/kubernetes/blob/be8ce6c38578c244ac6ebdb1c402f74e5a282140/pkg/kubelet/dockertools/docker.go#L344-L369

So, @bparees, all the containers we're starting with "fluffy_panda" 🐼 names or other prefixes won't be taken into account by the pruning process 😢

@ncdc
Copy link
Contributor

ncdc commented Mar 30, 2016

@rhcarvalho it does look at other containers - those are the unidentifiedContainers.

@bparees
Copy link
Contributor Author

bparees commented Mar 30, 2016

Please don't add random suffixes. If two builds are running at the same time on the same node, one should fail.

that means if i delete my buildconfig and recreate it, i'll never be able to run the build on that node again because the name will conflict, unless we explicitly remove containers with conflicting names before we start a new container.

Also, relying on the build

  1. having a post-commit hook
  2. running on the same node

seems like a strange way to catch duplicated builds. if duplicated builds are happening, we need a better way to prevent that.... and we do, it's the name of the build object itself which has to be unique. (I don't know how our assemble container is named today, i think it's always unique, it should be updated to match the naming convention used by the post-commit container, whatever we decide)

@smarterclayton
Copy link
Contributor

If you give the post commit container a deterministic name, and you come in
and the name already exists and is not running, you should delete the
container and then run. If you can't delete, or it's running, you should
fail.

ALL of our container names should be deterministic, period, full stop.
Anything you guys create on the node in a build you have to take
responsibility to clean up, or clean up on reentrance safely. Unique names
is pretending that leaving containers around that aren't cleaned up isn't a
huge problem.

On Wed, Mar 30, 2016 at 10:39 AM, Ben Parees notifications@github.com
wrote:

Please don't add random suffixes. If two builds are running at the same
time on the same node, one should fail.

that means if i delete my buildconfig and recreate it, i'll never be able
to run the build on that node again because the name will conflict, unless
we explicitly remove containers with conflicting names before we start a
new container.

Also, relying on the build

  1. having a post-commit hook
  2. running on the same node

seems like a strange way to catch duplicated builds. if duplicated builds
are happening, we need a better way to prevent that.... and we do, it's the
name of the build object itself which has to be unique. (I don't know how
our assemble container is named today, i think it's always unique, it
should be updated to match the naming convention used by the post-commit
container, whatever we decide)


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8266 (comment)

@bparees
Copy link
Contributor Author

bparees commented Mar 30, 2016

ALL of our container names should be deterministic, period, full stop. Anything you guys create on the node in a build you have to take responsibility to clean up, or clean up on reentrance safely. Unique names is pretending that leaving containers around that aren't cleaned up isn't a huge problem.

these are orthogonal statements. If things are working correctly, then the reentrant "clean up when we detect a collision" logic would never get triggered anyway, so that's not a useful way to deal with the dead container cleanup problem.

Dead containers should be handled via pruning mechanisms, just like the rest of k8s does today. (whether that is actually true or not is something we're currently trying to determine...) Which means the names can be as unique as we want.

@rhcarvalho
Copy link
Contributor

(I don't know how our assemble container is named today, i think it's always unique, it should be updated to match the naming convention used by the post-commit container, whatever we decide)

It is always unique, but completely arbitrary, generated by Docker, see openshift/source-to-image#395

@bparees
Copy link
Contributor Author

bparees commented Mar 30, 2016

So i feel like the summary here is:

  1. we should use a common convention for naming the assemble and post-commit containers (and any other containers we create).
  2. that convention should, imo, include a unique suffix (subject to on-going debate).
  3. we need to get those containers included in the system container pruning logic, which I do not believe they currently are.

Note that none of this is a huge problem today because we rm the container when we're done with it, so it only gets left around if the system gets killed in the middle of a build/post-commit operation.

@smarterclayton
Copy link
Contributor

Which can happen today pretty easily just by a) deleting a build or b)
deleting a pod

On Wed, Mar 30, 2016 at 10:54 AM, Ben Parees notifications@github.com
wrote:

So i feel like the summary here is:

  1. we should use a common convention for naming the assemble and
    post-commit containers (and any other containers we create).
  2. that convention should, imo, include a unique suffix (subject to
    on-going debate).
  3. we need to get those containers included in the system container
    pruning logic, which I do not believe they currently are.

Note that none of this is a huge problem today because we rm the container
when we're done with it, so it only gets left around if the system gets
killed in the middle of a build/post-commit operation.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8266 (comment)

@smarterclayton
Copy link
Contributor

Basically, if you're going to use a random suffix, you still need to define
a prefix and own the prefix going in. You probably also have to be
responsible for other things under that prefix and cleaning them up. In
fact, every build start on a node should probably be cleaning up other
leftovers based on the determistiic prefix.

On Wed, Mar 30, 2016 at 11:04 AM, Clayton Coleman ccoleman@redhat.com
wrote:

Which can happen today pretty easily just by a) deleting a build or b)
deleting a pod

On Wed, Mar 30, 2016 at 10:54 AM, Ben Parees notifications@github.com
wrote:

So i feel like the summary here is:

  1. we should use a common convention for naming the assemble and
    post-commit containers (and any other containers we create).
  2. that convention should, imo, include a unique suffix (subject to
    on-going debate).
  3. we need to get those containers included in the system container
    pruning logic, which I do not believe they currently are.

Note that none of this is a huge problem today because we rm the
container when we're done with it, so it only gets left around if the
system gets killed in the middle of a build/post-commit operation.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8266 (comment)

@ncdc
Copy link
Contributor

ncdc commented Mar 30, 2016

we need to get those containers included in the system container pruning logic, which I do not believe they currently are

They are included

@ncdc
Copy link
Contributor

ncdc commented Mar 30, 2016

Oops, they aren't included. Misread the code.

@bparees
Copy link
Contributor Author

bparees commented Mar 30, 2016

Basically, if you're going to use a random suffix, you still need to define a prefix and own the prefix going in. You probably also have to be responsible for other things under that prefix and cleaning them up. In fact, every build start on a node should probably be cleaning up other leftovers based on the determistiic prefix.

still an orthogonal issue. not using a random suffix solves very little of the cleanup problem. almost none of it. cleanup needs to be handled explicitly (as you say, either by us using a common prefix and adding logic, or by updating the k8s logic to include our containers).

@bparees
Copy link
Contributor Author

bparees commented Mar 30, 2016

(or by rearchitecting the entire build system to use managed pods instead of manual container launches)

@smarterclayton
Copy link
Contributor

Sure - the issues then are that a) we don't guarantee that we clean up
containers, and b) we don't prevent simultaneous execution. Can make a
case for simultaneous execution?

On Wed, Mar 30, 2016 at 11:29 AM, Ben Parees notifications@github.com
wrote:

Basically, if you're going to use a random suffix, you still need to
define a prefix and own the prefix going in. You probably also have to be
responsible for other things under that prefix and cleaning them up. In
fact, every build start on a node should probably be cleaning up other
leftovers based on the determistiic prefix.

still an orthogonal issue. not using a random suffix solves very little of
the cleanup problem. almost none of it. cleanup needs to be handled
explicitly (as you say, either by us using a common prefix and adding
logic, or by updating the k8s logic to include our containers).


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8266 (comment)

@bparees
Copy link
Contributor Author

bparees commented Mar 30, 2016

Sure - the issues then are that a) we don't guarantee that we clean up containers, and b) we don't prevent simultaneous execution. Can make a case for simultaneous execution?

we do prevent simultaneous execution. you cannot create two build objects w/ the same name.

@bparees
Copy link
Contributor Author

bparees commented Mar 30, 2016

@smarterclayton and I chatted offline, path forward is:

  1. use unique suffixes, this protects end users from spurious failures if we do a bad job cleaning up containers. (this issue, but should fix the assemble container names too, though they are currently unique).
  2. open an issue upstream to allow arbitrary pruning prefixes: making container pruning prefix more flexible kubernetes/kubernetes#23640
  3. update the build container removal logic to handle sigterm to ensure the container always gets removed (separate issue/PR from this one).

@rhcarvalho
Copy link
Contributor

  1. use unique suffixes

+1, #8316 and openshift/source-to-image#395 (working on a fix)

  1. open an issue upstream to allow arbitrary pruning prefixes

+1, kubernetes/kubernetes#23640

  1. update the build container removal logic to handle sigterm to ensure the container always gets removed (separate issue/PR from this one).

#8306 and openshift/source-to-image#452

This makes code more complicated and doesn't always guarantee the container gets removed.
Under which circumstances does the build container receive SIGTERM (/ SIGQUIT / SIGINT)?

We really need a way to hook into the platform pruning mechanisms (2), or the long term goal of having better platform support so that builds don't start their own unmanaged containers.

@smarterclayton
Copy link
Contributor

Every container on the system will ALWAYS receive SIGTERM, get at least 2
seconds to shut down, and then get SIGKILL.

On Thu, Mar 31, 2016 at 8:36 AM, Rodolfo Carvalho notifications@github.com
wrote:

  1. use unique suffixes

+1, #8316 #8316 and
openshift/source-to-image#395
openshift/source-to-image#395 (working on a
fix)

  1. open an issue upstream to allow arbitrary pruning prefixes

+1, kubernetes/kubernetes#23640
kubernetes/kubernetes#23640

  1. update the build container removal logic to handle sigterm to ensure
    the container always gets removed (separate issue/PR from this one).

#8306 #8306 and
openshift/source-to-image#452
openshift/source-to-image#452

This makes code more complicated and doesn't always guarantee the
container gets removed.
Under which circumstances does the build container receive SIGTERM (/
SIGQUIT / SIGINT)?

We really need a way to hook into the platform pruning mechanisms (2), or
the long term goal of having better platform support so that builds don't
start their own unmanaged containers.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8266 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants