Add option to apply caps only on alive pods #252

nehaljwani · 2017-11-17T16:39:46Z

A pod can be in one of five phases: running, pending, succeeded, failed
or unknown. The pods which are not in the running/pending state do not
consume any CPU/Memory resources, they just need to be GC'ed. If, in the
scenario when a cap has been specified on a k8s cloud (single namespace)
and let's say 40% of the count are in the 'failed' phase, and 60% are
'runnning/pending', newer pods will not be spawned in that namespace,
since the plugin will count all those pods for instance/pod cap and even
though it could have spawned 40% more pods, it won't and jobs will
starve. This patch adds an option to calculate the cap for pods only in
the 'running' or 'pending' phase, on a cloud level and on a pod template
level.

carlossg · 2017-11-21T08:54:48Z

the problem I see is that the errored pods fail for a reason (ie. wrong image), and this would just enter in a infinite loop of spawning new pods

nehaljwani · 2017-11-21T09:21:45Z

Well, the option is an opt-in.

Perhaps the plugin should optionally also provide a strategy (like, exponential back-off) to delay the launch of new pods?

nehaljwani · 2017-11-28T08:24:21Z

@carlossg Should I add a warning in the description of the checkbox that this can lead to an infinite loop of spawning new pods?

carlossg

I think it looks mostly ok, see my comment, but capOnlyOnUnDeadPods sounds really weird to me, wouldn't be more intuitive capOnlyOnAlivePods ?

carlossg · 2017-12-03T11:16:09Z

src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java

@@ -424,6 +436,22 @@ private boolean addProvisionedSlave(@Nonnull PodTemplate template, @CheckForNull
 PodList namedList = client.pods().inNamespace(templateNamespace).withLabels(labelsMap).list();
 List<Pod> namedListItems = namedList.getItems();

+ if (this.isCapOnlyOnUnDeadPods()) {


instead of getting the full list, then filtering here, can't it be done using something like withFields("status.phase", "running|pending") not sure what would be the right syntax but should be possible based on kubernetes/kubernetes#49387

I can do

./kubectl get pods --field-selector=status.phase='Succeeded'

I can also do:

./kubectl get pods --field-selector=status.phase='Failed'

But I can't do:

./kubectl get pods --field-selector=status.phase='Succeeded|Failed'

Neither can I do:

./kubectl get pods --field-selector='status.phase in (Succeeded, Failed)'

I couldn't find any official documentation on fieldselector 😢 https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#

I also looked at the implementation of withFields and getFieldQueryParam and it doesn't seem like a selector has been implemented yet. 😭

However, I can do:

./kubectl get pods --field-selector='status.phase!=Failed,status.phase!=Succeeded,status.phase!=Unknown'

But again, we really need a function withFieldSelector() at https://github.com/fabric8io/kubernetes-client/blob/v3.1.0/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/Filterable.java

A pod can be in one of five phases: running, pending, succeeded, failed or unknown. The pods which are not in the running/pending phase do not consume any CPU/Memory resources, they just need to be GC'ed. If, in the scenario when a cap has been specified on a k8s cloud (single namespace) and let's say 40% of the count are in the 'failed' phase, and 60% are 'runnning/pending', newer pods will not be spawned in that namespace, since the plugin will count all those pods for instance/pod cap and even though it could have spawned 40% more pods, it won't and jobs will starve. This patch adds an option to calculate the cap for pods only in the 'running' or 'pending' phase, on a cloud level and on a pod template level, so that the cap applies to only those pods which are alive or about to be. Change-Id: Id77e837aa9a42742618cd3c543e7b99f9d38a10a

nehaljwani · 2017-12-03T15:15:49Z

I've updated the PR with a less weird name 😉

nehaljwani · 2018-05-29T18:08:34Z

Finally, after 5 months! 🎉

carlossg · 2018-05-29T18:25:51Z

sorry, thanks!

nehaljwani force-pushed the cap-not-undead-pods branch 3 times, most recently from 5ee08a9 to dcb8b00 Compare November 19, 2017 19:33

carlossg requested changes Dec 3, 2017

View reviewed changes

nehaljwani force-pushed the cap-not-undead-pods branch from dcb8b00 to 020eaf6 Compare December 3, 2017 14:47

nehaljwani force-pushed the cap-not-undead-pods branch from 020eaf6 to a615156 Compare December 3, 2017 14:47

nehaljwani changed the title ~~Add option to apply caps only on undead pods~~ Add option to apply caps only on alive pods Dec 3, 2017

carlossg approved these changes May 23, 2018

View reviewed changes

This was referenced May 23, 2018

Do not consider Succeeded slaves when checking container cap #323

Closed

Ignore Failed and Succeeded slaves in addProvisionedSlave #170

Closed

carlossg merged commit d90c9c9 into jenkinsci:master May 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to apply caps only on alive pods #252

Add option to apply caps only on alive pods #252

nehaljwani commented Nov 17, 2017 •

edited

Loading

carlossg commented Nov 21, 2017

nehaljwani commented Nov 21, 2017

nehaljwani commented Nov 28, 2017 •

edited

Loading

carlossg left a comment

carlossg Dec 3, 2017

nehaljwani Dec 3, 2017

nehaljwani Dec 3, 2017

nehaljwani commented Dec 3, 2017

nehaljwani commented May 29, 2018

carlossg commented May 29, 2018

Add option to apply caps only on alive pods #252

Add option to apply caps only on alive pods #252

Conversation

nehaljwani commented Nov 17, 2017 • edited Loading

carlossg commented Nov 21, 2017

nehaljwani commented Nov 21, 2017

nehaljwani commented Nov 28, 2017 • edited Loading

carlossg left a comment

Choose a reason for hiding this comment

carlossg Dec 3, 2017

Choose a reason for hiding this comment

nehaljwani Dec 3, 2017

Choose a reason for hiding this comment

nehaljwani Dec 3, 2017

Choose a reason for hiding this comment

nehaljwani commented Dec 3, 2017

nehaljwani commented May 29, 2018

carlossg commented May 29, 2018

nehaljwani commented Nov 17, 2017 •

edited

Loading

nehaljwani commented Nov 28, 2017 •

edited

Loading