Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to apply caps only on alive pods #252

Merged
merged 1 commit into from
May 29, 2018

Conversation

nehaljwani
Copy link
Contributor

@nehaljwani nehaljwani commented Nov 17, 2017

A pod can be in one of five phases: running, pending, succeeded, failed
or unknown. The pods which are not in the running/pending state do not
consume any CPU/Memory resources, they just need to be GC'ed. If, in the
scenario when a cap has been specified on a k8s cloud (single namespace)
and let's say 40% of the count are in the 'failed' phase, and 60% are
'runnning/pending', newer pods will not be spawned in that namespace,
since the plugin will count all those pods for instance/pod cap and even
though it could have spawned 40% more pods, it won't and jobs will
starve. This patch adds an option to calculate the cap for pods only in
the 'running' or 'pending' phase, on a cloud level and on a pod template
level.

@nehaljwani nehaljwani force-pushed the cap-not-undead-pods branch 3 times, most recently from 5ee08a9 to dcb8b00 Compare November 19, 2017 19:33
@carlossg
Copy link
Contributor

the problem I see is that the errored pods fail for a reason (ie. wrong image), and this would just enter in a infinite loop of spawning new pods

@nehaljwani
Copy link
Contributor Author

Well, the option is an opt-in.

Perhaps the plugin should optionally also provide a strategy (like, exponential back-off) to delay the launch of new pods?

@nehaljwani
Copy link
Contributor Author

nehaljwani commented Nov 28, 2017

@carlossg Should I add a warning in the description of the checkbox that this can lead to an infinite loop of spawning new pods?

Copy link
Contributor

@carlossg carlossg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks mostly ok, see my comment, but capOnlyOnUnDeadPods sounds really weird to me, wouldn't be more intuitive capOnlyOnAlivePods ?

@@ -424,6 +436,22 @@ private boolean addProvisionedSlave(@Nonnull PodTemplate template, @CheckForNull
PodList namedList = client.pods().inNamespace(templateNamespace).withLabels(labelsMap).list();
List<Pod> namedListItems = namedList.getItems();

if (this.isCapOnlyOnUnDeadPods()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of getting the full list, then filtering here, can't it be done using something like withFields("status.phase", "running|pending") not sure what would be the right syntax but should be possible based on kubernetes/kubernetes#49387

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do

./kubectl get pods --field-selector=status.phase='Succeeded'

I can also do:

./kubectl get pods --field-selector=status.phase='Failed'

But I can't do:

./kubectl get pods --field-selector=status.phase='Succeeded|Failed'

Neither can I do:

./kubectl get pods --field-selector='status.phase in (Succeeded, Failed)'

I couldn't find any official documentation on fieldselector 😢 https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#

I also looked at the implementation of withFields and getFieldQueryParam and it doesn't seem like a selector has been implemented yet. 😭

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I can do:

./kubectl get pods --field-selector='status.phase!=Failed,status.phase!=Succeeded,status.phase!=Unknown'

But again, we really need a function withFieldSelector() at https://github.com/fabric8io/kubernetes-client/blob/v3.1.0/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/Filterable.java

A pod can be in one of five phases: running, pending, succeeded, failed
or unknown. The pods which are not in the running/pending phase do not
consume any CPU/Memory resources, they just need to be GC'ed. If, in the
scenario when a cap has been specified on a k8s cloud (single namespace)
and let's say 40% of the count are in the 'failed' phase, and 60% are
'runnning/pending', newer pods will not be spawned in that namespace,
since the plugin will count all those pods for instance/pod cap and even
though it could have spawned 40% more pods, it won't and jobs will
starve. This patch adds an option to calculate the cap for pods only in
the 'running' or 'pending' phase, on a cloud level and on a pod template
level, so that the cap applies to only those pods which are alive or
about to be.

Change-Id: Id77e837aa9a42742618cd3c543e7b99f9d38a10a
@nehaljwani
Copy link
Contributor Author

I've updated the PR with a less weird name 😉

@nehaljwani nehaljwani changed the title Add option to apply caps only on undead pods Add option to apply caps only on alive pods Dec 3, 2017
@nehaljwani
Copy link
Contributor Author

Finally, after 5 months! 🎉

@carlossg
Copy link
Contributor

sorry, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants