-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail to prioritize some nodepool using preferredDuringSchedulingIgnoredDuringExecution #13924
Comments
Could you check whether the pod has the nodeAffinity written in workflow? |
The |
@shuangkun when I check in the argo workflow GUI the preferredDuringSchedulingIgnoredDuringExecution are well setted (at least I think) |
@Joibel we tried both approch. The first in the workflowSpec as mentionned in the documentation and then in template of the pod itself as shown in the previous message |
Please check the actual pod. |
@Joibel if you refer to the workflow I provided in the first message there are only two pod template : They are called several times to make deploy several node from the autoscalling. Then my answer to shuangkun shows the view of the pod template directly in the workflow view from argo workflow GUI. I also updated my first message to provide the result of the 2 kubctl commands. In the log of the second one we can see in the "Executor initialized" message that the "affinity part" appears. Do you want me to provide other information for a pod ? |
Argo workflows doesn't schedule pods. Argo workflows creates kubernetes pods, and then it is up to kubernetes to perform the scheduling. I'm asking for the actual pod yaml to ensure that Argo workflows is creating the pods with the correct contents. This is why I'm asking again specifically for that. If it is, then the fault is not with workflows, but is with your cluster. If you create the same pod manually you should be able to recreate the problem and diagnose it. |
Pre-requisites
:latest
image tag (i.e.quay.io/argoproj/workflow-controller:latest
) and can confirm the issue still exists on:latest
. If not, I have explained why, in detail, in my description below.What happened? What did you expect to happen?
Hello,
What I want to do with argo workflow :
I want to lauch my pod on node from specific nodepool in priority. To do that I'm trying to use affinity>nodeAffinity>preferredDuringSchedulingIgnoredDuringExecution
Below an extract of my workflow. The goal is to make deploy the autoscaling node in my favorite order regarding of their nodepool : k8s-asp-dev-pool-var-b2-15 > k8s-asp-dev-pool-var-b2-30 > k8s-asp-dev-pool-fix-r2-120 > k8s-asp-dev-pool-var-r2-120 > ... > k8s-asp-dev-pool-var-b2-120 > k8s-asp-dev-pool-var-c2-120
What I got :
When I launch my workflow the node deployed were not from the expected nodepool
As you can see on the picture attached the node are deployed from the nodepool k8s-asp-dev-pool-var and and k8s-asp-dev-pool-var-c2-60 (which is not even in the list of desired nodepool).
The exepect nodepool were k8s-asp-dev-pool-var-r2-120, which is the one with the biggest weight, or at least k8s-asp-dev-pool-var-c2-120 which is the one with the smallest weight
Can you see what I did wrong ?
I did some test with requiredDuringSchedulingIgnoredDuringExecution instead of preferredDuringSchedulingIgnoredDuringExecution and it worked without any problem. But it is not want I want to implement
About my conf :
about kubectl
kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.2", GitCommit:"f66044f4361b9f1f96f0053dd46cb7dce5e990a8", GitTreeState:"clean", BuildDate:"2022-06-15T14:22:29Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.16", GitCommit:"c5f43560a4f98f2af3743a59299fb79f07924373", GitTreeState:"clean", BuildDate:"2023-11-15T22:28:05Z", GoVersion:"go1.20.10", Compiler:"gc", Platform:"linux/amd64"}
about argo-workflow we are in 3.4.4
Thanks for your help
Version(s)
v3.4.4
Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
Logs from the workflow controller
Logs from in your workflow's wait container
The result of the command is in the file attached
kubectl_log_argo_wait.txt
The text was updated successfully, but these errors were encountered: