-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v3.3.1: only the leader workflow-controller pod can expose metrics. #8283
Comments
@wshi5985: There are no area labels on this issue. Adding a label will help expedite your issue. If you are unsure what to do, make your best guess. We can change it later
DetailsI am a bot created to help the argoproj developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the DeFiCh/oss-governance-bot repository. |
/area metrics |
@whynowy we only have one pod serving metrics in v3.3. The service should always route to that pod. Is that something we can do, do you know? |
Signed-off-by: Alex Collins <alex_collins@intuit.com>
@wshi5985 could you please try adding the readiness check in the attached PR? |
Why adding a readiness check helps? @alexec |
You can put a service in front of the pods for the metrics (we should vend that). It should work. |
Non-leader pods should not expose metrics. They don't have anything interesting. |
This means the standby pod will be periodically restarted. |
I think that is a liveness probe. If the readiness probe fails, then the pod is just not ready. Correct me if I'm wrong? |
as expected, one pod not ready.
This means there will always be one pod in notReady mode ..... This will trigger our monitor alert i think. |
Removing the metrics endpoint was not intentional, side-effect of other changes. Do you think it provides useful data? |
plus it is tricky for update/deploy, since new pod could not come up while the old pod is leader. |
@alexec we are not using those metrics right now. we noticed this when prometheus server shows errors. |
I'm unclear. Is there another problem here? You should not have to delete any pods for the stand-by leader to come up. |
i don't see other problems. there is a rolling update strategy,
when apply the new deploy yaml, one new pod tried to come up, the old 2 pods were holding there waiting for the new one up. but the new one kept in notready mode. the 3 pods hanging there. deploy stuck. until i deleted the old replicaset which deleted the 2 old pods. the first new pod became leader and ready, and the second pod came up in notready mode too. |
Are you saying we can't use readiness because it prevents a rolling update? |
It looks like a pod could not be picked as leader if it is not ready just did another test, set rolling update strategy to 50%, with readinessProbe configured if remove readinessProbe, both pod ready, and one picked as leader. no need to delete old replicaset. |
btw, if we remove the workflow-controller-metrics service, what will be the impact (beside no metric exposed)? anything depends on the exposed metrics ? |
Leader election is unreleated to readiness. I don't believe readiness (or liveness) will affect it. |
Why don't abandon using |
during deployment, when readinessprobe failed, pod "ready" status showed "0/1"(instead of "1/1") and deployment stuck. i am not sure why. |
I don't think you can use rolling update at all. Consider this. You have 3 replicas. If you want to use service, you must have readiness check, otherwise you get errors. Only one is leader, so only one is ready. When rolling update occurs, the leader will not be deleted until another replica becomes ready. This will never happen, because replicas only become ready when they become leader. Catch 22! Instead, you could use: strategy:
type: Recreate Could you please try that? |
…rgoproj#8285)" This reverts commit 283f6b5. Signed-off-by: Alex Collins <alex_collins@intuit.com>
@whynowy Can you plz elaborate a bit more on
|
FWIW adding the readinessCheck (& switching to strategy Recreate) didn't actually fix the kube-prometheus-stack ... but even if that worked it doesn't seem a good solution, as then we'd have a pod continually sitting "unready" which would trigger another alert looking for containers staying "unready" for too long. Could we not have the metrics endpoint back, but serving up no metrics values? This idea just as an alternative to @MatthewHou 's suggestion above if you don't want to edit your own metadata on the fly (not sure which is easiest/cleanest). |
@Tryken Totally Agree! Deliberately making replicas unhealthy is a hacky workaround (And I think it won't work in Prometheus' issue). An "Unhealthy" Pod will trigger other alerts for Pods staying unhealthy for a long time because you shouldn't have unhealthy pods in the cluster for too long. Also, the pod is not really unhealthy... it's just idle 🤷🏻 Other solutions like adding Per-Pod Monitor in Prometheus to let it ignore non-leader pods are also hacky, as you'll need to re-adjust Prometheus whenever there is a new leader. I believe having |
@alexec sorry to bother, but is there any chance this will be reopened? I can see it is still an very annoying issue. |
This issue shouldn't have been closed :/ |
#11295 should fix this. Could you try |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
Still seeing the issue. Using the version 3.4.11. I have replicaset as 3 and when trying to access metrics through the service, i see the response is empty otherwise works if i explicitly portforward from the leader pod. |
@sakai-ast Would you like to help take a look to see what's still missing? |
@srinivin @terrytangyuan Before #11295, you get an error like "Connection refused" when your service accesses the non-leader pods. I also tested the behavior of v3.4.11 and v3.4.9(before the fix version) with curl to the service locally. v3.4.11when accessing non-leader pod
when accessing leader pod
v3.4.9when accessing non-leader pod
when accessing leader pod
If your service always gets an empty response, that is an another problem. |
Thanks for the update. I assume in these cases, shouldnt it behave same as a like sql write and read replicas? I mean though we direct write to only primary relicas, however read can happen from any replicas. same ways, though the leader pod is responsible for triggering workflows and other capablities, shouldnt the non-leader pods provide the metrics metadata. I assume the state of the metrics can be in a sharable persistence layer b/w leader and non-leader pods? |
Checklist
Summary
We upgraded argo to 3.3.1 from 3.1.3, There are 2 workflow-controller replica pods
somehow after upgrade, only the leader workflow-controller pod can expose metrics.
What version are you running? v3.3.1
Diagnostics
From our prometheus server UI, it shows one of the workflow-controller pod's metrics target DOWN with error: Get "http://10.127.217.135:9090/metrics": dial tcp 10.127.217.135:9090: connect: connection refused
While the other pod(the leader pod) shows UP, status is ok.
We tried run "wget" to service endpoint workflow-controller-metrics port 9090 (from a testing pod), it could intermittently get metrics result back, and half the time it returns error "wget: can't connect to remote host (172.20.116.209): Connection refused"
no related errors found in pod log.
Argo runs in our multiple eks clusters, we upgraded argo to v3.3.1 in two clusters, both have the same issue. the rest clusters with argo v3.1.3 does not have this problem.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
The text was updated successfully, but these errors were encountered: