Skip to content
This repository has been archived by the owner on Sep 2, 2024. It is now read-only.

[scheduler,mtsource] Pod not found when using kubeclient to get Pod #596

Closed
aavarghese opened this issue May 4, 2021 · 2 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@aavarghese
Copy link
Contributor

aavarghese commented May 4, 2021

Describe the bug
s.podClient.Get(context.Background(), podName, metav1.GetOptions{}) returns error occasionally
"error":"pods \"kafkasource-mt-adapter-4\" not found" (in controller log), and then most times during the next reconcile, it passes.

Finding pod failure means we don't get node info and then the zone info for that node.

Today, when we get an error, we peacefully continue onto next iteration of loop.

More info in #587, #463

Expected behavior
Seeing this error as a new pod comes up is understandable. But if pod is running and ready, kubeclient should be able to find pod.

To Reproduce
Create kafkasource with large # of consumers, and let autoscaler do its job to scale up. Open controller log and you'll find the error message above.

Knative release version

Additional context
Add any other context about the problem here such as proposed priority

@aavarghese aavarghese added the kind/bug Categorizes issue or PR as related to a bug. label May 4, 2021
@aavarghese
Copy link
Contributor Author

Update: this is the error you see in the logs with the latest HA code changes for pod lister:

{"level":"error","ts":"2021-06-10T18:57:10.545Z","logger":"kafka-controller.add replicas","caller":"statefulset/scheduler.go:390","msg":"Error getting zone info from pod","error":"pod \"kafkasource-mt-adapter-2\" not found","stacktrace":"knative.dev/eventing-kafka/pkg/common/scheduler/statefulset.(*StatefulSetScheduler).addReplicasEvenSpread\n\tknative.dev/eventing-kafka/pkg/common/scheduler/statefulset/scheduler.go:390\nknative.dev/eventing-kafka/pkg/common/scheduler/statefulset.(*StatefulSetScheduler).scheduleVPod\n\tknative.dev/eventing-kafka/pkg/common/scheduler/statefulset/scheduler.go:195\nknative.dev/eventing-kafka/pkg/common/scheduler/statefulset.(*StatefulSetScheduler).Schedule\n\tknative.dev/eventing-kafka/pkg/common/scheduler/statefulset/scheduler.go:125\nknative.dev/eventing-kafka/pkg/source/reconciler/mtsource.(*Reconciler).reconcileMTReceiveAdapter\n\tknative.dev/eventing-kafka/pkg/source/reconciler/mtsource/kafkasource.go:110\nknative.dev/eventing-kafka/pkg/source/reconciler/mtsource.(*Reconciler).ReconcileKind\n\tknative.dev/eventing-kafka/pkg/source/reconciler/mtsource/kafkasource.go:100\nknative.dev/eventing-kafka/pkg/client/injection/reconciler/sources/v1beta1/kafkasource.(*reconcilerImpl).Reconcile\n\tknative.dev/eventing-kafka/pkg/client/injection/reconciler/sources/v1beta1/kafkasource/reconciler.go:250\nknative.dev/pkg/controller.(*Impl).processNextWorkItem\n\tknative.dev/pkg@v0.0.0-20210609135543-c1db741846b8/controller/controller.go:531\nknative.dev/pkg/controller.(*Impl).RunContext.func3\n\tknative.dev/pkg@v0.0.0-20210609135543-c1db741846b8/controller/controller.go:468"}
{"level":"error","ts":"2021-06-10T18:57:10.545Z","logger":"kafka-controller.add replicas","caller":"statefulset/scheduler.go:390","msg":"Error getting zone info from pod","error":"pod \"kafkasource-mt-adapter-3\" not found","stacktrace":"knative.dev/eventing-kafka/pkg/common/scheduler/statefulset.(*StatefulSetScheduler).addReplicasEvenSpread\n\tknative.dev/eventing-kafka/pkg/common/scheduler/statefulset/scheduler.go:390\nknative.dev/eventing-kafka/pkg/common/scheduler/statefulset.(*StatefulSetScheduler).scheduleVPod\n\tknative.dev/eventing-kafka/pkg/common/scheduler/statefulset/scheduler.go:195\nknative.dev/eventing-kafka/pkg/common/scheduler/statefulset.(*StatefulSetScheduler).Schedule\n\tknative.dev/eventing-kafka/pkg/common/scheduler/statefulset/scheduler.go:125\nknative.dev/eventing-kafka/pkg/source/reconciler/mtsource.(*Reconciler).reconcileMTReceiveAdapter\n\tknative.dev/eventing-kafka/pkg/source/reconciler/mtsource/kafkasource.go:110\nknative.dev/eventing-kafka/pkg/source/reconciler/mtsource.(*Reconciler).ReconcileKind\n\tknative.dev/eventing-kafka/pkg/source/reconciler/mtsource/kafkasource.go:100\nknative.dev/eventing-kafka/pkg/client/injection/reconciler/sources/v1beta1/kafkasource.(*reconcilerImpl).Reconcile\n\tknative.dev/eventing-kafka/pkg/client/injection/reconciler/sources/v1beta1/kafkasource/reconciler.go:250\nknative.dev/pkg/controller.(*Impl).processNextWorkItem\n\tknative.dev/pkg@v0.0.0-20210609135543-c1db741846b8/controller/controller.go:531\nknative.dev/pkg/controller.(*Impl).RunContext.func3\n\tknative.dev/pkg@v0.0.0-20210609135543-c1db741846b8/controller/controller.go:468"}
{"level":"error","ts":"2021-06-10T18:57:10.545Z","logger":"kafka-controller.add replicas","caller":"statefulset/scheduler.go:390","msg":"Error getting zone info from pod","error":"pod \"kafkasource-mt-adapter-4\" not found","stacktrace":"knative.dev/eventing-kafka/pkg/common/scheduler/statefulset.(*StatefulSetScheduler).addReplicasEvenSpread\n\tknative.dev/eventing-kafka/pkg/common/scheduler/statefulset/scheduler.go:390\nknative.dev/eventing-kafka/pkg/common/scheduler/statefulset.(*StatefulSetScheduler).scheduleVPod\n\tknative.dev/eventing-kafka/pkg/common/scheduler/statefulset/scheduler.go:195\nknative.dev/eventing-kafka/pkg/common/scheduler/statefulset.(*StatefulSetScheduler).Schedule\n\tknative.dev/eventing-kafka/pkg/common/scheduler/statefulset/scheduler.go:125\nknative.dev/eventing-kafka/pkg/source/reconciler/mtsource.(*Reconciler).reconcileMTReceiveAdapter\n\tknative.dev/eventing-kafka/pkg/source/reconciler/mtsource/kafkasource.go:110\nknative.dev/eventing-kafka/pkg/source/reconciler/mtsource.(*Reconciler).ReconcileKind\n\tknative.dev/eventing-kafka/pkg/source/reconciler/mtsource/kafkasource.go:100\nknative.dev/eventing-kafka/pkg/client/injection/reconciler/sources/v1beta1/kafkasource.(*reconcilerImpl).Reconcile\n\tknative.dev/eventing-kafka/pkg/client/injection/reconciler/sources/v1beta1/kafkasource/reconciler.go:250\nknative.dev/pkg/controller.(*Impl).processNextWorkItem\n\tknative.dev/pkg@v0.0.0-20210609135543-c1db741846b8/controller/controller.go:531\nknative.dev/pkg/controller.(*Impl).RunContext.func3\n\tknative.dev/pkg@v0.0.0-20210609135543-c1db741846b8/controller/controller.go:468"}

@aavarghese
Copy link
Contributor Author

After further analysis, looks like this error is expected and fine. As the new pod comes up when sts scales up, the pod is unavailable/not found by pod lister. After some time, this resolves and pod is used for placements.

This delay to list pods isn't affecting placements from multiple sources.
So closing this issue. Nothing to be done here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

1 participant