-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flux logs chooses wrong pod with aks flux-extension #3944
Comments
Both pods mentioned are not part of the official Flux distribution, but rather part of the custom AKS distribution maintained by Microsoft. They should either ensure that they are not labeled as "part of Flux", which is what the |
In addition to this, we could make the command more graceful when it encounters Deployments and/or Pods that are apparently doing weird things. Instead of failing with a hard stop. |
In certain situations there might be 3rd-party pods running in the Flux namespace that cause the command to fail streaming logs, e.g. when they have multiple containers but none of them is called `manager` (which all Flux-maintained pods do). An example of such a situation is when Flux is installed with the 3rd-party Flux extension on AKS. The `logs` command is now more forgiving and merely logs an error in these situations instead of completely bailing out. For the parallel log streaming with `-f` the code is now a little more complex so that errors are now written to stderr in parallel with all other logs written to stdout. That's what `asyncCopy` is for. refs #3944 Signed-off-by: Max Jonas Werner <mail@makk.es>
In certain situations there might be 3rd-party pods running in the Flux namespace that cause the command to fail streaming logs, e.g. when they have multiple containers but none of them is called `manager` (which all Flux-maintained pods do). An example of such a situation is when Flux is installed with the 3rd-party Flux extension on AKS. The `logs` command is now more forgiving and merely logs an error in these situations instead of completely bailing out. For the parallel log streaming with `-f` the code is now a little more complex so that errors are now written to stderr in parallel with all other logs written to stdout. That's what `asyncCopy` is for. A behavioral difference now is that the command will not exit with an error, anymore, except for the case where it can't write to stdout/stderr in the parallel case. refs #3944 Signed-off-by: Max Jonas Werner <mail@makk.es>
In certain situations there might be 3rd-party pods running in the Flux namespace that cause the command to fail streaming logs, e.g. when they have multiple containers but none of them is called `manager` (which all Flux-maintained pods do). An example of such a situation is when Flux is installed with the 3rd-party Flux extension on AKS. The `logs` command is now more forgiving and merely logs an error in these situations instead of completely bailing out. For the parallel log streaming with `-f` the code is now a little more complex so that errors are now written to stderr in parallel with all other logs written to stdout. That's what `asyncCopy` is for. refs #3944 Signed-off-by: Max Jonas Werner <mail@makk.es>
In certain situations there might be 3rd-party pods running in the Flux namespace that cause the command to fail streaming logs, e.g. when they have multiple containers but none of them is called `manager` (which all Flux-maintained pods do). An example of such a situation is when Flux is installed with the 3rd-party Flux extension on AKS. The `logs` command is now more forgiving and merely logs an error in these situations instead of completely bailing out. For the parallel log streaming with `-f` the code is now a little more complex so that errors are now written to stderr in parallel with all other logs written to stdout. That's what `asyncCopy` is for. refs #3944 Signed-off-by: Max Jonas Werner <mail@makk.es>
In certain situations there might be 3rd-party pods running in the Flux namespace that cause the command to fail streaming logs, e.g. when they have multiple containers but none of them is called `manager` (which all Flux-maintained pods do). An example of such a situation is when Flux is installed with the 3rd-party Flux extension on AKS. The `logs` command is now more forgiving and merely logs an error in these situations instead of completely bailing out. It still returns a non-zero exit code. For the parallel log streaming with `-f` the code is now a little more complex so that errors are now written to stderr in parallel with all other logs written to stdout. That's what `asyncCopy` is for. refs #3944 Signed-off-by: Max Jonas Werner <mail@makk.es>
UX changes: - Only print an error when a pod doesn't have a matching container instead of exiting early. - Return a non-zero status code when no pod is found at all. Details: In certain situations there might be 3rd-party pods running in the Flux namespace that cause the command to fail streaming logs, e.g. when they have multiple containers but none of them is called `manager` (which all Flux-maintained pods do). An example of such a situation is when Flux is installed with the 3rd-party Flux extension on AKS. The `logs` command is now more forgiving and merely logs an error in these situations instead of completely bailing out. It still returns a non-zero exit code. For the parallel log streaming with `-f` the code is now a little more complex so that errors are now written to stderr in parallel with all other logs written to stdout. That's what `asyncCopy` is for. refs #3944 Signed-off-by: Max Jonas Werner <mail@makk.es>
UX changes: - Only print an error when a pod doesn't have a matching container instead of exiting early. - Return a non-zero status code when no pod is found at all. Details: In certain situations there might be 3rd-party pods running in the Flux namespace that cause the command to fail streaming logs, e.g. when they have multiple containers but none of them is called `manager` (which all Flux-maintained pods do). An example of such a situation is when Flux is installed with the 3rd-party Flux extension on AKS. The `logs` command is now more forgiving and merely logs an error in these situations instead of completely bailing out. It still returns a non-zero exit code. For the parallel log streaming with `-f` the code is now a little more complex so that errors are now written to stderr in parallel with all other logs written to stdout. That's what `asyncCopy` is for. refs #3944 Signed-off-by: Max Jonas Werner <mail@makk.es>
@eff917 a PR addressing your issue has been merged and the change will be released with the next Flux version that I'd love you to try out if you can and provide your feedback on it and if it solves your issue. |
Thank you, it seems to be working, with all the errors visible at the start and at the end:
|
Great. Just to clarify: Don't expect the |
Describe the bug
flux logs
command tries to get the manager container from the fluxconfig-agent pod, however when installing the flux extension in AKS, the manager container is in the fluxconfig-controller pod.Versions:
azure AKS cluster 1.24.10
microsoft.flux extension: 1.7.3 (https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/azure-arc/kubernetes/extensions-release.md#173-april-2023)
flux client: flux version 2.0.0-rc.5
running flux logs returns the following result:
Steps to reproduce
flux logs
Expected behavior
The logs should be streamed from the fluxconfig-controller pod instead of the fluxconfig-agent pod
Screenshots and recordings
No response
OS / Distro
arch linux
Flux version
2.0.0-rc.5
Flux check
► checking prerequisites
✔ Kubernetes 1.24.10 >=1.20.6-0
► checking controllers
✔ fluxconfig-agent: deployment ready
► mcr.microsoft.com/azurek8sflux/fluxconfig-agent:1.7.3
► mcr.microsoft.com/azurek8sflux/fluent-bit:1.7.3
✔ fluxconfig-controller: deployment ready
► mcr.microsoft.com/azurek8sflux/fluxconfig-controller:1.7.3
► mcr.microsoft.com/azurek8sflux/fluent-bit:1.7.3
✔ helm-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/helm-controller:v0.31.2
✔ image-automation-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/image-automation-controller:v0.31.0
✔ image-reflector-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/image-reflector-controller:v0.26.1
✔ kustomize-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/kustomize-controller:v0.35.1
✔ notification-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/notification-controller:v0.33.0
✔ source-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/source-controller:v0.36.1
► checking crds
✔ alerts.notification.toolkit.fluxcd.io/v1beta2
✔ buckets.source.toolkit.fluxcd.io/v1beta2
✔ fluxconfigs.clusterconfig.azure.com/v1alpha1
✔ gitrepositories.source.toolkit.fluxcd.io/v1beta2
✔ helmcharts.source.toolkit.fluxcd.io/v1beta2
✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1
✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2
✔ imagepolicies.image.toolkit.fluxcd.io/v1beta2
✔ imagerepositories.image.toolkit.fluxcd.io/v1beta2
✔ imageupdateautomations.image.toolkit.fluxcd.io/v1beta1
✔ kustomizations.kustomize.toolkit.fluxcd.io/v1beta2
✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2
✔ providers.notification.toolkit.fluxcd.io/v1beta2
✔ receivers.notification.toolkit.fluxcd.io/v1beta2
✔ all checks passed
Git provider
No response
Container Registry provider
No response
Additional context
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: