-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Fix kpo log_events_on_failure logs warnings at warning level #54967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix kpo log_events_on_failure logs warnings at warning level #54967
Conversation
romsharon98
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the PR that submitted this change #37944
and the kubernetes docs there is only Normal and Warning types, therefore warning is log as error on purpose.
|
@romsharon98 I don't think it was an intentional choice. It was just a transition from the status quo before #37944 (every event was logged as error). I don't think fundamentally any of the Pod Event logs should be logged as error. I'm not sure how to say it much simpler than "Warning type events should be logged as warning" Maybe the PR author can chime in @sudiptob2. Perhaps the original issuer (#36077 and #54964 both were me). |
|
Let's say it this way. What is the real error message in this output of KuberntesPodOperator task? I have many many devs confused and report to DevOps saying the K8s cluster is broken since they're getting pod event errors like FailedMount, FailedScheduling, etc. however if their KPO timeout was longer, the cluster would've auto-solved it. |
providers/cncf/kubernetes/tests/unit/cncf/kubernetes/operators/test_pod.py
Outdated
Show resolved
Hide resolved
ae08100 to
1f9e19a
Compare
1f9e19a to
fc35cbc
Compare
|
@romsharon98 Done and rebased. |
Resolves #54964
K8s pod events of the
type=Warningwas reported to Airflow logs at the error level. This is too high of a level for the intention of the event type. I chose to report all warning as warning and everything else as info as I think this follows the purpose of K8s events documented asMost if not all the pod events are retryable within K8s' scheduler. I have seen these events being an error tricks pipeline users/operators into thinking the K8s scheduler is at fault it's likely pod TTL issues.
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.