-
Notifications
You must be signed in to change notification settings - Fork 16.3k
[KubernetesPodOperator] Reads Kubernetes events and writes them into log #50192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,6 +18,7 @@ | |
|
|
||
| from __future__ import annotations | ||
|
|
||
| import asyncio | ||
| import datetime | ||
| import json | ||
| import logging | ||
|
|
@@ -602,12 +603,20 @@ def get_or_create_pod(self, pod_request_obj: k8s.V1Pod, context: Context) -> k8s | |
|
|
||
| def await_pod_start(self, pod: k8s.V1Pod) -> None: | ||
| try: | ||
| self.pod_manager.await_pod_start( | ||
| pod=pod, | ||
| schedule_timeout=self.schedule_timeout_seconds, | ||
| startup_timeout=self.startup_timeout_seconds, | ||
| check_interval=self.startup_check_interval_seconds, | ||
| loop = asyncio.get_event_loop() | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Specifically this: This is running in a normal sync worker there is no running even loop and this raises an exception.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmm, per the docs:
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I wonder if it's related to this, and some user code turning that into an Error.
Trying to confirm.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure what in parallel execution in a sync task is then the right way. For us it is working like this since a year in production. But maybe our environment is not representative? Challenge is to parse and follow logs and events in parallel. There is no API in K8s delivering both concurrently and flipping back-and forth is very in efficient if you want to listen to log stream. Therefore we took the async approach. Do you have more details on where and how it is "breaking"? Which environment? Note that we also initially attempted to run another thread and not using asyncio but this also was blocked by Celery and is probably also not advised.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't have many details -- its second hand through one of the Astronomer customers. I think this is something turning the deprecation warning in to an error -- I think the fix/workaround is to swap the manual loop control to
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we have a full report including a way to reproduce as a GH issue ticket on this? Would be great as well also to include a test to prevent regression then.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Trying to see what I can get. All I have right now is the stacktrace I put in a comment above. |
||
| events_task = asyncio.ensure_future( | ||
| self.pod_manager.watch_pod_events(pod, self.startup_check_interval_seconds) | ||
| ) | ||
| loop.run_until_complete( | ||
| self.pod_manager.await_pod_start( | ||
jason810496 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| pod=pod, | ||
| schedule_timeout=self.schedule_timeout_seconds, | ||
| startup_timeout=self.startup_timeout_seconds, | ||
| check_interval=self.startup_check_interval_seconds, | ||
| ) | ||
| ) | ||
| loop.run_until_complete(events_task) | ||
| loop.close() | ||
| except PodLaunchFailedException: | ||
| if self.log_events_on_failure: | ||
| self._read_pod_events(pod, reraise=False) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you should need this -- It's likely needed to make
asyncio.get_event_loop()pass, but a worker doesn't have it, and if we swap it toaysncio.run()it won't need this marker. I.e. adding this mark was working around the test failing in a way that is not respective of how the KPO actually runs?