Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve logging behavior of DockerOperator #40489

Merged

Conversation

jscheffl
Copy link
Contributor

We are using DockerOperator for executions outside the cloud... and compared to KubernetesPodOperator I see a lot of problems when logs are emitted. This PR improves the situation:

  • If an image is pulled all layer status is posted to logs - mostly this is not of interest, added a cool new log group around
  • As logs from container are captured, log chunks are logged to stdout for Airflow. But chunks mostly contain multiple lines and if a lot of logs are emitted then... at point of retrieval the task_log_handler tries to sort messages by timestamp an fails if numeric content is in the docker logs and totally scrambles the result. Logs are with this PR split by line and the python logger prefix is added consistently on each line - like in K8sPodOperator.

@potiuk potiuk merged commit c8b7dc5 into apache:main Jun 29, 2024
51 checks passed
@potiuk
Copy link
Member

potiuk commented Jun 29, 2024

Nice one :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants