Batch Operator: Realtime container execution logs in Airflow task execution log #31675
Closed
1 of 2 tasks
Labels
area:providers
good first issue
kind:feature
Feature Requests
provider:amazon
AWS/Amazon - related issues
Description
Add functionality to Airflow BatchOperator to capture the task execution logs as they happen from AWS CloudWatch to the BatchOperator's task execution logs.
Use case/motivation
When executing Batch tasks via BatchOperator the logs are not readily available in Airflow task during execution; only a link is provided to point to where the logs can be viewed in CloudWatch. As a result one should navigate the task logs in Airflow to get a handle of the container log path in AWS Cloudwatch and then access the log from AWS Cloudwatch in realtime.
The current behavior introduces the following challenge:
The ongoing execution logs are not captured in Airflow task ( BatchOperator ) log. This typically involves couple of additional steps - which can be avoided.
Related issues
This is almost identical to the ECSOperator realtime log streaming that was raised in this issue: #22512. In fact most of the description, use case, and motivation are lifted from that issue.
An
EcsTaskLogFetcher
class already exists inairflow.providers.amazon.aws.hooks.ecs
which AWS Batch can use as a backend. It just needs to be initialized, started, and stopped at the correct time. In the event of an array job, the Batch Operator already defaults to the first task and the same could be done when choosing which log stream to fetch.Are you willing to submit a PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: