Skip to content

Conversation

@gopidesupavan
Copy link
Member

closes: #50802

Currently, logging to CloudWatch only occurs at startup—specifically, when the executor first launches and triggers a DAG. Logging functions correctly for this initial run, but fails on subsequent runs. This issue arises because the handler is implemented as a cached_property, and we call self.handler.close() during the final log upload. In the close() method, the shutdown parameter is set to True https://github.com/kislyuk/watchtower/blob/main/watchtower/__init__.py#L463, which blocks further logging and prevents logs from being uploaded to CloudWatch.

Additionally, we should ensure that the stream name is explicitly assigned if it's available in the logger. Otherwise, we risk reusing a previously configured stream name, which can result in logs being written to the incorrect stream.

2025-05-20 11:07:58.746 | 2025-05-20 15:07:58.746784 [warning  ] /home/airflow/.local/lib/python3.12/site-packages/watchtower/__init__.py:464: WatchtowerWarning: Received message after logging system shutdown
2025-05-20 11:07:58.746 |   warnings.warn("Received message after logging system shutdown", WatchtowerWarning)
2025-05-20 11:07:58.746 |  [py.warnings]
2025-05-20 11:07:58.747 | 2025-05-20 15:07:58.747339 [debug    ] send_request_headers.started request=<Request [b'PUT']> [httpcore.http11]
2025-05-20 11:07:58.747 | 2025-05-20 15:07:58.747506 [debug    ] send_request_headers.complete  [httpcore.http11]

Before:

Screenshot 2025-05-23 at 23 10 47

After:
Screenshot 2025-05-23 at 23 07 12


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@eladkal eladkal requested a review from vincbeck May 24, 2025 04:04
@gopidesupavan
Copy link
Member Author

add back handler caching, as initializing the handler is a heavy operation, it sets up all queues in Watchtower and establishes sessions. It's inefficient to perform this on every access, especially via a property.

Instead of closing the handler, we can use flush, which sends all queued logs without shutting down the handler.

@gopidesupavan
Copy link
Member Author

Post change

image

@gopidesupavan
Copy link
Member Author

BTW delete_local_copy is currently not supported for CloudWatch logs. Implementing this is a bit tricky, as Watchtower uploads logs from internal queues. We would need to ensure that a specific stream has been fully flushed before safely deleting the local copy.

Will check this in a separate PR.

@gopidesupavan gopidesupavan merged commit 8f7c9c8 into apache:main May 26, 2025
69 checks passed
@gopidesupavan gopidesupavan deleted the fix-remote-aws-cloudwatch-logging branch May 26, 2025 17:05
sanederchik pushed a commit to sanederchik/airflow that referenced this pull request Jun 7, 2025
…assignment (apache#51022)

* Fix remote logging CloudWatch handler initialization and stream name assignment

* Flush the logs when upload function calls
jose-lehmkuhl pushed a commit to jose-lehmkuhl/airflow that referenced this pull request Jul 11, 2025
…assignment (apache#51022)

* Fix remote logging CloudWatch handler initialization and stream name assignment

* Flush the logs when upload function calls
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remote logging to Cloudwatch fails after the first task is logged.

4 participants