Skip to content

Comments

fix triggerer logger's file descriptor closed when it removed#62103

Open
wjddn279 wants to merge 1 commit intoapache:mainfrom
wjddn279:fix-triggerer-logging-file-not-closed
Open

fix triggerer logger's file descriptor closed when it removed#62103
wjddn279 wants to merge 1 commit intoapache:mainfrom
wjddn279:fix-triggerer-logging-file-not-closed

Conversation

@wjddn279
Copy link
Contributor

@wjddn279 wjddn279 commented Feb 18, 2026

closed: #61916

An issue was reported where file descriptors were not being properly closed when the logger was destroyed in the existing triggerer.

As shown below, the number of open file descriptors was continuously increasing before this fix. After applying the logic in this PR, file descriptors are now properly closed and the count decreases as expected.

Additionally, we confirmed through added logging that the logger is properly garbage collected after upload_to_remote is called, triggering the destructor as expected.

image

AS-IS (3.1.7)

image image

TO-BE (patched)

image
Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@eladkal
Copy link
Contributor

eladkal commented Feb 18, 2026

Can you add a test to avoid regression?

@eladkal eladkal added this to the Airflow 3.1.8 milestone Feb 18, 2026
@eladkal eladkal added the type:bug-fix Changelog: Bug Fixes label Feb 18, 2026
@wjddn279
Copy link
Contributor Author

sure. I'll add it

@wjddn279
Copy link
Contributor Author

@eladkal Done!

Copy link
Contributor

@amoghrajesh amoghrajesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on, similar problem for dag processor was solved in #47574

self.bound_logger = logger
return logger

def __del__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one concern I mainly have with using del is that if an exception occurs during cleanup, it will quietly exit


if file_handle and not file_handle.closed:
file_handle.flush()
file_handle.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we handle it similar to how it is done for DAG processor: #47574

In short something like this, where we store the handler and clear it to avoid diversion from that approach?

diff --git a/airflow-core/src/airflow/jobs/triggerer_job_runner.py b/airflow-core/src/airflow/jobs/triggerer_job_runner.py
--- a/airflow-core/src/airflow/jobs/triggerer_job_runner.py	(revision 10cd08dff8916b93f8c3f94bc34265bb7544fde4)
+++ b/airflow-core/src/airflow/jobs/triggerer_job_runner.py	(date 1771510381992)
@@ -30,7 +30,7 @@
 from datetime import datetime
 from socket import socket
 from traceback import format_exception
-from typing import TYPE_CHECKING, Annotated, Any, ClassVar, Literal, TypedDict
+from typing import IO, TYPE_CHECKING, Annotated, Any, ClassVar, Literal, TypedDict
 
 import anyio
 import attrs
@@ -302,6 +302,8 @@
 
     bound_logger: WrappedLogger = attrs.field(init=False, repr=False)
 
+    _filehandle: IO[Any] = attrs.field(init=False, repr=False)
+
     def __call__(self, processors: Iterable[structlog.typing.Processor]) -> WrappedLogger:
         if hasattr(self, "bound_logger"):
             return self.bound_logger
@@ -312,13 +314,20 @@
 
         pretty_logs = False
         if pretty_logs:
-            underlying_logger: WrappedLogger = structlog.WriteLogger(log_file.open("w", buffering=1))
+            self._filehandle = log_file.open("w", buffering=1)
+            underlying_logger: WrappedLogger = structlog.WriteLogger(self._filehandle)
         else:
-            underlying_logger = structlog.BytesLogger(log_file.open("wb"))
+            self._filehandle = log_file.open("wb")
+            underlying_logger = structlog.BytesLogger(self._filehandle)
         logger = structlog.wrap_logger(underlying_logger, processors=processors).bind()
         self.bound_logger = logger
         return logger
 
+    def close(self):
+        """Explicitly close the underlying log file handle."""
+        if hasattr(self, "_filehandle"):
+            self._filehandle.close()
+
     def upload_to_remote(self):
         from airflow.sdk.log import upload_to_remote
 
@@ -421,10 +430,9 @@
             for id in msg.finished or ():
                 self.running_triggers.discard(id)
                 self.cancelling_triggers.discard(id)
                 # Remove logger from the cache, and since structlog doesn't have an explicit close method, we
                # only need to remove the last reference to it to close the open FH
                 if factory := self.logger_cache.pop(id, None):
                     factory.upload_to_remote()
+                    factory.close()
 
             response = messages.TriggerStateSync(
                 to_create=[],

Copy link
Contributor Author

@wjddn279 wjddn279 Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In short something like this, where we store the handler and clear it to avoid diversion from that approach?

I agree of this and it looks better! Thanks!

But if we separate it with such an explicit method close, that method must always be called after upload_to_remote is invoked — which could lead to mistakes in future development.

What do you think about doing a try-catch in __del__? I actually considered that beforehand too, but I was worried it could cause a silent fd leak if it fails.

@wjddn279 wjddn279 force-pushed the fix-triggerer-logging-file-not-closed branch from 06c01e3 to ec1c806 Compare February 19, 2026 15:25
@wjddn279 wjddn279 force-pushed the fix-triggerer-logging-file-not-closed branch from ec1c806 to 505d5eb Compare February 20, 2026 02:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Triggerer type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Triggerer: too many open files

3 participants