-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Description
Apache Airflow version
3.1.3
If "Other Airflow 2/3 version" selected, which one?
No response
What happened?
We encountered a breaking change when upgrading from Airflow 2.11.0 to 3.1.3 related to remote_task_handler_kwargs in airflow_local_settings.py.
Issue:
In Airflow 2.x, remote_task_handler_kwargs configured the FileTaskHandler (or its subclasses like WasbTaskHandler). We successfully used this to set max_bytes for log file size limiting:
remote_task_handler_kwargs = {"max_bytes": 5000000, "backup_count": 5}
After upgrading to Airflow 3.x, this configuration raises:
TypeError: WasbRemoteLogIO.init() got an unexpected keyword argument 'max_bytes'
What you think should happen instead?
Root cause:
PR #48491 introduced separate RemoteLogIO classes for remote storage operations. The current implementation in airflow_local_settings.py now passes remote_task_handler_kwargs to WasbRemoteLogIO.init(), which only accepts base_log_folder, remote_base, delete_local_copy, and wasb_container.
Handler-level parameters like max_bytes and backup_count should go to FileTaskHandler, but are instead being consumed by the I/O layer, which doesn't support them.
Misleading naming:
The variable name remote_task_handler_kwargs suggests these parameters configure the handler, but the current code path applies them to the remote I/O abstraction.
Question:
How should max_bytes and similar FileTaskHandler parameters be configured in Airflow 3.x with the new RemoteLogIO architecture? Should there be separate config keys for I/O-layer vs handler-layer parameters, or should the current implementation be fixed to properly separate them?
Expected behavior:
Parameters like max_bytes, backup_count and delay should configure the FileTaskHandler instance, while WasbRemoteLogIO receives only its own relevant parameters.
How to reproduce
Configure remote logging e.g. using WasbFileTaskHandler to upload logs to a blob stroage and add:
[logging]
remote_logging = True
remote_base_log_folder = wasb-://.blob.core.windows.net/
remote_task_handler_kwargs = {"max_bytes": 5000000, "backup_count": 5}
and start application. And check for TypeError during init of the logging system:
TypeError: WasbRemoteLogIO.init() got an unexpected keyword argument 'max_bytes'
Operating System
Debian GNU/Linux 12
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct