Skip to content

Conversation

@amoghrajesh
Copy link
Contributor

@amoghrajesh amoghrajesh commented Oct 23, 2025

closes: #57167

PR #52562 changed _handle_heartbeat_failures() to accept an exception parameter and log it as a structured field. Now due to this change, during the first failed heartbeat attempt, the _handle_heartbeat_failures function logs a message by calling log.warning(), which accepts an exception parameter that expects a string type object. However, in the source code, an exception type object is passed instead of a string type object. This results in a TypeError (like below) which causes task supervision to fail.

The error looked like this:

2025-10-23T17:58:22.900129Z [error    ] Task execute_workload[aac34f36-54e1-46e4-ba47-15dba8ba7149] raised unexpected: TypeError('can only concatenate str (not "ConnectError") to str') [celery.app.trace] loc=trace.py:267
Traceback (most recent call last):
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
    yield
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 250, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
    raise exc
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
    stream = self._connect(request)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
    with map_exceptions(exc_map):
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused

The change in #52562 was mainly made due to ruff upgrade reasons, so I am going back to using the standard Python logging pattern: pass exception to exc_info parameter.

After changes, error looks like this:

2025-10-23T18:24:39.303939Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 1st time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:40.307404Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 2nd time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:41.417523Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 3rd time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:43.801145Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 4th time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:46.690339Z [warning  ] Failed to send heartbeat. Will be retried [supervisor] failed_heartbeats=3 loc=supervisor.py:1135 max_retries=3 ti_id=UUID('019a1250-48a0-756c-ab0b-9687289ef580')
Traceback (most recent call last):
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
    yield
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 250, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
    raise exc
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
    stream = self._connect(request)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
    with map_exceptions(exc_map):
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/airflow/task-sdk/src/airflow/sdk/execution_time/supervisor.py", line 1105, in _send_heartbeat_if_needed
    self.client.task_instances.heartbeat(self.id, pid=self._process.pid)
  File "/opt/airflow/task-sdk/src/airflow/sdk/api/client.py", line 259, in heartbeat
    self.client.put(f"task-instances/{id}/heartbeat", content=body.model_dump_json())
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 1181, in put
    return self.request(
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 338, in wrapped_f
    return copy(f, *args, **kw)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 477, in __call__
    do = self.iter(retry_state=retry_state)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 378, in iter
    result = action(retry_state)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 420, in exc_check
    raise retry_exc.reraise()
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 187, in reraise
    raise self.last_attempt.result()
  File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 480, in __call__
    result = fn(*args, **kwargs)
  File "/opt/airflow/task-sdk/src/airflow/sdk/api/client.py", line 894, in request
    return super().request(*args, **kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 825, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 914, in send
    response = self._send_handling_auth(
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 942, in _send_handling_auth
    response = self._send_handling_redirects(
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
    response = self._send_single_request(request)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 1014, in _send_single_request
    response = transport.handle_request(request)
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 249, in handle_request
    with map_httpcore_exceptions():
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [Errno 111] Connection refused

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@amoghrajesh amoghrajesh added this to the Airflow 3.1.2 milestone Oct 23, 2025
@amoghrajesh amoghrajesh self-assigned this Oct 23, 2025
@ashb ashb added the backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch label Oct 23, 2025
@kaxil kaxil merged commit 970d7da into apache:main Oct 23, 2025
80 checks passed
@kaxil kaxil deleted the exception-log-task-heartbeat branch October 23, 2025 21:30
github-actions bot pushed a commit that referenced this pull request Oct 23, 2025
…ion logging (#57172)

closes: #57167

PR #52562 changed `_handle_heartbeat_failures()` to accept an exception parameter and log it as a structured field. Now due to this change, during the first failed heartbeat attempt, the _handle_heartbeat_failures function logs a message by calling log.warning(), which accepts an exception parameter that expects a string type object. However, in the source code, [an exception type object is passed](https://github.com/apache/airflow/blob/54bd5d8cd9f6f477cc83445737614dec81c4323c/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L1126) instead of a string type object. This results in a TypeError (like below) which causes task supervision to fail.

The error looked like this:

```python

2025-10-23T17:58:22.900129Z [error    ] Task execute_workload[aac34f36-54e1-46e4-ba47-15dba8ba7149] raised unexpected: TypeError('can only concatenate str (not "ConnectError") to str') [celery.app.trace] loc=trace.py:267
Traceback (most recent call last):
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
    yield
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 250, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
    raise exc
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
    stream = self._connect(request)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
    with map_exceptions(exc_map):
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused
```

The change in #52562 was mainly made due to ruff upgrade reasons, so I am going back to using the standard Python logging pattern: pass exception to `exc_info` parameter.

After changes, error looks like this:
```python
2025-10-23T18:24:39.303939Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 1st time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:40.307404Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 2nd time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:41.417523Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 3rd time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:43.801145Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 4th time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:46.690339Z [warning  ] Failed to send heartbeat. Will be retried [supervisor] failed_heartbeats=3 loc=supervisor.py:1135 max_retries=3 ti_id=UUID('019a1250-48a0-756c-ab0b-9687289ef580')
Traceback (most recent call last):
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
    yield
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 250, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
    raise exc
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
    stream = self._connect(request)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
    with map_exceptions(exc_map):
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/airflow/task-sdk/src/airflow/sdk/execution_time/supervisor.py", line 1105, in _send_heartbeat_if_needed
    self.client.task_instances.heartbeat(self.id, pid=self._process.pid)
  File "/opt/airflow/task-sdk/src/airflow/sdk/api/client.py", line 259, in heartbeat
    self.client.put(f"task-instances/{id}/heartbeat", content=body.model_dump_json())
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 1181, in put
    return self.request(
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 338, in wrapped_f
    return copy(f, *args, **kw)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 477, in __call__
    do = self.iter(retry_state=retry_state)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 378, in iter
    result = action(retry_state)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 420, in exc_check
    raise retry_exc.reraise()
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 187, in reraise
    raise self.last_attempt.result()
  File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 480, in __call__
    result = fn(*args, **kwargs)
  File "/opt/airflow/task-sdk/src/airflow/sdk/api/client.py", line 894, in request
    return super().request(*args, **kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 825, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 914, in send
    response = self._send_handling_auth(
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 942, in _send_handling_auth
    response = self._send_handling_redirects(
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
    response = self._send_single_request(request)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 1014, in _send_single_request
    response = transport.handle_request(request)
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 249, in handle_request
    with map_httpcore_exceptions():
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [Errno 111] Connection refused
```
(cherry picked from commit 970d7da)

Co-authored-by: Amogh Desai <amoghrajesh1999@gmail.com>
@github-actions
Copy link

Backport successfully created: v3-1-test

Status Branch Result
v3-1-test PR Link

github-actions bot pushed a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Oct 23, 2025
…ion logging (apache#57172)

closes: apache#57167

PR apache#52562 changed `_handle_heartbeat_failures()` to accept an exception parameter and log it as a structured field. Now due to this change, during the first failed heartbeat attempt, the _handle_heartbeat_failures function logs a message by calling log.warning(), which accepts an exception parameter that expects a string type object. However, in the source code, [an exception type object is passed](https://github.com/apache/airflow/blob/54bd5d8cd9f6f477cc83445737614dec81c4323c/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L1126) instead of a string type object. This results in a TypeError (like below) which causes task supervision to fail.

The error looked like this:

```python

2025-10-23T17:58:22.900129Z [error    ] Task execute_workload[aac34f36-54e1-46e4-ba47-15dba8ba7149] raised unexpected: TypeError('can only concatenate str (not "ConnectError") to str') [celery.app.trace] loc=trace.py:267
Traceback (most recent call last):
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
    yield
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 250, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
    raise exc
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
    stream = self._connect(request)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
    with map_exceptions(exc_map):
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused
```

The change in apache#52562 was mainly made due to ruff upgrade reasons, so I am going back to using the standard Python logging pattern: pass exception to `exc_info` parameter.

After changes, error looks like this:
```python
2025-10-23T18:24:39.303939Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 1st time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:40.307404Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 2nd time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:41.417523Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 3rd time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:43.801145Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 4th time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:46.690339Z [warning  ] Failed to send heartbeat. Will be retried [supervisor] failed_heartbeats=3 loc=supervisor.py:1135 max_retries=3 ti_id=UUID('019a1250-48a0-756c-ab0b-9687289ef580')
Traceback (most recent call last):
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
    yield
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 250, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
    raise exc
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
    stream = self._connect(request)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
    with map_exceptions(exc_map):
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/airflow/task-sdk/src/airflow/sdk/execution_time/supervisor.py", line 1105, in _send_heartbeat_if_needed
    self.client.task_instances.heartbeat(self.id, pid=self._process.pid)
  File "/opt/airflow/task-sdk/src/airflow/sdk/api/client.py", line 259, in heartbeat
    self.client.put(f"task-instances/{id}/heartbeat", content=body.model_dump_json())
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 1181, in put
    return self.request(
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 338, in wrapped_f
    return copy(f, *args, **kw)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 477, in __call__
    do = self.iter(retry_state=retry_state)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 378, in iter
    result = action(retry_state)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 420, in exc_check
    raise retry_exc.reraise()
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 187, in reraise
    raise self.last_attempt.result()
  File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 480, in __call__
    result = fn(*args, **kwargs)
  File "/opt/airflow/task-sdk/src/airflow/sdk/api/client.py", line 894, in request
    return super().request(*args, **kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 825, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 914, in send
    response = self._send_handling_auth(
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 942, in _send_handling_auth
    response = self._send_handling_redirects(
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
    response = self._send_single_request(request)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 1014, in _send_single_request
    response = transport.handle_request(request)
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 249, in handle_request
    with map_httpcore_exceptions():
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [Errno 111] Connection refused
```
(cherry picked from commit 970d7da)

Co-authored-by: Amogh Desai <amoghrajesh1999@gmail.com>
amoghrajesh added a commit that referenced this pull request Oct 24, 2025
…ion logging (#57172) (#57179)

closes: #57167

PR #52562 changed `_handle_heartbeat_failures()` to accept an exception parameter and log it as a structured field. Now due to this change, during the first failed heartbeat attempt, the _handle_heartbeat_failures function logs a message by calling log.warning(), which accepts an exception parameter that expects a string type object. However, in the source code, [an exception type object is passed](https://github.com/apache/airflow/blob/54bd5d8cd9f6f477cc83445737614dec81c4323c/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L1126) instead of a string type object. This results in a TypeError (like below) which causes task supervision to fail.

The error looked like this:

```python

2025-10-23T17:58:22.900129Z [error    ] Task execute_workload[aac34f36-54e1-46e4-ba47-15dba8ba7149] raised unexpected: TypeError('can only concatenate str (not "ConnectError") to str') [celery.app.trace] loc=trace.py:267
Traceback (most recent call last):
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
    yield
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 250, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
    raise exc
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
    stream = self._connect(request)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
    with map_exceptions(exc_map):
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused
```

The change in #52562 was mainly made due to ruff upgrade reasons, so I am going back to using the standard Python logging pattern: pass exception to `exc_info` parameter.

After changes, error looks like this:
```python
2025-10-23T18:24:39.303939Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 1st time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:40.307404Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 2nd time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:41.417523Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 3rd time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:43.801145Z [warning  ] Starting call to 'airflow.sdk.api.client.Client.request', this is the 4th time calling it. [airflow.sdk.api.client] loc=before.py:42
2025-10-23T18:24:46.690339Z [warning  ] Failed to send heartbeat. Will be retried [supervisor] failed_heartbeats=3 loc=supervisor.py:1135 max_retries=3 ti_id=UUID('019a1250-48a0-756c-ab0b-9687289ef580')
Traceback (most recent call last):
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
    yield
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 250, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
    raise exc
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
    stream = self._connect(request)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
    with map_exceptions(exc_map):
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/airflow/task-sdk/src/airflow/sdk/execution_time/supervisor.py", line 1105, in _send_heartbeat_if_needed
    self.client.task_instances.heartbeat(self.id, pid=self._process.pid)
  File "/opt/airflow/task-sdk/src/airflow/sdk/api/client.py", line 259, in heartbeat
    self.client.put(f"task-instances/{id}/heartbeat", content=body.model_dump_json())
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 1181, in put
    return self.request(
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 338, in wrapped_f
    return copy(f, *args, **kw)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 477, in __call__
    do = self.iter(retry_state=retry_state)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 378, in iter
    result = action(retry_state)
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 420, in exc_check
    raise retry_exc.reraise()
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 187, in reraise
    raise self.last_attempt.result()
  File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/python/lib/python3.10/site-packages/tenacity/__init__.py", line 480, in __call__
    result = fn(*args, **kwargs)
  File "/opt/airflow/task-sdk/src/airflow/sdk/api/client.py", line 894, in request
    return super().request(*args, **kwargs)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 825, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 914, in send
    response = self._send_handling_auth(
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 942, in _send_handling_auth
    response = self._send_handling_redirects(
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
    response = self._send_single_request(request)
  File "/usr/python/lib/python3.10/site-packages/httpx/_client.py", line 1014, in _send_single_request
    response = transport.handle_request(request)
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 249, in handle_request
    with map_httpcore_exceptions():
  File "/usr/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [Errno 111] Connection refused
```
(cherry picked from commit 970d7da)

Co-authored-by: Amogh Desai <amoghrajesh1999@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:task-sdk backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TypeError in _handle_heartbeat_failures logging

4 participants