-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Description
Apache Airflow version
main (development)
If "Other Airflow 2 version" selected, which one?
No response
What happened?
While further looking at #53401, I am trying to understand more the spans and traces config. It came to observance that span queries are still happening although traces is disabled.
The Airflow scheduler continues to:
update span_status on dag_run and task_instance, and
execute, every scheduling loop, a query that scans those tables for
WHERE span_status = 'should_end'
This adds an unnecessary SELECT … every loop; on large task_instance tables, it showed up in pg_stat_activity with noticeably slow queries until we added an index on span_status
Impact:
- Unnecessary DB queries during Scheduler operations
- Performance degradation in environments where tracing is not used
- Missed heartbeats due to blocking/slow queries
What you think should happen instead?
I was checking the latest main branch and noticed, for example, that the call to the _end_spans_of_externally_ended_ops() method doesn’t appear to check whether tracing is enabled in the config. From what I can see, the Scheduler calls this method unconditionally.
There may be other similar cases, but in general, it would be good to ensure that no span-related logic or queries are executed when tracing is disabled (which is the default setting).
How to reproduce
- Start up an Airflow deployment with the default configuration (i.e. tracing disabled).
- Observe the database activity — span-related queries are executed during the Scheduler operation, even though tracing is disabled
- If the task_instance table contains a large number of records, these span-related queries can become slow and degrade performance (until this issue is addressed).
- These queries can be easily identified in the query logs or by profiling the database.
- Adding info logs to the
_end_spans_of_externally_ended_ops()method would further confirm that it’s being invoked unconditionally (I haven’t tried this yet, but it’s straightforward to trace through the code and verify the call path).
Operating System
Linux
Versions of Apache Airflow Providers
No response
Deployment
Astronomer
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct