Skip to content

Conversation

@kaxil
Copy link
Member

@kaxil kaxil commented Oct 26, 2025

The /public/dags and /ui/dags endpoints were triggering n+1 queries when loading DAG tags - one query to fetch DAGs, then one additional query per DAG to fetch its tags. For deployments with many DAGs, this could cause significant performance degradation.

Added selectinload(DagModel.tags) to generate_dag_with_latest_run_query() to eagerly load all tags in a single additional query instead of N separate queries. This reduces the total queries from O(N) to O(1) with respect to the number of DAGs.

Example impact:

  • Before: 1 query for DAGs + 100 queries for tags (100 DAGs) = 101 queries
  • After: 1 query for DAGs + 1 query for all tags = 2 queries

Added regression tests that verify query count doesn't scale linearly with the number of DAGs by comparing counts before and after adding more DAGs.

Fixes #57241


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

The `/public/dags` and `/ui/dags` endpoints were triggering n+1 queries
when loading DAG tags - one query to fetch DAGs, then one additional query
per DAG to fetch its tags. For deployments with many DAGs, this could cause
significant performance degradation.

Added `selectinload(DagModel.tags)` to `generate_dag_with_latest_run_query()`
to eagerly load all tags in a single additional query instead of N separate
queries. This reduces the total queries from O(N) to O(1) with respect to
the number of DAGs.

Example impact:
- Before: 1 query for DAGs + 100 queries for tags (100 DAGs) = 101 queries
- After: 1 query for DAGs + 1 query for all tags = 2 queries

Added regression tests that verify query count doesn't scale linearly with
the number of DAGs by comparing counts before and after adding more DAGs.

Fixes apache#57241
@kaxil kaxil added this to the Airflow 3.1.2 milestone Oct 26, 2025
@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Oct 26, 2025
@kaxil kaxil requested a review from tirkarthi October 26, 2025 00:32
Copy link
Member

@jason810496 jason810496 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvement!

@kaxil kaxil merged commit 5a28c44 into apache:main Oct 26, 2025
111 of 112 checks passed
@kaxil kaxil deleted the dag_n_plus_one branch October 26, 2025 03:19
Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super cool, thanks.

That could help solve #56635, where loading dags were reported slow.

Or we might have another similar issue specific to ui/dags when fetching 'recent dag runs'.

pierrejeambrun pushed a commit to astronomer/airflow that referenced this pull request Oct 30, 2025
@pierrejeambrun
Copy link
Member

Manual backport #57570, cc: @kaxil

kaxil added a commit that referenced this pull request Oct 30, 2025
(cherry picked from commit 5a28c44)

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API area:performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

n+1 queries to fetch tags for dags in dags list page

3 participants