Skip to content

n+1 queries to fetch tags for dags in dags list page #57241

@tirkarthi

Description

@tirkarthi

Apache Airflow version

main (development)

If "Other Airflow 2/3 version" selected, which one?

No response

What happened?

On loading the dags list page it seems the tags are not queried using joins and a query is made during serialization of each dag to fetch the tags for a dag. Though dag_id column itself is indexed in dag_tag table the tags could be fetched as part of joins.

CREATE TABLE `dag_tag` (
  `name` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,
  `dag_id` varchar(250) CHARACTER SET utf8mb3 COLLATE utf8mb3_bin NOT NULL,
  PRIMARY KEY (`name`,`dag_id`),
  KEY `idx_dag_tag_dag_id` (`dag_id`),
  CONSTRAINT `dag_tag_dag_id_fkey` FOREIGN KEY (`dag_id`) REFERENCES `dag` (`dag_id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci 
[2025-10-25T12:00:41.297540Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.297834Z] {base.py:1577} INFO - [cached since 0.005695s ago] ('dag_1',)
[2025-10-25T12:00:41.299771Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.299950Z] {base.py:1577} INFO - [cached since 0.007819s ago] ('dag_10',)
[2025-10-25T12:00:41.301762Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.301936Z] {base.py:1577} INFO - [cached since 0.009806s ago] ('dag_100',)
[2025-10-25T12:00:41.303630Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.303808Z] {base.py:1577} INFO - [cached since 0.01168s ago] ('dag_101',)
[2025-10-25T12:00:41.305534Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.305713Z] {base.py:1577} INFO - [cached since 0.01358s ago] ('dag_102',)
[2025-10-25T12:00:41.307352Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.307524Z] {base.py:1577} INFO - [cached since 0.0154s ago] ('dag_103',)
[2025-10-25T12:00:41.309495Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.309698Z] {base.py:1577} INFO - [cached since 0.01756s ago] ('dag_104',)
[2025-10-25T12:00:41.312336Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.312713Z] {base.py:1577} INFO - [cached since 0.02056s ago] ('dag_105',)
[2025-10-25T12:00:41.315968Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.316178Z] {base.py:1577} INFO - [cached since 0.02404s ago] ('dag_106',)
[2025-10-25T12:00:41.318186Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.318400Z] {base.py:1577} INFO - [cached since 0.02626s ago] ('dag_107',)
[2025-10-25T12:00:41.320469Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.320677Z] {base.py:1577} INFO - [cached since 0.02854s ago] ('dag_108',)
[2025-10-25T12:00:41.322693Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.322895Z] {base.py:1577} INFO - [cached since 0.03075s ago] ('dag_109',)
[2025-10-25T12:00:41.324665Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id
[2025-10-25T12:00:41.324910Z] {base.py:1577} INFO - [cached since 0.03276s ago] ('dag_11',)
[2025-10-25T12:00:41.326889Z] {base.py:1577} INFO - SELECT dag_tag.name AS dag_tag_name, dag_tag.dag_id AS dag_tag_dag_id 
FROM dag_tag 
WHERE %s = dag_tag.dag_id

What you think should happen instead?

No response

How to reproduce

  1. Create 500 dags with each dag corresponding to a tag.
  2. Load the dagslist page with echo as True to sqlalchemy engine args to view the tags query being made per dag in api-server logs.
[sqlalchemy]
sql_alchemy_engine_args = {"echo": true}
for index in range(500):
    code = f"""
from datetime import datetime
from airflow.sdk import DAG
from airflow.providers.standard.operators.bash import BashOperator


with DAG(
    dag_id="dag_{index}",
    schedule="@continuous",
    max_active_runs=1,
    catchup=False,
    tags=["tag_{index}"]
):
    task1 = BashOperator(task_id="task1", bash_command="echo dag_{index}")
"""

    with open(f"dag_{index}.py", "w") as f:
        f.write(code)

Operating System

Ubuntu 20.04

Versions of Apache Airflow Providers

No response

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

AIP-84Modern Rest APIaffected_version:3.0Issues Reported for 3.0area:UIRelated to UI/UX. For Frontend Developers.area:corekind:bugThis is a clearly a bugpriority:mediumBug that should be fixed before next release but would not block a release

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions