-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Optimize DAG list query for users with limited access #57460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When users have limited DAG access, the DAG list query was inefficiently
grouping all DagRuns in the database before filtering. This caused severe
performance degradation in large deployments where a user might access
only a few DAGs out of hundreds or thousands.
The fix filters both the main DAG query and the DagRun subquery by
accessible dag_ids before performing the expensive GROUP BY operation.
Before (queries all dagruns):
```sql
SELECT ... FROM dag
LEFT OUTER JOIN (
SELECT dag_run.dag_id, max(dag_run.id) AS max_dag_run_id
FROM dag_run
GROUP BY dag_run.dag_id
) AS mrq ON dag.dag_id = mrq.dag_id
```
After (filters to accessible dags):
```sql
SELECT ... FROM dag
LEFT OUTER JOIN (
SELECT dag_run.dag_id, max(dag_run.id) AS max_dag_run_id
FROM dag_run
WHERE dag_run.dag_id IN ('accessible_dag_1', 'accessible_dag_2')
GROUP BY dag_run.dag_id
) AS mrq ON dag.dag_id = mrq.dag_id
WHERE dag.dag_id IN ('accessible_dag_1', 'accessible_dag_2')
```
Performance impact: In a deployment with 100 DAGs (100 runs each) where
a user has access to only 2 DAGs, this reduces the subquery from grouping
10,000 rows down to 200 rows (50x improvement), and eliminates fetching
98 unnecessary DAG models.
Fixes apache#57427
tirkarthi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kaxil , we had a very similar patch tested with a shared cluster used by multiple teams with varied count of accessible dags. The group by logs that examined all the rows stopped occuring in the MySQL slow query logs after the fix.
Backport failed to create: v3-1-test. View the failure log Run details
You can attempt to backport this manually by running: cherry_picker f271f2b v3-1-testThis should apply the commit to the v3-1-test branch and leave the commit in conflict state marking After you have resolved the conflicts, you can continue the backport process by running: cherry_picker --continue |
When users have limited DAG access, the DAG list query was inefficiently
grouping all DagRuns in the database before filtering. This caused severe
performance degradation in large deployments where a user might access
only a few DAGs out of hundreds or thousands.
The fix filters both the main DAG query and the DagRun subquery by
accessible dag_ids before performing the expensive GROUP BY operation.
Before (queries all dagruns):
```sql
SELECT ... FROM dag
LEFT OUTER JOIN (
SELECT dag_run.dag_id, max(dag_run.id) AS max_dag_run_id
FROM dag_run
GROUP BY dag_run.dag_id
) AS mrq ON dag.dag_id = mrq.dag_id
```
After (filters to accessible dags):
```sql
SELECT ... FROM dag
LEFT OUTER JOIN (
SELECT dag_run.dag_id, max(dag_run.id) AS max_dag_run_id
FROM dag_run
WHERE dag_run.dag_id IN ('accessible_dag_1', 'accessible_dag_2')
GROUP BY dag_run.dag_id
) AS mrq ON dag.dag_id = mrq.dag_id
WHERE dag.dag_id IN ('accessible_dag_1', 'accessible_dag_2')
```
Performance impact: In a deployment with 100 DAGs (100 runs each) where
a user has access to only 2 DAGs, this reduces the subquery from grouping
10,000 rows down to 200 rows (50x improvement), and eliminates fetching
98 unnecessary DAG models.
Fixes apache#57427
When users have limited DAG access, the DAG list query was inefficiently
grouping all DagRuns in the database before filtering. This caused severe
performance degradation in large deployments where a user might access
only a few DAGs out of hundreds or thousands.
The fix filters both the main DAG query and the DagRun subquery by
accessible dag_ids before performing the expensive GROUP BY operation.
Before (queries all dagruns):
```sql
SELECT ... FROM dag
LEFT OUTER JOIN (
SELECT dag_run.dag_id, max(dag_run.id) AS max_dag_run_id
FROM dag_run
GROUP BY dag_run.dag_id
) AS mrq ON dag.dag_id = mrq.dag_id
```
After (filters to accessible dags):
```sql
SELECT ... FROM dag
LEFT OUTER JOIN (
SELECT dag_run.dag_id, max(dag_run.id) AS max_dag_run_id
FROM dag_run
WHERE dag_run.dag_id IN ('accessible_dag_1', 'accessible_dag_2')
GROUP BY dag_run.dag_id
) AS mrq ON dag.dag_id = mrq.dag_id
WHERE dag.dag_id IN ('accessible_dag_1', 'accessible_dag_2')
```
Performance impact: In a deployment with 100 DAGs (100 runs each) where
a user has access to only 2 DAGs, this reduces the subquery from grouping
10,000 rows down to 200 rows (50x improvement), and eliminates fetching
98 unnecessary DAG models.
Fixes #57427
(cherry picked from commit f271f2b)
When users have limited DAG access, the DAG list query was inefficiently
grouping all DagRuns in the database before filtering. This caused severe
performance degradation in large deployments where a user might access
only a few DAGs out of hundreds or thousands.
The fix filters both the main DAG query and the DagRun subquery by
accessible dag_ids before performing the expensive GROUP BY operation.
Before (queries all dagruns):
```sql
SELECT ... FROM dag
LEFT OUTER JOIN (
SELECT dag_run.dag_id, max(dag_run.id) AS max_dag_run_id
FROM dag_run
GROUP BY dag_run.dag_id
) AS mrq ON dag.dag_id = mrq.dag_id
```
After (filters to accessible dags):
```sql
SELECT ... FROM dag
LEFT OUTER JOIN (
SELECT dag_run.dag_id, max(dag_run.id) AS max_dag_run_id
FROM dag_run
WHERE dag_run.dag_id IN ('accessible_dag_1', 'accessible_dag_2')
GROUP BY dag_run.dag_id
) AS mrq ON dag.dag_id = mrq.dag_id
WHERE dag.dag_id IN ('accessible_dag_1', 'accessible_dag_2')
```
Performance impact: In a deployment with 100 DAGs (100 runs each) where
a user has access to only 2 DAGs, this reduces the subquery from grouping
10,000 rows down to 200 rows (50x improvement), and eliminates fetching
98 unnecessary DAG models.
Fixes #57427
(cherry picked from commit f271f2b)
When users have limited DAG access, the DAG list query was inefficiently grouping all DagRuns in the database before filtering. This caused severe performance degradation in large deployments where a user might access only a few DAGs out of hundreds or thousands.
The fix filters both the main DAG query and the DagRun subquery by accessible dag_ids before performing the expensive GROUP BY operation.
Before (queries all dagruns):
After (filters to accessible dags):
Performance impact: In a deployment with 100 DAGs (100 runs each) where a user has access to only 2 DAGs, this reduces the subquery from grouping 10,000 rows down to 200 rows (50x improvement), and eliminates fetching 98 unnecessary DAG models.
Fixes #57427
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.