-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize calendar view for cron scheduled DAGs #24262
Conversation
Makes sense. In the long run we should probably make this a method in the timetable class that any subclasses can override to provide an optimisation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, hope others review and approve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The planned counts look correct to me. I didn't see any off-by-one issues on my test DAGs.
I had a DAG scheduled for every 15 minutes which wouldn't even load to being <10s.
Later on I think it would be great to interpret the cron string when possible and only iterate through croniter for schedules that we can't interpret simply.
The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease. |
(cherry picked from commit 23fb663)
The calendar view was slow for frequents scheduled DAGs. Iterating the timetable with next_dagrun_info added a lot of overhead to the view.
By instead iterating on croniter, we can render the page significantly faster. With my test DAG, running every hour for the rest of the year, this approach took the page from 12s to under a second.
And bonus! It fixes the planned run count shown in the UI as well (this used to show 1 for my test DAG):
(In draft as I may have an off-by-1 error, and this isn't tested outside my single test DAG)
Related: #23602
(Or maybe it even closes it)