Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize calendar view for cron scheduled DAGs #24262

Merged
merged 1 commit into from
Jun 9, 2022

Conversation

jedcunningham
Copy link
Member

@jedcunningham jedcunningham commented Jun 6, 2022

The calendar view was slow for frequents scheduled DAGs. Iterating the timetable with next_dagrun_info added a lot of overhead to the view.

By instead iterating on croniter, we can render the page significantly faster. With my test DAG, running every hour for the rest of the year, this approach took the page from 12s to under a second.

And bonus! It fixes the planned run count shown in the UI as well (this used to show 1 for my test DAG):

Screen Shot 2022-06-06 at 3 55 25 PM

(In draft as I may have an off-by-1 error, and this isn't tested outside my single test DAG)

Related: #23602
(Or maybe it even closes it)

@boring-cyborg boring-cyborg bot added the area:webserver Webserver related Issues label Jun 6, 2022
@uranusjr
Copy link
Member

uranusjr commented Jun 7, 2022

Makes sense. In the long run we should probably make this a method in the timetable class that any subclasses can override to provide an optimisation.

Copy link
Contributor

@vincentkoc vincentkoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, hope others review and approve

@bbovenzi bbovenzi marked this pull request as ready for review June 9, 2022 17:46
Copy link
Contributor

@bbovenzi bbovenzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The planned counts look correct to me. I didn't see any off-by-one issues on my test DAGs.

I had a DAG scheduled for every 15 minutes which wouldn't even load to being <10s.

Later on I think it would be great to interpret the cron string when possible and only iterate through croniter for schedules that we can't interpret simply.

@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Jun 9, 2022
@github-actions
Copy link

github-actions bot commented Jun 9, 2022

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

@bbovenzi bbovenzi merged commit 23fb663 into apache:main Jun 9, 2022
@bbovenzi bbovenzi deleted the calendar_perf branch June 9, 2022 17:58
@eladkal eladkal added this to the Airflow 2.3.3 milestone Jun 10, 2022
ephraimbuddy pushed a commit that referenced this pull request Jun 29, 2022
@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Jun 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:webserver Webserver related Issues okay to merge It's ok to merge this PR as it does not require more tests type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants