Add task duration plot across dagruns #40755

tirkarthi · 2024-07-12T17:03:30Z

closes: #40337
related: #40337

This implements multi-line chart to plot task duration across all dagruns in the grid. The duration includes only run duration and is stored as seconds. During rendering the y-axis value is converted to hh:mm:ss format along with the tooltip. In task duration for a specific task we can calculate max duration but calculating it across all instances means when an outlier task ran in hours that takes only seconds then deselecting the task from legend will still display all others in hours. Clicking on the data point takes the user to specific task instance detail tab.

I am facing an issue with moment import which I couldn't figure out. Any help is appreciated.

static/js/dag/details/task/AllTaskDuration.tsx:67:27 - error TS2686: 'moment' refers to a UMD global, but the current file is a module. Consider adding an import instead.

67       const runDuration = moment
                             ~~~~~~

static/js/dag/details/task/AllTaskDuration.tsx:83:12 - error TS2686: 'moment' refers to a UMD global, but the current file is a module. Consider adding an import instead.

83     return moment.utc(value * 1000).format("HH[h]:mm[m]:ss[s]");
              ~~~~~~

static/js/dag/details/task/AllTaskDuration.tsx:127:28 - error TS2686: 'moment' refers to a UMD global, but the current file is a module. Consider adding an import instead.

127           const duration = moment.utc(value * 1000);

Screenshot

echarts example : https://echarts.apache.org/examples/en/editor.html?c=line-stack&lang=js

tirkarthi · 2024-07-12T17:16:26Z

Another option that I thought will be cool is a stacked bar chart that shows the percentage of time each task spent and showing variations. Like a gantt view of all dagruns but normalized to 1 and shows task time percentage of dagrun duration. Demo link and screenshot.

Demo Link : stacked bar chart

bbovenzi · 2024-07-15T14:36:14Z

Love the idea of showing task duration even when nothing is selected.

I'm still partial to a bar chart over a line graph. The discrete task instance or dag run is more important than the trendline. So the stacked bar chart is cool. Of course over a certain number of tasks, it isn't worth displaying. But maybe, we can actually merge the dag run and task graphs into a single view and provide various options to users to adjust the chart to best fit how each DAG works.

ketozhang · 2024-07-15T19:31:01Z

Amongst the suggestion, just want to remind the use case of #40337

Currently, we can only view the duration of tasks individually, which makes it difficult to identify spikes in specific tasks, especially if the total DAG run duration appears normal

The proportional stacked bar chart would make the use case difficult. Classical stacked bar chart would better suit this use case, so does line chart—but I agree with @bbovenzi bar chart is better (lines are for trends).

ketozhang · 2024-07-15T19:34:41Z

Clicking on the data point takes the user to specific task instance detail tab.

❤️ this. Do you predict the user clicking this would be more interested in the details tab or the durations tab (task durations bar chart)?

tirkarthi · 2024-07-18T13:33:54Z

Clicking on the data point takes the user to specific task instance detail tab.

❤️ this. Do you predict the user clicking this would be more interested in the details tab or the durations tab (task durations bar chart)?

Details tab was my first guess but it could even be logs since user might want to know why it took sometime. It can be changed as PR is reviewed and I don't have strong opinions here.

tirkarthi · 2024-07-18T13:49:33Z

It looks easier to implement switching between line and bar chart. In the latest commit I have added a checkbox similar to "show landing time" in run duration without persistence to make it easier to play around locally if someone is interested. I also wanted to see how the chart is for dags with lot of tasks where many of our dags have minimum 10 tasks and below is a sample dag to generate random sleep tasks.

Line chart to me looks unusable when there are lot of data points that finish around same time but useful to see trends as noted with fewer number of tasks with varying time. Bar chart also looks useful where user can click on the bar to get to task detail like line chart and looks better in some cases. With a lot of tasks the bars/lines can get crammed and might be harder to select the exact one without unselecting few tasks in the legend.

Few thoughts and questions :

Lot of tasks means legend can be larger and possibly collides with y axis. I tried various options in echarts like padding, nameGap but couldn't get it working.
Task groups don't filter down to further tasks as separate bars.
I have seen the chart go negative while running might be some issue in calculation of duration for running tasks.
Maybe keep the checkbox so that user can select the best representation as per the situation.
Tooltip could probably display only the relevant datapoint instead of time for everything. I am not sure how to select the right option for that in echarts.

from datetime import datetime, timedelta

from airflow import DAG
from airflow.decorators import task, task_group

with DAG(
    dag_id="all_task_duration",
    start_date=datetime(2024, 7, 1),
    catchup=True,
    schedule_interval="@daily",
) as dag:

    def base(i, j):
        import random, time

        duration = random.randrange(i, j)

        for index in range(duration):
            time.sleep(1)
            print(index)

    @task
    def extract():
        base(10, 20)

    @task
    def load():
        base(20, 30)

    @task
    def transform():
        base(2, 5)

    @task
    def transform_1():
        base(2, 5)

    @task
    def transform_2():
        base(2, 5)

    @task
    def transform_3():
        base(2, 5)

    @task
    def transform_4():
        base(2, 5)

    @task
    def transform_5():
        base(2, 5)

    @task
    def transform_6():
        base(2, 5)

    @task
    def transform_7():
        base(2, 5)

    @task
    def transform_8():
        base(2, 5)

    @task
    def transform_9():
        base(2, 5)

    @task
    def transform_10():
        base(2, 5)

    @task
    def transform_11():
        base(2, 5)

    @task_group(group_id="my_task_group")
    def tg1():

        @task
        def tg11():
            base(11, 20)

        @task
        def tg12():
            base(20, 30)

        tg11() >> tg12()

    (
        extract()
        >> load()
        >> tg1()
        >> transform()
        >> transform_1()
        >> transform_2()
        >> transform_3()
        >> transform_4()
        >> transform_5()
        >> transform_6()
        >> transform_7()
        >> transform_8()
        >> transform_9()
        >> transform_10()
        >> transform_11()
    )

Screenshots :

Lot of tasks with same duration where bar chart looks better than line chart

tirkarthi · 2024-07-18T13:57:35Z

Love the idea of showing task duration even when nothing is selected.
I'm still partial to a bar chart over a line graph. The discrete task instance or dag run is more important than the trendline. So the stacked bar chart is cool. Of course over a certain number of tasks, it isn't worth displaying. But maybe, we can actually merge the dag run and task graphs into a single view and provide various options to users to adjust the chart to best fit how each DAG works

Just thought to add about merging dagrun and task duration that sometimes the individual task duration doesn't really add to dagrun time since there could be cases where pool slot is not available, Airflow is down, maintenance etc where the task will execute fine once the resources/dependencies are met but the might not indicate an overall delay in the chart thus introducing discrepancy between the bar charts on the left of grid view not corresponding to the chart on the right.

tirkarthi · 2024-07-18T13:59:25Z

@bbovenzi @ketozhang If someone can also help in fixing the typescript issue in the PR description moment import that would also help in making the PR green for further review. I am not sure what I am doing wrong here. The class name AllTaskDuration could also be renamed later to be something more meaningful.

Thanks

bbovenzi · 2024-07-18T16:00:46Z

Consolidating the run and task charts. I think adding an additional sub-bar which is all the non-task run time would be really helpful for users!
I love selecting and deselecting various tasks. We might want to choose a maximum number of tasks to show for very large DAGs, maybe 20? Later, we can add some additional logic to consolidate all short duration tasks into a "Other" category to keep things readable.
I'm still not a fan of the line chart but toggling between is fine.

I'll pull down your branch later today and help with the ts work

airflow/www/static/js/dag/details/task/AllTaskDuration.tsx

…Add support to click on a data point and get to details page.

bbovenzi

Since 2.10 is due soon, let's get this merged and we can make enhancements in smaller PRs before release.

airflow/www/static/js/dag/details/task/AllTaskDuration.tsx

tirkarthi requested review from ryanahamilton, ashb, bbovenzi and pierrejeambrun as code owners July 12, 2024 17:03

boring-cyborg bot added area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues labels Jul 12, 2024

bbovenzi mentioned this pull request Jul 15, 2024

Add task status filters to Task Duration Bar chart view #40445

Open

2 tasks

tirkarthi force-pushed the task-duration-all branch from 5f10a1d to 592e9d2 Compare July 18, 2024 13:50

bbovenzi reviewed Jul 19, 2024

View reviewed changes

airflow/www/static/js/dag/details/task/AllTaskDuration.tsx Show resolved Hide resolved

tirkarthi added 5 commits July 24, 2024 23:01

Initial commit to display task duration across all task instances.

55d8a0f

Switch to datasets with custom labeling of hh:mm:ss format for time. …

37fe915

…Add support to click on a data point and get to details page.

Format value in tooltip.

52739a7

Add show bar chart option.

8ac76c7

Ignore ts check for moment.

d267aef

tirkarthi force-pushed the task-duration-all branch from 592e9d2 to d267aef Compare July 24, 2024 17:31

eladkal added this to the Airflow 2.10.0 milestone Jul 25, 2024

eladkal added the type:improvement Changelog: Improvements label Jul 25, 2024

bbovenzi approved these changes Jul 30, 2024

View reviewed changes

airflow/www/static/js/dag/details/task/AllTaskDuration.tsx Show resolved Hide resolved

bbovenzi merged commit f10fbe6 into apache:main Jul 30, 2024
48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add task duration plot across dagruns #40755

Add task duration plot across dagruns #40755

tirkarthi commented Jul 12, 2024

tirkarthi commented Jul 12, 2024 •

edited

Loading

bbovenzi commented Jul 15, 2024

ketozhang commented Jul 15, 2024

ketozhang commented Jul 15, 2024

tirkarthi commented Jul 18, 2024

tirkarthi commented Jul 18, 2024

tirkarthi commented Jul 18, 2024

tirkarthi commented Jul 18, 2024

bbovenzi commented Jul 18, 2024

bbovenzi left a comment

Add task duration plot across dagruns #40755

Add task duration plot across dagruns #40755

Conversation

tirkarthi commented Jul 12, 2024

tirkarthi commented Jul 12, 2024 • edited Loading

bbovenzi commented Jul 15, 2024

ketozhang commented Jul 15, 2024

ketozhang commented Jul 15, 2024

tirkarthi commented Jul 18, 2024

tirkarthi commented Jul 18, 2024

tirkarthi commented Jul 18, 2024

tirkarthi commented Jul 18, 2024

bbovenzi commented Jul 18, 2024

bbovenzi left a comment

Choose a reason for hiding this comment

tirkarthi commented Jul 12, 2024 •

edited

Loading