Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/add log grouping to UI #38021

Merged
merged 8 commits into from
Mar 14, 2024

Conversation

jscheffl
Copy link
Contributor

@jscheffl jscheffl commented Mar 10, 2024

This PR is another "eye candy" for upcoming 2.9.0 release - and I hope I pass the review and merge if before release cut-off.

Logging is usually a mass of text. Tasks/Operators generate a lot of text and it is hard to read and understand what is going on. Expertise need to be built-up. In our departments users are mostly over-whelmed with the logs in Airflow.

A bit of filtering had been introduced but what I personally always was missing was a kind of log grouping which is existing in Github or also Azure DevOps Pipelines.

This PR adds Log Grouping also to Airflow with the following effects:

  • Log grouping can be made with the same keywords like in Github or AzureDevOps - means that if users have scripts from these systems, they could be used in Airflow as well. Start a group with ::group::Some Headline and end it with ::endgroup::
  • Task instance runs automatically "fold" log details of task execution prior and post execute() method, making logs clean per default (showing only real business logic)
  • If a user un-folds a section, un-folding state is remembered in Grid view
  • Grouping is also working in old/legacy log view as well as Grid if the log is updated (auto-tail when task is running)

Example, how do logs look like for the example_params_trigger_ui DAG:
image

If you click on the blue colored log group header, the section un-folds:
image

One example added to example_python_operator DAG, prints are grouped giving the following results in the task logs:
image

FYI @clellmann @wolfdn @AutomationDev85

@boring-cyborg boring-cyborg bot added area:logging area:Scheduler including HA (high availability) scheduler area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues kind:documentation labels Mar 10, 2024
@potiuk
Copy link
Member

potiuk commented Mar 10, 2024

I ❤️ it. Been using it in GitHub for ages.

Folding before/after execute automatically is cool.

@potiuk
Copy link
Member

potiuk commented Mar 10, 2024

Just a warning - there might be some non-trivial performance issue created by it. But I think it's worth risking it.

@jscheffl
Copy link
Contributor Author

Just a warning - there might be some non-trivial performance issue created by it. But I think it's worth risking it.

Yeah, mainly depending on your browser/CPU speed and log volume. Anyway will only hit of the regex can make a match in log lines, but will need to crawl over N log lines. Will do a short test. Today log is stopping to be processed at 1M lines, asking for download.

@jscheffl
Copy link
Contributor Author

jscheffl commented Mar 10, 2024

Just a warning - there might be some non-trivial performance issue created by it. But I think it's worth risking it.

Okay, did a test with a DAG and just created ~40.000 and ~400.000 log lines - in 400 log groups of 100/1000 lines content:

  • 40k lines: loading time 1-2s in grid view
  • 400k lines: loading time of 12-15s, then scrolling and viewing/folding/unfolding was 100% smooth
    • View of same logs in "Legacy" non Grid foll-screen log view took only 6s!

Test on Core i7-1280P / HP Laptop / Firefox (other browsers might be even faster)
Test on Airflow running via breeze, NOT dev mode (optimized JS compiled!)
Note that production of 400k print lines of LOG in Python 3.11 took 55 seconds to execute - viewing logs in browser after 15s was quite fast for this :-D - Log file was in total 32MB for 400.000 lines

Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one, it looks good!

I agree with Jarek on the performance issue, but IMHO we can merge it into beta and then decide if we need to add configuration to enable the feature and disable it by default.

@bbovenzi bbovenzi added this to the Airflow 2.9.0 milestone Mar 11, 2024
@jscheffl jscheffl force-pushed the feature/add-log-grouping-to-ui branch from 3ced77d to 8b695c1 Compare March 11, 2024 20:10
@jscheffl
Copy link
Contributor Author

Did some further performance measurement on my Laptop. Interesting, if logs are folded actually rendering is even faster than if no grouping... maybe because most content is hidden and saves rendering time:

main before log coloring PR #37985 main today this PR
Log creation with DAG - - 43s
Firefox - Logs in Grid View 20s 18.2s 15.9s
Firefox - Logs in Legacy View 36.3 37.1 14.8
Chromium - Logs in Grid View 21.3s 20.5s 14.8s
Chromium - Logs in Legacy View 24.3s 25.3s 15.2s

All measured with my HP Laptop, Intel Core i7-1280P, Ubuntu 20.04. Measurement w/o proper tooling, just with a stop-watch :-D

FYI the 400k logs were generated with the following code in a PythonOperator:

        print("Now some log-mess...")
        for i in range(400):
            print(f"::group::Mess #{i}")
            for j in range(1000):
                print(f"messi line {i} - {j}")
            print("::endgroup::")
        print("DONE with log-mess!")

@potiuk
Copy link
Member

potiuk commented Mar 11, 2024

Did some further performance measurement on my Laptop. Interesting, if logs are folded actually rendering is even faster than if no grouping... maybe because most content is hidden and saves rendering time:

😱 -> yeah. that's likely not a very "realistic" example, but good enough. From my experience in GitHub Actions, the biggest issue is scrolling through HUGHE folded group - but I think a lot of it is becuase of the relatively slow backend handling a LOT LOT LOT of clients vs relatively fast front-end. Airlfow has far less of that problem - the webserver will be handling few (maybe few tens) of people maximum.

@jscheffl
Copy link
Contributor Author

Did some further performance measurement on my Laptop. Interesting, if logs are folded actually rendering is even faster than if no grouping... maybe because most content is hidden and saves rendering time:

😱 -> yeah. that's likely not a very "realistic" example, but good enough. From my experience in GitHub Actions, the biggest issue is scrolling through HUGHE folded group - but I think a lot of it is becuase of the relatively slow backend handling a LOT LOT LOT of clients vs relatively fast front-end. Airlfow has far less of that problem - the webserver will be handling few (maybe few tens) of people maximum.

Agree, not realistic but wanted to see the drop - and was surprised it is even faster in the case. I was fearing it is a lot slower but I have no fear anymore. 😱

@jscheffl jscheffl force-pushed the feature/add-log-grouping-to-ui branch 2 times, most recently from b52b1fa to 7795efa Compare March 12, 2024 20:06
@jscheffl jscheffl force-pushed the feature/add-log-grouping-to-ui branch from 7795efa to 6e75f6b Compare March 13, 2024 22:56
Copy link
Contributor

@bbovenzi bbovenzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@bbovenzi bbovenzi merged commit 54d0180 into apache:main Mar 14, 2024
56 checks passed
drajguru pushed a commit to drajguru/airflow that referenced this pull request Mar 14, 2024
* Add folding feature for log group to web UI

* Add folding feature for log groups, documentation

* Group logs from pre-task execution and post-execution logs

* Add state to the log block in grid view when task is running and logs are updated

* Spelling

* Review feedback

* Review feedback v2

* Enhance unicode arrow symbol compatability
@ephraimbuddy ephraimbuddy added the type:new-feature Changelog: New Features label Mar 14, 2024
@jscheffl jscheffl deleted the feature/add-log-grouping-to-ui branch March 14, 2024 19:21
@dstandish
Copy link
Contributor

One issue with this is (IIUC) that it assumes that groups won't overlap. But logs for a TI can come from multiple sources and this might not always be true. First it was only worker. Then we added triggerer. Then we added scheduler (in exceptional circumstances). Now there is interest in logging to TI logs any action specific to a TI. (e.g. scheduler state changes). With AIP-44, there could be logs from webserver / internal API server. To the extent that these groups can overlap in time, the folding will be misleading.

@jscheffl
Copy link
Contributor Author

One issue with this is (IIUC) that it assumes that groups won't overlap. But logs for a TI can come from multiple sources and this might not always be true. First it was only worker. Then we added triggerer. Then we added scheduler (in exceptional circumstances). Now there is interest in logging to TI logs any action specific to a TI. (e.g. scheduler state changes). With AIP-44, there could be logs from webserver / internal API server. To the extent that these groups can overlap in time, the folding will be misleading.

Yes, agree - partly. Groups can overlap. And if you have a large codebase you need to be careful. But you are not focred to use grouping in your custom code of tasks. I hope I took care in the case in pre/post execution that I catched all cases. If you thing explicitly I missed any let me know. Happy to add a patch/bugfix.

As of your concerns I just added an explicit check which I did not do before... what happens if you nest groups...?!?
Experiment with group nesting:

        print("before")
        print("::group::Outer Group")
        print("within 1")
        print("::group::Inner Group 1")
        print("within 2a - inside a second group")
        print("::endgroup::")
        print("within 2b")
        print("::group::Inner Group 2")
        print("within 2c - inside a second group")
        print("::endgroup::")
        print("within 3")
        print("::endgroup::")
        print("after")

Result: Seems to work. Nested groups allow nested folding:
image

If the numbers of group starts and ends do not match logs might look "weired". I'd assume if more groups are opened than ended still you can un-fold. Too many group ends will not be a problem. If group starts and ends are completely mixed the context might look strange but in general I'd assume all logs should be at least un-foldable by the user.

If you for example take this bad usage:

        print("before")
        print("::group::Group 1 forgot to end")
        print("within 1")
        print("::group::Group 2 forgot to end")
        print("within 2")
        print("::group::Group 3 forgot to end")
        print("within 3")
        print("after")

Result will render a folding group to the end of logs "eating up" other groups:
image

...but still it is possible to un-fold all:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:logging area:Scheduler including HA (high availability) scheduler area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues kind:documentation type:new-feature Changelog: New Features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants