Fetch served logs when no remote/executor logs available for non-running task try #39177

kahlstrm · 2024-04-22T19:05:59Z

This PR implements #32561 in a different way. This caused a regression for our use case, where non-running task try logs weren't shown in UI for running tasks. This is due to us storing the logs on the worker with a Persistent Volume.

Instead of never fetching the logs from the server for non-running task tries, try to fetch them if and only if there are no remote or executor logs available already.

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

airflow/utils/log/file_task_handler.py

…nning task try

RNHTTR · 2024-04-24T19:29:00Z

I can't resolve my conversations for some reason... maybe because some git stuff as it's only showing one commit? Either way, my comments were addressed.

potiuk

@dstandish ? Can you also take a look maybe?

airflow/utils/log/file_task_handler.py

…nning task try (#39177) (cherry picked from commit eca077b)

dstandish · 2024-08-05T22:19:54Z

@kahlstrm can you clarify what you mean here?

This PR implements #32561 in a different way. This caused a regression for our use case, where non-running task try logs weren't shown in UI for running tasks. This is due to us storing the logs on the worker with a Persistent Volume.

Specifically this part:

This is due to us storing the logs on the worker with a Persistent Volume

What does storing logs on the worker with a PV have to do with anything? If you're storing logs on a PV, shouldn't the webserver have access to it, so it can read the logs directly from the PV?

This PR definitely has introduced a bug, because now users cannot see served logs from triggerer while deferred. But I'm just not sure exactly what functionality here we need to preserve and implement in a different way.

kahlstrm · 2024-08-05T23:02:07Z

@kahlstrm can you clarify what you mean here?

This PR implements #32561 in a different way. This caused a regression for our use case, where non-running task try logs weren't shown in UI for running tasks. This is due to us storing the logs on the worker with a Persistent Volume.

Specifically this part:

This is due to us storing the logs on the worker with a Persistent Volume

What does storing logs on the worker with a PV have to do with anything? If you're storing logs on a PV, shouldn't the webserver have access to it, so it can read the logs directly from the PV?

This PR definitely has introduced a bug, because now users cannot see served logs from triggerer while deferred. But I'm just not sure exactly what functionality here we need to preserve and implement in a different way.

I'm no longer working with this particular project, but the setup was a PV on the worker that was not mounted on the webserver. When it comes to the bug, I would guess this line change is the culprit for the behavior. The reasoning for that line was to enable fetching previous task instance attempt served logs when there are no remote logs available, but this then introduced the incorrect behavior for the deferred case.

Is TaskInstanceState.DEFERRED always the latest task task instance attempt? If yes, then changing the aforementioned line to the following would perhaps fix this:

if is_in_running_or_deferred and not executor_messages and (not remote_logs or ti.try_number == try_number):

This would efffectively make it act the same as prior to this commit but retain the logic of fetching served logs for previous attempts when no remote logs are available.

This increases the cognitive complexity and readabality a bit, and refactoring the boolean logic as well would be ideal.

… logs apache#39177 introduced a bug where, if the task was in deferred state, served logs would not be checked.

dstandish · 2024-08-05T23:53:46Z

Thanks @kahlstrm, kind of you to follow up. What i'm going with now in #41272 is basically reverting everything -- the fix before yours (that introduced the bug you found), and your two fixes for the bugs introduced by that fix. I am just not sure it's worth the complexity just to avoid an edge case 403 error message that isn't of much consequence.
If someone has time to reintroduce a better approach to suppressing the 403 in that case (e.g. perhaps just suppress the 403) then they can. But for now, I just want to fix the inability to access logs while in deferred state.

kahlstrm · 2024-08-06T00:03:29Z

Thanks @kahlstrm, kind of you to follow up. What i'm going with now in #41272 is basically reverting everything -- the fix before yours (that introduced the bug you found), and your two fixes for the bugs introduced by that fix. I am just not sure it's worth the complexity just to avoid an edge case 403 error message that isn't of much consequence. If someone has time to reintroduce a better approach to suppressing the 403 in that case (e.g. perhaps just suppress the 403) then they can. But for now, I just want to fix the inability to access logs while in deferred state.

Sounds good to me 👍 I agree with you on this, that adding this amount of logical complexity just to avoid a single request error is not worth it, but didn't myself want to revert the wanted behavior of #32561 immediately. As it now has turned out, having this amount of bugs/unwanted behavior come out of such change is not worth it IMO.

boring-cyborg bot added the area:logging label Apr 22, 2024

RNHTTR reviewed Apr 22, 2024

View reviewed changes

airflow/utils/log/file_task_handler.py Outdated Show resolved Hide resolved

kahlstrm requested a review from RNHTTR April 24, 2024 12:04

RNHTTR reviewed Apr 24, 2024

View reviewed changes

airflow/utils/log/file_task_handler.py Outdated Show resolved Hide resolved

kahlstrm force-pushed the main branch 3 times, most recently from 368bb21 to 3c32218 Compare April 24, 2024 17:04

Get served logs when remote or executor logs not available for non-ru…

ed0bf55

…nning task try

kahlstrm force-pushed the main branch from 3c32218 to ed0bf55 Compare April 24, 2024 17:06

kahlstrm requested a review from RNHTTR April 24, 2024 17:07

eladkal added this to the Airflow 2.9.1 milestone Apr 24, 2024

RNHTTR approved these changes Apr 24, 2024

View reviewed changes

potiuk approved these changes Apr 24, 2024

View reviewed changes

dirrao reviewed Apr 25, 2024

View reviewed changes

airflow/utils/log/file_task_handler.py Show resolved Hide resolved

dstandish approved these changes Apr 25, 2024

View reviewed changes

potiuk merged commit eca077b into apache:main Apr 25, 2024
42 checks passed

eladkal added the type:bug-fix Changelog: Bug Fixes label Apr 25, 2024

jedcunningham pushed a commit that referenced this pull request Apr 26, 2024

Get served logs when remote or executor logs not available for non-ru…

505ecd9

…nning task try (#39177) (cherry picked from commit eca077b)

ephraimbuddy mentioned this pull request Apr 30, 2024

Status of testing of Apache Airflow 2.9.1rc2 #39326

Closed

56 tasks

kahlstrm mentioned this pull request May 8, 2024

Fetch served logs also when task attempt is up for retry and no remote logs available #39496

Merged

wolfier mentioned this pull request Jul 31, 2024

Webserver does not fetch server logs when there are remote logs #41164

Closed

2 tasks

dstandish added a commit to astronomer/airflow that referenced this pull request Aug 5, 2024

Check served task logs when task running / deferred OR when no remote…

f07728b

… logs apache#39177 introduced a bug where, if the task was in deferred state, served logs would not be checked.

dstandish mentioned this pull request Aug 5, 2024

Fix check served logs logic #41272

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fetch served logs when no remote/executor logs available for non-running task try #39177

Fetch served logs when no remote/executor logs available for non-running task try #39177

kahlstrm commented Apr 22, 2024 •

edited

Loading

RNHTTR commented Apr 24, 2024

potiuk left a comment

dstandish commented Aug 5, 2024

kahlstrm commented Aug 5, 2024 •

edited

Loading

dstandish commented Aug 5, 2024

kahlstrm commented Aug 6, 2024

Fetch served logs when no remote/executor logs available for non-running task try #39177

Fetch served logs when no remote/executor logs available for non-running task try #39177

Conversation

kahlstrm commented Apr 22, 2024 • edited Loading

RNHTTR commented Apr 24, 2024

potiuk left a comment

Choose a reason for hiding this comment

dstandish commented Aug 5, 2024

kahlstrm commented Aug 5, 2024 • edited Loading

dstandish commented Aug 5, 2024

kahlstrm commented Aug 6, 2024

kahlstrm commented Apr 22, 2024 •

edited

Loading

kahlstrm commented Aug 5, 2024 •

edited

Loading