Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add task context logging feature to allow forwarding messages to task logs #32646

Merged
merged 44 commits into from
Nov 17, 2023

Conversation

pankajkoti
Copy link
Member

@pankajkoti pankajkoti commented Jul 17, 2023

The PR adds a feature by adding the TaskContextLogger class that
can forward messages from Airflow components like Scheduler,
Executor, etc to the task logs. This is helpful when in exception
scenarios the task is marked as failed from these components
e.g. when task times out because it remains stuck in the queue or
when a zombie task is killed. In such scenarios, currently, no task
logs are available in the UI and the user does not have a clue why
the task failed. Forwarding such event messages to the task logs
will give a clear idea to the user.

The PR also adds a config param to disable this feature in case
something is observed to be not working well in the Airflow
components due to the addition of this feature.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:logging area:Scheduler including HA (high availability) scheduler labels Jul 17, 2023
@pankajkoti pankajkoti changed the title Add arbitrary log shipper Add task log shipper Jul 18, 2023
@uranusjr
Copy link
Member

What does ship means here? I don’t see it explained anywhere and it isn’t a commonly used word in this context, as far as I know.

@dstandish
Copy link
Contributor

What does ship means here? I don’t see it explained anywhere and it isn’t a commonly used word in this context, as far as I know.

Totally open to better names here. By ship i just mean send. We're sending from executor etc to task instance logs, which is something up to now we can't do.

@dstandish
Copy link
Contributor

Also, if you have any other design suggestions, that is welcome

@pankajkoti pankajkoti changed the title Add task log shipper Add task log shipper feature to allow forwarding messages to task logs Jul 19, 2023
@pankajkoti
Copy link
Member Author

The failing test is unrelated and seen in other PRs too. It was reported here getmoto/moto#7031 (comment) and likely supposed to get fixed

@dstandish
Copy link
Contributor

The failing test is unrelated and seen in other PRs too. It was reported here getmoto/moto#7031 (comment) and likely supposed to get fixed

i created #35687 which seems to fix it on my machine

@pankajkoti pankajkoti marked this pull request as ready for review November 16, 2023 16:55
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks really cool (providing the TODO that was supposed to be removed, will be removed that is).

@pankajkoti pankajkoti merged commit 2f69b5f into apache:main Nov 17, 2023
46 checks passed
@pankajkoti pankajkoti deleted the arbitrary-log-shipper branch November 17, 2023 07:19
pankajkoti added a commit to astronomer/airflow that referenced this pull request Nov 17, 2023
With the addition of taxt context logging feature in PR apache#32646,
this PR extends the feature to AWS S3 when is it set as remote
logging store. Here, backward compatibility is ensured for older
versions of Airflow that do not have the feature included in
Airflow Core.
pankajkoti added a commit to astronomer/airflow that referenced this pull request Nov 17, 2023
With the addition of taxt context logging feature in PR apache#32646,
this PR extends the feature to GCP Cloud storage when is it set as
remote logging store. Here, backward compatibility is ensured for
older versions of Airflow that do not have the feature included in
Airflow Core.
pankajkoti added a commit that referenced this pull request Nov 17, 2023
…32950)

With the addition of taxt context logging feature in PR #32646,
this PR extends the feature to AWS S3 when is it set as remote
logging store. Here, backward compatibility is ensured for older
versions of Airflow that do not have the feature included in
Airflow Core.
pankajkoti added a commit that referenced this pull request Nov 17, 2023
…32970)

With the addition of taxt context logging feature in PR #32646,
this PR extends the feature to GCP Cloud storage when is it set as
remote logging store. Here, backward compatibility is ensured for
older versions of Airflow that do not have the feature included in
Airflow Core.
pankajkoti added a commit that referenced this pull request Nov 17, 2023
…ure Blob Storage) (#32972)

With the addition of task context logging feature in PR #32646,
this PR extends the feature to Azure Blob Storage (WASB)  when is
it set as remote logging store. Here, backward compatibility is
ensured for older versions of Airflow that do not have the feature
included in Airflow Core.
dstandish pushed a commit to astronomer/airflow that referenced this pull request Nov 17, 2023
…earch

With the addition of task context logging feature in PR apache#32646,
this PR extends the feature to Elasticsearch when is it set as
remote logging store. Here, backward compatibility is ensured for
older versions of Airflow that do not have the feature included
in Airflow Core.
@ephraimbuddy ephraimbuddy added the type:new-feature Changelog: New Features label Nov 20, 2023
pankajkoti added a commit that referenced this pull request Nov 21, 2023
…earch (#32977)

* Extend task context logging support for remote logging using Elasticsearch

With the addition of task context logging feature in PR #32646,
this PR extends the feature to Elasticsearch when is it set as
remote logging store. Here, backward compatibility is ensured for
older versions of Airflow that do not have the feature included
in Airflow Core.

* update ensure_ti

---------

Co-authored-by: Daniel Standish <15932138+dstandish@users.noreply.github.com>
ephraimbuddy pushed a commit that referenced this pull request Nov 23, 2023
…earch (#32977)

* Extend task context logging support for remote logging using Elasticsearch

With the addition of task context logging feature in PR #32646,
this PR extends the feature to Elasticsearch when is it set as
remote logging store. Here, backward compatibility is ensured for
older versions of Airflow that do not have the feature included
in Airflow Core.

* update ensure_ti

---------

Co-authored-by: Daniel Standish <15932138+dstandish@users.noreply.github.com>
pankajkoti added a commit to astronomer/airflow that referenced this pull request Nov 26, 2023
Using the feature built in apache#32646, when the scheduler marks tasks
stuck in queued as failed, send such an explicit log indicating
the action to the task logs so that it helps users identify why
exactly the task was marked failed in such a case.
ephraimbuddy pushed a commit that referenced this pull request Nov 26, 2023
…earch (#32977)

* Extend task context logging support for remote logging using Elasticsearch

With the addition of task context logging feature in PR #32646,
this PR extends the feature to Elasticsearch when is it set as
remote logging store. Here, backward compatibility is ensured for
older versions of Airflow that do not have the feature included
in Airflow Core.

* update ensure_ti

---------

Co-authored-by: Daniel Standish <15932138+dstandish@users.noreply.github.com>
pankajkoti added a commit that referenced this pull request Nov 26, 2023
…35857)

Using the feature built in #32646, when the scheduler marks tasks
stuck in queued as failed, send such an explicit log indicating
the action to the task logs so that it helps users identify why
exactly the task was marked failed in such a case.

---------

Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:logging area:Scheduler including HA (high availability) scheduler type:new-feature Changelog: New Features use public runners Makes sure that Public runners are used even if commiters creates the PR (useful for testing)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants