Skip to content

Conversation

@gengliangwang
Copy link
Member

What changes were proposed in this pull request?

Correct misleading usage of log key TASK_ID from #45834 and #46739

Why are the changes needed?

Provide more accurate log keys in the structure logging

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing GA tests

Was this patch authored or co-authored using generative AI tooling?

No

// re-execute it.
logError(log"Task ${MDC(TASK_ID, info.id)} in stage ${MDC(STAGE_ID, taskSet.id)} " +
log"(TID ${MDC(TID, tid)}) can not write to output file: " +
logError(log"Task ${MDC(TASK_INFO_ID, info.id)} in stage ${MDC(STAGE_ID, taskSet.id)} " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the taskSet.id is s"$stageId.$stageAttemptId"

Copy link
Member

@pan3793 pan3793 Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for cases like TASK_INFO_ID and TASK_SET_ID, I am wondering if we can just expose

  • TASK_INDEX
  • TASK_ATTEMP_NUM
  • STAGE_ID
  • STAGE_ATTEMP_ID

to the MDC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, updated.

isExecutorDecommissioningOrDecommissioned(taskScheduler, bmAddress)
if (ignoreStageFailure) {
logInfo(log"Ignoring fetch failure from ${MDC(TASK_ID, task)} of " +
logInfo(log"Ignoring fetch failure from ${MDC(TASK_NAME, task)} of " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to call this TASK_NAME?
From the code below, it seems inconsistent:

log"${MDC(TASK_NAME, taskName(attemptInfo.taskId))} on " +

def taskName(tid: Long): String = {
val info = taskInfos.get(tid)
assert(info.isDefined, s"Can not find TaskInfo for task (TID $tid)")
s"task ${info.get.id} in stage ${taskSet.id} (TID $tid)"
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, I can't find a better name here..

log"${MDC(ERROR, ef.description)}; not retrying")
logError(
log"Task ${MDC(TASK_INDEX, info.index)}.${MDC(TASK_ATTEMPT_ID, info.attemptNumber)} " +
log"in stage ${MDC(STAGE_ID, taskSet.stageId)}." +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we unify the writing style of taskSet.logid?
image
the current writing style more consistent with traditional styles.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one only shows once...I was thinking about not having a new method

override def toString: String = "TaskSet " + id

// Identifier used in the structured logging framework.
lazy val logId: MessageWithContext = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we intentionally not treat it as a log parameter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by "log parameter"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, it's a MessageWithContext

@gengliangwang
Copy link
Member Author

@pan3793 @panbingkun @cloud-fan Thanks for the review.
Merging to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants