-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-48134][CORE] Spark core (java side): Migrate error/warn/info with variables to structured logging framework
#46390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… with variables to structured logging framework
| this.slf4jLogger = slf4jLogger; | ||
| } | ||
|
|
||
| public boolean isErrorEnabled() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is found that some logic uses this method during migration.
| slf4jLogger.error(msg, arg1, arg2); | ||
| } | ||
|
|
||
| public void error(String msg, String... args) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the method signature not public void error(String msg, Object... args)
Because it conflicts with public void error(String msg, MDC... mdcs)
When we want to use error(String msg, MDC... mdcs), it will enter error(String msg, Object... args)
| */ | ||
| package org.apache.spark.io; | ||
|
|
||
| import com.google.common.base.Preconditions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking this chance, let's adjust the import order of java code
| public class ReadAheadInputStream extends InputStream { | ||
|
|
||
| private static final Logger logger = LoggerFactory.getLogger(ReadAheadInputStream.class); | ||
| private static final Logger LOGGER = LoggerFactory.getLogger(ReadAheadInputStream.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's capitalize variables of type static final.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest avoiding such unnecessary change.
In https://www.slf4j.org/api/org/slf4j/Logger.html, the example code is using logger as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
| requestingConsumer); | ||
| if (LOGGER.isDebugEnabled()) { | ||
| LOGGER.debug("Task {} acquired {} for {}", String.valueOf(taskAttemptId), | ||
| Utils.bytesToString(got), requestingConsumer.toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| underlyingInputStream.close(); | ||
| } catch (IOException e) { | ||
| logger.warn(e.getMessage(), e); | ||
| LOGGER.warn("{}", e, MDC.of(LogKeys.REASON$.MODULE$, e.getMessage())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REASON => ERROR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
| MDC.of(LogKeys.THREAD_ID$.MODULE$, Thread.currentThread().getId()), | ||
| MDC.of(LogKeys.MEMORY_SIZE$.MODULE$, Utils.bytesToString(getMemoryUsage())), | ||
| MDC.of(LogKeys.NUM_SPILL_WRITERS$.MODULE$, spillWriters.size()), | ||
| MDC.of(LogKeys.TIME_UNIT$.MODULE$, spillWriters.size() > 1 ? " times" : " time")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SPILL_TIMES?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
…uctured logging framework ### What changes were proposed in this pull request? Since we are targeting on migration INFO/WARN/ERROR level logs to structure logging, this PR removes the logDebug and logTrace methods from the JAVA structured logging framework. ### Why are the changes needed? In the log migration PR #46390, there are unnecessary changes such as updating ``` logger.debug("Task {} need to spill {} for {}", taskAttemptId, Utils.bytesToString(required - got), requestingConsumer); ``` to ``` LOGGER.debug("Task {} need to spill {} for {}", String.valueOf(taskAttemptId), Utils.bytesToString(required - got), requestingConsumer.toString()); ``` With this PR, we can avoid such changes during log migrations. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing UT. ### Was this patch authored or co-authored using generative AI tooling? No Closes #46405 from gengliangwang/updateJavaLog. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
|
@gengliangwang |
|
|
||
| package org.apache.spark.util.collection.unsafe.sort; | ||
|
|
||
| import java.io.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's revert the unnecessary changes in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it the import order of the file header, or from org.slf4j.Logger & org.slf4j.LoggerFactory to org.apache.spark.internal.Logger &org.apache.spark.internal.LoggerFactory
The later change is to prohibit the import of org.slf4j.Logger & org.slf4j.LoggerFactory in java code later, so as to prevent subsequent developers from introducingorg.slf4j.Logger & org.slf4j.LoggerFactory again, thus writing inconsistent logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there is no log migration in this file, why do we need to change the import order here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import order has been restored.
I actually have an idea to align the import order of java with the import order of scala.
It's OK, I restored it, 😄
| logger.info("Spilling data because number of spilledRecords crossed the threshold " + | ||
| numElementsForSpillThreshold); | ||
| logger.info("Spilling data because number of spilledRecords crossed the threshold {}", | ||
| MDC.of(LogKeys.SHUFFLE_SPILL_NUM_ELEMENTS_FORCE_SPILL_THRESHOLD$.MODULE$, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this key is sort of long. How about NUM_ELEMENTS_SPILL_THRESHOLD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
gengliangwang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except two comments
|
Thanks, merging to master |
| if (inMemSorter.numRecords() >= numElementsForSpillThreshold) { | ||
| logger.info("Spilling data because number of spilledRecords crossed the threshold " + | ||
| numElementsForSpillThreshold); | ||
| logger.info("Spilling data because number of spilledRecords crossed the threshold {}" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When re-implementing SPARK-27734, I found that there was something wrong with the logs here.
{}" + MDC.of -> {}", MDC.of

What changes were proposed in this pull request?
The pr aims to
1.migrate
error/warn/infoin modulecorewith variables tostructured logging frameworkfor java side.2.convert all dependencies on
org.slf4j.Logger & org.slf4j.LoggerFactorytoorg.apache.spark.internal.Logger & org.apache.spark.internal.LoggerFactory, in order to completelyprohibitimportingorg.slf4j.Logger & org.slf4j.LoggerFactoryin java code later.Why are the changes needed?
To enhance Apache Spark's logging system by implementing structured logging.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
No.