Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34232][CORE] Redact SparkListenerEnvironmentUpdate event in log #31335

Closed
wants to merge 1 commit into from

Conversation

warrenzhu25
Copy link
Contributor

What changes were proposed in this pull request?

Redact event SparkListenerEnvironmentUpdate in log when its processing time exceeded logSlowEventThreshold

Why are the changes needed?

Credentials could be exposed when its processing time exceeded logSlowEventThreshold

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manually tested

@github-actions github-actions bot added the CORE label Jan 26, 2021
private[spark] def redactEvent(e: E): E = {
e match {
case event: SparkListenerEnvironmentUpdate =>
event
Copy link
Contributor

@HeartSaVioR HeartSaVioR Jan 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's reuse EventLoggingListener.redactEvent. Probably change the code in EventLoggingListener like below, and call here?

private[spark] class EventLoggingListener(

...

  private[spark] def redactEvent(
      event: SparkListenerEnvironmentUpdate): SparkListenerEnvironmentUpdate = {
    EventLoggingListener.redactEvent(sparkConf, event)
  }
}

private[spark] object EventLoggingListener extends Logging {

...

  def redactEvent(
      sparkConf: SparkConf,
      event: SparkListenerEnvironmentUpdate): SparkListenerEnvironmentUpdate = {
    // environmentDetails maps a string descriptor to a set of properties
    // Similar to:
    // "JVM Information" -> jvmInformation,
    // "Spark Properties" -> sparkProperties,
    // ...
    // where jvmInformation, sparkProperties, etc. are sequence of tuples.
    // We go through the various  of properties and redact sensitive information from them.
    val noRedactProps = Seq("Classpath Entries")
    val redactedProps = event.environmentDetails.map {
      case (name, props) if noRedactProps.contains(name) => name -> props
      case (name, props) => name -> Utils.redact(sparkConf, props)
    }
    SparkListenerEnvironmentUpdate(redactedProps)
  }
}

@warrenzhu25 warrenzhu25 force-pushed the 34232 branch 2 times, most recently from 3374183 to ab796cf Compare January 26, 2021 04:40
@HeartSaVioR
Copy link
Contributor

ok to test

@HeartSaVioR
Copy link
Contributor

Looks OK, but the code change is proposed by me so I'd feel comfort to wait for another review.

@HyukjinKwon
Copy link
Member

ok to test

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (only one minor comment about function visibility).

@SparkQA
Copy link

SparkQA commented Jan 26, 2021

Test build #134482 has started for PR 31335 at commit 618ddc4.

@SparkQA
Copy link

SparkQA commented Jan 26, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39066/

@SparkQA
Copy link

SparkQA commented Jan 26, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39068/

@SparkQA
Copy link

SparkQA commented Jan 26, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39068/

@SparkQA
Copy link

SparkQA commented Jan 26, 2021

Test build #134481 has finished for PR 31335 at commit ab796cf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

HyukjinKwon pushed a commit that referenced this pull request Jan 26, 2021
### What changes were proposed in this pull request?
Redact event SparkListenerEnvironmentUpdate in log when its processing time exceeded logSlowEventThreshold

### Why are the changes needed?
Credentials could be exposed when its processing time exceeded logSlowEventThreshold

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manually tested

Closes #31335 from warrenzhu25/34232.

Authored-by: Warren Zhu <warren.zhu25@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit 68b765e)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@HyukjinKwon
Copy link
Member

Merged to master and branch-3.1.

skestle pushed a commit to skestle/spark that referenced this pull request Feb 3, 2021
### What changes were proposed in this pull request?
Redact event SparkListenerEnvironmentUpdate in log when its processing time exceeded logSlowEventThreshold

### Why are the changes needed?
Credentials could be exposed when its processing time exceeded logSlowEventThreshold

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manually tested

Closes apache#31335 from warrenzhu25/34232.

Authored-by: Warren Zhu <warren.zhu25@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Feb 24, 2021
…in log

### What changes were proposed in this pull request?

Redact event SparkListenerEnvironmentUpdate in log when its processing time exceeded logSlowEventThreshold.

This is the backport of #31335 to branch-3.0.

### Why are the changes needed?

Credentials could be exposed when its processing time exceeded logSlowEventThreshold.

As this is related to security issue, it is better to backport it.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Manually tested in original PR.

Closes #31634 from viirya/SPARK-34232-3.0.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
rshkv pushed a commit to palantir/spark that referenced this pull request Mar 9, 2021
…in log

### What changes were proposed in this pull request?

Redact event SparkListenerEnvironmentUpdate in log when its processing time exceeded logSlowEventThreshold.

This is the backport of apache#31335 to branch-3.0.

### Why are the changes needed?

Credentials could be exposed when its processing time exceeded logSlowEventThreshold.

As this is related to security issue, it is better to backport it.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Manually tested in original PR.

Closes apache#31634 from viirya/SPARK-34232-3.0.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@warrenzhu25 warrenzhu25 deleted the 34232 branch June 11, 2022 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants