Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49918][CORE] Use read-only access to conf in SparkContext where appropriate #48402

Closed
wants to merge 1 commit into from

Conversation

pmenon
Copy link
Contributor

@pmenon pmenon commented Oct 9, 2024

What changes were proposed in this pull request?

This PR switches all calls to SparkContext.getConf that are read-only to use SparkContext.conf instead. The former method clones the conf, which is unnecessary when the caller only reads the conf. SparkContext.conf provides read-only access to the conf.

Why are the changes needed?

Cloning the entire conf adds unnecessary CPU overhead due to copying, and GC overhead due to cleanup, and both affect tail latencies on certain workloads.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests.

Was this patch authored or co-authored using generative AI tooling?

No

@cloud-fan cloud-fan changed the title [CORE] Use read-only access to conf in SparkContext where appropriate [SPARK-49918][CORE] Use read-only access to conf in SparkContext where appropriate Oct 10, 2024
@cloud-fan
Copy link
Contributor

The GA jobs actually all passed. Thanks, merging to master!

@cloud-fan cloud-fan closed this in ea60e93 Oct 10, 2024
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
…ere appropriate

### What changes were proposed in this pull request?

This PR switches all calls to `SparkContext.getConf` that are read-only to use `SparkContext.conf` instead. The former method clones the conf, which is unnecessary when the caller only reads the conf. `SparkContext.conf` provides read-only access to the conf.

### Why are the changes needed?

Cloning the entire conf adds unnecessary CPU overhead due to copying, and GC overhead due to cleanup, and both affect tail latencies on certain workloads.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#48402 from pmenon/getconf-optimizations.

Authored-by: Prashanth Menon <prashanth.menon@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants