[SPARK-34088][CORE] Rename all decommission configurations to use the same namespace "spark.decommission.*" #31151

Ngone51 · 2021-01-12T12:11:33Z

What changes were proposed in this pull request?

This PR supposes to rename all decommission configurations to use the same namespace "spark.decommission.*".

Besides, it also refines the config "spark.decommission.storage.fallbackStorage.path" to "spark.decommission.storage.fallbackStoragePath".

Why are the changes needed?

Currently, decommission configurations are using difference namespaces, e.g.,

spark.decommission.*
spark.storage.decommission.*
spark.executor.decommission.*

which may introduce unnecessary overhead for end-users. It's better to keep them under the same namespace.

Does this PR introduce any user-facing change?

No, since Spark 3.1 hasn't officially released.

How was this patch tested?

Pass existing tests.

Ngone51 · 2021-01-12T12:12:06Z

cc @holdenk @cloud-fan Please take a look, thanks!

Ngone51 · 2021-01-12T12:12:36Z

core/src/main/scala/org/apache/spark/internal/config/package.scala


  private[spark] val STORAGE_DECOMMISSION_FALLBACK_STORAGE_PATH =
-    ConfigBuilder("spark.storage.decommission.fallbackStorage.path")
+    ConfigBuilder("spark.decommission.storage.fallbackStoragePath")


cc @dongjoon-hyun

SparkQA · 2021-01-12T13:14:32Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38556/

SparkQA · 2021-01-12T13:43:37Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38556/

SparkQA · 2021-01-12T15:18:05Z

Test build #133969 has finished for PR 31151 at commit 6087e59.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2021-01-12T15:33:19Z

I'm fine with it, as it's easier to track all the configs of the decommission feature. But we may need to make the rule clear: how to name feature configs if the feature crosses many components like worker, master, executor, etc.?

Ngone51 · 2021-01-12T15:44:55Z

I actually once had a short discussion(https://github.com/apache/spark/pull/28053/files#r400238749) with @tgravescs about conf naming at CORE side when working on the "resource" feature. cc @tgravescs for more inputs.

tgravescs · 2021-01-12T16:36:10Z

Right, so this is going against the current name standards as was brought up in that discussion. the current format of these configs matches with that format where spark.executor, spark.workers, etc are the prefixes. The documentation in many cases categorizes them into sections by these prefixes. I get that naming them by feature may make it easier in some cases, but I also see where naming them as it makes it easier in others. I know that when I setup worker configuration I look for all spark.worker configs and those might go into separate file just for workers, whereas I put spark.executor configs in a file to be loaded by all applications, or when I am looking at things that might be used to optimize shuffle, I look at all spark.shuffle configs. My point is one person may think its easier this new way, others might think its easier as is. We have been using spark.executor, spark.worker, etc for a long time so changing that will introduce some confusion, especially if only done for a single feature.

So I'm against make this change unless we decide in general to change our naming rules for configs and think it should be decided on by the devs and documented appropriately. These configs were put in with the feature and reviewed by committers we should not be changing them last minute without good cause.

cloud-fan · 2021-01-12T16:40:10Z

I agree with @tgravescs . I believe "spark.decommission.storage.fallbackStorage.path" still worth to fix. @Ngone51 can you revert other changes and only fix that config?

holdenk · 2021-01-12T17:45:01Z

-1 on putting this into 3.1, it's not a bug fix and we've started the RC process already.
I'm open to considering a larger refactor in future versions.

cloud-fan · 2021-01-12T19:03:01Z

@dongjoon-hyun do you think we will have more sub-configs under spark.decommission.storage.fallbackStorage namespace? If not, I think it violates the config naming policy and we should use spark.storage.decommission.fallbackStoragePath instead of spark.storage.decommission.fallbackStorage.path.

dongjoon-hyun · 2021-01-12T19:22:46Z

@cloud-fan . Yes. Actually, I have more configurations internally and have a plan to deliver it to the community.
At the first PR, I intentionally drop .enabled to simplify the PR.

@dongjoon-hyun do you think we will have more sub-configs under spark.decommission.storage.fallbackStorage namespace? If not, I think it violates the config naming policy and we should use spark.storage.decommission.fallbackStoragePath instead of spark.storage.decommission.fallbackStorage.path.

dongjoon-hyun · 2021-01-12T19:32:07Z

To say explicitly, I also want to give -1 for this.

HyukjinKwon · 2021-01-13T00:48:52Z

I'm against make this change unless we decide in general to change our naming rules for configs and think it should be decided on by the devs and documented appropriately

This i agree. PySpark has the same issue too and I have been tracking this issue:

SQL:

spark.sql.pyspark.jvmStacktrace.enabled
spark.sql.execution.arrow.pyspark.*
spark.sql.execution.pyspark.udf.simplifiedTraceback.enabled

CORE

spark.executor.pyspark.memory
spark.pyspark.driver.python
spark.pyspark.python
spark.kubernetes.pyspark.pythonVersion
spark.python.daemon.module
spark.python.worker.*
spark.python.use.daemon
spark.python.task.killTimeout
spark.python.authenticate.socketTimeout
spark.python.profile.*

We should also merge both configurations spark.executor.pyspark.memory and spark.python.worker.memory (SPARK-26679).

It would be great if we have have a general naming to keep. Then we can deprecate and change the name. The current naming is kind of confusing.

Ngone51 · 2021-01-13T02:31:05Z

thanks for everyone's feedback. I'll close this PR first. Although, any ideas about naming rules are still welcomed :)

dongjoon-hyun · 2021-01-13T02:35:31Z

Thank you for your decision, @Ngone51 .

Ngone51 added 2 commits January 12, 2021 19:54

rename decommission conf

33e011b

rename

6087e59

github-actions bot added the CORE label Jan 12, 2021

Ngone51 commented Jan 12, 2021

View reviewed changes

HyukjinKwon mentioned this pull request Jan 13, 2021

[SPARK-34053][INFRA][FOLLOW-UP] Disables canceling push/schedule workflows #31153

Closed

Ngone51 closed this Jan 13, 2021

[SPARK-34088][CORE] Rename all decommission configurations to use the same namespace "spark.decommission.*" #31151

[SPARK-34088][CORE] Rename all decommission configurations to use the same namespace "spark.decommission.*" #31151

Uh oh!

Conversation

Ngone51 commented Jan 12, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Ngone51 commented Jan 12, 2021

Uh oh!

Ngone51 Jan 12, 2021

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jan 12, 2021

Uh oh!

SparkQA commented Jan 12, 2021

Uh oh!

SparkQA commented Jan 12, 2021

Uh oh!

cloud-fan commented Jan 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ngone51 commented Jan 12, 2021

Uh oh!

tgravescs commented Jan 12, 2021

Uh oh!

cloud-fan commented Jan 12, 2021

Uh oh!

holdenk commented Jan 12, 2021

Uh oh!

cloud-fan commented Jan 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented Jan 12, 2021

Uh oh!

dongjoon-hyun commented Jan 12, 2021

Uh oh!

HyukjinKwon commented Jan 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ngone51 commented Jan 13, 2021

Uh oh!

dongjoon-hyun commented Jan 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

cloud-fan commented Jan 12, 2021 •

edited

Loading

cloud-fan commented Jan 12, 2021 •

edited

Loading

HyukjinKwon commented Jan 13, 2021 •

edited

Loading