-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-49033][CORE] Support server-side environmentVariables replacement in REST Submission API
#47509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ent in REST Submission API
8781f8c to
1fc3930
Compare
|
Could you review this when you have some time, @viirya ? |
environmentVariables replacement in REST Submission API
|
Could you review this PR when you have some time, @yaooqinn ? |
| conf: SparkConf) | ||
| extends SubmitRequestServlet { | ||
|
|
||
| val envVariablePattern = "\\{\\{[A-Z_]+\\}\\}".r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, where is this envVariablePattern used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. Now, it's no op. Let me clean up it.
I used it for Scala 2.12 patch.
|
Thank you, @viirya . It's removed. |
| val updatedMasters = masters.map( | ||
| _.replace(s":$masterRestPort", s":$masterPort")).getOrElse(masterUrl) | ||
| val appArgs = request.appArgs | ||
| // Filter SPARK_LOCAL_(IP|HOSTNAME) environment variables from being set on the remote system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also update this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
|
Thank you. Let me merge this because all CIs passed and the last two commits are only about comment and removal of unused lines. I checked the unit test result too manually. |
|
Late LGTM, do we need a doc for this feature? |
|
Thank you, @yaooqinn . Yes, I'll update it together. I have another PR at this area. |
|
@dongjoon-hyun, Thank you! |
…ement in REST Submission API
### What changes were proposed in this pull request?
This PR aims to support server-side environment variable replacement in REST Submission API.
- For example, ephemeral Spark clusters with server-side environment variables can provide backend-resource and information without touching client-side applications and configurations.
- The place holder pattern is `{{SERVER_ENVIRONMENT_VARIABLE_NAME}}` style like the following.
https://github.com/apache/spark/blob/163e512c53208301a8511310023d930d8b77db96/docs/configuration.md?plain=1#L694
https://github.com/apache/spark/blob/163e512c53208301a8511310023d930d8b77db96/core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala#L233-L234
### Why are the changes needed?
A user can submits an environment variable holder like `{{AWS_CA_BUNDLE}}` and `{{AWS_ENDPOINT_URL}}` in order to use server-wide environment variables of Spark Master.
```
$ SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true" \
AWS_ENDPOINT_URL=ENDPOINT_FOR_THIS_CLUSTER \
sbin/start-master.sh
$ sbin/start-worker.sh spark://$(hostname):7077
```
```
curl -s -k -XPOST http://localhost:6066/v1/submissions/create \
--header "Content-Type:application/json;charset=UTF-8" \
--data '{
"appResource": "",
"sparkProperties": {
"spark.master": "spark://localhost:7077",
"spark.app.name": "",
"spark.submit.deployMode": "cluster",
"spark.jars": "/Users/dongjoon/APACHE/spark-merge/examples/target/scala-2.13/jars/spark-examples_2.13-4.0.0-SNAPSHOT.jar"
},
"clientSparkVersion": "",
"mainClass": "org.apache.spark.examples.SparkPi",
"environmentVariables": {
"AWS_ACCESS_KEY_ID": "A",
"AWS_SECRET_ACCESS_KEY": "B",
"AWS_ENDPOINT_URL": "{{AWS_ENDPOINT_URL}}"
},
"action": "CreateSubmissionRequest",
"appArgs": [ "10000" ]
}'
```
- http://localhost:4040/environment/

### Does this PR introduce _any_ user-facing change?
No. This is a new feature and disabled by default via `spark.master.rest.enabled (default: false)`
### How was this patch tested?
Pass the CIs with newly added test case.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes apache#47509 from dongjoon-hyun/SPARK-49033.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…I server-side env variable replacements ### What changes were proposed in this pull request? This PR aims to document the following three recent improvements. - #47491 - #47509 - #47511 ### Why are the changes needed? To provide an updated documentation. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs and check the HTML manually. <img width="926" alt="Screenshot 2024-07-29 at 14 10 40" src="https://github.com/user-attachments/assets/6c904ec0-0ece-432a-8e41-aeb88f7baab8"> <img width="932" alt="Screenshot 2024-07-29 at 13 52 20" src="https://github.com/user-attachments/assets/ca3afe9a-dcfe-4258-b455-9ff4781cb4e5"> <img width="940" alt="Screenshot 2024-07-29 at 13 52 29" src="https://github.com/user-attachments/assets/ad9635d4-c66f-4320-8b93-005443d4df2e"> ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47523 from dongjoon-hyun/SPARK-49049. Lead-authored-by: Dongjoon Hyun <dhyun@apple.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
This PR aims to support server-side environment variable replacement in REST Submission API.
For example, ephemeral Spark clusters with server-side environment variables can provide backend-resource and information without touching client-side applications and configurations.
The place holder pattern is
{{SERVER_ENVIRONMENT_VARIABLE_NAME}}style like the following.spark/docs/configuration.md
Line 694 in 163e512
spark/core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala
Lines 233 to 234 in 163e512
Why are the changes needed?
A user can submits an environment variable holder like
{{AWS_CA_BUNDLE}}and{{AWS_ENDPOINT_URL}}in order to use server-wide environment variables of Spark Master.Does this PR introduce any user-facing change?
No. This is a new feature and disabled by default via
spark.master.rest.enabled (default: false)How was this patch tested?
Pass the CIs with newly added test case.
Was this patch authored or co-authored using generative AI tooling?
No.