-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-3584] sbin/slaves doesn't work when we use password authentication for SSH #2444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
297e75d
88e2f17
1bba8a9
7120a0c
e570431
53d7121
7858225
eff7394
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -23,6 +23,7 @@ conf/*.cmd | |
| conf/*.properties | ||
| conf/*.conf | ||
| conf/*.xml | ||
| conf/slaves | ||
| docs/_site | ||
| docs/api | ||
| target/ | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -62,7 +62,12 @@ Finally, the following configuration options can be passed to the master and wor | |
|
|
||
| # Cluster Launch Scripts | ||
|
|
||
| To launch a Spark standalone cluster with the launch scripts, you need to create a file called `conf/slaves` in your Spark directory, which should contain the hostnames of all the machines where you would like to start Spark workers, one per line. The master machine must be able to access each of the slave machines via password-less `ssh` (using a private key). For testing, you can just put `localhost` in this file. | ||
| To launch a Spark standalone cluster with the launch scripts, you should create a file called conf/slaves in your Spark directory, | ||
| which must contain the hostnames of all the machines where you intend to start Spark workers, one per line. | ||
| If conf/slaves does not exist, the launch scripts defaults to a single machine (localhost), which is useful for testing. | ||
| Note, the master machine accesses each of the worker machines via ssh. By default, ssh is run in parallel and requires password-less (using a private key) access to be setup. | ||
| If you do not have a password-less setup, you can set the environment variable SPARK_SSH_FOREGROUND and serially provide a password for each worker. | ||
|
|
||
|
|
||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what about - To launch a Spark standalone cluster with the launch scripts, you should create a file called
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mattf Thank you for reviewing. It makes sense. |
||
| Once you've set up this file, you can launch or stop your cluster with the following shell scripts, based on Hadoop's deploy scripts, and available in `SPARK_HOME/bin`: | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -44,7 +44,9 @@ sbin="`cd "$sbin"; pwd`" | |
| # If the slaves file is specified in the command line, | ||
| # then it takes precedence over the definition in | ||
| # spark-env.sh. Save it here. | ||
| HOSTLIST="$SPARK_SLAVES" | ||
| if [ -f "$SPARK_SLAVES" ]; then | ||
| HOSTLIST=`cat "$SPARK_SLAVES"` | ||
| fi | ||
|
|
||
| # Check if --config is passed as an argument. It is an optional parameter. | ||
| # Exit if the argument is not a directory. | ||
|
|
@@ -67,23 +69,34 @@ fi | |
|
|
||
| if [ "$HOSTLIST" = "" ]; then | ||
| if [ "$SPARK_SLAVES" = "" ]; then | ||
| export HOSTLIST="${SPARK_CONF_DIR}/slaves" | ||
| if [ -f "${SPARK_CONF_DIR}/slaves" ]; then | ||
| HOSTLIST=`cat "${SPARK_CONF_DIR}/slaves"` | ||
| else | ||
| HOSTLIST=localhost | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should change the docs in
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, i was moving too quickly this morning. definitely need something to allow for background ssh.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. O.K, I'll add SSH_FOREGROUND variable and add description. |
||
| fi | ||
| else | ||
| export HOSTLIST="${SPARK_SLAVES}" | ||
| HOSTLIST=`cat "${SPARK_SLAVES}"` | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why cat here and echo later?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is to use HOSTLIST as List of Host, not file. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thanks for pointing that out. i didn't read closely enough. |
||
| fi | ||
| fi | ||
|
|
||
|
|
||
|
|
||
| # By default disable strict host key checking | ||
| if [ "$SPARK_SSH_OPTS" = "" ]; then | ||
| SPARK_SSH_OPTS="-o StrictHostKeyChecking=no" | ||
| fi | ||
|
|
||
| for slave in `cat "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do | ||
| ssh $SPARK_SSH_OPTS "$slave" $"${@// /\\ }" \ | ||
| 2>&1 | sed "s/^/$slave: /" & | ||
| if [ "$SPARK_SLAVE_SLEEP" != "" ]; then | ||
| sleep $SPARK_SLAVE_SLEEP | ||
| fi | ||
| for slave in `echo "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do | ||
| if [ -n "${SPARK_SSH_FOREGROUND}" ]; then | ||
| ssh $SPARK_SSH_OPTS "$slave" $"${@// /\\ }" \ | ||
| 2>&1 | sed "s/^/$slave: /" | ||
| else | ||
| ssh $SPARK_SSH_OPTS "$slave" $"${@// /\\ }" \ | ||
| 2>&1 | sed "s/^/$slave: /" & | ||
| fi | ||
| if [ "$SPARK_SLAVE_SLEEP" != "" ]; then | ||
| sleep $SPARK_SLAVE_SLEEP | ||
| fi | ||
| done | ||
|
|
||
| wait | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this one I think we intentionally have a simple slaves file in the repo so that people can start a local cluster by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this file will not to be edited? User should use another slave list file by SPARK_SLAVES variable right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay this is fine actually, given that we preserve the "deafult" behavior due to your edits below (of starting at localhost).