Secure HDFS Support #514

ifilonenko · 2017-09-28T03:11:38Z

RUNNING DIFF FOR Secure HDFS Support

What changes were proposed in this pull request?

This it the on-going work of setting up Secure HDFS interaction with Spark-on-K8S.
The architecture is discussed in this community-wide google doc
This initiative can be broken down into 4 stages.

STAGE 1

Detecting HADOOP_CONF_DIR environmental variable and using Config Maps to store all Hadoop config files locally, while also setting HADOOP_CONF_DIR locally in the driver / executors

STAGE 2

Grabbing TGT from LTC or using keytabs+principle and creating a DT that will be mounted as a secret

STAGE 3

Driver + Executor Logic

How was this patch tested?

E2E Integration tests
- Stage 1
- Stage 2
- Stage 3
Unit tests
- Stage 1
- Stage 2
- Stage 3

Docs and Error Handling?

Docs
Error Handling

[WIP] Use HDFS Delegation Token in driver/executor pods as part of Secure HDFS Support

* Initial architecture design for HDFS support * Minor styling * Added proper logic for mounting ConfigMaps * styling * modified otherKubernetesResource logic * fixed Integration tests and modified HADOOP_CONF_DIR variable to be FILE_DIR for Volume mount * setting HADOOP_CONF_DIR env variables * Included integration tests for Stage 1 * Initial Kerberos support * initial Stage 2 architecture using deprecated 2.1 methods * Added current, BROKEN, integration test environment for review * working hadoop cluster * Using locks and monitors to ensure proper configs for setting up kerberized cluster in integration tests * working Stage 2 * documentation * Integration Stages 1,2 and 3 * further testing work * fixing imports * Stage 3 Integration tests pass * uncommented SparkDockerBuilder * testing fix * handled comments and increased test hardening * Solve failing integration test problem and lower TIMEOUT time * modify security.authoization * Modifying HADOOP_CONF flags * Refactored tests and included modifications to pass all tests regardless of environment * Adding unit test and one more integration test * completed unit tests w/o UGI mocking * cleanup and various small fixes * added back sparkdockerbuilder images * address initial comments and scalastyle issues * addresses comments from PR * mocking hadoopUGI * Fix executor env to include simple authn * Fix a bug in executor env handling * Fix a bug in how the driver sets simple authn * handling Pr comments

…nt reworking of scheduler backend

ifilonenko · 2017-09-28T07:41:50Z

PR is now up to date with branch-2.2-kubernetes and is awaiting review

erikerlandson · 2017-09-28T22:48:42Z

LGTM, pending successful CI

erikerlandson · 2017-09-28T22:54:22Z

rerun integration tests please

mccheah

Some things stand out before we should merge here.

mccheah · 2017-09-28T22:52:41Z

...anagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopConfBootstrap.scala

+ new KeyToPathBuilder()
+ .withKey(file.toPath.getFileName.toString)
+ .withPath(file.toPath.getFileName.toString)
+ .build()).toList


Any reason for the .toList here?

mccheah · 2017-09-28T22:53:17Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

+ def getCurrentTime: Long = System.currentTimeMillis()
+
+ // Functions that should be in Core with Rebase to 2.3
+ @deprecated("Moved to core in 2.2", "2.2")


Think we mean 2.3 in these comments.

mccheah · 2017-09-28T22:53:51Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

+ val interval = newExpiration - identifier.getIssueDate
+ interval
+ }.toOption}
+ if (renewIntervals.isEmpty) None else Some(renewIntervals.min)


renewIntervals.map instead of checking on isEmpty and returning None.

Actually since renewIntervals is a Seq, use reduceLeftOption with math.min.

mccheah · 2017-09-28T22:55:35Z

...scala/org/apache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopConfMounterStep.scala

+ PodWithMainContainer(
+ hadoopConfigSpec.driverPod,
+ hadoopConfigSpec.driverContainer
+ ))


Nit: Braces to the previous line

mccheah · 2017-09-28T22:56:36Z

...pache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopKerberosKeytabResolverStep.scala

+ maybeKeytab: Option[File],
+ maybeRenewerPrincipal: Option[String],
+ hadoopUGI: HadoopUGIUtil) extends HadoopConfigurationStep with Logging{
+ private var originalCredentials: Credentials = _


Indentation is off here - these should all be 2 space indented. The arguments should be 4 space indented (but we have not been consistent on that through the whole project).

mccheah · 2017-09-28T23:04:03Z

...rk/deploy/k8s/submit/submitsteps/initcontainer/BaseInitContainerConfigurationStepSuite.scala

@@ -29,7 +29,8 @@ import org.apache.spark.SparkFunSuite
 import org.apache.spark.deploy.k8s.{PodWithDetachedInitContainer, SparkPodInitContainerBootstrap}
 import org.apache.spark.deploy.k8s.config._

-class BaseInitContainerConfigurationStepSuite extends SparkFunSuite with BeforeAndAfter{
+private[spark] class BaseInitContainerConfigurationStepSuite
+ extends SparkFunSuite with BeforeAndAfter{


Again, if it all fit on one line before, keep it that way

mccheah · 2017-09-28T23:04:31Z

...st/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackendSuite.scala

 import io.fabric8.kubernetes.client.Watcher.Action
 import io.fabric8.kubernetes.client.dsl.{FilterWatchListDeletable, MixedOperation, NonNamespaceOperation, PodResource}
-import org.mockito.{AdditionalAnswers, ArgumentCaptor, Mock, MockitoAnnotations}


Import ordering is broken and this will fail Scalastyle.

mccheah · 2017-09-28T23:05:07Z

...test/scala/org/apache/spark/deploy/k8s/integrationtest/KerberizedHadoopClusterLauncher.scala

+ */
+private[spark] class KerberizedHadoopClusterLauncher(
+ kubernetesClient: KubernetesClient,
+ namespace: String) extends Logging {


Indentation is a little off

mccheah · 2017-09-28T23:05:56Z

...sts/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KerberosTestPodLauncher.scala

+ .get().get(0) match {
+ case deployment: Deployment =>
+ val deploymentWithEnv: Deployment = new DeploymentBuilder(deployment)
+ .editSpec()


Indent all lines from this one down to the end of the chain in by one level.

mccheah · 2017-09-28T23:07:28Z

...test/scala/org/apache/spark/deploy/k8s/integrationtest/kerberos/KerberosPVWatcherCache.scala

+ scala.collection.mutable.Map[String, String]()
+ private var lock: Lock = new ReentrantLock()
+ private var nnBounded: Condition = lock.newCondition()
+ private var ktBounded: Condition = lock.newCondition()


I would be more comfortable with using Futures instead of Condition locks, but perhaps that's a matter of preference - I'm not sure if there are any precedents in the codebase we can follow.

mccheah · 2017-09-29T16:51:55Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

+ renewedTokens: Iterable[Token[_ <: TokenIdentifier]],
+ hadoopConf: Configuration): Option[Long] = {
+ val renewIntervals = renewedTokens.filter {
+ _.decodeIdentifier().isInstanceOf[AbstractDelegationTokenIdentifier]}


Indent as follows:

val renewIntervals = renewedTokens.filter { _.decodeIdentifier().... }.flatMap { token => Try { // logic }.toOption }

ifilonenko · 2017-11-03T03:02:25Z

...bernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala

+ PodWithMainContainer(executorHadoopConfPod, executorHadoopConfContainer))
+ (podWithMainContainer.pod, podWithMainContainer.mainContainer)
+ }.getOrElse((executorHadoopConfPod, executorHadoopConfContainer))
+ val resolvedExecutorPod = new PodBuilder(executorKerberosPod)


this line is breaking integration tests... fixed in the other PR

mccheah · 2017-12-04T23:16:21Z

Reviewing this change is still quite difficult given that it's coming close to 3800 lines. Is there any way we can break this down further?

I think we want to split this up into changes on the submission client and changes on the scheduler backend, like we are doing for shipping the core project upstream.

We can also afford to write some code that technically doesn't actually change the end behavior, but lays the groundwork and architecture for the rest of the code.

ifilonenko · 2017-12-05T16:53:45Z

I am assuming you are commenting on the other PR. There are a lot of moving parts but the grunt of it is in one of the hadoopSteps and the integration tests. Everything else is pretty cookie-cutter, no?

* first stage of PR #514 of just logic * fixing build and unit test issues * fixed integration tests * fixed issue with executorPodFactory unit tests * first series of PR comments * handle most PR comments * third round of PR comments * initial round of comments and initial unit tests for deploy * handled most of the comments and added test cases for pods * resolve conflicts * merge conflicts * adding thread sleeping for RSS issues as a test * resolving comments and unit testing * regarding comments on PR

kimoonkim and others added 8 commits August 1, 2017 15:34

Mount a hadoop secret in the driver pod

48533ff

Fix compile error

3c3331a

Mount a hadoop secret in the executor pod

f2a4033

Document the experimental config option

aa04b04

Address review comments and fix styles

0141c0a

Merge pull request #379 from kimoonkim/wip-use-dt-secrets-on-pods

a004888

[WIP] Use HDFS Delegation Token in driver/executor pods as part of Secure HDFS Support

Sync'd to branch-2.2-kubernetes

5c29bf8

ifilonenko requested review from kimoonkim and erikerlandson September 28, 2017 03:12

ifilonenko self-assigned this Sep 28, 2017

ifilonenko added 2 commits September 28, 2017 00:44

rename

1b4b0eb

merged most recent commits and modified PR 414 to work with most rece…

7de347c

…nt reworking of scheduler backend

ifilonenko requested a review from foxish September 28, 2017 07:45

mccheah suggested changes Sep 28, 2017

View reviewed changes

mccheah reviewed Sep 29, 2017

View reviewed changes

kimoonkim mentioned this pull request Oct 12, 2017

SparkPi Example: java.nio.channels.UnresolvedAddressException #523

Closed

ifilonenko mentioned this pull request Oct 16, 2017

Use a headless service to give a hostname to the driver. #483

Merged

ifilonenko added a commit to bloomberg/apache-spark-on-k8s that referenced this pull request Nov 3, 2017

first stage of PR apache-spark-on-k8s#514 of just logic

7612bf5

ifilonenko mentioned this pull request Nov 3, 2017

Basic Secure HDFS Support [514] #540

Merged

3 tasks

ifilonenko commented Nov 3, 2017

View reviewed changes

ifilonenko mentioned this pull request Dec 13, 2017

[Secure HDFS] Running List of HDFS Issues #575

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Secure HDFS Support #514

Secure HDFS Support #514

ifilonenko commented Sep 28, 2017 •

edited

Loading

ifilonenko commented Sep 28, 2017

erikerlandson commented Sep 28, 2017

erikerlandson commented Sep 28, 2017

mccheah left a comment

mccheah Sep 28, 2017

mccheah Sep 28, 2017

mccheah Sep 28, 2017

mccheah Sep 29, 2017

mccheah Sep 28, 2017

mccheah Sep 28, 2017

mccheah Sep 28, 2017

mccheah Sep 28, 2017

mccheah Sep 28, 2017

mccheah Sep 28, 2017

mccheah Sep 28, 2017

mccheah Sep 29, 2017

ifilonenko Nov 3, 2017

mccheah commented Dec 4, 2017

ifilonenko commented Dec 5, 2017

Secure HDFS Support #514

Are you sure you want to change the base?

Secure HDFS Support #514

Conversation

ifilonenko commented Sep 28, 2017 • edited Loading

RUNNING DIFF FOR Secure HDFS Support

What changes were proposed in this pull request?

How was this patch tested?

Docs and Error Handling?

ifilonenko commented Sep 28, 2017

erikerlandson commented Sep 28, 2017

erikerlandson commented Sep 28, 2017

mccheah left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mccheah commented Dec 4, 2017

ifilonenko commented Dec 5, 2017

ifilonenko commented Sep 28, 2017 •

edited

Loading