Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

Secure HDFS Support #514

Open
wants to merge 10 commits into
base: branch-2.2-kubernetes
Choose a base branch
from
Open

Conversation

ifilonenko
Copy link
Member

@ifilonenko ifilonenko commented Sep 28, 2017

RUNNING DIFF FOR Secure HDFS Support

What changes were proposed in this pull request?

This it the on-going work of setting up Secure HDFS interaction with Spark-on-K8S.
The architecture is discussed in this community-wide google doc
This initiative can be broken down into 4 stages.

STAGE 1

  • Detecting HADOOP_CONF_DIR environmental variable and using Config Maps to store all Hadoop config files locally, while also setting HADOOP_CONF_DIR locally in the driver / executors

STAGE 2

  • Grabbing TGT from LTC or using keytabs+principle and creating a DT that will be mounted as a secret

STAGE 3

  • Driver + Executor Logic

How was this patch tested?

  • E2E Integration tests
    • Stage 1
    • Stage 2
    • Stage 3
  • Unit tests
    • Stage 1
    • Stage 2
    • Stage 3

Docs and Error Handling?

  • Docs
  • Error Handling

kimoonkim and others added 8 commits August 1, 2017 15:34
[WIP] Use HDFS Delegation Token in driver/executor pods as part of Secure HDFS Support
* Initial architecture design for HDFS support

* Minor styling

* Added proper logic for mounting ConfigMaps

* styling

* modified otherKubernetesResource logic

* fixed Integration tests and modified HADOOP_CONF_DIR variable to be FILE_DIR for Volume mount

* setting HADOOP_CONF_DIR env variables

* Included integration tests for Stage 1

* Initial Kerberos support

* initial Stage 2 architecture using deprecated 2.1 methods

* Added current, BROKEN, integration test environment for review

* working hadoop cluster

* Using locks and monitors to ensure proper configs for setting up kerberized cluster in integration tests

* working Stage 2

* documentation

* Integration Stages 1,2 and 3

* further testing work

* fixing imports

* Stage 3 Integration tests pass

* uncommented SparkDockerBuilder

* testing fix

* handled comments and increased test hardening

* Solve failing integration test problem and lower TIMEOUT time

* modify security.authoization

* Modifying HADOOP_CONF flags

* Refactored tests and included modifications to pass all tests regardless of environment

* Adding unit test and one more integration test

* completed unit tests w/o UGI mocking

* cleanup and various small fixes

* added back sparkdockerbuilder images

* address initial comments and scalastyle issues

* addresses comments from PR

* mocking hadoopUGI

* Fix executor env to include simple authn

* Fix a bug in executor env handling

* Fix a bug in how the driver sets simple authn

* handling Pr comments
@ifilonenko
Copy link
Member Author

PR is now up to date with branch-2.2-kubernetes and is awaiting review

@erikerlandson
Copy link
Member

LGTM, pending successful CI

@erikerlandson
Copy link
Member

rerun integration tests please

Copy link

@mccheah mccheah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things stand out before we should merge here.

new KeyToPathBuilder()
.withKey(file.toPath.getFileName.toString)
.withPath(file.toPath.getFileName.toString)
.build()).toList
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for the .toList here?

def getCurrentTime: Long = System.currentTimeMillis()

// Functions that should be in Core with Rebase to 2.3
@deprecated("Moved to core in 2.2", "2.2")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we mean 2.3 in these comments.

val interval = newExpiration - identifier.getIssueDate
interval
}.toOption}
if (renewIntervals.isEmpty) None else Some(renewIntervals.min)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renewIntervals.map instead of checking on isEmpty and returning None.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually since renewIntervals is a Seq, use reduceLeftOption with math.min.

PodWithMainContainer(
hadoopConfigSpec.driverPod,
hadoopConfigSpec.driverContainer
))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Braces to the previous line

maybeKeytab: Option[File],
maybeRenewerPrincipal: Option[String],
hadoopUGI: HadoopUGIUtil) extends HadoopConfigurationStep with Logging{
private var originalCredentials: Credentials = _
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is off here - these should all be 2 space indented. The arguments should be 4 space indented (but we have not been consistent on that through the whole project).

@@ -29,7 +29,8 @@ import org.apache.spark.SparkFunSuite
import org.apache.spark.deploy.k8s.{PodWithDetachedInitContainer, SparkPodInitContainerBootstrap}
import org.apache.spark.deploy.k8s.config._

class BaseInitContainerConfigurationStepSuite extends SparkFunSuite with BeforeAndAfter{
private[spark] class BaseInitContainerConfigurationStepSuite
extends SparkFunSuite with BeforeAndAfter{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, if it all fit on one line before, keep it that way

import io.fabric8.kubernetes.client.Watcher.Action
import io.fabric8.kubernetes.client.dsl.{FilterWatchListDeletable, MixedOperation, NonNamespaceOperation, PodResource}
import org.mockito.{AdditionalAnswers, ArgumentCaptor, Mock, MockitoAnnotations}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import ordering is broken and this will fail Scalastyle.

*/
private[spark] class KerberizedHadoopClusterLauncher(
kubernetesClient: KubernetesClient,
namespace: String) extends Logging {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is a little off

.get().get(0) match {
case deployment: Deployment =>
val deploymentWithEnv: Deployment = new DeploymentBuilder(deployment)
.editSpec()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent all lines from this one down to the end of the chain in by one level.

scala.collection.mutable.Map[String, String]()
private var lock: Lock = new ReentrantLock()
private var nnBounded: Condition = lock.newCondition()
private var ktBounded: Condition = lock.newCondition()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be more comfortable with using Futures instead of Condition locks, but perhaps that's a matter of preference - I'm not sure if there are any precedents in the codebase we can follow.

renewedTokens: Iterable[Token[_ <: TokenIdentifier]],
hadoopConf: Configuration): Option[Long] = {
val renewIntervals = renewedTokens.filter {
_.decodeIdentifier().isInstanceOf[AbstractDelegationTokenIdentifier]}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent as follows:

val renewIntervals = renewedTokens.filter {
    _.decodeIdentifier()....
  }.flatMap { token =>
    Try {
      // logic
    }.toOption
  }

PodWithMainContainer(executorHadoopConfPod, executorHadoopConfContainer))
(podWithMainContainer.pod, podWithMainContainer.mainContainer)
}.getOrElse((executorHadoopConfPod, executorHadoopConfContainer))
val resolvedExecutorPod = new PodBuilder(executorKerberosPod)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line is breaking integration tests... fixed in the other PR

@mccheah
Copy link

mccheah commented Dec 4, 2017

Reviewing this change is still quite difficult given that it's coming close to 3800 lines. Is there any way we can break this down further?

I think we want to split this up into changes on the submission client and changes on the scheduler backend, like we are doing for shipping the core project upstream.

We can also afford to write some code that technically doesn't actually change the end behavior, but lays the groundwork and architecture for the rest of the code.

@ifilonenko
Copy link
Member Author

I am assuming you are commenting on the other PR. There are a lot of moving parts but the grunt of it is in one of the hadoopSteps and the integration tests. Everything else is pretty cookie-cutter, no?

liyinan926 pushed a commit that referenced this pull request Dec 12, 2017
* first stage of PR #514 of just logic

* fixing build and unit test issues

* fixed integration tests

* fixed issue with executorPodFactory unit tests

* first series of PR comments

* handle most PR comments

* third round of PR comments

* initial round of comments and initial unit tests for deploy

* handled most of the comments and added test cases for pods

* resolve conflicts

* merge conflicts

* adding thread sleeping for RSS issues as a test

* resolving comments and unit testing

* regarding comments on PR
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants