Discuss how drivers and executors will pick up new tokens from the token refresh server #534

kimoonkim · 2017-10-25T16:43:04Z

#453 is implementing the HDFS token refresh server which will obtain brand new tokens when prior tokens completely expire after 7 days. For each supported job, the refresh server will write back the new token to the associated K8s secrets as an additional data item. The job's driver and executors should detect the new token and load it into the JVMs so they can continue to access the secure HDFS.

We should discuss how exactly this can be done. I can imagine two approaches:

If K8s secret mounting supports this (does it?), the new token will appear as a new file in the mount point directory of the secret volume. Then, the driver and executors will periodically scan the directory for a new file and load it into memory.
The driver and executors use a K8s watcher for the secret and find the update event. And use K8s API to read the new data item containing the new token. This requires executors also to use K8s API client and service account, which is a new behavior.

I personally prefer (1), if it is possible.

One related note is that there is an existing hook in the base class SparkHadoopUtil both for the driver and executor for supporting this. We just need to subclass the base class and implement (1) or (2) in the subclass:

 /**
   * Start a thread to periodically update the current user's credentials with new credentials so
   * that access to secured service does not fail.
   */
  private[spark] def startCredentialUpdater(conf: SparkConf) {}

Thoughts? Concerns?

@ifilonenko @liyinan926

The text was updated successfully, but these errors were encountered:

kimoonkim · 2017-10-31T23:23:44Z

One question we had was whether or not a mounted secret dir will automatically see a new token if the token was added to the secret by the token refresh server. https://kubernetes.io/docs/concepts/configuration/secret/ has this section:

Mounted Secrets are updated automatically
When a secret being already consumed in a volume is updated, projected keys are eventually updated as well. Kubelet is checking whether the mounted secret is fresh on every periodic sync. However, it is using its local ttl-based cache for getting the current value of the secret. As a result, the total delay from the moment when the secret is updated to the moment when new keys are projected to the pod can be as long as kubelet sync period + ttl of secrets cache in kubelet.

So the answer seems to be yes.

kimoonkim · 2017-10-31T23:32:05Z

And I just did a little experiment and confirmed the secret update behavior:

Create a secret with one data item (kubectl create secret generic mysecret --from-file=./username.txt)
Mount the secret in a test pod. (Used the pod yaml in this doc) The mount point /etc/foo has only one file.
Then add a second data item, say password.txt, to the secret. (Used the k8s dashboard UI)
Check the mount point in the pod. Now it has two files.

$ kubectl exec -it mypod /bin/ls /etc/foo
password.txt  username.txt

kimoonkim · 2017-10-31T23:53:02Z

So the next step is to imagine how we can write the startCredentialUpdater method in a subclass of SparkHadoopUtil, mentioned in the issue description.

I think the key lines are the following, copied from yarn CredentialUpdater:

          val newCredentials = new Credentials()
          newCredentials.readTokenStorageStream(stream)
          UserGroupInformation.getCurrentUser.addCredentials(newCredentials)

So as long as we can point the stream to the new token's file path, we should be fine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discuss how drivers and executors will pick up new tokens from the token refresh server #534

Discuss how drivers and executors will pick up new tokens from the token refresh server #534

kimoonkim commented Oct 25, 2017 •

edited

Loading

kimoonkim commented Oct 31, 2017

Uh oh!

kimoonkim commented Oct 31, 2017 •

edited

Loading

Uh oh!

kimoonkim commented Oct 31, 2017

Uh oh!

Discuss how drivers and executors will pick up new tokens from the token refresh server #534

Discuss how drivers and executors will pick up new tokens from the token refresh server #534

Comments

kimoonkim commented Oct 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

kimoonkim commented Oct 31, 2017

Uh oh!

kimoonkim commented Oct 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kimoonkim commented Oct 31, 2017

Uh oh!

kimoonkim commented Oct 25, 2017 •

edited

Loading

kimoonkim commented Oct 31, 2017 •

edited

Loading