-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-6918][YARN] Secure HBase support. #5586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
|
@XuTingjun noticed you were also interested in this feature on #5031 |
|
Yeah, LGTM, I need this function. can we put hbase's config into hbase-site.xml, right? |
|
The HBaseConfiguration object will read from hbase-default.xml or hbase-site.xml in the classpath. Do you have hbase config in another file? The zookeeper configs are what is needed for obtaining the security token and should always be in hbase-site.xml so just copying that in to the Spark config dir should do the trick. |
|
Jenkins, test this please |
|
Test build #30596 has started for PR 5586 at commit |
|
Test build #30596 has finished for PR 5586 at commit
|
|
Test PASSed. |
|
So can you detail how one actually uses this? The hive stuff can be compiled into spark, but hbase cannot be. So I assume for this to work you have to include the hbase jars. Does just specifying driver-class-path for both yarn client and cluster modes work? did you test this on both secure and non-secure clusters? |
a4256cd to
f48927b
Compare
|
Can one of the admins verify this patch? |
|
ok to test |
|
Yes, including the HBase jars on the driver and/or executor (eg. /usr/lib/hbase/lib/hbase-client.jar:/usr/lib/hbase/lib/hbase-common.jar:/usr/lib/hbase/lib/hbase-hadoop2-compat.jar:/usr/lib/hbase/lib/hbase-protocol.jar:/usr/lib/hbase/lib/htrace-core-2.04.jar) will allow the driver and executor to reference the hbase configuration and create a new connection. The assumption is that the hbase jars are also in those same dirs on the executors. Hbase-site.xml will need to be moved in to /conf or in to the Spark conf path since that is where the zk config for HBase is contained. I've tested this on yarn-client and yarn-cluster on our secure production cluster with hbase 0.98 with and without the hbase jars included. And also in HDP sandbox with hbase 0.98 with a unsecured hbase connection(all running locally). Updated the pull request to remove throw new RuntimeException on line 1117 and log as an error since users may be running a secure YARN cluster without security on HBase. |
|
jenkins, test this please |
|
Test build #31140 has started for PR 5586 at commit |
|
Test build #31140 has finished for PR 5586 at commit
|
|
I think this looks good. It would be nice to have an example on accessing hbase for other users to reference but that is out of scope of this. |
|
My cluster information is: /opt/jdk1.8.0_40, hadoop26.0, hbase1.0.0, zookeeper 3.5.0. These days I run the select command to read data in hbase with beeline shell.It always throw the exception:
|
|
@XuTingjun This looks like a generic Spark driver error when an executor crashes. Can you please dig up the executor stack trace containing the root cause? |
|
@deanchen ,I use this patch, hbase throw the exception below. Can you help me?
|
|
@XuTingjun Have you tried authenticating to your hbase server without Spark? Looks like a failure caused by a misconfiguration. |
|
@deanchen Can you list the needed configs of hbase in client. |
Obtain HBase security token with Kerberos credentials locally to be sent to executors. Tested on eBay's secure HBase cluster. Similar to obtainTokenForNamenodes and fails gracefully if HBase classes are not included in path. Requires hbase-site.xml to be in the classpath(typically via conf dir) for the zookeeper configuration. Should that go in the docs somewhere? Did not see an HBase section. Author: Dean Chen <deanchen5@gmail.com> Closes apache#5586 from deanchen/master and squashes the following commits: 0c190ef [Dean Chen] [SPARK-6918][YARN] Secure HBase support.
Obtain HBase security token with Kerberos credentials locally to be sent to executors. Tested on eBay's secure HBase cluster. Similar to obtainTokenForNamenodes and fails gracefully if HBase classes are not included in path. Requires hbase-site.xml to be in the classpath(typically via conf dir) for the zookeeper configuration. Should that go in the docs somewhere? Did not see an HBase section. Author: Dean Chen <deanchen5@gmail.com> Closes apache#5586 from deanchen/master and squashes the following commits: 0c190ef [Dean Chen] [SPARK-6918][YARN] Secure HBase support.
Obtain HBase security token with Kerberos credentials locally to be sent to executors. Tested on eBay's secure HBase cluster. Similar to obtainTokenForNamenodes and fails gracefully if HBase classes are not included in path. Requires hbase-site.xml to be in the classpath(typically via conf dir) for the zookeeper configuration. Should that go in the docs somewhere? Did not see an HBase section. Author: Dean Chen <deanchen5@gmail.com> Closes apache#5586 from deanchen/master and squashes the following commits: 0c190ef [Dean Chen] [SPARK-6918][YARN] Secure HBase support.
Obtain HBase security token with Kerberos credentials locally to be sent to executors. Tested on eBay's secure HBase cluster.
Similar to obtainTokenForNamenodes and fails gracefully if HBase classes are not included in path.
Requires hbase-site.xml to be in the classpath(typically via conf dir) for the zookeeper configuration. Should that go in the docs somewhere? Did not see an HBase section.