-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow presto to connect to kerberized Hive clusters (v3) #4867
Changes from all commits
f17e72b
c7fdd6b
e039914
be6808f
6bb741b
f6c7481
f6d832a
ce90e47
41df9be
329cf92
bc42b8f
aadff5a
5e5c1fc
e291ee8
da10430
4ab6ea8
be80b48
6b320ad
36133a4
cacccb8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -67,6 +67,24 @@ The configuration files must exist on all Presto nodes. If you are | |
referencing existing Hadoop config files, make sure to copy them to | ||
any Presto nodes that are not running Hadoop. | ||
|
||
Accessing Hadoop clusters protected with Kerberos authentication | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Kerberos authentication is currently supported for both HDFS and Hive metastore. | ||
|
||
However there are still few limitations: | ||
|
||
* Kerberos authentication is supported only for ``hive-hadoop2`` and `hive-cdh5` connectors. | ||
* Kerberos authentication by ticket cache is not yet supported. | ||
|
||
Please refer to `Configuration Properties`_ section for configuration details. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is the underscore supposed to be here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it is. I tested that. |
||
|
||
.. note:: | ||
|
||
If your ``krb5.conf`` location is different than ``/etc/krb5.conf`` you must set it | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not ideal because we lose the ability to validate options that are set in this manner. I guess it's fine for now, but we should figure out a way to make the option required by the hive connector and the one required by Presto coexist. |
||
explicitly using the ``java.security.krb5.conf`` JVM property in ``jvm.config`` file. | ||
Example: ``-Djava.security.krb5.conf=/example/path/krb5.conf``. | ||
|
||
Configuration Properties | ||
------------------------ | ||
|
||
|
@@ -107,6 +125,76 @@ Property Name Description | |
``hive.max-partitions-per-writers`` Maximum number of partitions per writer. 100 | ||
|
||
``hive.s3.sse.enabled`` Enable S3 server-side encryption. ``false`` | ||
|
||
``hive.metastore.authentication.type`` Hive metastore authentication type. ``NONE`` | ||
Possible values are ``NONE`` or ``KERBEROS``. | ||
|
||
``hive.metastore.service.principal`` Hive metastore service principal. | ||
The ``_HOST`` placeholder is allowed here and it is | ||
substituted with the actual metastore host. Use ``_HOST`` | ||
placeholder for configurations with more that | ||
one Hive metastore server. | ||
Example: ``hive/hive-server-host@EXAMPLE.COM`` or | ||
``hive/_HOST@EXAMPLE.COM``. | ||
|
||
``hive.metastore.client.principal`` Hive metastore client principal. | ||
The ``_HOST`` placeholder is allowed here and it is | ||
substituted with the actual Presto server host. Use | ||
``_HOST`` placeholder for the principal per server | ||
configurations. | ||
Example: ``presto/presto-server-node@EXAMPLE.COM`` or | ||
``presto/_HOST@EXAMPLE.COM``. | ||
|
||
.. warning:: | ||
|
||
The principal specified by | ||
``hive.metastore.client.principal`` | ||
must have sufficient privileges to remove files | ||
and directories within the ``hive/warehouse`` | ||
directory. If the principal does not, only the | ||
metadata will be removed, and the data will | ||
continue to consume disk space. | ||
|
||
This occurs because the Hive metastore is | ||
responsible for deleting the internal table data. | ||
When the metastore is configured to use Kerberos | ||
authentication, all of the HDFS operations performed | ||
by the metastore are impersonated. Errors | ||
deleting data are silently ignored. | ||
|
||
``hive.metastore.client.keytab`` Hive metastore client keytab location. Must be accessible | ||
for the user running Presto and must contain the | ||
credentials for the ``hive.metastore.client.principal``. | ||
|
||
``hive.hdfs.authentication.type`` HDFS authentication type. ``NONE`` | ||
Possible values are ``NONE`` or ``KERBEROS``. | ||
|
||
``hive.hdfs.impersonation.enabled`` Enable HDFS calls impersonation. ``false`` | ||
|
||
When set to the default of ``false``, Presto accesses | ||
HDFS as the Unix user the presto process is running as, | ||
or as the Kerberos principal specified in | ||
``hive.hdfs.presto.principal`` | ||
|
||
When set to ``true``, Presto accesses HDFS as the Presto | ||
user or Kerberos principal specified by ``--user`` or | ||
``--krb5-principal`` passed to the CLI, or as the user | ||
in the JDBC credentials. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not your fault, but the whole situation here is super confusing because there are 4 different kinds of "user-like things":
SIMPLE, false: HDFS access as the OS user of the presto process or the value of -DHADOOP_USER_NAME if specified. We should try to get something simple and reasonably accurate in here, and I'll add a link to the security/hive.rst when I put up the PR for that. @BrendaNoonan, feel free to weigh in here too. |
||
``hive.hdfs.presto.principal`` HDFS client principal. The ``_HOST`` placeholder | ||
is allowed here and it is substituted with the actual | ||
Presto server host. Use ``_HOST`` placeholder for the | ||
principal per server configurations. | ||
When impersonation is enabled make sure that provided | ||
user is configured to be a super user and has the | ||
impersonation allowed. | ||
Example: | ||
``presto-hdfs-superuser/presto-server-node@EXAMPLE.COM`` or | ||
``presto-hdfs-superuser/_HOST@EXAMPLE.COM``. | ||
|
||
``hive.hdfs.presto.keytab`` HDFS client keytab location. Must be accessible | ||
for the user running Presto and must contain the | ||
credentials for the ``hive.hdfs.presto.principal``. | ||
================================================== ============================================================ ========== | ||
|
||
Querying Hive Tables | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
/* | ||
* Copyright 2016, Teradata Corp. All rights reserved. | ||
*/ | ||
|
||
/* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package com.facebook.presto.hive; | ||
|
||
import javax.inject.Qualifier; | ||
|
||
import java.lang.annotation.Retention; | ||
import java.lang.annotation.Target; | ||
|
||
import static java.lang.annotation.ElementType.FIELD; | ||
import static java.lang.annotation.ElementType.METHOD; | ||
import static java.lang.annotation.ElementType.PARAMETER; | ||
import static java.lang.annotation.RetentionPolicy.RUNTIME; | ||
|
||
@Retention(RUNTIME) | ||
@Target({FIELD, PARAMETER, METHOD}) | ||
@Qualifier | ||
public @interface ForHdfs | ||
{ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@electrum can you review this?