-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kerberos authentification #3380
Comments
The kerberos code in Presto currently only does authentication of users when the request is over HTTPS, and if authentication fails, the user gets an error. The server currently does not perform any authorization checks. The next step of our security work is to enable authorization checks for the tables, databases, views, etc. accessed in queries. This will likely be performed using the Hive metastore and/or Knox/Ranger. We currently don't have plans to implement "per-user" authentication with HDFS. Instead we are planning on relying on security for SQL resources (tables/views) and using a single (superuser) credential for the Presto workers to authenticate with HDFS. |
Thank you for your answer! Yes, as you said we got the error from kerberos when we accessed via http. By the way, when do you think the next step of your security work will be pushed to git? Thanks in advance. |
Thank you for all your work on Presto; initial tests on our small cluster show it to be not too painful to deploy, and very, very fast. :-) Our production Hadoop clusters have mandatory Kerberos turned on, through, so I'm glad to see this issue logged and with nearly 500 other people watching it. Even though full Kerberos support isn't quite ready, can we (today, in version 0.114) do a tech preview / testing version of it, by configuring the last part of your email without the first? I.e., can we configure a static "superuser" credential, in the form of a principal name and keytab, that Presto Server can use to read the table files from HDFS, while ignoring authentication from the Presto client to the server? Obviously that is a gaping security weakness, but if we're willing to work in that state for a while, will it work now? If so, very briefly how do I configure it? Thanks. |
Thank you for your working of presto! I set up hadoop https as shown in the following URL: and here is the current status: |
Using authentication infos stored in Hive metastore would be good. 2015-07-31 20:31 GMT+02:00 Dain Sundstrom notifications@github.com:
|
Hi! I'm politely asking again about this issue, since it's been a few months. Presto clearly has a lot of excellent programmers working on it, but perhaps the issue of Kerberos ticket carry-through to Hive is not important to most users. I really would like to deploy Presto or at least figure out how much of an improvement it would be for certain queries here. At least for a proof-of-concept, I am willing to live without actual security, i.e., I don't need Presto to actually accept a Kerberos ticket, confirm that it's valid, pull out the username, and authorize it by name against any particular list of user or tables. But if my whole Hadoop cluster is Kerberized, then the Presto server is immediately not permitted to even talk to the Thrift Metastore interface or the HDFS files without initiating those connections with Kerberos credentials. (It gets a java.net.SocketTimeoutException in org.apache.thrift.transport.TTransportException, because Thrift is expecting a Kerberos ticket in SASL and Presto is never supplying one.) So is it possible to just give to the Presto server nodes (in their config properties somewhere) a hard-coded Kerberos identity which they would then use to access Hive metastore and files? I realize the poor security implications of this, i.e., the Presto server would be impersonating someone who was not necessarily the person running the queries from Presto CLI. Long-term, everyone would want "real" handling of Kerberos, but this would be a quick-and-dirty experiment -- we would have to firewall-restrict the Presto server port. If no one else is working on this, I could poke at it in my spare time, but I might appreciate direction from someone familiar with the code as to which modules of source code I should start wading through. Thanks! |
@JeffSaxe I'm not at all familiar with how Kerberos works for the Hive metastore. We connect to the Hive metastore in |
@JeffSaxe |
Teradata is working on this and has a branch that's under development and testing. |
HI, my Hadoop cluster is Kerberized, Now I user Presto to compute data, then the Presto server is immediately not permitted to even talk to the Thrift Metastore interface or the HDFS files. will it work now? If so, how do I configure it? Thanks. |
This work is complete now. If you have questions about using Presto with Kerberos authentication you can look at the presto user group: presto-users@googlegroups.com |
We are trying to use presto with hadoop and hive, using kerberos authenticaion.
There are several options, but we are unclear as to what they are currently controlling
[for example]
--krb5-config-path, --krb5-keytab-path, --krb5-principal
-Dhttp.authentication.krb5.config=/${krb5.conf.dir}
-Dhttp.authentication.krb5.credential-cache=/tmp
-Dhttp.authentication.krb5.keytab=${keytabfiledir}
When we look at the logs from the kerberos server, we do not see any request from presto. (Currently we are using a single presto node).
Currently, does Presto support kerberos authentication when using hadoop?
If it does, is there any example I can refer to?
If it doesn't, is there any plans in the near future to implement it?
The text was updated successfully, but these errors were encountered: