Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flag to execute Hive queries as the user submitting the query #4382

Closed
wants to merge 1 commit into from

Conversation

billonahill
Copy link

Adding the hive.read-as-query-user flag (default=false). When set to true Presto will read from HDFS as the user submitting the query. When set to false the query will read from HDFS as the presto daemon user.

if (HiveSessionProperties.getReadAsQueryUser(session)) {
UserGroupInformation ugi = UgiUtils.getUgi(session.getUser());
try {
return ugi.doAs((PrivilegedExceptionAction<ConnectorPageSource>) () ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it sufficient to only doAs for the initial creation of the page source? Historically, we have had problems with Haooop APIs and the multithreaded nature of Presto. In a standard query this page source will be accessed from many threads, so if in the middle of the query the thread preforms a new "read", the UGI will not be set on the thread.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dain if the a single query accesses this method repeatedly with multiple threads we should be ok since the query will always be executed by the same user and the method will always doAs that user. We'd have problems if the ConnectorPageSource returned by this method was then shared across queries/users. Would that ever be the case?

In an initial version of this patch we ran into user/threading issues in the BackgroundHiveSplitLoader because we were doing doAs at too high a level before kicking off tasks in the threadpool. This had the effect of binding the subject to the thread across queries. The fix was to do doAs within the thread, not outside of it.

We've been successfully running this patch in production for a week or so without issue, but that doesn't mean there couldn't be a corner case that we've missed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about when we make further calls to the Hive record reader (e.g., recordReader.next). I would assume most readers cache the open file system input stream, but I would bet there are record readers that open new files during execution (maybe reading a sidecar file), and these would fail if the UGI is not on the thread. Maybe we just ignore that until the problem actually happens.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I see your point. An impl that opens new files after getting the page source would run into issues. We could see wait and see if that's really an issue.

@dain
Copy link
Contributor

dain commented Jan 20, 2016


// Every instance of a UserGroupInformation object for a given user has a unique hashcode, due
// to the hashCode() impl. If we don't cache the UGI per-user here, there will be a memory leak
// in the PrestoFileSystemCache.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the leak? Specifically, if I have a lot of users over the uptime of my system, do I have a "large" permanent allocation for every user seen, or can we "expire" them overtime?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this cache the specific leak would occur in PrestoFileSystemCache.map (https://github.com/prestodb/presto-hadoop-apache2/blob/master/src/main/java/org/apache/hadoop/fs/PrestoFileSystemCache.java#L62), since the Key would always be unique, even when the same user accesses the same filesystem in multiple queries.

Yes, the caching bounds the map size to O(number of users) by hashing UGIs consistently, but as implemented PrestoFileSystemCache.map has a "leak" in that it will cache all users keys for the duration of the JVM. Running our coordinator with 64G heap we ran into issues once we got to O(100k) map entries, to give a sense of the volume of users required to cause memory issues.

@billonahill
Copy link
Author

Do we need similar code in https://github.com/facebook/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/HivePageSinkProvider.java

We would need to do that to support hive.write-as-query-user which we don't yet support [1], but we plan to. We need to submit some more patches for CTAS that will require this, so my plan to was follow up with that in a separate patch.

1 - Per user temp directory: https://github.com/twitter-forks/presto/blob/twitter-master/presto-hive/src/main/java/com/facebook/presto/hive/HiveWriteUtils.java#L378

@ebd2
Copy link
Contributor

ebd2 commented Jan 21, 2016

@billonahill: Teradata is actively working on supporting Kerberos in the Hive connector. As part of doing this, we get non-Kerberos impersonation basically for free. We have a branch in our fork that we'd love to have somebody else who has been working on this stuff take a look at.

I'll open an issue and add a link to our branch there, but in the meantime, you can check it out here:
https://github.com/Teradata/presto/tree/kerberos_hive_poc

Edited to add: It looks like there's an issue open already: #3380

@billonahill
Copy link
Author

Thanks for sharing that Eric, that's great. I do have a few comments, but
AFAIK I can't comment on a diff, only a PR. Could you make a PR just for
commenting?

On Thu, Jan 21, 2016 at 12:18 PM, Eric Diven notifications@github.com
wrote:

@billonahill https://github.com/billonahill: Teradata is actively
working on supporting Kerberos in the Hive connector. As part of doing
this, we get non-Kerberos impersonation basically for free. We have a
branch in our fork that we'd love to have somebody else who has been
working on this stuff take a look at.

I'll open an issue and add a link to our branch there, but in the
meantime, you can check it out here:
https://github.com/Teradata/presto/tree/kerberos_hive_poc


Reply to this email directly or view it on GitHub
#4382 (comment).

@petroav
Copy link
Contributor

petroav commented Jan 22, 2016

@billonahill Teradata#105

@billonahill
Copy link
Author

Thanks @petroav and @ebd2, Teradata#105 looks good. We should focus on that PR instead of what I propose in this one.

@cberner
Copy link
Contributor

cberner commented May 10, 2016

#4867 has been merged, which it sounds like addresses this. If I misunderstood please re-open!

@cberner cberner closed this May 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants