Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set effective user of (web)hdfs hooks using connection config, optionally overridden by constructor #942

Merged
merged 1 commit into from
Feb 8, 2016

Conversation

xadhoom
Copy link
Contributor

@xadhoom xadhoom commented Feb 1, 2016

Sometimes is needed to do hdfs operations on behalf of other users (like creating dirs or setting permissions), so this PR adds a simple method to do that (optionally, without breaking compatibility).

@mistercrunch
Copy link
Member

I like the idea, but I what about using the username from the Connection object instead if/when it is defined? You could just set the proxy user in the HDFSHook constructor.

@xadhoom
Copy link
Contributor Author

xadhoom commented Feb 1, 2016

well, that makes sense. I missed that.
I'll update the PR asap with the modification.

@xadhoom
Copy link
Contributor Author

xadhoom commented Feb 2, 2016

PR updated.

Now if login is set into the connection, it is used as proxy user.

I preferred to have also a way to override it in the constructor, since it can be useful in certain situations, where you have to deal with different proxy users and defining an hdfs connection for each one can be "too static".

@xadhoom xadhoom force-pushed the hdfs_proxy_user branch 5 times, most recently from bd4b251 to c132866 Compare February 2, 2016 10:25
… WebHDFSHook, optionally allow to override it in the constructor.

Add simple init test for WebHDFS.
@xadhoom xadhoom changed the title add method to set effective user of hdfs operations Set effective user of (web)hdfs hooks using connection config, optionally overridden by constructor Feb 3, 2016
@xadhoom
Copy link
Contributor Author

xadhoom commented Feb 3, 2016

PR updated to include webhdfs proxy user support.

client.content('/')
proxy_user = self.proxy_user or nn.login
client = InsecureClient(connection_str, user=proxy_user)
client.status('/')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the lib, what is this change about?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand, the content method is called to check if the namenode is valid, but it fails also on current airflow, if the task is run as any user != hdfs, while the status method will not, so I think that calling status is better for checking the namenode validity.
The other change is for adding the "doAs" also to the hdfs client.

@mistercrunch
Copy link
Member

Awesome. Thanks for this contribution.

mistercrunch added a commit that referenced this pull request Feb 8, 2016
Set effective user of (web)hdfs hooks using connection config, optionally overridden by constructor
@mistercrunch mistercrunch merged commit c479fb4 into apache:master Feb 8, 2016
@artwr
Copy link
Contributor

artwr commented Feb 8, 2016

Can confirm. Content was a way to check the validity of namenode. Thanks for this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants