-
Notifications
You must be signed in to change notification settings - Fork 347
Use "aws_access_key_id" and "aws_secret_access_key" for reads, not just unload #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Current coverage is
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another test coverage issue :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there's a separate earlier PR just for that one line... got pulled into this branch by accident as I'm trying to get things working
|
/cc @kai-zeng who has a WIP branch with a similar fix. |
|
Hey @koeninger, do you have a regression test for this? Now that we have the infra from #41 in place I think it would be a good idea to test this. |
|
Hmm, I'm wondering why the fact that I'm missing this fix in my end-to-end tests has not led to a test failure yet. |
|
@JoshRosen I came across this issue as well. It worked fine when running the job an EC2 instance that was allowed to read/write from S3. However, when running the job locally, I needed to set the hadoop configuration with my keys. |
|
@eduardoramirez @koeninger, given that both of you have reported this issue and fix I'm inclined to merge this now and figure out the regression tests later. I'm still puzzled over why I haven't been able to get this to fail in Travis, especially given that I purposely picked environment variables that don't match the ones that AWS expects. I'll take a closer look at our S3 bucket permissions tomorrow to see if that could explain the problem. |
|
I think I've spotted the problem: during test setup, we end up setting some Hadoop Configuration access key properties on the global |
|
I've opened #55 to bring this up to date and to fix the tests so that they're capable of detecting the bug being fixed here. |
…st unload This is an updated version of #32 with two additional changes: - Do not mutate the SparkContext's `hadoopConfiguration`. - Refactor tests so that they're capable of catching this bug. Author: cody koeninger <cody@koeninger.org> Author: Josh Rosen <joshrosen@databricks.com> Closes #55 from JoshRosen/set-credentials.
No description provided.