-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AWS Credentials parsing from file #2117
Comments
As far as I know, deltalake expects all the AWS parameters to be defined in the environment, exactly as you noted @Shershebnev.
To summarize, I don't see this as something that should be prioritized but If there is a strong support for implementing this I can have a stab |
@r3stl355 Polars for example parses these files to grab the credentials, could likely take inspiration from that implementation |
Yes @ion-elgreco, looks like Polars is using the second approach I mentioned - looking into specific files config and credentials files it "knows" may exist. However, it uses hard-code paths like "~/.aws/credentials" which, I believe, will break on Windows, hence there will be a need to handle OS specific file system as I mentioned |
Some of this will go away with #1601 fwiw, right now there's kind of a hodge-podge of configuration possibilities between |
Speaking from a Dask perspective I'd certainly like to throw weight behind this. We certainly find that people commonly use In the meantime, can I ask what mechanisms are available to specify AWS credentials? Is it just environment variables? Is there something people can do to specify these programmatically in the meantime? |
Environment variables or passing to >>> storage_options = {"AWS_ACCESS_KEY_ID": "THE_AWS_ACCESS_KEY_ID", "AWS_SECRET_ACCESS_KEY":"THE_AWS_SECRET_ACCESS_KEY"}
>>> dt = DeltaTable("../rust/tests/data/delta-0.2.0", storage_options=storage_options) |
Thanks for the example showing The system I'm running on uses |
FWIW I think that the system referred to above is used by systems that are on AWS machines that have an IAM role attached and use that rather than AWS secret keys in environment variables. Systems know to go and read a local available endpoint to get access tokens. |
Great to see this is a recent thread - I've gone down a rabbit hole determining if IAM roles could be used in delta-rs (but looking at arrow-rs issues: apache/arrow-rs#4556 and apache/arrow-rs#4238) I'm trying to use delta-rs using IAM role attached to ECS task and finding it very hard to believe you can't (and that you have to use AWS KEYS). Can you confirm that you cannot use IAM roles to write delta lake tables to S3? +1 to the points above |
I'm reviving this old thread to clean it up! I believe we have corrected this behavior since the deltalake-aws crate will use the AWS SDK itself for all AWS-related credential resolution, this includes passing access key/secret key through to the object_store crate. 🤞 |
Environment
Delta-rs version:
Binding:
Python
Environment:
Bug
What happened:
It seems that credentials are not correctly obtained from
~/.aws/credentials
and~/.aws/config
files. Just like here #1416 I'm gettingOSError: Generic S3 error: Missing region
when trying to read from S3On MacOS locally:
Setting
AWS_DEFAULT_REGION
fixes this, but then it tries to retrieve instance metadata usinghttp://169.254.169.254/latest/api/token
which obviously fails when running not from AWS instanceOSError: Generic S3 error: Error after 10 retries in 6.409805791s, max_retries:10, retry_timeout:180s, source:error sending request for url (http://169.254.169.254/latest/api/token): error trying to connect: tcp connect error: Host is down (os error 64)
On AWS instance:
Setting only
AWS_DEFAULT_REGION
results inOSError: Generic S3 error: Client error with status 403 Forbidden: <?xml version="1.0" encoding="UTF-8"?>
In both cases setting everything through env variables fixes the problem, e.g.
AWS_DEFAULT_REGION=... AWS_ACCESS_KEY_ID=... AWS_SECRET_ACCESS_KEY=... python
. Other tools like boto3 don't have problems using credentials stored in default location:What you expected to happen:
Credentials are properly read from default location
~/.aws/credentials
and~/.aws/config
How to reproduce it:
Install deltalake and try to read from S3 while having credentials set in default files. See example above with polars and deltalake
The text was updated successfully, but these errors were encountered: