Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AWS profile option to profiles.yml #2437

Closed
ghost opened this issue May 12, 2020 · 10 comments · Fixed by #2581
Closed

Add AWS profile option to profiles.yml #2437

ghost opened this issue May 12, 2020 · 10 comments · Fixed by #2581
Labels
enhancement New feature or request good_first_issue Straightforward + self-contained changes, good for new contributors! redshift

Comments

@ghost
Copy link

ghost commented May 12, 2020

Describe the feature

New option to be allowed on Redshift targets in profiles.yml, specifying the AWS profile to use when getting credentials via IAM.

Describe alternatives you've considered

Allow specification of full awscli command used to fetch credentials? Not sure

Additional context

Maybe relevant only for Redshift, but I think other cloud providers would have similar features.

Who will this benefit?

Increasing security regarding credentials when using DBT.

@ghost ghost added enhancement New feature or request triage labels May 12, 2020
@Limess
Copy link

Limess commented May 12, 2020

To allow the complete flexibility for IAM auth it may also be worth also adding the equivalent of:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_SESSION_TOKEN (only valid if the two prior variables are specified)
  • AWS_DEFAULT_REGION

EC2/ECS credentials will automatically be picked up by the provider chain so don't need any extra setup.

The alternative is to ignore this entirely and point to the boto3 documentation, as all of these can be used by setting the above environment variables before running DBT.

Reference https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html

@jtcohen6
Copy link
Contributor

The alternative is to ignore this entirely and point to the boto3 documentation, as all of these can be used by setting the above environment variables before running DBT.

I think this is how we've gone about this in the past, when it's been necessary.

  • Could you say a bit more about which specific values you'd want to add as options for method: iam in the Redshift profile?
  • How are you deploying dbt?

@jtcohen6 jtcohen6 removed the triage label May 12, 2020
@ghost
Copy link
Author

ghost commented May 13, 2020

We have several AWS accounts, however the access key/credentials at the user level are only defined in one account, whereas redshift sits in a different account. Thus I always need to use credentials from account X to access account Y. The way I do that in the awscli is just specifying a profile with --profile.

We deploy DBT as an ECS task that runs on a schedule, and that ECS task is invoked using an exclusive IAM role that exists on the same environment as Redshift, and I haven't tried, but I think DBT would work fine with IAM validation when deploying since the credentials and redshift are on the same account.

@ghost
Copy link
Author

ghost commented Jun 22, 2020

I want to give it a shot at implementing this!

My idea is to add a new config option 'aws_profile' to profiles.yml, when the method is IAM, and pass that config to the boto3 call that gets the authentication.

Does that sound reasonable?

@jtcohen6 jtcohen6 added good_first_issue Straightforward + self-contained changes, good for new contributors! redshift labels Jun 22, 2020
@jtcohen6
Copy link
Contributor

To be clear, your proposal is that we shouldn't add additional parameters for all of:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_SESSION_TOKEN
  • AWS_DEFAULT_REGION

Instead, we should expect users to configure those via boto3, and to identify a set of credentials by its profile name. Does that sound right?

If so, that sounds like a quick addition to me. It's all yours @landbay-brunomurino. Let me know if you want help getting started.

You should also look into adding to/updating the Redshift unit test test_iam_conn_optionals.

@ghost
Copy link
Author

ghost commented Jun 22, 2020

Think so, yeah. I think (and correct me if I'm wrong) most users of AWS would have the .aws folder with their credentials and profiles, so simply pointing to the correct profile would enable all sorts of connections.

Thanks! Actually I'm struggling with something @jtcohen6 I've set up the environment but am struggling to run the unit tests specific to redshift, it doesn't connect successfully and I'm not sure I should play too much with how things run.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jun 22, 2020

@landbay-brunomurino You shouldn't need any database connection to run unit tests. Are you able to run make test-unit?

If you're trying to run integration tests, you will need to define Redshift credentials as environment variables in a local file called test.env. You can copy test.env.sample as a starting point.

@ghost
Copy link
Author

ghost commented Jun 22, 2020

Yea, sorry, I didn't mean unit tests, I meant the integration tests. I was able to run make test-unit successfully.

I did define the credentials in the test.env but that's not working.

May not be relevant, but I connect to redshift via SSH tunnel, and when running DBT via docker I need to specify an internal docker host ip output of dig +short host.docker.internal, instead of localhost when not using docker. Since the tests run via docker, I tried putting that IP in the test.env file and now I'm getting a different error (I think the connection worked?):

No module named 'dbt.deps.local'

@jtcohen6
Copy link
Contributor

Got it, that is tricky. I haven't ever tried connecting to Redshift via Docker + SSH; I agree your error message is promising as far as the connection, but unexpected otherwise.

For what it's worth, when you open a PR, CI tests against a sandboxed Redshift cluster will run automatically. As long as you can do some end-to-end testing of your changes locally, I think it's fine to open a fork PR and see.

@ghost
Copy link
Author

ghost commented Jun 22, 2020

@jtcohen6 I got it working locally now, but not sure about the test you mentioned before. If the sandboxed redshift/aws is configured with the default profile, then the current test should pass fine as the new option (profile) is optional.

I created the PR for this , but the first of the CI tests failed with

ERROR:   flake8: commands failed

I'm not sure what's the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good_first_issue Straightforward + self-contained changes, good for new contributors! redshift
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants