Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Athena List permissions to use AWS SDK for Pandas in SageMaker #1155

Merged
merged 1 commit into from
Apr 10, 2024

Conversation

dlpzx
Copy link
Contributor

@dlpzx dlpzx commented Apr 9, 2024

Feature or Bugfix

  • Bugfix

Detail

Gran athena:List:* permissions to environment-team-roles. We could have added the following permissions one by one, but since we are dangerously close to IAM service quotas on managed policies per IAM role this PR grants athena:List:* instead.

{ "Action": [ "athena:ListDataCatalogs" ], "Resource": [ "*" ], "Effect": "Allow" }, { "Action": [ "athena:ListDatabases", "athena:ListTableMetadata" ], "Resource": [ "arn:aws:athena:eu-central-1:<account>:datacatalog/*" ], "Effect": "Allow" },

The issue also reports missing S3 permissions when using the SDK for pandas. These issues cannot be reproduced if using ctas_approach set to False in the read_sql statement (link to docs). So no additional S3 permissions have been granted.

df = wr.athena.read_sql_table(
    table="table",
    database="dataall_DATABASE_shared",
    workgroup= "dataall-GROUP",
    ctas_approach=False
)

Relates

Security

Please answer the questions below briefly where applicable, or write N/A. Based on
OWASP 10.

  • Does this PR introduce or modify any input fields or queries - this includes
    fetching data from storage outside the application (e.g. a database, an S3 bucket)? NO
    • Is the input sanitized?
    • What precautions are you taking before deserializing the data you consume?
    • Is injection prevented by parametrizing queries?
    • Have you ensured no eval or similar functions are used?
  • Does this PR introduce any functionality or component that requires authorization? NO
    • How have you ensured it respects the existing AuthN/AuthZ mechanisms?
    • Are you logging failed auth attempts?
  • Are you using or adding any cryptographic features? NO
    • Do you use a standard proven implementations?
    • Are the used keys controlled by the customer? Where are they stored?
  • Are you introducing any new policies/roles/users? Yes
    • Have you used the least-privilege principle? How? Yes, the user only gets List permissions to Athena resources

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…Catalogs", "athena:ListDatabases", "athena:ListTableMetadata"
Copy link
Contributor

@petrkalos petrkalos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The documentation states that when ctas_approach=False that nested types are not handled. Do we know if there is a need for that?
  • Do we have a long term plan to fix the IAM Role policy limits?

@dlpzx
Copy link
Contributor Author

dlpzx commented Apr 10, 2024

  • The documentation states that when ctas_approach=False that nested types are not handled. Do we know if there is a need for that?

If customers use unload_approach=True and ctas_approach=False they can still query nested data, the comment is for both being false.

* Do we have a long term plan to fix the IAM Role policy limits? 

We can discuss offline. When it comes to IAM roles, we already optimized the size of the IAM policies in the way they are splitted. For the long-run we are assessing IAM Identity Center

@dlpzx dlpzx merged commit 84ffcd4 into main Apr 10, 2024
8 checks passed
@dlpzx dlpzx deleted the fix/athena-permissions-sagemaker-studio branch April 11, 2024 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ML Studio - In order to use for example athena in datawrangler several permissions are missing
2 participants