Limit Dataset Role permissions #461

dlpzx · 2023-05-17T11:18:34Z

Is your idea related to a problem? Please describe.
After analyzing IAM roles in data.all we noticed that the IAM role created as part of the dataset stack has some permissions that are too open (e.g. lakeformation* on *)

Describe the solution you'd like
Review and harden permissions for Dataset role

P.S. Don't attach files. Please, prefer add code snippets directly in the message body.

### Feature or Bugfix - Refactoring ### Detail The resulting IAM policy can: - list all buckets - read and write objects to the dataset Bucket which is encrypted - read S3 access points in the dataset Bucket - putLogs in dataset Glue crawler log group - read dataset Glue database, read and write tables in the dataset Glue database. This is not strictly necessary as in data.all permission to data is handled using Lake Formation. But restricting the IAM-based data permissions we ensure that any Glue resource that is not protected using Lake Formation is not accessible by this role - WIP - read objects to the `/profiling/code` prefix in the environment bucket - WIP - read and write objects to the `/profiling/code/results/datasetUri/` prefix in the environment bucket IMPORTANT: I found a bug related to profiling jobs that prevented me to test the profiling jobs. A separate [issue](#506) has been opened for it. For this reason the profiling permissions are a work in progress and might require changes. e.g. additional KMS permissions. It cannot: - read or write to any other S3 Bucket - use any KMS key different from the dataset KMS key - read or write any other Glue database/tables In addition, the Glue crawler and the profiling Job of the dataset have been modified to always use the dataset role and not the PivotRole to break down the "super permissions" of the pivot role and distribute responsibilities. As a result, the dataset role can be assumed: - by the pivotRole -> used whenever users are assuming the role from data.all UI - by Glue -> to run Glue profiling jobs and Glue crawler ### Relates - #461 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>

noah-paige · 2023-07-19T16:00:25Z

Closing issue - implemented in v1.6

dlpzx added type: enhancement Feature enhacement status: in-progress This issue has been picked and is being implemented priority: high labels May 17, 2023

dlpzx self-assigned this May 17, 2023

This was referenced Jun 2, 2023

Minimize the permissions of Dataset Role #493

Closed

feat: limit dataset role permissions #497

Merged

dlpzx mentioned this issue Jun 9, 2023

Review permissions of data.all IAM roles #336

Closed

noah-paige added status: in-review This issue has been implemented and is currently in review and waiting for next release and removed status: in-progress This issue has been picked and is being implemented labels Jul 7, 2023

dlpzx mentioned this issue Jul 18, 2023

Limit Pivot Role S3 permissions #580

Closed

7 tasks

noah-paige closed this as completed Jul 19, 2023

noah-paige removed the status: in-review This issue has been implemented and is currently in review and waiting for next release label Jul 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit Dataset Role permissions #461

Limit Dataset Role permissions #461

dlpzx commented May 17, 2023

noah-paige commented Jul 19, 2023

Limit Dataset Role permissions #461

Limit Dataset Role permissions #461

Comments

dlpzx commented May 17, 2023

noah-paige commented Jul 19, 2023