Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit Dataset Role permissions #461

Closed
dlpzx opened this issue May 17, 2023 · 1 comment
Closed

Limit Dataset Role permissions #461

dlpzx opened this issue May 17, 2023 · 1 comment
Assignees
Labels

Comments

@dlpzx
Copy link
Contributor

dlpzx commented May 17, 2023

Is your idea related to a problem? Please describe.
After analyzing IAM roles in data.all we noticed that the IAM role created as part of the dataset stack has some permissions that are too open (e.g. lakeformation* on *)

Describe the solution you'd like
Review and harden permissions for Dataset role

P.S. Don't attach files. Please, prefer add code snippets directly in the message body.

@dlpzx dlpzx added type: enhancement Feature enhacement status: in-progress This issue has been picked and is being implemented priority: high labels May 17, 2023
@dlpzx dlpzx self-assigned this May 17, 2023
dlpzx added a commit that referenced this issue Jun 9, 2023
### Feature or Bugfix
- Refactoring

### Detail
The resulting IAM policy can:
- list all buckets
- read and write objects to the dataset Bucket which is encrypted
- read S3 access points in the dataset Bucket
- putLogs in dataset Glue crawler log group
- read dataset Glue database, read and write tables in the dataset Glue
database. This is not strictly necessary as in data.all permission to
data is handled using Lake Formation. But restricting the IAM-based data
permissions we ensure that any Glue resource that is not protected using
Lake Formation is not accessible by this role
- WIP - read objects to the `/profiling/code` prefix in the environment
bucket
- WIP - read and write objects to the
`/profiling/code/results/datasetUri/` prefix in the environment bucket

IMPORTANT: I found a bug related to profiling jobs that prevented me to
test the profiling jobs. A separate
[issue](#506) has been
opened for it. For this reason the profiling permissions are a work in
progress and might require changes. e.g. additional KMS permissions.

It cannot:
- read or write to any other S3 Bucket
- use any KMS key different from the dataset KMS key
- read or write any other Glue database/tables

In addition, the Glue crawler and the profiling Job of the dataset have
been modified to always use the dataset role and not the PivotRole to
break down the "super permissions" of the pivot role and distribute
responsibilities. As a result, the dataset role can be assumed:
- by the pivotRole -> used whenever users are assuming the role from
data.all UI
- by Glue -> to run Glue profiling jobs and Glue crawler

### Relates
- #461 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
@noah-paige noah-paige added status: in-review This issue has been implemented and is currently in review and waiting for next release and removed status: in-progress This issue has been picked and is being implemented labels Jul 7, 2023
@noah-paige
Copy link
Contributor

Closing issue - implemented in v1.6

@noah-paige noah-paige removed the status: in-review This issue has been implemented and is currently in review and waiting for next release label Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants