Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IAM Policy enhancements - Split policy statements in chunks #884

Open
anushka-singh opened this issue Nov 17, 2023 · 7 comments
Open

IAM Policy enhancements - Split policy statements in chunks #884

anushka-singh opened this issue Nov 17, 2023 · 7 comments

Comments

@anushka-singh
Copy link
Contributor

Is your idea related to a problem? Please describe.
The limit for managed IAM policies is 6144 characters. Based on a dirty approximation, each team can request access to around 35 buckets based on the number of characters in policy statements.

Describe the solution you'd like
In v2.1.0 we added some utilities in dataall/base/utils/iam_policy_utils.py. We can reuse them to limit the resources per policy and splitting policies.

@dlpzx
Copy link
Contributor

dlpzx commented Nov 27, 2023

Thanks for opening the issue for visibility @anushka-singh. We will work together on introducing this feature because as you noticed it includes some scenarios out of the ones considedred in the iam_policy_utils

@zsaltys
Copy link
Contributor

zsaltys commented Jul 5, 2024

@dlpzx @SofiaSazonova @anmolsgandhi we ran into this issue on IAM policies when a role accumulated 38+ shares. Would be nice to prioritize this one.

@github-project-automation github-project-automation bot moved this to Nominated in v2.7.0 Jul 8, 2024
@anmolsgandhi anmolsgandhi moved this from Nominated to Prioritized To do in v2.7.0 Jul 8, 2024
@anmolsgandhi anmolsgandhi moved this from Backlog to This Sprint in v2.7.0 Aug 12, 2024
@petrkalos petrkalos self-assigned this Sep 3, 2024
@petrkalos
Copy link
Contributor

I started an investigation into it and here are my findings so far...

When using data.all generated IAM roles we already use all the policies (max 10 per role) for services hence we cannot add anymore. We can assume that not all deployment will have all the features enabled and those 10 might be fewer and as such have some space but I don't consider this a good solution.

When using consumption roles we don't have those service policies hence in theory we have space for adding more managed policies but keeping in mind that those roles are not data.all managed and hence users might have already added their own policies.

Overall to tackle this issue we need to come up with a smarter (which probably means dynamic) policy manager which will maximise the capacity based on the following limitations

  • 10 managed policies per role
  • 6144 chars per managed policy
  • 10240 chars in total across all the inline policies per role

@dlpzx dlpzx added this to v2.8.0 Sep 9, 2024
@github-project-automation github-project-automation bot moved this to Nominated in v2.8.0 Sep 9, 2024
@TejasRGitHub
Copy link
Contributor

Hi @petrkalos ,

We are seeing IAM policy limit getting exceeded on a managed roles. I think an initial version where if the policy gets exceeded, another policy is created and attached would be helpful. I think data.all would somehow ( maybe ) have to remember the policy name it created and try to modify the policy which still has space left.

FYI , if needed the limit can be pushed to 20 managed policies per role- https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_iam-quotas.html#reference_iam-quotas-entity-length:~:text=Managed%20policies%20per%20role . Please correct me if I am wrong

@anushka-singh anushka-singh changed the title Bucket Policy enhancement - Split policy statements in chunks IAM Policy enhancements - Split policy statements in chunks Sep 19, 2024
@TejasRGitHub
Copy link
Contributor

TejasRGitHub commented Sep 20, 2024

[WIP]

Current IAM Roles and Policy management in data.all

Data.all creates managed roles for data.all constructs like Environment, Dataset. These managed roles contain managed and inline policies.

  1. Environment - Contains Data Policy ( inline ), muliple service policies ( Customer managed ), share policy [Optional]
  2. Dataset - Contains one inline policy
  3. Teams attached to an environment - Contains Data Policy ( inline ), muliple service policies ( Customer managed )[Optional], share policy[Optional]
  4. Consumption Roles- Irrespective of if the role is data.all managed or not, consumption roles contains one customer managed policy
  5. PivotRole - Contains one inline, multiple customer managed policies ( these policies are split into chunks to maximize space usage)

Problem with managing Customer Managed Service policies:

For environments and teams where roles are created by data.all and customer managed service policies are attached, these service policies are split into chunks ( splitting 10 statements into one managed policy ). Usually these service policies are split into managed policies as show below,

image

Now if there is some modification in the 1-indexed policy then that leaves the managed policy some empty space in which it could fill in some statements. Data.all will have to rearrange all those policies so that there is a tight fit ( i.e. minimal empty space in all the policies ).

Thus this problem is for statements = [s1, s2, s3, .... ] and their sizes ( weights ) = [size0, size1, size2, ... ]. Put statements in policies ( service-policy-0, service-policy-1, service-policy-2 ... ) such that Min(Empty Space left ).

This problems exists because the service policies are diverse and dynamically added. This doesn't allow us to efficiently split it into chunks like how it is done in functions like split_policy_with_resources_in_statements, etc. But this problem is not present for managed policies used in dataset sharing. Here optimizations can be used to split the statements into chunks and create indexed managed policies.

Proposal

Stage 1:

Since the problem of managed policies getting full and resulting in "limit exceeded error" most likely happens for roles which are involved in share purposes. This update can be first targeted to all the consumption roles, environment team roles ( which are involved in sharing ). Currently there exists just one managed policy with data.all with the naming convention - <dataall-env>-<URI>-share-policy-<name of the role>.

Policy updates with bucket sharing in data.all -

Following resources are added in requestors IAM role

SID for S3 - BucketStatementS3

S3 permissions
s3_target_resources = [f'arn:aws:s3:::{self.bucket_name}', f'arn:aws:s3:::{self.bucket_name}/*']

SID for KMS - BucketStatementKMS

Kms permissions ( if KMS is used to encrypt bucket )
kms_target_resources = [f'arn:aws:kms:{self.bucket_region}:{self.source_account_id}:key/{kms_key_id}']

Policy updates with folder sharing in data.all -

Following resources are added in requestors IAM role

SID - AccessPointsStatementS3

S3 permissions
s3_target_resources = [
            f'arn:aws:s3:::{self.bucket_name}',
            f'arn:aws:s3:::{self.bucket_name}/*',
            f'arn:aws:s3:{self.dataset_region}:{self.dataset_account_id}:accesspoint/{self.access_point_name}',
            f'arn:aws:s3:{self.dataset_region}:{self.dataset_account_id}:accesspoint/{self.access_point_name}/*',
]

SID - AccessPointsStatementKMS

Kms permissions ( if KMS is used to encrypt bucket )
kms_target_resources = [f'arn:aws:kms:{self.dataset_region}:{self.dataset_account_id}:key/{kms_key_id}']

Policy updates with table sharing in data.all -

Here the requestor IAM role is not updated


Design to handle splitting of policies

When a share is gettting processed ( either for bucket or folder share )

  1. Check if the share has an old format policy
    • If yes, then create a policy with naming convention plus have the index as 0
  2. Check if the share already has a "share-policy", then create a new one with an index and copy all statements from old to new policy. Delete the old policy.
  3. Read the statements from all the policies and split the statements similar to the function split_policy_with_resources_in_statements.
    a. This function can be extended / modified in which depending on if its bucket share or access point share , it will pick up all the statements. belonging to an SID pattern ( AccessPointsStatement or BucketStatement). [UPDATED]
    b. Then the statements will be again split with split_policy_with_resources_in_statements.
    c. Fetch all policy statement from all the managed policies.
    d. Combine statement from (b) and (c) .
    e. Then split statements into statement chunks with split_policy_statements_in_chunks.
  4. Then add statement chunks into the managed policies like here
        for index, chunk in enumerate(statements_chunks):
            policies.append(
                iam.ManagedPolicy(
                    self.stack,
                    f'PivotRolePolicy-{index + 1}',
                    managed_policy_name=f'{self.env_resource_prefix}-pivot-role-cdk-policy-{self.region}-{index + 1}',
                    statements=chunk,
                )
            )

When a share is revoked,

  1. Check if the share has an old policy
    • If yes, then create a policy with naming convention plus have the index as 0
  2. Check if the share already has a "share-policy" and create a new on with an index and copy all statements from old to new policy and delete the old policy
  3. Fetch all the policy statements , remove the principals which have been revoked from the S3 and the KMS sids. Get all the statements and then again call the split function split_policy_with_resources_in_statements.
  4. Add the statements chunks into the managed policies like in Step 4 above.
  5. Delete all the managed policies which are not used ( empty ).

Migrating to managed policies with indexes.

Since we already have a managed policy, this will be converted to an index based managed policy with the name format as -
<dataall-env>-<URI>-share-policy-<name of the role>-<**index** (for e.g. 0, 1, 2,.. )>

While attaching all the policies which are of the above format will be picked and attached to the role.


Questions

Q1. What happens when a role is not data.all managed ( in the case of consumption roles ) ?

In case if the role is not managed by data.all, then all managed policies which are created as a part of the share will be added on the copy clipboard button. The policies will be presented as array of policies when copied from the clip board icon.

Q2. Can we search for all the managed policies belonging to that consumption role ( any role which is involved in dataset share ) with boto3 calls ?

Seems like the policies can be filtered based on PathPrefix - https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam/client/list_policies.html

Q3. What if we make separate managed policies for access point and bucket share ?

This will involve migrating the existing policy into two managed policies like "<dataall-env>-<URI>-access-share-policy-<name of the role>-<index>" and "<dataall-env>-<URI>-bucket-share-policy-<name of the role>-<index>"

Pros:

  1. Easy to handle splitting and splitting will happen specific to the share type ( access and bucket )
  2. Easy for user to view the managed policies and navigate to view statements

Cons

  1. If the role already has 9 policies, splitting the policies into two would be a problem ( since 10 is the limit to the maximum number of managed policy a role can have )

Q4. What are the policies which are added by Redshift sharing ?

@dlpzx , any insights on this one ?


Previous GH issues related to policy -
#922

@petrkalos
Copy link
Contributor

Thanks for the detailed analysis @TejasRGitHub, I agree with the proposed solution, let's tackle ConsuptionRoles (and groups since it's very similar code) and think later about the rest. Also the indexing solution is already utilised and proven to work fine.

@dlpzx
Copy link
Contributor

dlpzx commented Sep 26, 2024

Hi @TejasRGitHub, very clear design. Answering to Q4 ---> luckily for us we do not rely much on Redshift IAM permissions in the data.all integration. The only role that gets additional Redshift IAM permissions is the PivotRole, the rest will get access to Redshift using directly Redshift connections and Redshift permissions in the cluster (outside of IAM, managed by database admins)

@NickCorbett NickCorbett removed this from v2.7.0 Oct 10, 2024
TejasRGitHub pushed a commit to TejasRGitHub/aws-dataall that referenced this issue Oct 17, 2024
# Conflicts:
#	backend/dataall/modules/s3_datasets_shares/services/share_managers/s3_access_point_share_manager.py
#	backend/requirements.txt
#	docker-compose.yaml
TejasRGitHub pushed a commit to TejasRGitHub/aws-dataall that referenced this issue Oct 29, 2024
# Conflicts:
#	deploy/stacks/container.py
@dlpzx dlpzx assigned TejasRGitHub and unassigned petrkalos Oct 31, 2024
@dlpzx dlpzx removed this from v2.8.0 Nov 14, 2024
@dlpzx dlpzx added this to v2.7.0 Nov 14, 2024
@github-project-automation github-project-automation bot moved this to Nominated in v2.7.0 Nov 14, 2024
@dlpzx dlpzx moved this from Nominated to In progress in v2.7.0 Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In progress
Development

No branches or pull requests

6 participants