Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to import a bucket with a long name #1710

Open
zsaltys opened this issue Nov 21, 2024 · 1 comment
Open

Failed to import a bucket with a long name #1710

zsaltys opened this issue Nov 21, 2024 · 1 comment

Comments

@zsaltys
Copy link
Contributor

zsaltys commented Nov 21, 2024

Describe the bug

My bucket name is 50 characters long. When this bucket is imported into data.all it tries to create a dataset admin role with format: dataall-TRUNCATED_BUCKET_NAME-30ydkjf9

The problem is that this truncation is not long enough. Data.all truncated my bucket name to 49 characters. Let's add them all together

dataall-(8)TRUNCATED_BUCKET_NAME(49)-30ydkjf9(9) = 8 + 49 + 9 = 66

However AWS IAM allows only 64 characters maximum. Therefore I received an error:

Screenshot 2024-11-21 at 11 56 18

How to Reproduce

Import a dataset with a bucket name longer than 50 characters.

Expected behavior

No response

Your project

No response

Screenshots

No response

OS

N/A

Python version

N/A

AWS data.all version

2.6

Additional context

No response

@TejasRGitHub
Copy link
Contributor

TejasRGitHub commented Nov 22, 2024

Able to reproduce the same issue and found the bug.

While calculating the name following is used
f"{slugify(self.resource_prefix + '-' + self.target_label[:**(max_length - len(self.resource_prefix + self.target_uri)**)] + suffix, regex_pattern=fr'{regex}', separator=separator, lowercase=True)}"

The highlighted part truncates the bucket name to how many characters so as to accomodate the suffix ( which is the targetURI = the datasetUri in this case ). The calculation does miss 2 characters - the '-' used in between the naming.

To correct , following should be used ,
f"{slugify(self.resource_prefix + '-' + self.target_label[:(max_length - len(self.resource_prefix + self.suffix) - 1)] + suffix, regex_pattern=fr'{regex}', separator=separator, lowercase=True)}"

One '-' is used in between self.resource_prefix + '-' + self.target_label... and another '-' is used in the suffix which is formed of -{targetUri}.

Things to consider before making this change

Changing this logic will likely affect all the dataset iam role names and not just those but whereever NamingConventionService is used and i.e. for Environments and other places where Stacks are used. Also this same is used for generating policies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants