-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default data location for AWS Glue tables #8472
Labels
@aws-cdk/aws-glue
Related to AWS Glue
effort/small
Small work item – less than a day of effort
feature-request
A feature should be added or improved.
in-progress
This issue is being actively worked on.
Comments
@sam-goodwin - As the original author of this, mind if I pick your brain? |
Marking this as a feature request to use an empty string as the default data location, which seems like a more reasonable default. |
There was no intelligent reasoning. I agree it should be changed since it’s too opinionated. |
DerkSchooltink
pushed a commit
to DerkSchooltink/aws-cdk
that referenced
this issue
Jul 10, 2020
fixes aws#8472 BREAKING CHANGE: the default location of glue data will be the root of an s3 bucket, instead of /data
DerkSchooltink
pushed a commit
to DerkSchooltink/aws-cdk
that referenced
this issue
Jul 10, 2020
fixes aws#8472 BREAKING CHANGE: the default location of glue data will be the root of an s3 bucket, instead of /data
DerkSchooltink
pushed a commit
to DerkSchooltink/aws-cdk
that referenced
this issue
Jul 13, 2020
fixes aws#8472 BREAKING CHANGE: the default location of glue data will be the root of an s3 bucket, instead of /data
DerkSchooltink
pushed a commit
to DerkSchooltink/aws-cdk
that referenced
this issue
Jul 13, 2020
fixes aws#8472 BREAKING CHANGE: the default location of glue data will be the root of an s3 bucket, instead of /data
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
@aws-cdk/aws-glue
Related to AWS Glue
effort/small
Small work item – less than a day of effort
feature-request
A feature should be added or improved.
in-progress
This issue is being actively worked on.
The Question
I started rolling out a Glue table with the following CDK structs:
To my surprise the Glue table pointed towards my bucket using this URL:
s3://<bucket>/data
.Looking into the documentation of CDK, indeed
/data
is the default location for the Glue data to be discovered from (this is thes3Prefix
property). But little explanation is given why this is the default. Is this done to follow certain guidelines or is this a randomly chosen path?I would have expected the default to be just empty; no nested folder in the bucket but just the root. Defining the blank
s3Prefix
seems to be out of place to achieve this behavior:Proposed solution
Remove the default that points to
/data
for thes3Prefix
and use empty string insteadOR
Provide documentation that explains why
/data
is chosen as defaultThe text was updated successfully, but these errors were encountered: