-
Notifications
You must be signed in to change notification settings - Fork 2.9k
AWS: Support multiple storage credential prefixes #12799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS: Support multiple storage credential prefixes #12799
Conversation
e12e4ff to
c443354
Compare
| private final AtomicBoolean isResourceClosed = new AtomicBoolean(false); | ||
| private transient StackTraceElement[] createStack; | ||
| private List<StorageCredential> storageCredentials = ImmutableList.of(); | ||
| private transient volatile Cache<String, PrefixedS3Client> s3ClientCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better if we use Execution interceptor doc which gives us the handle of the request, so that we can inspect the request and see which path it refers to ? rather than making a client per perfix ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really familiar with ExecutionInterceptor so I'll have to do some digging on how a potential solution would look with this. @singhpk234 is there an easy way to pass the storage credentials (that we currently pass to FileIO) to a given ExecutionInterceptor?
But generally speaking, we're only expecting a handful of different storage credentials in practice, so the cost of maintaining < 5 different client instances should be quite low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is what i had in mind and have seen people doing in past
- Execution interceptor can be configured as part of ClientOverrideConfiguration when we are creating a S3 SDK client, I am assuming we wire the VendedCredProvider for the same way but giving the creds map to the class constructor, so we can do the same way here for the ExecutionInterceptor and pass it in SDK client creation
- Now how to provide / have refreshed creds is this
ExecutionInterceptorwill give us the handle of modify request and you can meddle with SDK header
something like this
@Override
public SdkHttpRequest modifyHttpRequest(Context.BeforeTransmission context, ExecutionAttributes executionAttributes) {
// you can also inspect from the request which path it has in this request
// and then based on the path you can wire in the creds which suits best for the path here :
SdkHttpRequest.Builder builder = context.httpRequest().toBuilder();
AwsCredentials credentials = credentialsProvider.get();
if (credentials != null && credentials.accessKeyId() != null && credentials.secretAccessKey() != null) {
builder.putHeader("X-Amz-Access-Key", credentials.accessKeyId());
builder.putHeader("X-Amz-Secret-Key", credentials.secretAccessKey());
if (credentials.sessionToken().isPresent()) {
builder.putHeader("X-Amz-Security-Token", credentials.sessionToken().get());
}
} else {
System.err.println("Warning: AWS credentials not found.");
}
return builder.build();
}
The main reason for not having one client per path is each client will have a connection pool and we may have to tune it like what is the min pool size for HTTP, if we can by gaurantee it will < 5 (which based on my understanding we can't right now) i think then we can think more.
Would love to know your thoughts for the same considering above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the suggestion @singhpk234 and I really like that idea. I've opened #12827 to explore using a custom ExecutionInterceptor
c3ad9bb to
c4080aa
Compare
c4080aa to
2c3df22
Compare
33fbac8 to
aa82932
Compare
aws/src/main/java/org/apache/iceberg/aws/s3/PrefixedS3Client.java
Outdated
Show resolved
Hide resolved
e36b267 to
f8470d5
Compare
f8470d5 to
395cf9f
Compare
danielcweeks
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 Thanks @nastra!
amogh-jahagirdar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I already approved last week for some reason, but this has looked right to me since the recent changes. Thank you @nastra!
|
Hey. I had a problem with this commit. As I understand my path start with s3a://, not s3://. How to configure it? |
|
@and124578963 can you please open a new issue with some additional details about your catalog configuration? |
No description provided.