-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Glacier skipping logic to S3FileSystem #18237
Conversation
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
4383d43
to
b20b5de
Compare
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % few suggestions
@@ -202,4 +211,34 @@ private static void validateS3Location(Location location) | |||
{ | |||
new S3Location(location); | |||
} | |||
|
|||
private boolean shouldReadObject(S3Object object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Maybe move connected methods to S3ObjectStorageClassFilter
and return function from context
.filter(context.s3ObjectStorageClassFilter())
e.g
public enum S3ObjectStorageClassFilter
{
READ_ALL(o -> true),
SKIP_ALL_GLACIER(S3ObjectStorageClassFilter::isNotGlacierObject),
READ_RESTORED_GLACIER_OBJECTS(S3ObjectStorageClassFilter::isCompletedRestoredObject);
private Function<S3Object, Boolean> filter;
private S3ObjectStorageClassFilter(Function<S3Object, Boolean> filter){
.....
}
private static boolean isCompletedRestoredObject(S3Object object)
{
/* There are 3 cases for the restore status:
*
* 1. The object is not restored, and has not been requested to be restored. We should have a null
* value for restoreStatus
* 2. The object is in the process of being restored. isRestoreInProgress will be true, but restoreExpiryDate
* will be null.
* 3. The object has completed restore. isRestoreInProgress will be false, restoreExpiryDate will be some date.
*
* Since we only care about distinguishing when it is case 3, we can just check restoreExpiryDate for a non-null value. */
return isNotGlacierObject(object) || Optional.ofNullable(object.restoreStatus())
.map(RestoreStatus::restoreExpiryDate)
.isPresent();
}
private static boolean isNotGlacierObject(S3Object object)
{
return !GLACIER_STORAGE_CLASSES.contains(object.storageClass());
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this! Will update
return switch (context.s3ObjectStorageClassFilter()) { | ||
case READ_ALL -> true; | ||
case SKIP_ALL_GLACIER -> !isGlacierObject(object); | ||
case READ_RESTORED_GLACIER_OBJECTS -> !isGlacierObject(object) || isCompletedRestoredObject(object); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will restore ALL non Glacier objects and RESTORED objects ? Should we adjust name ?
Maybe
READ_ALL
READ_NON_GLACIER
READ_NON_GLACIER_AND_RESTORED
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed this is more clear will update
Facing some dependency issues, as it looks like the version of Netty used in the SDK with <!-- Netty 4.1.94.Final breaks Apache Arrow -->
<dep.netty.version>4.1.93.Final</dep.netty.version> Per this comment, we need to wait for the next release of Apache Arrow. |
return s3ObjectStorageClassFilter; | ||
} | ||
|
||
@Config("s3.object-storage-class-filter") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i understand correctly, this is applicable to hive connector only and should be a config there.
Also, i don't know why anyone would want to silently skip glacier objects. that's just returning wrong data, isn't it?
delta and iceberg use listings for cleanups (eg DROP TABLE); in such case I don't think we should be skipping glacier objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, although this is actually a problem with the legacy S3 code. Thoughts on how to improve this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i understand correctly, this is applicable to hive connector only and should be a config there. delta and iceberg use listings for cleanups (eg DROP TABLE); in such case I don't think we should be skipping glacier objects.
Good point. What about overloading listFiles()
and adding a function signature which accepts a filter (and defaults to the filesystem configuration for the regular signature of listFiles(Location location)
? The S3AFileSystem
does something similar.
Also, i don't know why anyone would want to silently skip glacier objects. that's just returning wrong data, isn't it?
This is the default behavior in Athena. Lots of people have lifecycle policies on S3 buckets which transition their objects in place from standard storage to Glacier storage. This allows them to continue to read from that table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the use case for this feature? Silently skipping data is a correctness issue.
ListObjectsV2Iterable iterable = client.listObjectsV2Paginator(request); | ||
return new S3FileIterator(s3Location, iterable.contents().iterator()); | ||
Iterator<S3Object> iterator = client.listObjectsV2Paginator(request).contents() | ||
.stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: put stream on previous line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
|
||
private static boolean isCompletedRestoredObject(S3Object object) | ||
{ | ||
/* There are 3 cases for the restore status: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about simplifying this to
// Only restored objects will have the restoreExpiryDate set.
// Ignore not-restored objects and in-progress restores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
|
||
private static boolean isGlacierObject(S3Object object) | ||
{ | ||
return GLACIER_STORAGE_CLASSES.contains(object.storageClass()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's simplify this to
return (object.storageClass() == GLACIER) || (object.storageClass() == DEEP_ARCHIVE);
Or we could do
return switch (object.storageClass()) {
case GLACIER, DEEP_ARCHIVE -> true;
default -> false;
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
return s3ObjectStorageClassFilter; | ||
} | ||
|
||
@Config("s3.object-storage-class-filter") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, although this is actually a problem with the legacy S3 code. Thoughts on how to improve this?
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
👋 @ericlgoodman @findepi @electrum - this PR has become inactive. We hope you are still interested in working on it. Please let us know, and we can try to get reviewers to help with that. We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks. |
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
Apologies for the delay on following up on this. As @findepi mentioned, putting the skipping logic directly in the file system is problematic for deletes, and we really only want this to apply to Hive file listing. Adding a new listing method just for this is ugly and breaks the abstractions. Looking at this with fresh eyes, I'm thinking we can implement this by adding a |
Sounds good to me. Sorry for the delay on this - we've had a lot of reshuffling lately. Planning on having someone on my team pick this up in the next few weeks. |
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
Hi @electrum, I'll be picking this back up. I was wondering how we should go about implementing the skipping logic in TrinoFileStatusRemoteIterator. If we were to pass the filter to there, how would we go about the hasNext() and next() logic? hasNext() would only allow us to see if there is a next object, whereas in next() we could see if the object matches our filter but we wouldn't be able to just call next() again b/c we aren't sure if it hasNext(). Any help would be appreciated, thank you |
Hi @bangtim, take a look at |
To be completed via #21164 |
Description
TrinoS3FileSystem
contains logic/configuration for skipping Glacier objects found in S3.This PR adds the missing functionality/configuration from
TrinoS3FileSystem
and also adds the capability to configure the filesystem to read restored Glacier objects.Additional context and related issues
Recently, S3 added the
RestoreStatus
to the result ofListObjectsV2
- see https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-s3-restore-status-s3-glacier-objects-list-api/ for more details.Release notes
( ) This is not user-visible or docs only and no release notes are required.
(x) Release notes are required, please propose a release note for me.
() Release notes are required, with the following suggested text: