HDDS-10521. ETag field should not be returned during GetObject if the key does not contain ETag field #6377
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
A user encountered this error when it tries to download a key using the AWS S3 Java SDK (version 1.11.415).
Although the key was able to be downloaded using AWS s3api
The problem can be replicated even with the current version by uploading a file with ofs (which does not populate the ETag field) and downloading it using AWS S3 SDK.
It seems object without ETag field was able to be downloaded using AWS CLI, but not AWS Java SDK.
After looking at the AWS SDK code it seems that AWS SDK will do a post-processing step that will validate the ETag field of the downloaded object to the object's content (See
BinaryUtils#fromHex
inAmazonS3Client#postProcessS3Object
). If the ETag field is null, the post-processing step will skip the validation.Currently, S3G returns a string "null" for the ETag field if the ETag field does not exist, which should cause the AWS SDK to not be able to parse the string since it md5 string is longer than the "null" string. This is most probably why there is an ArrayIndexOutOfBoundsException
The current solution is to not return the ETag field at all if the key does not contain ETag to begin with. This way the post processing step in the AWS SDK will not validate the md5 hash. These are applied for GET operations since other operations do not seem to have this post processing steps.
Future improvement: We might need to add some test coverage for AWS SDK (maybe in the integration tests) since the behavior might be different than AWS CLI.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10521
How was this patch tested?
Manual tests: Creating a S3 key with ofs and downloading it with the AWS Java SDK GetObject.
In the future after S3G support in MiniOzoneCluster HDDS-10390, we can have S3 compatibility tests using AWS SDK (on top of AWS CLI in acceptance test and unit tests).
Clean CI run: https://github.com/ivandika3/ozone/actions/runs/8277395144