Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support api of COPY object #728

Merged
merged 4 commits into from
Nov 25, 2019
Merged

Conversation

sunfch
Copy link
Contributor

@sunfch sunfch commented Nov 19, 2019

What this PR does / why we need it:
The requirement of the PR is from aws s3 api, https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html

The PR provide a api that is to create a copy of an object that is already stored in S3. The size of object to copy must be less 5GB, because multipart of Copy Object is not implemented yet.

The implemented features are as follows:

  1. request with conditions. x-amz-copy-source-if-match, x-amz-copy-source-if-none-match, ....
  2. use "x-amz-metadata-directive" to replace with metadata provided in the request when source object is equal to target object.
  3. copy object from a backend to another backend, or copy object in the same backend.

The most common use case is the following example. It will create a copy of object "test.txt" in the same bucket
curl -H "Content-type: application/xml" -H "X-Amz-Copy-Source: /bucket001/test.txt" -X PUT http://127.0.0.1:8089/v1/s3/bucket001/test.txt2

Not implemented:

  1. Access Permissions, ACL --- ACLs and access permissions are not involved in the PR when copy object
  2. Versioning. the same as above
  3. SSE. the same as above

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #727

Special notes for your reviewer:

Release note:

sourceObjectName = splits[1]
}
// If source object is empty, reply back error.
if sourceObjectName == "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check if sourceBucketName equals to "" too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is necessary, and how about your suggestion?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, if sourceBucketName equals to "", that's invalid. How do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your reviewing, I will add sourceBucketName condition check to it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, we should check for both Object and Bucket. Else part for len(splits) == 2 should return error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kumarashit Thanks, I add some lines to return error when the parts of len(splits) is not equal 2

}

// Note that sourceObject and targetObject are pointers
targetObject := &pb.Object{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems targetObject is not used after it is defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that you are right, I will delete it in the next commit, thanks

@@ -21,9 +21,11 @@ import (
"net/http"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please point me to the Requirements document for the Copy Object feature ?

  1. Specifically, is this only for AWS ?
  2. if yes, how different is it from this https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html

Another question, what is our behavior, during copy, w.r.t the following
SSE
Versioning
CORS
ACL
Storage class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhsakarpos , Thanks for your reviewing. The Requirement for the Copy Object is from AWS. The PR implement partial features. It include Copy Object between distinct backends and adding new attribute and checking pre conditions in http header. The PR dont support SSE, versioning, CORS, ACL, Storage class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sunfch if the source object already set SSE, Versioning, etc, will those settings be copied, means set to new object too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfzeng , thanks for your review, SSE and Versioning are not include in the PR now

Signed-off-by: sunfch <sunspot0105@gmail.com>
@hopkings2008
Copy link
Collaborator

LGTM

Signed-off-by: sunfch <sunspot0105@gmail.com>
return err
}
if res.Written < srcObject.Size {
// TODO: delete incomplete object at backend
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sunfch Could you please add a issue to trace those TODO things?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have add a issue #740 to trace it

func (o *Object) GetUpdateMetaSql() (string, []interface{}) {
version := math.MaxUint64 - uint64(o.LastModified)
attrs, _ := json.Marshal(o.CustomAttributes)
sql := "update objects set customattributes =? where bucketname=? and name=? and version=?"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From objectcopy.go in api, it seems user can update customattritubes, storage class, and content type, but here, only customattritubes will updated. Is that a bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfzeng , Thanks for your reviewing, this is a bug. I"ll fix it in next commit

Copy link
Contributor Author

@sunfch sunfch Nov 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acl and content type are updated in latest commit. storage class have not been updated, because there is no storage class in db schema. I add a issue #742 to trace it

log.Errorln("failed to create storage. err:", err)
return err
}
reader, err := srcSd.Get(ctx, srcObject.Object, 0, srcObject.Size)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, it's better to get target bucket and target backend, and create target storage driver before get data from source backend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your reviewing, I will do it as you proposed in next commit

and add contenttype and acl in update sql
and adjust the sequence between source and target backend

Signed-off-by: sunfch <sunspot0105@gmail.com>
sourceObjectName = splits[1]
}
// If source object is empty, reply back error.
if sourceObjectName == "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, we should check for both Object and Bucket. Else part for len(splits) == 2 should return error

api/pkg/s3/objectcopy.go Show resolved Hide resolved
return
}

//TODO: In a versioning-enabled bucket, you cannot change the storage class of a specific version of an object. When you copy it, Amazon S3 gives it a new version ID.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nguptaopensds @rhsakarpos Can we please check this with respect to the versioning we are implementing

api/pkg/s3/objectcopy.go Show resolved Hide resolved
response.AddHeader("x-amz-version-id", result.VersionId)
}

log.Info("COPY object successfully.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the log say that REPLACE object successfully? As this is REPLACE and not copy exactly. And the start of the loop log that as it's replace we are updating meta data only. Thinking from debugging perspective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your reviewing, I will fix it as your suggestion in next commit.

api/pkg/s3/objectcopy.go Show resolved Hide resolved
Add new comment when api of COPY return successfully
fix the error for const number of max object size

Signed-off-by: sunfch <sunspot0105@gmail.com>
Copy link
Collaborator

@kumarashit kumarashit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -30,7 +30,7 @@ const (
MaxPaginationLimit = 1000
DefaultPaginationLimit = MaxPaginationLimit
DefaultPaginationOffset = 0
MaxObjectSize = 5 * 1024 * 1024 // 5GB
MaxObjectSize = 5 * 1024 * 1024 * 1024 // 5GB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just confirming..1024 is the standard right? no 1000. please confirm GB or GiB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kumarashit , Thanks for your reviewingm, I created a 510241024*1024 size file and put it to aws, and aws response with ok, so it should be 1024

Copy link
Collaborator

@kumarashit kumarashit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sfzeng sfzeng merged commit 7045d23 into sodafoundation:development Nov 25, 2019
@sfzeng sfzeng mentioned this pull request Jun 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants