Skip to content
This repository has been archived by the owner on Feb 8, 2024. It is now read-only.

rgw_sal_motr:[CORTX-33799] Handle progress_cb response in case of copy-object operation failure. #386

Merged
merged 3 commits into from
Aug 12, 2022

Conversation

shriya-deshmukh
Copy link

@shriya-deshmukh shriya-deshmukh commented Aug 9, 2022

Problem:
If copy operation failed after copying some data, its throwing an invalid XML error due to partial progress response. i.e rgw was not closing 'CopyObjectResult' tag in case of failure.

progress_cb() is not sending copy progress percentage or the total size of the object being copied during the copy operation, but periodically keeps sending copied bytes till now, which is not enough to know copy progress anyways as total obj size not known, and in negative cases, no proper response is sent by Copy object, instead progress_cb() response is seen, which is originally meant as partial response only.

Solution:
Replace progress_cb() call from rgw_sal_motr layer, with dump_continue() signal, which responds with HTTP 100 Continue header to avoid client timeout during the long copy operation.

Behavior comparison with AWS S3 & RGW(RADOS)

  1. AWS/RGW not sending the copy progress during copy operation, still no client connection timeout seen.
  2. In MGW(Motr) case, progress_cb() is using chunked encoding, which is keeping the client connection alive, though error handling in in negative cases is missing.

Signed-off-by: Shriya Deshmukh shriya.deshmukh@seagate.com

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

 copy-object operation failure.

Problem:
If copy operation failed after copying some data, then it was throwing
an invalid XML error due to partial progress response. i.e rgw was not closing
'CopyObjectResult' tag in case of failure.

Solution:
Remove progress_cb call from rgw_sal_motr layer, and send 'dump_continue' signal
to avoid client timeout during copy operation.

Signed-off-by: Shriya Deshmukh <shriya.deshmukh@seagate.com>
Copy link

@sachitanands sachitanands left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good.
please test with ceph test tool job

@shriya-deshmukh
Copy link
Author

Copy link

@cdeshmukh cdeshmukh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In PR solution, also mention what is the AWS & RADOS behavior and if anything expected from copy progress perspective, like copy % finished or remaining, todays progress_cb doesnt serve that since it doesnt tell what is the total size of object being copied, so remaining copy cant be determined anyways

src/rgw/rgw_sal_motr.cc Outdated Show resolved Hide resolved
Signed-off-by: Shriya Deshmukh <shriya.deshmukh@seagate.com>
@shriya-deshmukh
Copy link
Author

In PR solution, also mention what is the AWS & RADOS behavior and if anything expected from copy progress perspective, like copy % finished or remaining, todays progress_cb doesnt serve that since it doesnt tell what is the total size of object being copied, so remaining copy cant be determined anyways

done

Copy link

@sjain09 sjain09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment.

src/rgw/rgw_sal_motr.cc Outdated Show resolved Hide resolved
Signed-off-by: Shriya Deshmukh <shriya.deshmukh@seagate.com>
Copy link

@sjain09 sjain09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@cdeshmukh cdeshmukh merged commit 79cb730 into Seagate:main Aug 12, 2022
@shriya-deshmukh shriya-deshmukh deleted the CORTX-33799 branch August 12, 2022 06:53
jjxsg pushed a commit that referenced this pull request Aug 22, 2022
…y-object operation failure. (#386)

Problem: If copy operation failed after copying some data, then it was throwing an invalid XML error due to partial progress response. i.e rgw was not closing 'CopyObjectResult' tag in case of failure.

Solution: Remove progress_cb call from rgw_sal_motr layer, and send 'dump_continue' signal to avoid client timeout during copy operation.

Signed-off-by: Shriya Deshmukh <shriya.deshmukh@seagate.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants