-
Notifications
You must be signed in to change notification settings - Fork 453
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* rfc: Multipart Signed-off-by: Xuanwo <github@xuanwo.io> * Assign number Signed-off-by: Xuanwo <github@xuanwo.io> * Fix typo Signed-off-by: Xuanwo <github@xuanwo.io> * Add guide-level explanation Signed-off-by: Xuanwo <github@xuanwo.io> * Fix other parts Signed-off-by: Xuanwo <github@xuanwo.io>
- Loading branch information
Showing
3 changed files
with
166 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,163 @@ | ||
- Proposal Name: `multipart` | ||
- Start Date: 2022-07-11 | ||
- RFC PR: [datafuselabs/opendal#438](https://github.com/datafuselabs/opendal/pull/438) | ||
- Tracking Issue: [datafuselabs/opendal#439](https://github.com/datafuselabs/opendal/issues/439) | ||
|
||
# Summary | ||
|
||
Add multipart support in OpenDAL. | ||
|
||
# Motivation | ||
|
||
[Multipart Upload](https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html) APIs are widely used in object storage services to upload large files concurrently and resumable. | ||
|
||
A successful multipart upload includes the following steps: | ||
|
||
- `CreateMultipartUpload`: Start a new multipart upload. | ||
- `UploadPart`: Upload a single part with the previously uploaded id. | ||
- `CompleteMultipartUpload`: Complete a multipart upload to get a regular object. | ||
|
||
To cancel a multipart upload, users need to call `AbortMultipartUpload`. | ||
|
||
Apart from those APIs, most object services also provide a list API to get the current multipart uploads status: | ||
|
||
- `ListMultipartUploads`: List current ongoing multipart uploads | ||
- `ListParts`: List already uploaded parts. | ||
|
||
Before `CompleteMultipartUpload` has been called, users can't read already uploaded parts. | ||
|
||
After `CompleteMultipartUpload` or `AbortMultipartUpload` has been called, all uploaded parts will be removed. | ||
|
||
Object storage services commonly allow 10000 parts, and every part will allow up to 5 GiB. This way, users can upload a file up to 48.8 TiB. | ||
|
||
OpenDAL users can upload objects larger than 5 GiB via supporting multipart uploads. | ||
|
||
# Guide-level explanation | ||
|
||
Users can start a multipart upload via: | ||
|
||
```rust | ||
let mp = op.object("path/to/file").create_multipart().await?; | ||
``` | ||
|
||
Or build a multipart via already known upload id: | ||
|
||
```rust | ||
let mp = op.object("path/to/file").into_multipart("<upload_id>"); | ||
``` | ||
|
||
With `Multipart`, we can upload a new part: | ||
|
||
```rust | ||
let part = mp.write(part_number, content).await?; | ||
``` | ||
|
||
After all parts have been uploaded, we can finish this upload: | ||
|
||
```rust | ||
let _ = mp.complete(parts).await?; | ||
``` | ||
|
||
Or, we can abort already uploaded parts: | ||
|
||
```rust | ||
let _ = mp.abort().await?; | ||
``` | ||
|
||
# Reference-level explanation | ||
|
||
`Accessor` will add the following APIs: | ||
|
||
```rust | ||
pub trait Accessor: Send + Sync + Debug { | ||
async fn create_multipart(&self, args: &OpCreateMultipart) -> Result<String> { | ||
let _ = args; | ||
unimplemented!() | ||
} | ||
|
||
async fn write_multipart(&self, args: &OpWriteMultipart) -> Result<PartWriter> { | ||
let _ = args; | ||
unimplemented!() | ||
} | ||
|
||
async fn complete_multipart(&self, args: &OpCompleteMultipart) -> Result<()> { | ||
let _ = args; | ||
unimplemented!() | ||
} | ||
|
||
async fn abort_multipart(&self, args: &OpAbortMultipart) -> Result<()> { | ||
let _ = args; | ||
unimplemented!() | ||
} | ||
} | ||
``` | ||
|
||
While closing a `PartWriter`, a `Part` will be generated. | ||
|
||
`Operator` will build APIs based on `Accessor`: | ||
|
||
```rust | ||
impl Object { | ||
async fn create_multipart(&self) -> Result<Multipart> {} | ||
fn into_multipart(&self, upload_id: &str) -> Multipart {} | ||
} | ||
|
||
impl Multipart { | ||
async fn write(&self, part_number: usize, bs: impl AsRef<[u8]>) -> Result<Part> {} | ||
async fn writer(&self, part_number: usize, size: u64) -> Result<impl PartWrite> {} | ||
async fn complete(&self, ps: &[Part]) -> Result<()> {} | ||
async fn abort(&self) -> Result<()> {} | ||
} | ||
``` | ||
|
||
# Drawbacks | ||
|
||
None. | ||
|
||
# Rationale and alternatives | ||
|
||
## Why not add new object modes? | ||
|
||
It seems natural to add a new object mode like `multipart`. | ||
|
||
```rust | ||
pub enum ObjectMode { | ||
FILE, | ||
DIR, | ||
MULTIPART, | ||
Unknown, | ||
} | ||
``` | ||
|
||
However, to make this work, we need big API breaks that introduce `mode` in Object. | ||
|
||
And we need to change every API call to accept `mode` as args. | ||
|
||
For example: | ||
|
||
```rust | ||
let _ = op.object("path/to/dir/").list(ObjectMODE::MULTIPART); | ||
let _ = op.object("path/to/file").stat(ObjectMODE::MULTIPART) | ||
``` | ||
|
||
## Why not split Object into File and Dir? | ||
|
||
We can split `Object` into `File` and `Dir` to avoid requiring `mode` in API. There is a vast API breakage too. | ||
|
||
# Prior art | ||
|
||
None. | ||
|
||
# Unresolved questions | ||
|
||
None. | ||
|
||
# Future possibilities | ||
|
||
## Support list multipart uploads | ||
|
||
We can support listing multipart uploads to list ongoing multipart uploads so we can resume an upload or abort them. | ||
|
||
## Support list part | ||
|
||
We can support listing parts to list already uploaded parts for an upload. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1c9f9d7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deploy preview for opendal ready!
✅ Preview
https://opendal-1y2c4tvhw-databend.vercel.app
Built with commit 1c9f9d7.
This pull request is being automatically deployed with vercel-action