Skip to content
This repository has been archived by the owner on Oct 18, 2023. It is now read-only.

proposal: Add more storage services support for bottomless #711

Open
Xuanwo opened this issue Sep 28, 2023 · 3 comments
Open

proposal: Add more storage services support for bottomless #711

Xuanwo opened this issue Sep 28, 2023 · 3 comments

Comments

@Xuanwo
Copy link

Xuanwo commented Sep 28, 2023

Summary

Add more storage services support for bottomless

Motivation

bottomless implements a virtual write-ahead log (WAL) which continuously backs up the data to S3-compatible storage and is able to restore it later. It's natural to consider extending this feature to other storage services such as GCS, AzBlob, HDFS, and more.

Guide-level explanation

Users can serve and repliace sqlite files stored at gcs, azblob in the same way as they are at AWS S3:

LIBSQL_BOTTOMLESS_GCS_BUCKET=<bucket>

or

LIBSQL_BOTTOMLESS_AZBLOB_BUCKET=<bucket>
LIBSQL_BOTTOMLESS_AZBLOB_ACCOUNT_NAME=<account_name>
LIBSQL_BOTTOMLESS_AZBLOB_ACCOUNT_KEY=<account_key>

Reference-level explanation

Introduces OpenDAL to handle the IO operations.

OpenDAL is a data access layer that allows users to easily and efficiently retrieve data from various storage services in a unified way. It's now natively s3, gcs, azblob, oss, hdfs and over 20 different storage services. OpenDAL is used in many cloud native databases like databend, risingwave and greptime.

I'm one of the maintainers of this project 💌

The general usage of OpenDAL will be like:

// Init s3
let mut builder = services::S3::default();
builder.bucket("test");
let op = Operator::new(builder)?.finish();

// A reader implements AsyncRead & AsyncSeek.
let r = op.reader("path/to/file").await?;

// A writer implements AsyncWrite
let w = op.writer("path/to/file").await?;

// A lister implement Stream<Item=Result<Entry>>
let l = op.lister("path/to/dir").await?;

We can add opendal in following steps:

  • Move s3 related config to a seperate S3Options instead of a large Options.
  • Add gcs or azblob support as PoC.
  • Migrate s3 implemenation to OpenDAL too (it depends)

Drawbacks

Make the code and testing more complex to ensure that bottomless works on all storage services, even though OpenDAL has already tested all those services.

Rationale and alternatives

Use storage vendors SDK

We can use the SDK provided by storage vendors to implement the same features.

Good:

Visit storage features directly instead of adding an unified abstraction like OpenDAL.

Bad:

  • More dependences to be added (OpenDAL implement all features without those SDKs)
  • Harder to alter their behaviors (For example, adding logging/metrics/tracing for all services)

Stick to S3-compatible storage

We can stick to S3-compatible storage since most storage services provide S3 API.

Good:

Easy to maintain

Bad:

Users have to access the bucket with static keys as they are unable to utilize IAM, which is a native feature of storage.

For instance, OpenDAL users on GCP can utilize Application Default Credentials (ADC) without the need for manual configuration of credentials.

@psarna
Copy link
Contributor

psarna commented Sep 29, 2023

Good direction, we want to support more providers anyway 👌

@Xuanwo
Copy link
Author

Xuanwo commented Oct 7, 2023

Hi, @psarna. Since there are no objections to this proposal, it's highly likely to be accepted. Would you like to discuss the details of this plan? Does it seem good to you?

  • Move s3 related config to a seperate S3Options instead of a large Options.
  • Add gcs or azblob support as PoC.
  • Migrate s3 implemenation to OpenDAL too (it depends)

@psarna
Copy link
Contributor

psarna commented Oct 9, 2023

Yes, sounds reasonable

Migrate s3 implemenation to OpenDAL too (it depends)

Right, at first we want to keep our existing implementation as is, in case there's an implementation detail in aws-sdk-s3 crate that we depend on. But once we test OpenDAL's S3 impl enough, we can just switch 100% to the unified approach

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants