Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharding middleware #325

Closed
gaul opened this issue Jul 7, 2020 · 0 comments
Closed

Sharding middleware #325

gaul opened this issue Jul 7, 2020 · 0 comments

Comments

@gaul
Copy link
Owner

gaul commented Jul 7, 2020

Some S3 implementations have hotspots if key names distribute poorly, e.g., date and timestamp prefixes:

https://aws.amazon.com/blogs/aws/amazon-s3-performance-tips-tricks-seattle-hiring-event/

S3Proxy could support a middleware which adds a prefix of the key name hash to the key name. This poses challenges for listing objects but this operation could either be disabled or return the prefixed name instead of the user-provided name.

@gaul gaul added the middleware label Jul 7, 2020
timuralp added a commit to timuralp/s3proxy that referenced this issue Apr 8, 2021
Adds the sharded bucket middleware, which allows for splitting objects
across multiple backend buckets for a given virtual bucket. The
middleware should be configured as:
s3proxy.sharded-blobstore.<bucket name>.shards=<number of shards>
s3proxy.sharded-blobstore.<bucket name>.prefix=<prefix>.

All shards are named <prefix>-<index>, where index is an
integer from 0 to <number of shards> - 1. If the <prefix> is not
supplied, the <bucket name> is used as the prefix.

Listing the virtual bucket and multipart uploads are not supported. When
listing all containers, the shards are elided from the result.

Fixes gaul#325
Fixes gaul#351
timuralp added a commit to timuralp/s3proxy that referenced this issue Apr 8, 2021
Adds the sharded bucket middleware, which allows for splitting objects
across multiple backend buckets for a given virtual bucket. The
middleware should be configured as:
s3proxy.sharded-blobstore.<bucket name>.shards=<number of shards>
s3proxy.sharded-blobstore.<bucket name>.prefix=<prefix>.

All shards are named <prefix>-<index>, where index is an
integer from 0 to <number of shards> - 1. If the <prefix> is not
supplied, the <bucket name> is used as the prefix.

Listing the virtual bucket and multipart uploads are not supported. When
listing all containers, the shards are elided from the result.

Fixes gaul#325
Fixes gaul#351
timuralp added a commit to timuralp/s3proxy that referenced this issue May 3, 2021
Adds the sharded bucket middleware, which allows for splitting objects
across multiple backend buckets for a given virtual bucket. The
middleware should be configured as:
s3proxy.sharded-blobstore.<bucket name>.shards=<number of shards>
s3proxy.sharded-blobstore.<bucket name>.prefix=<prefix>.

All shards are named <prefix>-<index>, where index is an
integer from 0 to <number of shards> - 1. If the <prefix> is not
supplied, the <bucket name> is used as the prefix.

Listing the virtual bucket and multipart uploads are not supported. When
listing all containers, the shards are elided from the result.

Fixes gaul#325
Fixes gaul#351
timuralp added a commit to timuralp/s3proxy that referenced this issue May 8, 2021
Adds the sharded bucket middleware, which allows for splitting objects
across multiple backend buckets for a given virtual bucket. The
middleware should be configured as:
s3proxy.sharded-blobstore.<bucket name>.shards=<number of shards>
s3proxy.sharded-blobstore.<bucket name>.prefix=<prefix>.

All shards are named <prefix>-<index>, where index is an
integer from 0 to <number of shards> - 1. If the <prefix> is not
supplied, the <bucket name> is used as the prefix.

Listing the virtual bucket and multipart uploads are not supported. When
listing all containers, the shards are elided from the result.

Fixes gaul#325
Fixes gaul#351
timuralp added a commit to timuralp/s3proxy that referenced this issue May 8, 2021
Adds the sharded bucket middleware, which allows for splitting objects
across multiple backend buckets for a given virtual bucket. The
middleware should be configured as:
s3proxy.sharded-blobstore.<bucket name>.shards=<number of shards>
s3proxy.sharded-blobstore.<bucket name>.prefix=<prefix>.

All shards are named <prefix>-<index>, where index is an
integer from 0 to <number of shards> - 1. If the <prefix> is not
supplied, the <bucket name> is used as the prefix.

Listing the virtual bucket and multipart uploads are not supported. When
listing all containers, the shards are elided from the result.

Fixes gaul#325
Fixes gaul#351
@gaul gaul closed this as completed in 0d8f9aa Jun 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant