Skip to content

Commit

Permalink
Update documentation for new read/write specific part size arguments (#…
Browse files Browse the repository at this point in the history
…960)

* Update documentation for new read/write specific part size arguments

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

* Add changelog entry

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

* Update changelog entry

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

---------

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>
  • Loading branch information
dannycjones authored Aug 1, 2024
1 parent 3efed3d commit 0db2844
Show file tree
Hide file tree
Showing 4 changed files with 10 additions and 4 deletions.
4 changes: 2 additions & 2 deletions doc/CONFIGURATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,13 +235,13 @@ Despite these configurations, [IAM permissions](#iam-permissions) still always a
At mount time, Mountpoint automatically selects appropriate defaults to provide high-performance access to Amazon S3. These defaults include [Amazon S3 performance best practices](https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html) such as scaling requests across multiple S3 connections, using range `GET` requests to parallelize sequential reads, and using request timeouts and retries. Most applications should not need to adjust these defaults, but if necessary, you can change them in several ways:
* Mountpoint scales the number and rate of parallel requests to meet a targeted maximum network throughput. This maximum is shared across all file and directory accesses made by a single Mountpoint process. By default, Mountpoint sets this maximum network throughput to the [available network bandwidth](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-network-bandwidth.html) when running on an EC2 instance or to 10 Gbps elsewhere. To change this default, use the `--maximum-throughput-gbps` command-line argument, providing a value in gigabits-per-second (Gbps). For example, if you have multiple Mountpoint processes on the same instance, you can adjust this argument to partition the available network bandwidth between them.
* By default, Mountpoint can serve up to 16 concurrent file or directory operations, and automatically scales up to reach this limit. If your application makes more than this many concurrent reads and writes (including to the same or different files), you can improve performance by increasing this limit with the `--max-threads` command-line argument. Higher values of this flag might cause Mountpoint to use more of your instance's resources.
* When reading or writing files to S3, Mountpoint divides them into parts and uses parallel requests to improve throughput. You can change the part size Mountpoint uses for these parallel requests using the `--part-size` command-line argument, providing a maximum number of bytes per part. The default value of this argument is 8 MiB (8,306,688 bytes), which in our testing is the highest value that achieves maximum throughput. Higher values of this argument can reduce the number of billed requests Mountpoint makes, but also reduce the throughput of object reads and writes to S3.
* When reading or writing files to S3, Mountpoint divides them into parts and uses parallel requests to improve throughput. You can change the part size Mountpoint uses for these parallel requests using the `--read-part-size` and `--write-part-size` command-line arguments, providing a maximum number of bytes per part for reading or writing respectively. For Mountpoint v1.7.2 or earlier, use `--part-size` instead. The default value for these arguments is 8 MiB (8,306,688 bytes), which in our testing is the largest value that achieves maximum throughput. Larger values can reduce the number of billed requests Mountpoint makes, but also reduce the throughput of object reads and writes to S3.

### Maximum object size

In its default configuration, there is no maximum on the size of objects Mountpoint can read. However, Mountpoint uses [multipart upload](https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html) when writing new objects, and multipart upload allows a maximum of 10,000 parts for an object. This means Mountpoint can only upload objects up to 80,000 MiB (78.1 GiB) in size. If your application tries to write objects larger than this limit, writes will fail with an out of space error.

To increase the maximum object size for writes, use the `--part-size` command-line argument to specify a maximum number of bytes per part, which defaults to 8 MiB. The maximum object size will be 10,000 multiplied by the value you provide for this argument. Even with multipart upload, S3 allows a maximum object size of 5 TiB, and so setting this argument higher than 524.3 MiB will not further increase the object size limit.
To increase the maximum object size for writes, use the `--write-part-size` command-line argument to specify a maximum number of bytes per part, which defaults to 8 MiB. The maximum object size will be 10,000 multiplied by the value you provide for this argument. Even with multipart upload, S3 allows a maximum object size of 5 TiB, and so setting this argument higher than 524.3 MiB will not further increase the object size limit.

### Automatically mounting an S3 bucket at boot

Expand Down
2 changes: 1 addition & 1 deletion doc/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,5 +259,5 @@ WARN write{req=100 ino=5 fh=2 offset=83886080000 length=1048576 pid=100 name="20
mountpoint_s3::fuse: write failed: upload error: object exceeded maximum upload size of 83886080000 bytes
```

For workloads uploading files larger than 78GiB, we recommend configuring a larger part size using the `--part-size <MiB>` command-line argument.
For workloads uploading files larger than 78GiB, we recommend configuring a larger part size using the `--write-part-size <MiB>` command-line argument.
For more information, see [Mountpoint's configuration documentation](https://github.com/awslabs/mountpoint-s3/blob/main/doc/CONFIGURATION.md#maximum-object-size).
6 changes: 6 additions & 0 deletions mountpoint-s3/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
## Unreleased

### New features

* Mountpoint now offers two new command-line arguments `--read-part-size <SIZE>` and `--write-part-size <SIZE>` which allow to specify different part sizes to be used when reading and writing respectively.

## v1.7.2 (June 17, 2024)

### Other changes

* Fix an issue where reading a file through Mountpoint could fail, even if the corresponding S3 GetObject request had succeeded. ([#917](https://github.com/awslabs/mountpoint-s3/pull/917))

## v1.7.1 (June 14, 2024)
Expand Down
2 changes: 1 addition & 1 deletion mountpoint-s3/src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ pub struct CliArgs {

#[clap(
long,
help = "Part size for multi-part GET in bytes [default: 8388608]",
help = "Part size for GET in bytes [default: 8388608]",
value_name = "SIZE",
value_parser = value_parser!(u64).range(1..usize::MAX as u64),
help_heading = CLIENT_OPTIONS_HEADER,
Expand Down

0 comments on commit 0db2844

Please sign in to comment.