Asynchronously upload pack files #3489

MichaelEischer · 2021-08-22T17:18:03Z

What does this PR change? What problem does it solve?

The backup pipeline can stall if lots of small files are backed up while using a high-latency backend, see #2696. This is caused by the fileSaver and blobSaver having to wait until a pack file is uploaded, if a blob filled up the pack file. By blocking the fileSaver this prevents processing further files and thus effectively limits the upload concurrency to 2 for small files.

This PR lets the repository upload finished packs in the background (in SaveAndEncrypt), without blocking the caller. The caller is only blocked if all uploader goroutines are busy. That way the fileSaver is no longer blocked when backing up small files. The archiver concurrency has been adapted to remove excess goroutines.

The concurrency of the pack uploader is based on the number of connections configured for each backend. For local a default of 2 and for sftp a default of 5 connections is used. Configurable connections for these two backends are implemented in #3475.

To keep the number of temporary pack files low, despite the increased upload concurrency, the packer manager now only maintains a single not-completed pack file. The pack file hash calculation is deferred to the uploader goroutines as otherwise the single pack file would become a bottleneck. Simply dumping data chunks into the temporary pack file is rather fast: for me BenchmarkPackerManager which measures the pack assembly step reports a throughput of 3GB/s.

Was the change previously discussed in an issue or on the forum?

Fixes #2696

Should be combined with #3475.

Checklist

I have read the contribution guidelines.
I have enabled maintainer edits.
I have added tests for all code changes. Only the pack uploader has no own tests. Everything else is already covered by existing tests
I have added documentation for relevant changes (in the manual). I think we should document the -o <backend>.connections=42 parameter somewhere
There's a new file in changelog/unreleased/ that describes the changes for our users (see template).
I have run gofmt on the code in all commits.
All commit messages are formatted in the same style as the other commits in the repo.
I'm done! This pull request is ready for review.

metalsp0rk · 2021-12-03T00:42:55Z

After testing this, I'm seeing much better and more consistent usage of my network connection. Hope to see this implemented soon.

SFTP backend on my local network goes from 550mbps to 875mbps average when combined with #3475 (non-scientific anecdote)

Slind14 · 2022-06-28T11:40:54Z

We ran a benchmark with this in comparison to 0.13.1:

10G Uplink	0.13.1	MichaelEischer:async-pack-uploads	diff
ext. S3	550M	550M	-
Minio	800M	1.4G	👍
Rest Server	2.2G	1.6G	👎

MichaelEischer · 2022-07-02T18:55:34Z

I've rebased the PR to the current master. The upload statistics introduce by #3733 required quite a few changes, as the asynchronous upload no longer allows retrieving the pack header overhead from the call to Finalize. As a replacement there's now a new method Packer.HeaderOverhead which estimates the pack header overhead before finalization. The removal of the tomb package from the archiver by #3785 also required a few changes.

Previously, SaveAndEncrypt would assemble blobs into packs and either return immediately if the pack is not yet full or upload the pack file otherwise. The upload will block the current goroutine until it finishes. Now, the upload is done using separate goroutines. This requires changes to the error handling. As uploads are no longer tied to a SaveAndEncrypt call, failed uploads are signaled using an errgroup. To count the uploaded amount of data, the pack header overhead is no longer returned by `packer.Finalize` but rather by `packer.HeaderOverhead`. This helper method is necessary to continue returning the pack header overhead directly to the responsible call to `repository.SaveBlob`. Without the method this would not be possible, as packs are finalized asynchronously.

Now with the asynchronous uploaders there's no more benefit from using more blob savers than we have CPUs. Thus use just one blob saver for each CPU we are allowed to use.

Large amount of tree savers have no obvious benefit, however they can increase the amount of (potentially large) trees kept in memory.

Use only a single not completed pack file to keep the number of open and active pack files low. The main change here is to defer hashing the pack file to the upload step. This prevents the pack assembly step to become a bottleneck as the only task is now to write data to the temporary pack file. The tests are cleaned up to no longer reimplement packer manager functions.

MichaelEischer · 2022-07-03T09:46:52Z

@Slind14 Thanks for testing. Does adding -o s3.connections=10 / -o rest.connections=10 improve the upload performance?

I'm going to merge this PR despite the rest-server performance regression, but we should still investigate it separately. The problem for me right now is that I have no way to reproduce the performance issues. A local backup to a nvme disk achieves 500MB/s throughput for me with several 1GB files and apparently does not suffer from the performance problem.

Slind14 · 2022-07-03T10:43:15Z

Going into the hundreds of connections we can reach 3G.

3G seems to be the max restic can achieve. We get the same with a server right next to it in LAN. No matter if 10 or 100 connections.

MichaelEischer force-pushed the async-pack-uploads branch 2 times, most recently from ea2055c to 3fc0e8a Compare August 22, 2021 17:22

MichaelEischer mentioned this pull request Sep 7, 2021

Add backup --file-read-concurrency flag #2750

Merged

8 tasks

MichaelEischer force-pushed the async-pack-uploads branch from 3fc0e8a to 52fd4e9 Compare September 22, 2021 19:54

cqjjjzr mentioned this pull request Dec 22, 2021

rclone subprocess signal handling on Windows: tons of "rclone stdio connection already closed" when Ctrl+C #3601

Closed

MichaelEischer force-pushed the async-pack-uploads branch from 52fd4e9 to 98199e4 Compare December 29, 2021 22:46

MichaelEischer mentioned this pull request Dec 29, 2021

Adjust worker goroutines to number of backend connections #3611

Merged

6 tasks

MichaelEischer force-pushed the async-pack-uploads branch from 98199e4 to ecd3270 Compare February 12, 2022 19:21

MichaelEischer force-pushed the async-pack-uploads branch from ecd3270 to 3ad46cd Compare April 9, 2022 10:31

MichaelEischer mentioned this pull request Apr 15, 2022

v0.13 gets stuck on repack step (s3 backend) #3710

Closed

MichaelEischer mentioned this pull request Apr 23, 2022

Fix stuck repack step #3717

Merged

7 tasks

MichaelEischer force-pushed the async-pack-uploads branch from 3ad46cd to 3714f73 Compare April 30, 2022 10:23

MichaelEischer force-pushed the async-pack-uploads branch 2 times, most recently from 5d5aa68 to d5d6107 Compare June 5, 2022 13:05

This was referenced Jun 8, 2022

Backblaze B2 transfer rate limited (not reaching Backblaze's cli tool's speed), presumably due to missing threading #3791

Closed

Feature: min packsize flag #3731

Merged

MichaelEischer force-pushed the async-pack-uploads branch from d5d6107 to de09891 Compare July 2, 2022 18:51

MichaelEischer added 5 commits July 2, 2022 22:42

tag: Remove unnecessary flush call

bba4c69

archiver: Limit blob saver count to GOMAXPROCS

bba1e81

Now with the asynchronous uploaders there's no more benefit from using more blob savers than we have CPUs. Thus use just one blob saver for each CPU we are allowed to use.

archiver: Reduce tree saver concurrency

fa25d61

Large amount of tree savers have no obvious benefit, however they can increase the amount of (potentially large) trees kept in memory.

MichaelEischer force-pushed the async-pack-uploads branch from de09891 to 753e56e Compare July 2, 2022 20:42

MichaelEischer added 2 commits July 3, 2022 11:19

Document connections and compression option

dbb5860

Add changelog for async pack uploads

74df9d5

MichaelEischer merged commit cd50feb into restic:master Jul 3, 2022

MichaelEischer deleted the async-pack-uploads branch July 3, 2022 09:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Asynchronously upload pack files #3489

Asynchronously upload pack files #3489

MichaelEischer commented Aug 22, 2021

metalsp0rk commented Dec 3, 2021 •

edited

Loading

Slind14 commented Jun 28, 2022 •

edited

Loading

MichaelEischer commented Jul 2, 2022

MichaelEischer commented Jul 3, 2022

Slind14 commented Jul 3, 2022

Asynchronously upload pack files #3489

Asynchronously upload pack files #3489

Conversation

MichaelEischer commented Aug 22, 2021

What does this PR change? What problem does it solve?

Was the change previously discussed in an issue or on the forum?

Checklist

metalsp0rk commented Dec 3, 2021 • edited Loading

Slind14 commented Jun 28, 2022 • edited Loading

MichaelEischer commented Jul 2, 2022

MichaelEischer commented Jul 3, 2022

Slind14 commented Jul 3, 2022

metalsp0rk commented Dec 3, 2021 •

edited

Loading

Slind14 commented Jun 28, 2022 •

edited

Loading