Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream the parallel xz/gz tarball generation #76

Merged
merged 1 commit into from
Jan 17, 2018

Commits on Jan 17, 2018

  1. Stream the parallel xz/gz tarball generation

    This melds the serial-`Tee` and parallel-batched approaches from before
    and after commit adea17e.  Now we can get the same multithreaded speedup
    without having to build the entire uncompressed tarball in memory first.
    
    The new `impl Write for RayonTee` uses `rayon::join` to split the
    compression work for each buffer to separate threads.  This is scoped,
    so it can be fully zero-copy, sharing the input buffer directly.  This
    is all wrapped in a 1 MiB `BufWriter` to balance the cost of thread
    wake-ups and synchronization.
    
    The net performance is unchanged, using around 125% CPU -- approximately
    4:1 time spent in xz versus gz.  The overall memory use is much reduced,
    now independent of the tarball size -- just a few MiB on top of the
    fixed-cost 674 MiB compressor memory requirements of `xz -9`.
    cuviper committed Jan 17, 2018
    Configuration menu
    Copy the full SHA
    5926023 View commit details
    Browse the repository at this point in the history