-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Zstandard to System.IO.Compression #59591
Comments
Tagging subscribers to this area: @dotnet/area-system-io-compression Issue DetailsZstandard (or Zstd) is a fast compression algorithm that was published by Facebook in 2015, and had its first stable release in May 2021. Their official repo offers a C implementation. https://github.com/facebook/zstd Data compression mechanism specification: https://datatracker.ietf.org/doc/html/rfc8478 Features:
It's used by:
We could offer a stream-based class, like we do for Deflate with
|
It would be a great enhancement for .Net, but also for the public visibility of this impressive compression algorithm.
Open questions:
|
Thank you, @manandre for your offer! Let's start by discussing the stream API. I think it makes sense for the stream class to look very similar to Deflate, since both would only wrap a compression algorithm (unlike the Zip, GZip, ZLib APIs, which additionally represent a compression/archiving format). I am thinking we can avoid creating too many constructors by creating a separate The
Questions
var options = new ZStandardOptions(level: CompressionLevel.SmallestSize) { Level = -5 };
namespace System.IO.Compression
{
public class ZStandardOptions
{
/// <summary>Allow mapping the CompressionLevel enum to predefined levels for ZStandard:
/// - CompressionLevel.NoCompression = 1, // Official normal minimum
/// - CompressionLevel.Fastest = 1, // Official normal minimum
/// - CompressionLevel.Optimal = 3, // Official default: ZSTD_CLEVEL_DEFAULT
/// - CompressionLevel.SmallestSize = 22 // Official maximum: ZSTD_MAX_CLEVEL
/// </summary>
public ZStandardOptions(CompressionLevel level);
// Min = ZSTD_minCLevel() which can be negative, Max=ZSTD_maxCLevel()=22, Default=ZSTD_CLEVEL_DEFAULT=3, throw if out-of-bounds
int CompressionLevel { get; set; }
CompressionMode Mode { get; set; }
bool LeaveOpen { get; set; }
static int MaxCompressionLevel { get; } // P/Invoke for current maximum: 22
}
public class ZStandardStream : Stream
{
public ZStandardStream(Stream stream, ZStandardOptions? options); // If options null, then use default values
public Stream BaseStream { get; }
public override bool CanRead { get; }
public override bool CanSeek { get; }
public override bool CanWrite { get; }
public override long Length { get; }
public override long Position { get; set; }
public override IAsyncResult BeginRead(byte[] buffer, int offset, int count, AsyncCallback? asyncCallback, object? asyncState);
public override IAsyncResult BeginWrite(byte[] buffer, int offset, int count, AsyncCallback? asyncCallback, object? asyncState);
public override void CopyTo(Stream destination, int bufferSize);
public override Task CopyToAsync(Stream destination, int bufferSize, CancellationToken cancellationToken);
protected override void Dispose(bool disposing);
public override ValueTask DisposeAsync();
public override int EndRead(IAsyncResult asyncResult);
public override void EndWrite(IAsyncResult asyncResult);
public override void Flush();
public override Task FlushAsync(CancellationToken cancellationToken);
public override int Read(byte[] buffer, int offset, int count);
public override int Read(Span<byte> buffer);
public override Task<int> ReadAsync(byte[] buffer, int offset, int count, CancellationToken cancellationToken);
public override ValueTask<int> ReadAsync(Memory<byte> buffer, CancellationToken cancellationToken = default(CancellationToken));
public override int ReadByte();
public override long Seek(long offset, SeekOrigin origin);
public override void SetLength(long value);
public override void Write(byte[] buffer, int offset, int count);
public override void Write(ReadOnlySpan<byte> buffer);
public override void WriteByte(byte value); // ZLibStream overrides it, but not Deflate/GZipStream
public override Task WriteAsync(byte[] buffer, int offset, int count, CancellationToken cancellationToken);
public override ValueTask WriteAsync(ReadOnlyMemory<byte> buffer, CancellationToken cancellationToken = default(CancellationToken));
}
} |
|
FYI @VSadov this may be particularly interesting to single-file compression as it is supposed to be very fast for decompression. This might mean we would need deeper runtime integration to be usable during bundler loading. |
How does the multi-threading work internally? Does it integrate somehow with the usual .NET infrastructure ( I wonder about that because sometimes you need threading to play nice with what else lives in the same process. In a web app, multi-threading could cause load spikes that crowd out request work from the CPU. Reducing the DOP is only a partial fix because multiple parallel compression jobs would again saturate all cores and cause the problem to reappear. Isolating such work onto a custom thread pool can be a solution and it would not work if the library starts its own threads. Another concern would be startup overhead for multi-threading inside the library. Is there thread pooling? It seems to me that
|
About thread pooling, the /* ! Thread pool :
* These prototypes make it possible to share a thread pool among multiple compression contexts.
* This can limit resources for applications with multiple threads where each one uses
* a threaded compression mode (via ZSTD_c_nbWorkers parameter).
* ZSTD_createThreadPool creates a new thread pool with a given number of threads.
* Note that the lifetime of such pool must exist while being used.
* ZSTD_CCtx_refThreadPool assigns a thread pool to a context (use NULL argument value
* to use an internal thread pool).
* ZSTD_freeThreadPool frees a thread pool, accepts NULL pointer.
*/
typedef struct POOL_ctx_s ZSTD_threadPool;
ZSTDLIB_API ZSTD_threadPool* ZSTD_createThreadPool(size_t numThreads);
ZSTDLIB_API void ZSTD_freeThreadPool (ZSTD_threadPool* pool); /* accept NULL pointer */
ZSTDLIB_API size_t ZSTD_CCtx_refThreadPool(ZSTD_CCtx* cctx, ZSTD_threadPool* pool); |
Zstandard would be very useful to single-file compression. We currently use ZLib/Deflate as it is available in the runtime, but would prefer something faster as impact of decompression is very noticeable at start up. We did examine lz4 and Zstd as alternative choices of which lz4 is faster at decompression, but Zstd would allow to keep the same compression ratio as with Deflate. If there is Zstd support in the runtime, single-file compression will definitely switch to it. |
Here are some interesting benchmarks: google/brotli#553. ZStandard offers a really nice trade-off for speed and compression ratio. |
It looks like Chrome may also be getting support for decoding zstd encoded content, making this also relevant to web / cloud scenarios. https://chromestatus.com/feature/6186023867908096 Putting in my vote or support, and hoping to see this prioritized in the .NET 9.0 planning. UPDATE: Chrome has confirmed that they are shipping zstd support in v123. |
I have open dotnet/aspnetcore#50643 to support the zstd Content-Encoding in ASP .NET Core. |
+1 |
Is there any plan to support it in Net 9.0? |
Chrome 123 release support zstd
Could you consider it for .NET 9 ? |
It's super cool to see they released this in Chrome. I think the biggest motivating factor for getting this work done is that ASP.NET can support It looks like Facebook.com is already serving webpages with Most implementations bind to the native Facebook libs, but there are a few existing c# projects that are ports, like: |
Since the 126 release Mozilla Firefox also supports zstd compression: https://www.mozilla.org/en-US/firefox/126.0/releasenotes/ |
This for net 9 would be awesome, it would also be great for other algorithms like lzma2. |
I noticed that this issue has been open for a few years now, and I was wondering if there are any plans to add Zstandard (Zstd) support to .NET. If not, I’d be happy to contribute to help implement this feature. Given the performance benefits and the wide adoption of Zstd, I think it would be a great addition to the framework. If there are any steps or guidelines you can share, I’d love to assist in moving this forward. Looking forward to your feedback and guidance! Thanks! |
This comment has been minimized.
This comment has been minimized.
@siyavash1984 thank you! We still need to propose the APIs first. Here's the process: https://github.com/dotnet/runtime/blob/43813ac73242fa78c463d456bf755e3a6622b5d7/docs/project/api-review-process.md At the moment we have this initial proposal #59591 (comment) and one reply discussing it. Additional feedback and discussion is welcome on these APIs (or additional proposed ones) to keep this moving. |
In terms of API proposal:
|
Tagging subscribers to this area: @dotnet/area-system-io-compression |
Zstandard (or Zstd) is a fast compression algorithm that was published by Facebook in 2015, and had its first stable release in May 2021.
Their official repo offers a C implementation. https://github.com/facebook/zstd
Data compression mechanism specification: https://datatracker.ietf.org/doc/html/rfc8478
Features:
It's used by:
We could offer a stream-based class, like we do for Deflate with
DeflateStream
orGZipStream
, but we should also consider offering a stream-less static class, since it's a common request.The text was updated successfully, but these errors were encountered: