Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brotli Decompression issue, when not using compressionlevel.fastest #243

Closed
FilixDark opened this issue Mar 10, 2024 · 5 comments
Closed

Comments

@FilixDark
Copy link

I am having an issue with BrotliCompression and Decompression when not using Fastest. The code works fine with fastest, but I would really love to use smallest size for archieving stuff.

This is how I implemented it.

compression

using (var fs = new FileStream(filename+".compressed", FileMode.CreateNew)) {
    using (var cp = new BrotliCompressor(System.IO.Compression.CompressionLevel.Optimal)) {
        MemoryPackSerializer.Serialize(cp, log);
        await cp.CopyToAsync(fs);
    }
}

decompression

using (var fs = new FileStream(filename, FileMode.Open)) { 
    var buffer = new byte[fs.Length];
    await fs.ReadAsync(buffer, 0, buffer.Length);
    using (var dc = new BrotliDecompressor()) {
        var decomp = dc.Decompress(new ReadOnlySpan<byte>(buffer));   // This line is throwing 
// MemoryPack.MemoryPackSerializationException: 'Failed in Brotli compression/decompression process, status: NeedMoreData.'
        var log = MemoryPackSerializer.Deserialize<Tradelog>( decomp);
        return log;
    }
}
@hadashiA
Copy link
Contributor

hadashiA commented Mar 14, 2024

Your sample seems to have worked on my end.

I would like you to confirm the following

await fs.ReadAsync(buffer, 0, buffer.Length);

Here, is ReadAsync reading buffer.Length correctly?
This third argument is the "maximum number of bytes". So it is possible that ReadAsync cannot read the specified number of bytes at a time.

I think you need to check the return value of ReadAsync and read repeatedly if there are not enough bytes to read. For example, as follows.

var readBytes = 0;
while (readBytes < length)
{
    readBytes += await fs.ReadAsync(buffer, 0, buffer.Length);
}

If the problem persists, please let us know the environment in which you ran each serialization and deserialization. Thanks.

@FilixDark
Copy link
Author

FilixDark commented Mar 14, 2024

Hi, thanks for the reply. I tried this and sadly it was reading always the full size, but implemented it anyway. I made a small test project with a sample file, what causes issues. Hope that makes it easier to replicate.

brotliissue.zip

And I am running this on Windows 10 Pro 64
Microsoft Visual Studio Community 2022 (64-bit) - Current
Version 17.9.2

@hadashiA
Copy link
Contributor

hadashiA commented Mar 15, 2024

Thanks a lot for the reproduced code.
Apparently the problem is in BrotliCompressor.CopyToAsync.

If you pass a very large MemoryStream to CopyToAsync at once, there seems to be a problem with it not being written all the way through.

In your code, it would be as follows.

        static async Task SaveLogfile(TradeLog log, MemoryStream ms, CompressionLevel compressionLevel) {
            using (var cp = new BrotliCompressor(compressionLevel)) {
                    MemoryPackSerializer.Serialize(cp, log);
                    await cp.CopyToAsync(ms);
                }
            await ms.FlushAsync(); }
        }

I will attempt to fix this in #249.

NOTE:
Since the buffer size can be specified in the second argument of CopyToAsync, I think this problem can be avoided by enlarging it.

By the way, I would like to recommend the following points:
If you want to serialize in memory one at a time, don't use Stream. Stream support is primarily for I/O to output to, such as File.

The above code could use IBufferWriter instead.

var buffer = new ArrayBufferWriter<byte>(inputBytes.Length);
compressor.CopyTo(buffer);      

@hadashiA
Copy link
Contributor

This fix is released in 1.20.1.
Please let us know if there is anything else. Thanks.

@FilixDark
Copy link
Author

Thanks a lot for the reproduced code. Apparently the problem is in BrotliCompressor.CopyToAsync.

If you pass a very large MemoryStream to CopyToAsync at once, there seems to be a problem with it not being written all the way through.

In your code, it would be as follows.

        static async Task SaveLogfile(TradeLog log, MemoryStream ms, CompressionLevel compressionLevel) {
            using (var cp = new BrotliCompressor(compressionLevel)) {
                    MemoryPackSerializer.Serialize(cp, log);
                    await cp.CopyToAsync(ms);
                }
            await ms.FlushAsync(); }
        }

I will attempt to fix this in #249.

NOTE: Since the buffer size can be specified in the second argument of CopyToAsync, I think this problem can be avoided by enlarging it.

By the way, I would like to recommend the following points: If you want to serialize in memory one at a time, don't use Stream. Stream support is primarily for I/O to output to, such as File.

The above code could use IBufferWriter instead.

var buffer = new ArrayBufferWriter<byte>(inputBytes.Length);
compressor.CopyTo(buffer);      

Hi thanks for the fix, usually i write it to a file, I just made it to a memorystream so you dont have to wear off your ssd when testing it. I hope this comment doesnt reopen the issue now. I am not really familiar with github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants