Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd: Another data corruption on SpeedBestCompression level #922

Closed
MichaelEischer opened this issue Feb 2, 2024 · 7 comments · Fixed by #923
Closed

zstd: Another data corruption on SpeedBestCompression level #922

MichaelEischer opened this issue Feb 2, 2024 · 7 comments · Fixed by #923

Comments

@MichaelEischer
Copy link

After a bug report at restic/restic#4677, we've identified that it is caused by the zstd library at the SpeedBestCompression level. Compressing and decompressing it afterwards results in corrupted data. There is no data corruption at the default compression level. I've tested versions v1.17.4 (via restic using Go 1.21.6) and v1.17.5 (using Go 1.21.6).

minimized.txt (Minimized using the script from #875)

Sorry for only bringing bad news, and that just before the weekend 😞 .

@MichaelEischer
Copy link
Author

Apparently, the data corruption was introduced between v1.17.2 and v1.17.3.

@klauspost
Copy link
Owner

Thanks! I will investigate ASAP

@klauspost
Copy link
Owner

(got it replicated, but you are catching me at an unfortunate time and I don't have time to work on a fix until Sunday/Monday)

klauspost added a commit that referenced this issue Feb 5, 2024
When extending back repeats we may encounter the [`literals_length = 0`](https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#repeat-offsets]) exception,  which changes the meaning of offsets. This means the repeat matches the wrong offset.

There is little gain extending repeats back anyway, so just avoid it.

Fixes #922
@klauspost
Copy link
Owner

@MichaelEischer I found the issue and have a fix in #923

This is a recoverable encoding error for once, so if you or anyone else have critical data compressed with this I will be able to detect and correct this issue. If you have such data feel free to attach it here. If there is enough I can build a standalone tool to read and decompress it.

klauspost added a commit that referenced this issue Feb 5, 2024
When extending back repeats we may encounter the [`literals_length = 0`](https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#repeat-offsets]) exception,  which changes the meaning of offsets. This means the repeat matches the wrong offset.

There is little gain extending repeats back anyway, so just avoid it.

Fixes #922
@klauspost
Copy link
Owner

I will release when I've investigated #920

@klauspost
Copy link
Owner

New release out.

@MichaelEischer
Copy link
Author

Thanks for the fix :-) !

ItalyPaleAle added a commit to ItalyPaleAle/dapr-components-contrib that referenced this issue Feb 24, 2024
Fixes 2 possible data corruption issues. See:

- klauspost/compress#875
- klauspost/compress#922

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
ItalyPaleAle added a commit to ItalyPaleAle/dapr-components-contrib that referenced this issue Feb 24, 2024
Fixes 2 possible data corruption issues. See:

- klauspost/compress#875
- klauspost/compress#922

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants