Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seemingly valid lzma files result in "Corrupted range coding" #15

Closed
dragly opened this issue Dec 9, 2019 · 0 comments · Fixed by #16
Closed

Seemingly valid lzma files result in "Corrupted range coding" #15

dragly opened this issue Dec 9, 2019 · 0 comments · Fixed by #16
Labels

Comments

@dragly
Copy link
Contributor

dragly commented Dec 9, 2019

Processing some files result in an

LZMAError("Corrupted range coding")

even though the file is decoded fine by unlzma from XZ utils.

It seems like the assumption that self.code should not equal self.range in RangeDecoder::get_bit might be wrong:

if self.code == self.range {

I have attached a file that reproduces the issue:

bad-random-data.tar.gz (ironically I had to compress it as a .tar.gz to be allowed to upload it to GitHub 🙂 )

You can verify that it is successfully converted by XZ Utils using

unlzma -k -f bad-random-data.lzma

I will create a PR with a suggested fix that simply drops the error and changes the definition of bit in that function.

Also, thanks for creating this library!

dragly added a commit to dragly/lzma-rs that referenced this issue Dec 9, 2019
It appears that this is not an invalid state and that some files
can have data encoded like this. See gendx#15 for an example of such a file.

Fixes gendx#15
dragly added a commit to dragly/lzma-rs that referenced this issue Dec 9, 2019
It appears that this is not an invalid state and that some files
can have data encoded like this. See gendx#15 for an example of such a file.

Fixes gendx#15
dragly added a commit to dragly/lzma-rs that referenced this issue Dec 9, 2019
It appears that this is not an invalid state and that some files
can have data encoded like this. See gendx#15 for an example of such a file.

Fixes gendx#15
@gendx gendx added the bug label Dec 10, 2019
bors bot added a commit that referenced this issue Dec 16, 2019
16: Do not raise error if code equals range in get_bit r=gendx a=dragly

It appears that code being equal to range is a valid state and that some files can have 
data encoded like this. An example of such a file has been added to the
tests.

### Pull Request Overview

This pull request fixes #15 

### Testing Strategy

This pull request was tested by...

- [ ] Added relevant unit tests.
- [x] Added relevant end-to-end tests (such as `.lzma`, `.lzma2`, `.xz` files).


### Supporting Documentation and References

The original data was produced by processing a randomly generated geometry with OpenCTM. OpenCTM files can contain embedded LZMA compressed data. This appears to be generated with the liblzma library. The data was then extracted to make a standalone file that is attached in this pull request.

### TODO or Help Wanted

None


Co-authored-by: Svenn-Arne Dragly <dragly@cognite.com>
Co-authored-by: gendx <gendx@users.noreply.github.com>
Co-authored-by: Svenn-Arne Dragly <s@dragly.com>
@bors bors bot closed this as completed in 30b4b89 Dec 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants