Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extra sanity checks in bmp and ico decoders #632

Merged
merged 8 commits into from
Jul 30, 2017
Merged

Add extra sanity checks in bmp and ico decoders #632

merged 8 commits into from
Jul 30, 2017

Conversation

oyvindln
Copy link
Contributor

@oyvindln oyvindln commented Apr 2, 2017

This PR adds some extra sanity checks to the BMP and ICO decoders as requested in #622, #623 and #624. E.g images with x and/or y size values higher than 65535 are rejected, which is what the decoder in Firefox does, and the temporary palette buffer is limited to 256 values (palette indexes above 255 can't be referenced by 8-bit values anyhow). Additionally, for the BMP decoder, the initial image buffer is limited to around 100mb in size and expanded to the full size if needed so that images with a bogus size won't cause memory usage issues as easily.

The changes does seem to have slightly degraded the performance of BMPs using bitfields (which are primarily 16-bit images), but on the flip side, a small change I did made decoding standard 24/32-bit BMPs (which I think is the most common type) a bit faster.

This PR also alters the decoding of RLE-encoded BMPs with delta codes, end of row and end of file codes to set the colour value to [0, 0, 0(, 0)] for the "skipped" pixels as I had to alter some things there anyhow. Before the pixels ended up being white [FF, FF, FF(, FF)] as that was the default value for the bytes in the image buffer, which didn't seem to be the correct behaviour. Now, the behaviour is similar to what the BMP decoder used in windows seems to do. There doesn't seem to be a clear definition of how to handle the skipped pixels so I figured that following what Microsoft themselves do would be the best option and I added a test to check if it works correctly using two example images from the BMP suite (other images from there is already used in the project for other tests and from what I can gather they are public domain). The behaviour does slightly differ from Firefox and chrome for these types of files, which seem to make the pixels transparent while in Image the output buffer won't have an alpha channel except when the BMP is embedded in an ICO.

Lastly, this PR adds two use statements to fix compilation of the benchmarks, as I wanted to make sure I didn't significantly negatively impact performance.

…inor optimization.

This commit adds some size limits to the decoder, limits the palette buffer size, and limits the starting pixel buffer size to avoid OOM issues.
Additionally, this adds a small optimisation to read_full_byte_pixel_data by reducing the number of read calls, which improves the performance of decoding images with plain 8-bit r/g/b/a values.
This commit also fixes the compilation of benchmarks which was broken.
@oyvindln
Copy link
Contributor Author

oyvindln commented Apr 2, 2017

Seems I broke the truncate tests for BMP and ICO, didn't notice as they were ignored by default, so I need to fix that before any merging.

EDIT: Should be fixed in 487be3e

@philipc
Copy link
Contributor

philipc commented Apr 5, 2017

Adding more sanity checks is always good.

However, I'm not sure about the changes to allocate a smaller initial buffer. After these changes, is it still possible to craft a malformed BMP that will cause an out of memory error or use large amounts of memory? For example, is it possible to craft a RLE BMP with large dimensions but small file size using an end-of-file marker? If so, this does not fix the denial of service attack.

try!(self.r.seek(SeekFrom::Start(self.data_offset)));
for row in self.rows(&mut pixel_data) {
try!(reader.seek(SeekFrom::Start(self.data_offset)));
//for row in self.rows(&mut pixel_data) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix this in the next commit.

@oyvindln
Copy link
Contributor Author

oyvindln commented Apr 5, 2017

However, I'm not sure about the changes to allocate a smaller initial buffer. After these changes, is it still possible to craft a malformed BMP that will cause an out of memory error or use large amounts of memory?

The limited starting buffer makes this less severe as images will have to actually be more than 100mb to allocate that much memory, and the decoder will now reject BMPs with x or y values larger than 2^15 bytes and limits the temporary palette buffer to 256 entries (before it wasn't limited). Though, it's by no means a full solution to the issue, this PR is merely a slight improvement on what's currently there.

For example, is it possible to craft a RLE BMP with large dimensions but small file size using an end-of-file marker?

Good catch, that's not something I had thought of, and RLE-encoded images present an additional problem as it's difficult to estimate their size. The BMP header does have a size field, but it's valid for it to be 0, meaning that it's not set, so it's not something that can be relied on. The documentation doesn't seem to specify what should happen if an EOF marker is hit prematurely and doesn't match up with the length/width values, maybe the file should simply be ended at the end of the row containing the EOF marker. I will have a look at how other decoders deal with this and see if I can improve on it a bit in this PR.

Ideally we would supply the decoder with the actual file size and/or some maximum size like many other decoders do (see the comments in #622) and reject files where the dimensions and sizes don't match up. However, that would require an API change (or alternatively you could use specialization for some types of readers but that's still unstable and not an ideal solution), which should probably be planned out before being actually implemented so that is probably better left for a later PR.

An issue in general with BMP images is that they are stored upside-down, so incrementally increasing the size of the buffer is a bit slow and clunky, so we might have to make some trade-offs here depending on the use case. For a web browser it's good to be conservative, as we wouldn't expect websites to have large BMPs anyhow, while for an image editor, loading a large BMP might actually be something that would be expected to work.

@oyvindln
Copy link
Contributor Author

oyvindln commented Apr 18, 2017

Sorry for taking a bit of time to reply. It seems that the behaviour varies a bit. I tested one of the test images that had RLE codes, and modified the size attribute in the header to be very large (12543x23295).

  • Firefox (ubuntu 16.04 x86-64) loaded the image, but not at full scale, clicking zoom in shows the downscaled image scaled up.
  • Chrome loaded the whole thing, though it seem to be able to somehow keep only part of the image loaded, only using a lot of memory while loading or moving around, though moving around is very slow.
  • Imagemagick display loads the whole thing into memory at once
    For windows applications (ran win7 32-bit in a VM with 3 gig of memory, I think fully loading the image I modified would require something like 2 gig):
  • Internet Explorer 11 doesn't load the image.
  • Paint says the file is unsupported or not valid.
  • The image viewer loads something (slowly), though the resulting image displayed is all black (missing the bits of the image where there are pixels that are supposed to be coloured)
  • Firefox on this system didn't load the image.

Not sure what the best approach to this is. I think the RLE skipping features are mostly used for icons with transparent pixels (which are limited to 256x256 pixels), so I'm not sure if loading these types of huge images with EOF RLE codes are very relevant anyhow.

@philipc
Copy link
Contributor

philipc commented Apr 18, 2017

Not sure what the best approach to this is. I think the RLE skipping features are mostly used for icons with transparent pixels (which are limited to 256x256 pixels), so I'm not sure if loading these types of huge images with EOF RLE codes are very relevant anyhow.

I don't think it's something we need to support loading successfully. But we do need to ensure we don't run out of memory trying to load them.

My preference would be for the crate to add an API to specify the maximum image size (and support this for all decoders), but that's a decision for others to make. Then once that is added, the limited starting buffer that this PR implements isn't quite as necessary. Another option would be to implement something like #462.

@oyvindln
Copy link
Contributor Author

oyvindln commented May 2, 2017

Sorry for taking long. I removed the commented code, and added a very basic check for RLE files with an EoF code, so that a file won't extend the buffer from the initial size if an EoF check is encountered. Not sure if there is an easy way to add anything more without having a continuously expanding buffer, but that's a bit complicated since BMP-files are stored upside-down. For now at least, pending any API adjustments, I think it would be advisable for users of the library to check the size of an image using GenericImage::dimensions() before fully loading for application where malformed images could be a risk.

EDIT: A simple change could be to add a load_with_max_size (don't know what to name it) function to the image loading API that does this automatically.

@oyvindln
Copy link
Contributor Author

oyvindln commented Jun 6, 2017

Ping @philipc
Anything else needed for this to be merged?

@philipc
Copy link
Contributor

philipc commented Jun 6, 2017

I'd like @nwin to review this.

I still think that specifying maximum image sizes is a better approach, but that's not a decision for me to make. Otherwise this looks fine to me.

@nwin
Copy link
Contributor

nwin commented Jul 24, 2017

Seems like I produced some conflicts, sorry for that. Can you please resolve them. Looks good otherwise, ready to be merged

@oyvindln
Copy link
Contributor Author

Yup I'll try to sort out those conflicts.

@oyvindln
Copy link
Contributor Author

@nwin I've updated it. Let me know if it I need to squash it (If github won't do it automatically).

@bvssvni
Copy link
Contributor

bvssvni commented Jul 30, 2017

Merging.

@bvssvni bvssvni merged commit 962f027 into image-rs:master Jul 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants