Pdfium's FPDFImageObj_GetRenderedBitmap() function fails if the image object's transformation matrix includes negative a and/or d values. #52

ajrcarey · 2022-10-01T18:48:12Z

Follow-on from #50. PdfPageImageObject::get_processed_image() uses Pdfium's FPDFImageObj_GetRenderedBitmap() function to return a bitmap, which is then converted into an image::DynamicImage. The call to FPDFImageObj_GetRenderedBitmap() fails if the image object's transformation matrix includes negative values for matrix variables a or d.

The text was updated successfully, but these errors were encountered:

ajrcarey · 2022-10-01T18:48:44Z

https://groups.google.com/g/pdfium/c/V-H9LpuHpPY gave a tiny clue to the source of the problem. Seems to be an upstream issue; the question is, can pdfium-render implement a work-around?

ajrcarey · 2022-10-04T21:04:12Z

Attempted to work around by flipping transformation matrix a and d values on the fly if necessary before calling FPDFImageObj_GetRenderedBitmap(), then flipping them back after the call returns. This works, but the obvious question is: does flipping those values affect the rendered image?

Additionally, https://github.com/Victor-N-Suadicani reported in #50 (comment) that the returned image size is often smaller than expected. Since the returned image size is determined as part of FPDFImageObj_GetRenderedBitmap(), this is again an upstream issue. However, it may be possible to use the width and height reported by FPDFImageObj_GetImageMetadata() in conjunction with FPDFImageObj_GetImageDataDecoded() to return a usable image at the expected dimensions. Experimenting with this. At the moment, I can retrieve a byte buffer of the expected size that (presumably) contains the image data, but image::RgbaImage::from_raw() will not accept it as valid data.

ajrcarey · 2022-10-05T18:17:56Z

image::RgbaImage::from_raw() does not accept the output from FPDFImageObj_GetImageDataDecoded() (at least in the case of images in image-test.pdf) because the data returned from Pdfium is expressed in 24 bits per pixel, but image::RgbaImage expects 32 bits per pixel. The buffers provided by Pdfium are therefore smaller than RgbaImage expects.

ajrcarey · 2022-10-06T19:07:15Z

The image data returned from FPDFImageObj_GetImageDataDecoded() being 24 bpp, i.e. three channels, is a little confusing, given that Pdfium asserts (via the image object's metadata) that the image format is BGRA, i.e. 32 bpp in four channels. Hmm. Anyway, running the buffer returned from FPDFImageObj_GetImageDataDecoded() through PdfiumLibraryBindings::bgr_to_rgba() creates a new buffer that image::RgbaImage::from_raw() accepts.

The resulting image appears to be sized correctly, which is an improvement from FPDFImageObj_GetRenderedBitmap(). It's also colored correctly, as far as I can tell; this suggests that the internal image representation is BGR, irrespective of what the image object's metadata claims. The image does not take any other transforms into account, however. So it is not usable as a replacement for FPDFImageObj_GetRenderedBitmap(), since the result of calling that function must take all transformations and image filters into account.

The next thing to try would be dynamically adjusting the image object's translation matrix immediately prior to calling FPDFImageObj_GetRenderedBitmap() to try to upscale the image.

ajrcarey · 2022-10-09T13:38:07Z

Implemented dynamic adjustment of the image object's transformation matrix to coerce Pdfium into returning a bitmap closer in size to that suggested by the image object's metadata. It's slower, since we have to generate the bitmap twice - once to measure its actual dimensions, then a second time at the transformed scale - but it appears to otherwise work well.

ajrcarey · 2022-10-09T13:55:15Z

Updated README.md. Bumped crate version to 0.7.22. Waiting for any comments from https://github.com/Victor-N-Suadicani before publishing to crates.io.

ajrcarey · 2022-10-13T14:14:51Z

No further comments received. Published to crates.io as version 0.7.22.

ajrcarey mentioned this issue Oct 1, 2022

get_processed_image gives reversed byte order #50

Closed

ajrcarey pushed a commit that referenced this issue Oct 2, 2022

Progressing #52

51996c1

ajrcarey self-assigned this Oct 7, 2022

ajrcarey pushed a commit that referenced this issue Oct 9, 2022

Progressing #52

9ccde8a

ajrcarey pushed a commit that referenced this issue Oct 9, 2022

Progressing #52

1dc627d

ajrcarey closed this as completed in 75e26f6 Oct 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pdfium's FPDFImageObj_GetRenderedBitmap() function fails if the image object's transformation matrix includes negative a and/or d values. #52

Pdfium's FPDFImageObj_GetRenderedBitmap() function fails if the image object's transformation matrix includes negative a and/or d values. #52

ajrcarey commented Oct 1, 2022

ajrcarey commented Oct 1, 2022 •

edited

Loading

ajrcarey commented Oct 4, 2022

ajrcarey commented Oct 5, 2022

ajrcarey commented Oct 6, 2022 •

edited

Loading

ajrcarey commented Oct 9, 2022

ajrcarey commented Oct 9, 2022

ajrcarey commented Oct 13, 2022

Pdfium's FPDFImageObj_GetRenderedBitmap() function fails if the image object's transformation matrix includes negative a and/or d values. #52

Pdfium's FPDFImageObj_GetRenderedBitmap() function fails if the image object's transformation matrix includes negative a and/or d values. #52

Comments

ajrcarey commented Oct 1, 2022

ajrcarey commented Oct 1, 2022 • edited Loading

ajrcarey commented Oct 4, 2022

ajrcarey commented Oct 5, 2022

ajrcarey commented Oct 6, 2022 • edited Loading

ajrcarey commented Oct 9, 2022

ajrcarey commented Oct 9, 2022

ajrcarey commented Oct 13, 2022

ajrcarey commented Oct 1, 2022 •

edited

Loading

ajrcarey commented Oct 6, 2022 •

edited

Loading