Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pdfium's FPDFImageObj_GetRenderedBitmap() function fails if the image object's transformation matrix includes negative a and/or d values. #52

Closed
ajrcarey opened this issue Oct 1, 2022 · 7 comments
Assignees

Comments

@ajrcarey
Copy link
Owner

ajrcarey commented Oct 1, 2022

Follow-on from #50. PdfPageImageObject::get_processed_image() uses Pdfium's FPDFImageObj_GetRenderedBitmap() function to return a bitmap, which is then converted into an image::DynamicImage. The call to FPDFImageObj_GetRenderedBitmap() fails if the image object's transformation matrix includes negative values for matrix variables a or d.

@ajrcarey
Copy link
Owner Author

ajrcarey commented Oct 1, 2022

https://groups.google.com/g/pdfium/c/V-H9LpuHpPY gave a tiny clue to the source of the problem. Seems to be an upstream issue; the question is, can pdfium-render implement a work-around?

@ajrcarey
Copy link
Owner Author

ajrcarey commented Oct 4, 2022

Attempted to work around by flipping transformation matrix a and d values on the fly if necessary before calling FPDFImageObj_GetRenderedBitmap(), then flipping them back after the call returns. This works, but the obvious question is: does flipping those values affect the rendered image?

Additionally, https://github.com/Victor-N-Suadicani reported in #50 (comment) that the returned image size is often smaller than expected. Since the returned image size is determined as part of FPDFImageObj_GetRenderedBitmap(), this is again an upstream issue. However, it may be possible to use the width and height reported by FPDFImageObj_GetImageMetadata() in conjunction with FPDFImageObj_GetImageDataDecoded() to return a usable image at the expected dimensions. Experimenting with this. At the moment, I can retrieve a byte buffer of the expected size that (presumably) contains the image data, but image::RgbaImage::from_raw() will not accept it as valid data.

@ajrcarey
Copy link
Owner Author

ajrcarey commented Oct 5, 2022

image::RgbaImage::from_raw() does not accept the output from FPDFImageObj_GetImageDataDecoded() (at least in the case of images in image-test.pdf) because the data returned from Pdfium is expressed in 24 bits per pixel, but image::RgbaImage expects 32 bits per pixel. The buffers provided by Pdfium are therefore smaller than RgbaImage expects.

@ajrcarey
Copy link
Owner Author

ajrcarey commented Oct 6, 2022

The image data returned from FPDFImageObj_GetImageDataDecoded() being 24 bpp, i.e. three channels, is a little confusing, given that Pdfium asserts (via the image object's metadata) that the image format is BGRA, i.e. 32 bpp in four channels. Hmm. Anyway, running the buffer returned from FPDFImageObj_GetImageDataDecoded() through PdfiumLibraryBindings::bgr_to_rgba() creates a new buffer that image::RgbaImage::from_raw() accepts.

The resulting image appears to be sized correctly, which is an improvement from FPDFImageObj_GetRenderedBitmap(). It's also colored correctly, as far as I can tell; this suggests that the internal image representation is BGR, irrespective of what the image object's metadata claims. The image does not take any other transforms into account, however. So it is not usable as a replacement for FPDFImageObj_GetRenderedBitmap(), since the result of calling that function must take all transformations and image filters into account.

The next thing to try would be dynamically adjusting the image object's translation matrix immediately prior to calling FPDFImageObj_GetRenderedBitmap() to try to upscale the image.

@ajrcarey ajrcarey self-assigned this Oct 7, 2022
@ajrcarey
Copy link
Owner Author

ajrcarey commented Oct 9, 2022

Implemented dynamic adjustment of the image object's transformation matrix to coerce Pdfium into returning a bitmap closer in size to that suggested by the image object's metadata. It's slower, since we have to generate the bitmap twice - once to measure its actual dimensions, then a second time at the transformed scale - but it appears to otherwise work well.

ajrcarey pushed a commit that referenced this issue Oct 9, 2022
@ajrcarey
Copy link
Owner Author

ajrcarey commented Oct 9, 2022

Updated README.md. Bumped crate version to 0.7.22. Waiting for any comments from https://github.com/Victor-N-Suadicani before publishing to crates.io.

ajrcarey pushed a commit that referenced this issue Oct 9, 2022
@ajrcarey
Copy link
Owner Author

No further comments received. Published to crates.io as version 0.7.22.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant