-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
save_image_file should set DPI for derived images #343
Comments
Why can't we check whether such information is available? Copy it, if that's the case and delete the default value if not? We should definitely not write default values which we know to be wrong (I added the label |
We could, but as I said: usually it is not. Or should I say: almost always. Any non-mutating PIL.Image operation (even those that wouldn't affect the meta-data) omits
Okay, but that's probably a bug in Pillow's |
Oh, just found that this appears to have been fixed in Pillow 6.2.1. Since we cannot likely get any better, I will close for now. |
Should we update to 6.2.1 then? Do you have a sample for me to test? Thanks! |
Sorry, I was imprecise: It could have been fixed in an earlier version already. I saw the following with 5.4.1 (leading up to this issue): python -c "import PIL.Image; PIL.Image.open('repo/data/assets/scribo-test/data/OCR-D-IMG/OCR-D-IMG-orig_tiff.tif').save('test.png')"
identify -verbose test.png | grep Resolution:
Resolution: 72x72 However, this does not happen any more with 6.2.1 (where correctly no DPI is saved). |
@wrznr just convinced me that we should indeed take action to ensure core and modules comply with the spec (which requires PPI information to be kept for derived images, cf. OCR-D/spec#137). |
Not so sure about the time frame for this though. Since it involves patching all modules, I guess the final workshop is out of the question. Setting |
If we just assume the derived image passed to |
Alternatively, we could inject DPI info into the coords dict (along with affine transform and image features) at the top |
Currently, any information on image resolution provided in the original image (and made available via
OcrdExif
inWorkspace.image_from_page
) is ignored when saving derived images in the workspace (viaWorkspace.save_image_file
). Due to PIL.Image format internals, the PNG then contains a setting of 72 DPI however. This might create problems for processors that look at the derived image files alone.But this is hard to fix in core: the image passed to
save_image_file
could come from anywhere (and usually does not have aninfo['dpi']
; even simple PIL.Image operations omit that in the result).Realistically though, it will have been created some way from the source image file under the same
pageId
, and since rescaling is currently not permitted in the spec, one could assume the same DPI for all derived images.The text was updated successfully, but these errors were encountered: