-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tesseract process never finishes with specific gif image #3369
Comments
That GIF file is special: it includes 125 images. How should Tesseract handle animated GIF images? Create OCR for all images, or only for the first one, or refuse to process such files? |
This is a gif animation. Convert it to static images and give them to |
The static images work fine. Nevertheless handling of animated GIF images has to be well defined, see my question above. |
Ok, thanks for the advice, I should handle this on my side, check for gif and slice and analyze it. |
Can you maybe tell me what tool you used to create the static images? |
My answer is 'Create OCR for all images' |
I used |
Could be configurable through arguments with the default being do OCR for all images |
It's should be something like |
So the handling of animated GIF should be similar to multipage TIFF (which either processes all pages or a selected page as far as I remember). Maybe in a first step throwing an "unimplemented" error is easier. I am not sure how Leptonica supports animated GIF. |
I was mistaken. Not all static images work fine. The first one which looks empty |
We depend on Leptonica for image IO. Can it handle gif animation? @DanBloomberg What we need from Leptonica:
This way we can treat it like we treat multi-page tiff. |
this image is the offender, a blank page with specific color |
So Leptonica probably only sees the first image and returns it as pix. |
Please attach the first image. |
I've recorded first N GBs of debug logs in the infinite loop.
Is it tess specific thing or a bug? negative numbers in bbox |
I now have run latest Tesseract production code on the original animated GIF image. The image is processed, and Tesseract returns a "result" for the first included image. This takes 4:26 minutes, so it finishes, but takes rather long for an image which looks empty for me but obviously includes lots of small colour variations (otherwise the PNG file would be much smaller). |
@wix-andriusb, how long did you wait for "never finished"? Depending on your machine, it might take at least 4 minutes, but maybe also 20 minutes. Of course this can nevertheless be considered as a bug. |
The original image is 1080 x 1920, so those box coordinates look definitely strange, not only because they are negative, but also because the absolute x values exceed the image width. |
We can try to cut the image to, let's say, 50x50 and check it. |
For 50x100 reduced image -
Full log of that loop - |
referring to Amit's comment, I attempted to implement writing of gif anim about 4 years ago, but failed. Never tried reading animated gif into a pixa. |
And if someone shows me how to tell if a gif file is an animated gif, I'll use it in the gif reader to skip ("not supported") reading. |
I don't think there is a high desire to have advanced OCR support for animated GIF file. That's a very special rare need. Obviously the first image in an animated GIF is already read and processed with the current code. Processing all images in a file can be done with a simple external conversion. So the animated GIF issue has very low priority for me. The huge time which is required to process an image without visible content is more important for me, as I expect that "normal" scans with text can suffer from extended processing time, too. And OCR processing time has high priority. |
Both pixRasterop() and pixCountPixels() are optimized, so using them together -- first cropping the rectangle with rasterop and then counting the ON pixels -- is very efficient. |
But is it possible to count pixels directly on the original pix? |
Yes, of course, but it would be a bit complicated to do it efficiently. You are welcome to extend pixCountPixels() to take an arbitrary rectangle :-) |
I'm thinking here also about 8bpp b/w image to speedup such calcs. The question is how will this increase overall memory consumption. Do we really need 1bpp in tess? |
All this pixel counting is for 1 bpp. |
@egorpugin, are you sure that It is possible to optimize the code and use only |
On windows I see that pixCountPixels is the slowest part. |
Make sure that you are calling pixCountPixelsInRect() with tab8 defined as the 4th arg. |
https://bentkus.eu/ocr_while_loop.png With the code from #3418, the processing ends after less than half second, when Sauvola binarization is used. |
Environment
tesseract 4.1.1
reproduced on macosx and linux
Current Behavior:
running tesseract in command line on this image https://bentkus.eu/ocr_while_true.gif does not finish after 1h
Expected Behavior:
process should finish in 2 minutes
Suggested Fix:
I'll try to build and see why it never stops
upd. (by @egorpugin):
test png - https://bentkus.eu/ocr_while_loop.png
The text was updated successfully, but these errors were encountered: