-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong model causes negative array index when calling DetectOrientationScript() #4062
Comments
Is this a regression, or do you see the same issue also with older releases? How did you start |
I'm using my own code that just calls SetImage and then DetectOrientationScript, but if you look at the diff, it is quite trivial so inspection should be enough. "id" can be -1 from choice->script_id() and when it is there is no check before using it as an array index. GitHub "blame" shows it back from 3.0 on a side note, this method is much, must faster than using Recognize and iterator calls to determine image rotation as shown in the tesseract samples page |
Can you please also provide some example image to test the issue? |
scan-90.zip |
I tried Maybe you can also test this patch which uses a packed vector of boolean values:
|
your new patch still calls done.at(-1) which is invalid The code to repro is quite simple. The returned "degree" value is indeterminate because the -1 index randomly breaks the loop early depending on an invalid stack value
|
Calling I assume that your code is incomplete and misses some initialisation which is part of |
? by definition the Init call succeeds (using only 'eng' data) because 80% of the time rotation is detected and text is extracted. The index is absolutely -1 on every call, but dereferencing an array index of -1 is ALWAYS a bug with indeterminate value ocr->Init(ocrdir.c_str(), 'eng') |
Is it possible to provide a complete working example code including It's not necessary to repeat that a negative index is not okay because nobody questions that. But according to my tests the Supporting your use case might be useful (maybe with a different patch). But it might also be undesired behaviour which requires better documentation how that function should be used to avoid negative indexes. Raising an exception when an API is not used according to the documentation would be a valid fix. I still don't know which of the two options is the better one. |
you can paste those 3 lines into any test snippet that reads a Pix object (such as the ones on tesseract site) - there really is nothing special here other than init(), setImage() and DetectOrientationScript() in order There are quite a few stackoverflow and other internet questions asking how to detect image rotation. They all recommend Recognize() with a 0 PSM, but that is incredibly slow. This solution runs dramatically faster and since you have to rotate before calling Recognize with a non-0 PSM, it can lead to dramatic improvements on bulk OCR If this API solution is ok, it would help to add internal image rotation as an option to Tesseract API enabled through config or some other option? It's a common problem with a simple solution, but very hard to track down (took me a couple days to find and debug) |
Sorry, I don't have the time to collect code snippets to build a test program for this issue. I am afraid that without a complete working example code I cannot help. |
here is sample code
|
https://github.com/tesseract-ocr/tessdoc/blob/main/examples/OSD_example.cc The code currently does not check the returned values from the methods it calls. |
My sample code does check error codes, but that doesn't matter because it returns correct results with my trivial 2 line patch - never a failure |
You don't need to rotate with PSM 1. |
@todd-richmond, add destructors to your sample code to avoid memory leaks:
The examples were also incomplete. This will be fixed with tesseract-ocr/tessdoc#110. Even with your patched code, there remains an illegal memory read:
|
my runtime code also has deletes - this is just a snippet that was asked for. I don't have valgrind or other errors with my patch in my app and the API returns expected results 100% of the time. I see the vector<> change was merged so I can't speak to what happens with that since it doesn't check the subscript either |
I'm not quite sure why it has been so hard to get this patch accepted. I contribute small patches to dozens of open source projects we use at work and this example solves a problem that a LOT of people on the net have asked about. I found the alternative reading Tesseract headers after finding the common suggestion far too slow. The API call is documented as official, but fails in the same code + image I gave. The trivial patch resolves the bug w/o changing any internal execution paths |
Your patch does not fix the problem, because there remains an array access with index -1 (see my previous comment). You can reproduce the problem by using a sanitizer build ( In addition I get wrong results for an image without rotation and with mostly non-Latin script:
If I apply your code to your test image, I get a rotation of 270°. I get the same result when I rotate that image by 90° (which makes it upright). |
That's because he used "eng", which is fine if the user knows that the text in the image is written in English and only wants to find the orientation of the page. |
Then it should work for the image provided which only contains English text, but it does not. If I replace "eng" by "osd", all tests succeed and there is no longer a negative array index. |
You need to use an old It will probably also work with a trainedata from the |
Thank you. Indeed, the test code works when I replace So the rule is to use "osd" or any other model which supports the legacy engine. Then there is no negative index. Otherwise raise an exception to avoid undefined behaviour. It looks like this issue can be closed. |
I am not sure if we are able to check the presence of the legacy model in OSD functions. |
|
my sample code was run using tessdata_fast, not tessdata or tessdata_best because I care more about speed than recognition at scale. It tries to detect rotation before setting the PSM and calling recognize so that I can use different configurable modes that don't support OSD w/o having to use slow PSM 1 first. @stweil , there is no subsequent valgrind issue using my sample and patched library (but I wasn't using debug mode). However, you are correct that the 0 degree rotation is still returning the same results so my code does fail I tried PSM 0 with Recognize(), but it is always returning 270 degrees no matter what. I tried DetectOS, DetectOrientationScript and GetOsdText, but all fail on unrotated content even when tesseract cmdline program returns correct rotation. In short, I need fast scanning with configurable PSM, but with 90 degree rotation handling and running Recognize twice is too slow. It seems I need to use only PSM modes that enable OSD |
Regarding the speed you can have a look at leptonica (see e.g. https://bucket401.blogspot.com/2023/05/detecting-page-orientation-c.html, there is also example for fixing a skew). DanBloomberg (Leptonica author) is very experienced and he provide excellent support on leptonica (just create issue if you have problem or questions). |
I also struggled with this for several days but finally landed on this post... I am not sure, it still says open, but if there is some way I can help I am happy to provide an example of what I was doing. I think if it only works with certain models it should give me an error if I am using a model that does not work, right? |
Current Behavior
Latest Tesseract 5.3.1 has a bug in DetectOrientationScript() that causes it to randomly fail when attempting to detect image orientation for 90 degree image scans. The problem is that getting a script id can return -1 which then indexes into a "done" array
On my system this fails about 20% of the time on the exact same image input
Error found with valgrind
Expected Behavior
Trivial fix to not write to done[-1] which then allows DetectOrientationScript() to always return the same results
Suggested Fix
tesseract -v
No response
Operating System
RHEL 8
Other Operating System
No response
uname -a
No response
Compiler
gcc 12.2
CPU
No response
Virtualization / Containers
No response
Other Information
No response
The text was updated successfully, but these errors were encountered: