How to Improve the Detection Accuracy of This Character? #1273
Replies: 1 comment
-
It may be possible to fine-tune Tesseract to improve its recognition with some labeled examples, although it may increase errors in other cases. OCRmyPDF doesn't have anything on its own to help with this. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I have been using this great library to perform OCR on invoices and it has been working great so far.
I'm having difficulty trying to get OCRmyPDF to recognize the '9' in this picture (part of an invoice) correctly:
I have tried different
oversample
values (None, 200, 350, 600, 800, 1000) combined with differenttesseract_pagesegmode
(1, 4, 6, 11, 12), and the generated text would always recognize the '9' as $, €, 6, or omit it altogether.What can be done to improve the detection accuracy in this case and similar cases?
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions