Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tesseract trained on handwriting data - when being tested it only outputs the letter "e" #395

Open
KPLinux opened this issue Jul 15, 2024 · 4 comments

Comments

@KPLinux
Copy link

KPLinux commented Jul 15, 2024

Trained using this dataset: https://www.kaggle.com/datasets/nibinv23/iam-handwriting-word-database/data

I am creating an OCR model that is meant to recognize human handwriting. I extracted the image files and created separate ground truth text files for each image and followed the training process as explained in the README. The makefile ran well and no errors occurred except for one corrupted file that was unable to be read, so I just deleted it and continued the training, after which the process terminated successfully.

However, when I tried testing tesseract by having it run on some testing handwriting samples, it always gave back "e". Just to check if it's not an issue with the engine itself, I tested it with the english traineddata and (while it returned gibberish) it worked fine in the sense that each image returned a different value. But my trained model only outputs "e" for every input image.

Is there a way to fix this? I am quite new to AI and ML and just programming in general, so any tips/suggestions would be appreciated

@stweil
Copy link
Collaborator

stweil commented Jul 15, 2024

Tesseract's layout detection is not able to separate the text lines in handwritten text. It was only designed for printed text. Therefore your newly trained model would work for single lines (with the right --psm parameter), but not for typical handwritten text.

Use kraken or other software which supports text recognition for handwritings.

@KPLinux
Copy link
Author

KPLinux commented Jul 15, 2024

Thanks for the info. The model I am training is only meant to detect single lines of text, not multi-line sentences or paragraphs, so I thought tesstrain would help me train a custom Tesseract model for this purpose.

@stweil
Copy link
Collaborator

stweil commented Jul 15, 2024

Then try --psm 7 or --psm 13 and pass the line image to Tesseract.

@KPLinux
Copy link
Author

KPLinux commented Jul 15, 2024

I tried both, they still return "e" and nothing else. I guess I'll have to learn how to use kraken, unless there's some other method to go about this. Either way, thanks for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants