You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I like the hocr renderer very much as it is the only way by which you have influence on the outcome.
However, since a while it has rendered some letters wrongly, especially the "s" in old texts.
Steps to reproduce
Running these two commands:
1.) ocrmypdf -l frk --pdf-renderer hocr inmidst.pdf inmidst_hocr.pdf
2.) ocrmypdf -l frk inmidst.pdf inmidst_sandwich.pdf
leads to these two underlying texts:
1.) That's a tent with „3“ inmidnt a word.
2.) That's a test with „3“ inmidst a word.
There is no difference when using an older version of hocrtransform.py (Don't know how old it must be.)
Could you pls. have a look at this? Thanks a lot!
Describe the bug
I like the hocr renderer very much as it is the only way by which you have influence on the outcome.
However, since a while it has rendered some letters wrongly, especially the "s" in old texts.
Steps to reproduce
Files
inmidst.pdf
How did you download and install the software?
PyPI (pip, poetry, pipx, etc.)
OCRmyPDF version
15.4.2
Relevant log output
No response
The text was updated successfully, but these errors were encountered: