You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue 1
This gem is not able to extract the line near pdf page break some time
I have attached the PDF file
Extract the text and check ON PAGE 2 last line (just before the page break) is not getting extracted
ON the page 2 of attached PDF
"**TA PIMCO Total Return- Service Class 3 5/1/2002 -0.36 3.72 0.92 1.68 0.62 3.13 3.22"**
is not getting extracted
Issue 2
If text has some subscript then it got appended to the word and some time the subscript is appended in new line \n please extract the content and check the textfile abc.pdf
The text was updated successfully, but these errors were encountered:
Issue one seems to have been resolved - I can't reproduce it on the latest release (v2.2.1).
Issue two will be harder to address in a consistent way.
In this particular PDF, the superscript numbers are regular numbers printed in a smaller font (not unicode superscripts codepoints). That makes it hard to reliably identify them as superscript.
With a bit of tweaking to the page layout algorithm it'd probably be possible to have them rendered t the same line as the text they're associated with, but they'd appear as full height normal numbers.
Issue 1
This gem is not able to extract the line near pdf page break some time
I have attached the PDF file
Extract the text and check ON PAGE 2 last line (just before the page break) is not getting extracted
ON the page 2 of attached PDF
is not getting extracted
Issue 2
If text has some subscript then it got appended to the word and some time the subscript is appended in new line \n please extract the content and check the textfile
abc.pdf
The text was updated successfully, but these errors were encountered: