-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError
raised when laparams
set
#383
Labels
Comments
Thanks for flagging. I'll take a look. |
jsvine
added a commit
that referenced
this issue
Mar 19, 2021
pdfminer.six's `LTAnno` objects are not PDF annotations (which we already provide access to via `.annots`, regardless of whether `laparams` is set), but rather layout annotations. Per pdfminer.six codebase: > Note that, while a LTChar object has actual boundaries, LTAnno objects > does not, as these are "virtual" characters, inserted by a layout > analyzer according to the relationship between two characters (e.g. a > space). Because they have no boundaries, they cause problems for pdfplumber, which expects bounding-box coordinates for all objects. See, e.g., issue #383, which this commit should fix.
Thanks for flagging this, @alexreg. Commit/PR above should handle this. I'll close this issue when when/if the PR is merged. |
This was fixed by the PR above; belatedly closing this issue. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
I get the following exception when calling
page.objects["char"]
. Note, this only occurs when I open the PDF withlaparams
set.Code to reproduce the problem
PDF file
https://mathscinet.ams.org/msnhtml/serials.pdf
Expected behavior
No error (exception) should be raised.
Actual behavior
The above exception (
KeyError
) is raised.Screenshots
left_column:
right_column:
Environment
Additional context
None
The text was updated successfully, but these errors were encountered: