You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi. I'm not really sure if is a bug or an expected behaviour in newer versions, but the related pdfminer layout objects are missing under page.objects dict when passing laparams configuration.
Code to reproduce the problem
withpdfplumber.open(<path>, laparams= {}) aspdf:
page=pdf.pages[0]
assert"textboxhorizontal"inpage.objects.keys() # This will fail
Expected behavior
Like previous versions, passing the laparams configuration on pdf creation, will make available layout objects.
Actual behavior
The custom objects "textboxhorizontal" and "textlinehorizontal", related to pdfminer layout analysis and given by default in previous versions (0.5.16 at least), are missing.
Environment
pdfplumber version: 0.5.26
Python version: 3.7.6
OS: Mac
Additional context
I think the else condition in function iter_layout_objects discard all high-level objects that could contain textual info (horizontal textboxes and horizontal lines)
This commit reinstates access to higher-level layout objects (such as
`textboxhorizontal`) when `laparams` is passed to
`pdfplumber.open(...)`. Had been removed in `0.5.24` via 1f87898.
Also adds a test for this behavior.
Describe the bug
Hi. I'm not really sure if is a bug or an expected behaviour in newer versions, but the related pdfminer layout objects are missing under
page.objects
dict when passinglaparams
configuration.Code to reproduce the problem
Expected behavior
Like previous versions, passing the
laparams
configuration on pdf creation, will make available layout objects.Actual behavior
The custom objects "textboxhorizontal" and "textlinehorizontal", related to pdfminer layout analysis and given by default in previous versions (0.5.16 at least), are missing.
Environment
Additional context
I think the
else
condition in functioniter_layout_objects
discard all high-level objects that could contain textual info (horizontal textboxes and horizontal lines)pdfplumber/pdfplumber/page.py
Line 199 in 0027278
The text was updated successfully, but these errors were encountered: