char bounding box #559
Replies: 2 comments 1 reply
-
Very interesting test cases you have here, @hhslepicka, and thank you for sharing! Practically speaking, this question would be better suited for the But I do wonder: What bounding boxes do you get when you remove the |
Beta Was this translation helpful? Give feedback.
-
Thank you, @jsvine. Here is the debug with pdfplumber/pdfminer.six: Here is the debug with PDFBox: I will try to reach out to the pdfminer.six team as well. If I get any input I will link it here. Thanks again for your help. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I am trying to fetch each char's bounding box and compare that against the result generated by Apache PDFBox.
My goal is to reach the
cyan
box surrounding theg
and theA
displayed below:Using pdfplumber I was able to fetch the information but it seems like the box is being affected by the other chars on the same text line.
My code with pdfplumber:
The code from Apache PDFBox which generated the first image is located here.
Here is my test PDF file:
https://drive.google.com/file/d/16V0BzcwP50f-iWhR-eRfGkZ5_o83BD6a/view?usp=sharing
Here is the full image from the first crop
:
Thank you for your help and developing pdfplumber.
Beta Was this translation helpful? Give feedback.
All reactions