You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi !
First of all, thank you for this great tool.
Let me ask you a question:
I would like to be able to extract a table (or a list) containing the text objects with their properties, is that possible?
Thanks
The text was updated successfully, but these errors were encountered:
@Banguiskode thank you for your interest in the library. Your expectations are captured as enhancements #2, #7, #11 and #17.
PDF as a specification does not have any simple mechanism of specifying tabular structures as tables unless you post process the text positions extracted from the PDF files. While the API does not provided a very explicit API for the same, pdPageEvaluate can be extended to extract the text data and their positions. As part of tagged specification PDF supports specifying the tabular structure representations but a very small portion of the PDF files available in the market actually implement those specifications to a great extent. If you will like to contribute to any parts of PDFIO by implementing any of the features, we will be happy to accept PRs.
Since, the intent of the issue is already captured as part of other issues, I will close the issue with this comment.
Hi !
First of all, thank you for this great tool.
Let me ask you a question:
I would like to be able to extract a table (or a list) containing the text objects with their properties, is that possible?
Thanks
The text was updated successfully, but these errors were encountered: