You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I recently had examples of tables with one column only in forms / FAQ types of document (see below for an example of such documents).
This still makes sense to have this data treated table as the cells help to properly split the text.
However it seems that this snippet is removing such tables, despite the cells being properly recognized.
# PyMuPDF modification:# Remove tables without text or having only 1 columnforiinrange(len(tables) -1, -1, -1):
r=EMPTY_RECT()
x1_vals=set()
x0_vals=set()
forcintables[i]:
r |= cx1_vals.add(c[2])
x0_vals.add(c[0])
if (
len(x1_vals) <2orlen(x0_vals) <2orwhite_spaces.issuperset(
page.get_textbox(
r,
textpage=TEXTPAGE,
)
)
):
deltables[I]
Describe the solution you'd like
Is it possible to make the minimum column / row an attribute of the TableFinder Settings?
Describe alternatives you've considered
Are there several options for how your request could be met?
I can make a PR for that if you are ok with the idea.
Additional context
Add any other context or screenshots about the feature request here.
Is your feature request related to a problem? Please describe.
I recently had examples of tables with one column only in forms / FAQ types of document (see below for an example of such documents).
This still makes sense to have this data treated table as the cells help to properly split the text.
However it seems that this snippet is removing such tables, despite the cells being properly recognized.
Describe the solution you'd like
Is it possible to make the minimum column / row an attribute of the TableFinder Settings?
Describe alternatives you've considered
Are there several options for how your request could be met?
I can make a PR for that if you are ok with the idea.
Additional context
Add any other context or screenshots about the feature request here.
Basic example
Question.pdf
The text was updated successfully, but these errors were encountered: