Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
An invoice2data
area
plugin helps in extracting text on the basis of area coordinates utilizing pdf2text area cropping option. An area plugin is customization to invoice2data to define area cropping with coordinates. Coordinates defined for the template can vary from pdf to pdf.You just have to add a normal template containing the YAML file in which there are different plugins for fields and tables and you just have to add an area plugin and it works on every pdf.
Just write the field of multiple lines you want to extract and give the coordinates of that field that is (
x=? y=? r=? H=? W=?
)Area Plugin Options:
Name: field name to map with extracted text
Area: takes dict as input for cropping pdf area and extract text
Regex: Optional parameter, used for further extracting text from the cropped area.
Sample Invoice
Here is a sample of an invoice template including the area plugin that helps you extract multiple lines.
Output :