-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do I generate a dataframe after identifying the table structure? #112
Comments
@hyshandler I guess either you use the implementation in this repo directly, because you have all the post-processing steps available until the construction of the dataframe, or you combine the output of that feature_extractor from huggingface with the post-processing function that you will find in this repo. Unfortunately, the post-processing steps to build the dataframe do not come with the huggingface transformers. |
Thanks @WalidHadri-Iron for the response. Can you point me to the specific functions to construct that process? I'm having trouble finding the right ones in this repo. Thanks! |
@hyshandler the post-processing code is here https://github.com/microsoft/table-transformer/blob/main/src/postprocess.py and the steps are grouped here https://github.com/microsoft/table-transformer/blob/main/src/inference.py . What you have in "results" is a bunch of bounding boxes with labels and scores, you need to post-process those bounding boxes based on their score, their label and the position of text you have in the image. The function objects_to_structures groups almost the whole post-processsing table-transformer/src/inference.py Line 295 in 235ad51
Then you have the three functions to get the format of outputs you want table-transformer/src/inference.py Line 359 in 235ad51
table-transformer/src/inference.py Line 540 in 235ad51
table-transformer/src/inference.py Line 512 in 235ad51
|
Can the results from hugging face model be passed into structure_to_cells. If yes then how can i find those tokens that needs to be passed as parameters and how can i have them in the desired data structure. |
You can keep the tokens argument as None if you don't have the text data of the cells. It works without tokens too. |
@amish1706 |
|
@amish1706 I managed to get the text and its bounding boxes using pytesseract and now i am able to get the results in csv file but sometimes the entries from two rows are being concatenated together as a single row entry of 2 or 3 columns. I manually looked up the bounding boxes of those text and there was significant difference in the location of the text so i am not able to understand why it happened. If there is a way that can do the entire work of creating the csv file without passing the tokens then it will be very much helpful. Else i wanted to know a work around as to how can i solve the above error of concatenation some entries. |
@Ashwani-Dangwal Hi I'm trying to use generate the tokens list in the format described on the Inference.MD. There it says we only need the bounding box and text, not span. Were you able to figure this out? I generated in the format suggested but I end up with empty cells. |
@lionely Other than the bounding box and the text you also need span_num, block_num and line_num. |
@Ashwani-Dangwal Thank you so much for your reply. I see how to get the block_num, line_num using pytesseract. But which one would be the span_num? Thanks again! |
@lionely span_num would be the word_num in pytessearct. And you would also make the co-ordinates of the bounding boxes in xmin, ymin, xmax and ymax format since pytesseract gives the bounding boxes in x,y ,w and h format. |
I am also trying to use this but what is class_thresholds |
I'm trying to generate a dataframe from a table in an image. After running the following code I was able to return the original image with (mostly) correct grid lines drawn on the table. How can I turn the coordinates in the image / gridlines and boxes into a pandas dataframe or csv? Thanks in advance!
model = TableTransformerForObjectDetection.from_pretrained("microsoft/table-transformer-structure-recognition") feature_extractor = DetrFeatureExtractor()
encoding = feature_extractor(image, return_tensors="pt")
with torch.no_grad(): outputs = model(**encoding)
target_sizes = [image.size[::-1]]
results = feature_extractor.post_process_object_detection(outputs, threshold=0.8, target_sizes=target_sizes)[0]
The text was updated successfully, but these errors were encountered: