Add Google Vision OCR block to workflows #709

brunopicinin · 2024-10-02T00:27:21Z

Description

Adds a new workflow OCR block, based on Google Vision API. The block outputs the text for the whole image as well as detected language and sv.Detections(...) for text blocks with proper labels.

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)

How has this change been tested, please provide a testcase or example of how you tested the change?

Tested locally with an image of pure text, using ocr_text_detection mode:

Tested locally with a picture containing some text, using text_detection mode:

Tested locally with an image without text:

CLAassistant · 2024-10-02T00:27:27Z

All committers have signed the CLA.

brunopicinin · 2024-10-02T00:40:09Z

I ended up using the fullTextAnnotation for the blocks. Doing some tests, it seems to give "more natural" results than the textAnnotations for the text blocks. Also, it comes with confidence score on DOCUMENT_TEXT_DETECTION mode.

As an exemple, the difference in the Google Vision API output for the following image:

PawelPeczek-Roboflow

Great contribution 💪
I am very positively surprised on the speed and quality. The comments are minor - I am willing to accept the contribution and apply myself or you can do it if you wanted.

Do you have any comments / suggestions / feature requests regarding Workflows ecosystem as a result of this contribution?

inference/core/workflows/core_steps/models/foundation/google_vision_ocr/v1.py

Add Google Vision OCR block to workflows

0125f42

brunopicinin requested review from PawelPeczek-Roboflow, grzegorz-roboflow, yeldarby, probicheaux and hansent as code owners October 2, 2024 00:27

brunopicinin mentioned this pull request Oct 2, 2024

Hacktoberfest 2024 | Google Vision OCR 🤝 Workflows #692

Closed

PawelPeczek-Roboflow reviewed Oct 2, 2024

View reviewed changes

PawelPeczek-Roboflow added the release 0.22.0 label Oct 2, 2024

Fix block input metadata and attach parent metadata to detections

1676503

brunopicinin requested a review from PawelPeczek-Roboflow October 2, 2024 18:21

Merge branch 'main' into google-vision-ocr

2595510

PawelPeczek-Roboflow approved these changes Oct 2, 2024

View reviewed changes

PawelPeczek-Roboflow merged commit bba3742 into roboflow:main Oct 2, 2024
25 of 54 checks passed

brunopicinin deleted the google-vision-ocr branch October 2, 2024 20:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Google Vision OCR block to workflows #709

Add Google Vision OCR block to workflows #709

brunopicinin commented Oct 2, 2024

CLAassistant commented Oct 2, 2024 •

edited

Loading

brunopicinin commented Oct 2, 2024

PawelPeczek-Roboflow left a comment

Add Google Vision OCR block to workflows #709

Add Google Vision OCR block to workflows #709

Conversation

brunopicinin commented Oct 2, 2024

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

CLAassistant commented Oct 2, 2024 • edited Loading

brunopicinin commented Oct 2, 2024

PawelPeczek-Roboflow left a comment

Choose a reason for hiding this comment

CLAassistant commented Oct 2, 2024 •

edited

Loading