Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add barcode detection to OCR #651

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

josegomezr
Copy link

@josegomezr josegomezr commented Aug 31, 2024

LocalOCRService can now detect 1D barcodes and QR Codes via pyzbar.

In our investigations we've realized that QR detection can be impaired if the QR Code contain artifacts/noise within the code. We've included a parametrizable processing pipeline to minimize artifacts and improve the scanner performance.

Before (not detected by zbar):
image

After (detected):
image

The results of pyzbar are appended at the end of the OCR output in the form of:

--- QRCODE CODE ---
QUALITY: 1
ORIENTATION: UP
POSITION: [278, 18549, 1352, 1386]
DATA: 190401021.01.1.0001!54,3,0,0,2,0,0,0,3,0,0,1,4,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,2,2,68,1,0!0!0

Note: result extracted from the sample images above

We have some preliminary timings as well as part of this change. In a best-case scenario (pyzbar finds codes at first try) it'll take ~450ms more than current master, and if it needs to go over the processing pipeline every attempt takes ~1.5s.

Using python-opencv and numpy improve the speed of the image filters, but we wanted to keep the changes at minimum first.

Open Questions:

  • Are log messages ok?
  • Is the logic aligned with your style? I tried my best to match it

- The barcode detection routine tries multiple times (seven to be
  precise) to find a code by applying preprocessing the image with the
  following filters:

    1. Preserve luminance channel
    2. Gaussian blur (pre)
    3. Parametrizable Binary filter (this is the filter adjusted on every
      iteration)
    4. [Dilatation & Erosion](https://docs.opencv.org/4.x/db/df6/tutorial_erosion_dilatation.html)
    5. 2x Resize
    6. 1/2 downsize with linear interpolation
    7. Gaussian blur (post)

  And appends the detected Code at the end of the OCR scan for the
  image.
Rewritten opencv & numpy based image processing filters with Pillow
instead. It's a bit slower but it reduces the dependencies to only
`libzbar0`.
@catileptic catileptic removed their request for review October 22, 2024 10:31
@stchris
Copy link
Contributor

stchris commented Oct 22, 2024

Hi @josegomezr and thanks for your PR. First of all I want to apologize for the late reply and thank you for a very interesting addition. I'm fine with the changes overall, I think the main problem has to do with ingest-file not (yet) being configurable with feature flags. I think everyones data is different and I'd be hesitant to make ingest times longer.

Would you be up to add a setting along the lines of

ENABLE_ZBAR=0 # uses pyzbar to detect bar codes and QR codes

and then only doing the detection if that setting is enabled?

I will leave other comments inline.

# no results found then
return []

def extract_barcodes(self, image):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I'd appreciate adding type hints here. Assuming image is a PIL.Image and the return type is a str?
  • (minor) This being public it would be great to add a docstring saying what it does, since at first glance I wouldn't necessarily expect this to return text. (Perhaps extract_text_from_barcodes is more appropiate?

@@ -45,6 +48,81 @@ def extract_ocr_text(self, data, languages=None):
return stringify(text)


class ZBarDetectorService(object):
THRESHOLDS = list(range(32, 230, 32))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be able to document where these values come from and what they represent?

results = pyzbar.decode(image)
# Found it at first try
if len(results) > 0:
log.info("OCR: zbar found (%d) results at first shot", len(results))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor thing but) I'd lower all of the logging calls to log.debug.

@@ -45,6 +48,81 @@ def extract_ocr_text(self, data, languages=None):
return stringify(text)


class ZBarDetectorService(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • It would be great to have a few testcases for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants