Skip to content

Commit

Permalink
Fixed a rare bug in OCR system where self-intersecting bounding boxes…
Browse files Browse the repository at this point in the history
… caused crashes during multi-pass OCR. Implemented a quick fix in bbox_intersection_area_ratio function to return 0.0 when polygons are invalid due to self-intersection.
  • Loading branch information
Paethon committed Sep 7, 2023
1 parent 5c828e0 commit ae6c282
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 1 deletion.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ The version numbers are according to [Semantic Versioning](http://semver.org/).

### Fixed
- Adds forced conversion to RGB in pillow before sending data to OpenCV to fix a possible bug in Studio

- Fixes a rare bug where self-intersecting bounding boxes caused the OCR system to crash when using multi-pass OCR
### Removed


Expand Down
3 changes: 3 additions & 0 deletions ocr_wrapper/bbox_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ def bbox_intersection_area_ratio(bb1: BBox, bb2: BBox) -> float:
"""
self_poly = bb1.get_shapely_polygon()
that_poly = bb2.get_shapely_polygon()
# Sometimes the polygons are invalid (usually because of self-intersection), in which case we return 0.0. We should have a closer look why this happens, but for now it seems to be a rare occurence and this is a quick fix that doesn't seem to affect the results much.
if not self_poly.is_valid or not that_poly.is_valid:
return 0.0
if self_poly.intersects(that_poly):
inter_poly = self_poly.intersection(that_poly)
return inter_poly.area / self_poly.area
Expand Down

0 comments on commit ae6c282

Please sign in to comment.