-
Notifications
You must be signed in to change notification settings - Fork 7
A Practicioner's View on Binarization
This page aims at providing an idea of how to approach binarization and image enhancement from the OCR-D user perspective. It will aggregate information on possible processors and parameters, general recommendations, and a few select pathological/hard examples, with hints on inter-dependencies with other processing steps.
As discussed in the workflow guide, it can be beneficial to do binarization repeatedly on both the raw (uncropped) page and the cropped page, the text/table regions, or the text lines. However, not all processors are capable of running on all hierarchy levels or when some binarization is already present.
Also, not all algorithms cope well with unnormalized/unenhanced images (with too low/high contrast, too low/high brightness, far-off white point, noise, optical distortion). Thus, there's a strong connection with image preprocessing (like raw denoising).
Moreover, some algorithms produce quite noisy results. This can be favourable for recognition (if the model is trained appropriately), but almost always degrades segmentation quality. Also, after binarization there is usually no need for very high pixel density images. Thus, there's a strong connection with image postprocessing (like binary denoising and downscaling).
Furthermore, at least for the class of adaptive algorithms, there is an additional interdependency. Methods like Sauvola or Niblack require setting an adequate window size (such that windows are always likely to include both text and background, of sufficient variance, but are still local "enough"). Thus, there's a strong connection with DPI (pixel density of the input images), especially DPI metadata.
... list processors as in the workflow table, with short remarks; also available now: ocrd-preprocess-image
wrapping arbitrary external binarization or image enhancement tools.
... what to do under which circumstances, source media, resolution etc.
-
This image has been provided by Jochen Barth (UB Heidelberg):
-
This image has been provided by Kay-Michael Würzner (SLUB):
... processor/parameter options, preprocessing options, discussion, more examples
sauvola + kim is recommended in https://ocr-d.de/en/workflows for binarization, but sauvola parameter k is depended on black and white level. The wolf algorithm works independent of black and white level (does not need an individual parameter). singh is second best, sauvola-ms-split produces the thinnest of acceptable results. See https://digi.ub.uni-heidelberg.de/diglitData/v/olena-20200807.tif (multi-layer tif with different methods {method label at bottom} & black+white+noise levels)
Welcome to the OCR-D wiki, a companion to the OCR-D website.
Articles and tutorials
- Running OCR-D on macOS
- Running OCR-D in Windows 10 with Windows Subsystem for Linux
- Running OCR-D on POWER8 (IBM pSeries)
- Running browse-ocrd in a Docker container
- OCR-D Installation on NVIDIA Jetson Nano and Xavier
- Mapping PAGE to ALTO
- Comparison of OCR formats (outdated)
- A Practicioner's View on Binarization
- How to use the bulk-add command to generate workspaces from existing files
- Evaluation of (intermediary) steps of an OCR workflow
- A quickstart guide to ocrd workspace
- Introduction to parameters in OCR-D
- Introduction to OCR-D processors
- Introduction to OCR-D workflows
- Visualizing (intermediate) OCR-D-results
- Guide to updating ocrd workspace calls for 2.15.0+
- Introduction to Docker in OCR-D
- How to import Abbyy-generated ALTO
- How to create ALTO for DFG Viewer
- How to create searchable fulltext data for DFG Viewer
- Setup native CUDA Toolkit for Qurator tools on Ubuntu 18.04
- OCR-D Code Review Guidelines
- OCR-D Recommendations for Using CI in Your Repository
Expert section on OCR-D- workflows
Particular workflow steps
Workflow Guide
- Workflow Guide: preprocessing
- Workflow Guide: binarization
- Workflow Guide: cropping
- Workflow Guide: denoising
- Workflow Guide: deskewing
- Workflow Guide: dewarping
- Workflow Guide: region-segmentation
- Workflow Guide: clipping
- Workflow Guide: line-segmentation
- Workflow Guide: resegmentation
- Workflow Guide: olr-evaluation
- Workflow Guide: text-recognition
- Workflow Guide: text-alignment
- Workflow Guide: post-correction
- Workflow Guide: ocr-evaluation
- Workflow Guide: adaptation-of-coordinates
- Workflow Guide: format-conversion
- Workflow Guide: generic transformations
- Workflow Guide: dummy processing
- Workflow Guide: archiving
- Workflow Guide: recommended workflows