-
Notifications
You must be signed in to change notification settings - Fork 7
2020 08 06
OCR-D has focused on developing consistent command line interfaces, based on processor-provided metadata and a convention that maps mets:fileGrp
/mets:file
to directories and files in the file system.
While this has proven an effective and comfortable-to-develop-in set of design patterns, it is a very low level API. With phase 3 of OCR-D there will be much more emphasis on scalable solutions that are easy to deploy and integrate into existing software and workflows.
There will be a need for server-client API more abstract than the "low-level" command line interface. We should plan and implement such APIs cooperatively to ensure interoperability.
Workspaces can be placed in a specific directory via FTP or similar.
Workers try to move the workspace to their local storage.
Workspaces are stored at a location with a URL.
Message is posted on a queue with command line and url of workspace.
- How do we translate CLI to HTTP/REST calls?
- How should workspace storage work - would a reference implementation of a "workspace repository" help?
In-house solution at Stabi Berlin:
- Python
- flask for web interface
- celery/redis for jobs
- OpenAPI (Swagger) definitions
- CLI is the lowest-level API
- Technology and architecture for scaling and load-balancing CLI processes is outside OCR-D's scope (backend)
- Consider common integration guidelines:
- Specify file format for exchangeable workflows
- Mechanism for parameterizing workflows (for workflow "inheritance")
- Define standard workflows that work well for certain groups of works
- Keep standard workflows in a Git repository
- Digitization software user interfaces should offer both workflow and language selection
- "language" is to be understood more broadly than for ABBYY backends (which Kitodo/Goobi, Visual Library and DWork all support, so we can build on that), more based on the training material
- Design an OCR-D HTTP interface
- discovery of available processors, workflows, parameter sets, models
- discovery of RAM, CPU cores, GPUs, available slots, load
- auth: authentication and authorization
- jobs: list all, list one
- processor: list available, run one
- workspace: list all, search by job/processor
- => Define as OpenAPI/Swagger
- => All projects with API needs should consider API development/maintenance a low-effort but continuuous task
- An OCR-D Training HTTP interface to provide training services:
- Revisit okralact's API and implementation
- Specify how (work-specific) ground truth should be serialized in a workspace
Welcome to the OCR-D wiki, a companion to the OCR-D website.
Articles and tutorials
- Running OCR-D on macOS
- Running OCR-D in Windows 10 with Windows Subsystem for Linux
- Running OCR-D on POWER8 (IBM pSeries)
- Running browse-ocrd in a Docker container
- OCR-D Installation on NVIDIA Jetson Nano and Xavier
- Mapping PAGE to ALTO
- Comparison of OCR formats (outdated)
- A Practicioner's View on Binarization
- How to use the bulk-add command to generate workspaces from existing files
- Evaluation of (intermediary) steps of an OCR workflow
- A quickstart guide to ocrd workspace
- Introduction to parameters in OCR-D
- Introduction to OCR-D processors
- Introduction to OCR-D workflows
- Visualizing (intermediate) OCR-D-results
- Guide to updating ocrd workspace calls for 2.15.0+
- Introduction to Docker in OCR-D
- How to import Abbyy-generated ALTO
- How to create ALTO for DFG Viewer
- How to create searchable fulltext data for DFG Viewer
- Setup native CUDA Toolkit for Qurator tools on Ubuntu 18.04
- OCR-D Code Review Guidelines
- OCR-D Recommendations for Using CI in Your Repository
Expert section on OCR-D- workflows
Particular workflow steps
Workflow Guide
- Workflow Guide: preprocessing
- Workflow Guide: binarization
- Workflow Guide: cropping
- Workflow Guide: denoising
- Workflow Guide: deskewing
- Workflow Guide: dewarping
- Workflow Guide: region-segmentation
- Workflow Guide: clipping
- Workflow Guide: line-segmentation
- Workflow Guide: resegmentation
- Workflow Guide: olr-evaluation
- Workflow Guide: text-recognition
- Workflow Guide: text-alignment
- Workflow Guide: post-correction
- Workflow Guide: ocr-evaluation
- Workflow Guide: adaptation-of-coordinates
- Workflow Guide: format-conversion
- Workflow Guide: generic transformations
- Workflow Guide: dummy processing
- Workflow Guide: archiving
- Workflow Guide: recommended workflows