Skip to content
Stefan Weil edited this page Jan 15, 2024 · 10 revisions

Welcome to the quiver-benchmarks wiki!

Using quiver-benchmarks with podman

The current code uses docker compose which requires Docker Desktop. Docker Desktop is non-free software and uses a license with license fees. Free usage is restricted to a few cases which might not apply. Therefore alternatives might be desired.

Older versions of Docker which had a more liberal license worked with a separate docker-compose and are no longer available. The free alternative Podman uses podman-compose. Neither docker-compose nor podman-compose can be used without code modifications.

Luckily it is possible to add support for podman / podman-compose. The Git branch https://github.com/UB-Mannheim/quiver-benchmarks/tree/fixes contains the necessary modifications.

Install podman and podman-compose

podman is provided by Debian / Ubuntu, so simply install it:

apt install podman

podman-compose is also provided by Debian / Ubuntu, but the version from Debian bookworm is too old (see Debian bug report). Therefore either get a newer version either from bookworm-backports or from PyPI and install it in a virtual environment:

python3 -m venv venv
source venv/bin/activate
pip install -U pip setuptools
pip install podman-compose

Get quiver-benchmarks

Here the fork from UB Mannheim is used to get the required modifications for podman.

mkdir -p ~/src/github/OCR-D
cd ~/src/github/OCR-D
git clone --branch fixes https://github.com/UB-Mannheim/quiver-benchmark.git
cd quiver-benchmark

Build container image

make build

Start container app

make start

Install ground truth data

make prepare-default-gt

This step currently fails to install any GT data from https://github.com/tboenig/ because the related repositories meanwhile have new incompatible releases.

Therefore that GT data must be installed manually using older releases:

cd gt
curl -Lo 16_frak_simple.zip https://github.com/tboenig/16_frak_simple/releases/download/v1.0.0/bagitDump-v79.zip
curl -Lo 17_frak_simple.zip https://github.com/tboenig/17_frak_simple/releases/download/v1.0.4/bagitDump-v16.zip
curl -Lo 17_frak_complex.zip https://github.com/tboenig/17_frak_complex/releases/download/v1.0.1/bagitDump-v10.zip
curl -Lo 18_frak_simple.zip https://github.com/tboenig/18_frak_simple/releases/download/v10/bagitDump-v10.zip
curl -Lo 18_frak_complex.zip https://github.com/tboenig/18_frak_complex/releases/download/v1.0.0/bagitDump-v11.zip
curl -Lo 19_frak_simple.zip https://github.com/tboenig/19_frak_simple/releases/download/v1.0.0/bagitDump-v10.zip
curl -Lo 16_ant_simple.zip https://github.com/tboenig/16_ant_simple/releases/download/v1.0.0/bagitDump-v11.zip
curl -Lo 16_ant_complex.zip  https://github.com/tboenig/16_ant_complex/releases/download/v1.0.11/bagitDump-v53.zip
curl -Lo 18_ant_simple.zip https://github.com/tboenig/18_ant_simple/releases/download/v1.0.0/bagitDump-v11.zip
curl -Lo 19_ant_simple.zip https://github.com/tboenig/19_ant_simple/releases/download/v1.1.5/bagitDump-v19.zip
curl -Lo 17_fontmix_simple.zip https://github.com/tboenig/17_fontmix_simple/releases/download/v1.0.0/bagitDump-v17.zip
curl -Lo 18_fontmix_complex.zip https://github.com/tboenig/18_fontmix_complex/releases/download/v1.0.0/bagitDump-v10.zip
cd ..
make prepare-default-gt

Run the benchmark tests

Running all benchmark tests takes several hours:

make run

Repeated executions of this command will skip tests with existing results and only run new tests. Each execution writes a new logfile in directory logs/.

Add more benchmark tests

Additional tests can either clone existing tests and change process parameters (for example use a different OCR model) or implement a completely different OCR-D workflow. Add them to the directory workflows/ocrd_workflows/ which already contains some benchmark tests. Make sure that the filename of an OCR benchmark ends with _ocr.txt.

Create reports

A summary of all test results is written to data/*.json. TODO: how to create a nice report from the JSON file.