From d886e682bff8d871f770a427f300898193154f87 Mon Sep 17 00:00:00 2001 From: Konstantin Baierer Date: Thu, 21 Jan 2021 16:21:28 +0100 Subject: [PATCH] models: document mounting models in docker --- site/en/models.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/site/en/models.md b/site/en/models.md index c2c204e41..203563f6d 100644 --- a/site/en/models.md +++ b/site/en/models.md @@ -231,6 +231,31 @@ ocrd-tesserocr-recognize -I OCR-D-SEG-LINE -O OCR-D-OCR-TESS -P model 'deut+frk' ocrd-tesserocr-recognize -I OCR-D-SEG-LINE -O OCR-D-OCR-TESS -P Fraktur ``` +# Models and docker + +We recommend a two-step process to make models available in Docker. First +download all the models that you want to use on the host system. When running +the docker container, mount that local directory into the container alongside +the data you want to process. + +Download the models to `$HOME/.local/share/ocrd-resources`: + +```sh +ocrd resmgr download --location data ocrd-tesserocr-recognize eng.traineddata +ocrd resmgr download --location data ocrd-calamari-recognize default +# ... +``` + +Run the `ocrd_all` Docker container: + +```sh +docker run --user $(id -u) --workdir /data \ + --volume $PWD:/data \ + --volume $HOME/.local/cache/ocrd-resources:/ocrd-resources \ + ocrd_all ocrd-tesserocr-recognize -I IN -O OUT -P model eng +``` + + # Model training With the pretrained models mentioned above, good results can be obtained for many originals. Nevertheless, the