Skip to content

Commit

Permalink
Merge pull request #197 from OCR-D/model-fixes
Browse files Browse the repository at this point in the history
models: formatting, new section on download-on-demand
  • Loading branch information
kba authored Jan 26, 2021
2 parents 580a507 + ccfeee8 commit 9863e1d
Showing 1 changed file with 28 additions and 13 deletions.
41 changes: 28 additions & 13 deletions site/en/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,9 @@ ocrd resmgr download ocrd-tesserocr-recognize '*'
**NOTE:** Equally, the special processor `*` can be used instead of a processor and a resource
to download *all* known resources for *all* installed processors:

ocrd resmgr download '*'
```sh
ocrd resmgr download '*'
```

(In either case, `*` must be in quotes or escaped to avoid wildcard expansion by the shell.)

Expand Down Expand Up @@ -164,7 +166,13 @@ Moreover, that variable can easily be overridden during installation.
However, there are use cases where `system` or even `cwd` should be
used as location to store resources, hence the `--location` option.

## Downloading on-demand

When you provide a value to a file parameter, such as ocrd_calamari's `checkpoint_dir`
parameter, the value will be resolved by OCR-D/core. If the resource is not
found in the filesystem, OCR-D/core will try to find a matching resource in
its list of bundled resources. If the parameter value matches the `name` of one
of those resources, it will be **downloaded on-demand**.

## Notes on specific processors

Expand Down Expand Up @@ -245,18 +253,25 @@ additional models into that location using `ocrd resmgr`.
The following will assume (without loss of generality) that your host-side data
path is under `./data`, and the host-side resource path is under `./models`:

- To download models to `./models` in the host FS and `/usr/local/share/ocrd-resources` in Docker:
docker run --user $(id -u) \
--volume $PWD/models:/usr/local/share/ocrd-resources \
ocrd/all \
ocrd resmgr download ocrd-tesserocr-recognize eng.traineddata\; \
ocrd resmgr download ocrd-calamari-recognize default\; \
...
- To run processors, as usual do:
docker run --user $(id -u) --workdir /data \
--volume $PWD/data:/data \
--volume $PWD/models:/usr/local/share/ocrd-resources \
ocrd/all ocrd-tesserocr-recognize -I IN -O OUT -P model eng
To download models to `./models` in the host FS and `/usr/local/share/ocrd-resources` in Docker:

```sh
docker run --user $(id -u) \
--volume $PWD/models:/usr/local/share/ocrd-resources \
ocrd/all \
ocrd resmgr download ocrd-tesserocr-recognize eng.traineddata\; \
ocrd resmgr download ocrd-calamari-recognize default\; \
...
```

To run processors, as usual do:

```sh
docker run --user $(id -u) --workdir /data \
--volume $PWD/data:/data \
--volume $PWD/models:/usr/local/share/ocrd-resources \
ocrd/all ocrd-tesserocr-recognize -I IN -O OUT -P model eng
```

This principle applies to all `ocrd/*` Docker images, e.g. you can replace `ocrd/all` above with `ocrd/tesserocr` as well.

Expand Down

0 comments on commit 9863e1d

Please sign in to comment.