Skip to content

dinurp/tesseract-in-cf

Repository files navigation

tesseract-in-cf

  • Deployent of tesseract in CF
  • Sample Flask app using pytesseract
  • logging to app logger

how to create tesseract-ocr-mrz.deb?

mkdir -p tesseract-ocr-mrz/DEBIAN
cat - > tesseract-ocr-mrz/DEBIAN/control <<EOF
Package: tesseract-ocr-mrz
Source: tesseract-lang
Version: 4.00~git-0f039b
Architecture: all
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Recommends: tesseract-ocr (>= 3.99)
Breaks: tesseract-ocr (<< 3.99)
Replaces: tesseract-ocr-data (<< 2)
Provides: tesseract-ocr-lang, tesseract-ocr-language
Section: graphics
Priority: optional
Homepage: https://github.com/DoubangoTelecom/tesseractMRZ
Description: tesseract-ocr language files for Machine Readable Zone (MRZ)
 Tesseract is an open source Optical Character Recognition (OCR)
 Engine. It can be used directly, or (for programmers) using an API to
 extract printed text from images. This package contains the data
 needed for processing images of MRZ in ID documents..
EOF

mkdir -p tesseract-ocr-mrz/usr/share/tesseract-ocr/4.00/tessdata
wget  -O tesseract-ocr-mrz/usr/share/tesseract-ocr/4.00/tessdata/mrz.traineddata \
         https://github.com/DoubangoTelecom/tesseractMRZ/raw/master/tessdata_best/mrz.traineddata

find tesseract-ocr-mrz -type d | xargs chmod 755
sudo chown root:root -R tesseract-ocr-mrz
sudo dpkg-deb --build tesseract-ocr-mrz

upload tesseract-ocr-mrz.deb somewhere where .deb extension can be retained and refer to it from apt.yaml.

Reference:

Enabling application logging

  • create an instance of application-logs
  • bind this instance to the app
  • open Kibana (logs.< cf Landscape domain >) to view the logs

Reference:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published