Skip to content

Latest commit

 

History

History
55 lines (45 loc) · 2.15 KB

README.md

File metadata and controls

55 lines (45 loc) · 2.15 KB

tesseract-in-cf

  • Deployent of tesseract in CF
  • Sample Flask app using pytesseract
  • logging to app logger

how to create tesseract-ocr-mrz.deb?

mkdir -p tesseract-ocr-mrz/DEBIAN
cat - > tesseract-ocr-mrz/DEBIAN/control <<EOF
Package: tesseract-ocr-mrz
Source: tesseract-lang
Version: 4.00~git-0f039b
Architecture: all
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Recommends: tesseract-ocr (>= 3.99)
Breaks: tesseract-ocr (<< 3.99)
Replaces: tesseract-ocr-data (<< 2)
Provides: tesseract-ocr-lang, tesseract-ocr-language
Section: graphics
Priority: optional
Homepage: https://github.com/DoubangoTelecom/tesseractMRZ
Description: tesseract-ocr language files for Machine Readable Zone (MRZ)
 Tesseract is an open source Optical Character Recognition (OCR)
 Engine. It can be used directly, or (for programmers) using an API to
 extract printed text from images. This package contains the data
 needed for processing images of MRZ in ID documents..
EOF

mkdir -p tesseract-ocr-mrz/usr/share/tesseract-ocr/4.00/tessdata
wget  -O tesseract-ocr-mrz/usr/share/tesseract-ocr/4.00/tessdata/mrz.traineddata \
         https://github.com/DoubangoTelecom/tesseractMRZ/raw/master/tessdata_best/mrz.traineddata

find tesseract-ocr-mrz -type d | xargs chmod 755
sudo chown root:root -R tesseract-ocr-mrz
sudo dpkg-deb --build tesseract-ocr-mrz

upload tesseract-ocr-mrz.deb somewhere where .deb extension can be retained and refer to it from apt.yaml.

Reference:

Enabling application logging

  • create an instance of application-logs
  • bind this instance to the app
  • open Kibana (logs.< cf Landscape domain >) to view the logs

Reference: