-
Notifications
You must be signed in to change notification settings - Fork 1
MacOSX command line program to perform optical character recognition (OCR) on images and PDFs
License
srirangav/vocr
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
README ------ vocr v0.3.1 Homepage: https://github.com/srirangav/vocr About: vocr is a MacOSX command line program that can perform optical character recognition (OCR) on images and PDF files. It outputs any text found in the input files to stdout. vocr relies on, and derives its names from, the Vision framework (v for [V]ision). Usage: vocr [-v] [-f] [-p] [-a [accurate|fast]] [-i [no|tab]] [-l [lang]] [files] If -v is specified, vocr runs in [v]erbose mode and outputs errors and informational messages. If -f is specified, vocr will perform ocr on each page in a PDF. By default, if a PDF already contains a text representation of a given page, vocr will output that text. If -p is specified, when OCR'ing a PDF, a page break (^L) will be inserted at the end of each page. if -a is specified with the 'fast' option, vocr will use the 'fast' ocr algorithm, which may be useful for non-English languages, such as German. If -a is specified with the 'accurate' option, vour will use the 'accurate' ocr algorithm (which is the default). If -i is specified with the 'no' option, vocr will not attempt to indent any text that is OCR'ed. If -i is specified with the 'tab' option, vocr will indent using tabs instead of spaces (by default vocr indents using spaces). If -l is specified, on MacOSX 11.x (BigSur) and newer, vocr will ask the Vision framework to recognize the text in the specified language. The supported language options are: 'de' - German 'en' - English 'fr' - French 'it' - Italian 'pt' - Portuguese 'es' - Spanish 'zh' - Chinese Build: $ ./configure $ make Install: $ ./configure $ make $ make install By default, vocr is installed in /usr/local/bin. To install it in a different location, the alternate installation prefix can be supplied to configure: $ ./configure --prefix="<prefix>" or, alternately to make: $ make install PREFIX="<prefix>" For example, the following will install vocr in /opt/local: $ make PREFIX=/opt/local install A DESTDIR can also be specified for staging purposes (with or without an alternate prefix): $ make DESTDIR="<destdir>" [PREFIX="<prefix>"] install Dependencies: vocr relies on VNRecognizeTextRequest in Apple's Vision framework, which is available on MacOSX 10.15 (Catalina) and newer: https://developer.apple.com/documentation/vision/vnrecognizetextrequest History: v. 0.3.1 - updates for Monterey (MacOSX 12) v. 0.3.0 - switch to PDFKit v. 0.2.3 - fix manpage formatting v. 0.2.2 - move source files into configure.ac v. 0.2.1 - update configure with additional compiler options related to security v. 0.2.0 - print text as soon as it has been recognized, default to quiet mode, add support for languages other than English v. 0.1.0 - initial release Platforms: vocr has been tested on MacOSX 11 (BigSur) and 12 (Monterey) on M1 and x86_64. It should also work on MacOSX 10.15+ (Catalina) x86_64. License: See LICENSE.txt
About
MacOSX command line program to perform optical character recognition (OCR) on images and PDFs