SANE Command-Line Scan to PDF

Sane command-line scanning bash shell script on Linux with OCR and deskew support. The script automates common scan-to-pdf operations for scanners with an automatic document feeder, such as the awesome Fujitsu ScanSnap S1500, with output to PDF files.

Tested and run regularly on Fedora, but should work on other distributions with the requirements below.

Features

Join scanned pages into a single output file, or specify a name for each page
Deskew (if supported by scanner driver, or software-based via unpaper)
Crop (if supported by scanner driver)
Creates searchable PDFs (with tesseract)
Duplex (if scanner supports it)
Specify resolution
Truncate n pages explicitly from end of scan e.g. duplex scanning with last page truncated
Skip white-only pages automatically (with ImageMagick)
Specify page width and height for odd size pages, or common sizes (Letter, Legal, A4)
Performance: scanner run in parallel with page post-processing

Requirements

bash
pnmtops (netpbm-progs)
ps2pdf (ghostscript)
pdfunite
ImageMagick (if --skip-empty-pages) is used

Optional

unpaper (for software deskew)
flock (usually provided by util-linux) (for properly ordered verbose logs)
tesseract (to make searchable PDFs)
sem (via gnu-parallels, to constrain CPU usage during page processing for fast scanners)

Getting Started

# scan --help
scan [OPTIONS]... [OUTPUT]

OPTIONS
 -v, --verbose
   Verbose output (this will slow down the scan due to the need to prevent interleaved output)
 -d, --duplex
   Duplex scanning
 -m, --mode
   Mode e.g. Lineart (default), Halftone, Gray, Color, etc.
 -r, --resolution
   Resolution e.g 300 (default)
 -a, --append
   Append output to existing scan
 -e, --max <pages>
   Max number of pages e.g. 2 (default is all pages)
 -t, --truncate <pages>
   Truncate number of pages from end e.g. 1 (default is none)
 -s, --size
   Page Size as type e.g. Letter (default), Legal, A4, no effect if --crop is specified
 -ph, --page-height
   Custom Page Height in mm
 -pw, --page-width
   Custom Page Width in mm
 -x, --device
   Override scanner device name, defaulting to `fujitsu`
 --crop
   Crop to contents (driver must support this)
 --deskew
   Run driver deskew (driver must support this)
 --unpaper
   Run post-processing deskew and black edge detection (requires unpaper)
 --ocr
   Run OCR to make the PDF searchable (requires tesseract)
 --skip-empty-pages
   remove empty pages from resulting PDF document (e.g. one sided doc in duplex mode)

OUTPUT
 -o, --output <outputfile>
   Output to named file default=scan.pdf
 -l, --outputlist <outputfile-1...outputfile-n> Output to named files for each scanned page, can be used with append

Author

Raman Gupta

Contributors

Stefan Armbruster

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
LICENSE.md		LICENSE.md
README.md		README.md
scan		scan
scan_perpage		scan_perpage

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SANE Command-Line Scan to PDF

Features

Requirements

Optional

Getting Started

Author

Contributors

About

Releases

Packages

Languages

License

olesendan/sane-scan-pdf

Folders and files

Latest commit

History

Repository files navigation

SANE Command-Line Scan to PDF

Features

Requirements

Optional

Getting Started

Author

Contributors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages