The project is an implementation of a microservice for reading text from images, powered by Tesseract OCR, that can be easily incorporated in any application via a simple-to-use API built with FastAPI. The whole microservice is containerized using Docker, making it easier for anyone to set up a local copy and bend it to their needs.
The microservice also cleans and processes the uploaded images with OpenCV; improving the OCR predictions of the Tesseract model.
git clone https://github.com/mubasharkk/fastapi_ocr.git
pip install -r requirements.txt
cd app
uvicorn pdfapi:app --host 0.0.0.0 --port 8000 --reload
docker build -t fastapi_ocr .
docker run -d --name my_container api_ocr
The api contains the following endpoints
cd /var/www/app
export PYTHONPATH=$PWD
pytest -q tests