AI OCR with Azure

This project demonstrates 'autopilot' OCR with Azure Cognitive Services

Classic OCR models need training to extract structured information from documents. In this project I demonstrate how to use hybrid approach with LLM (multimodal) to get better results without any pre-training.

The project uses Azure Document Intelligence combined with GPT4 and GPT-Vision. Each of the tools have their strong points and the hybrid approach is better than any of them alone.

Notes:

The document-intelligence needs to be using the markdown preview (limited regions).
The openai model needs to be vision capable.

How to use

Run example projects in examples-* folder.

The examples need docker to run. Each folder has a script that you can execute to run the complete example. Each folder has also .env file that needs to be filled with your Azure service credentials.

Complete the .env files in each example folder before running.

Note: The powershell scripts don't work very well, the bash scripts are better...

Example 1

example 1 - Sample collection Extract process of water sample providing from an information flyer.

Example 2

example 2 - Complex tables Let's find some insurance products from a more complex table.

Notes on the examples

I used https://bjdash.github.io/JSON-Schema-Builder/ to create the json-schemas in the example folders. If the keys in the json model are not self-explanatory, you should use description fields to tell the LLM model what you mean by each key to increase accuracy.

User interface

User interface is provided for testing purposes only. To run it locally, install

poetry install --with ui

then run ./ui.sh in the root folder. (env is picked from .env file in the root folder)

Develop

Install with poetry

poetry install

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
ai_ocr		ai_ocr
example-1-sample-collection		example-1-sample-collection
example-2-tables		example-2-tables
test_ui		test_ui
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
DockerfileUi		DockerfileUi
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
ui-docker.sh		ui-docker.sh
ui.sh		ui.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI OCR with Azure

This project demonstrates 'autopilot' OCR with Azure Cognitive Services

How to use

Example 1

Example 2

Notes on the examples

User interface

Develop

About

Releases 5

Packages

Languages

License

piizei/azure-ai-ocr

Folders and files

Latest commit

History

Repository files navigation

AI OCR with Azure

This project demonstrates 'autopilot' OCR with Azure Cognitive Services

How to use

Example 1

Example 2

Notes on the examples

User interface

Develop

About

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Languages

Packages