Multilingual_LayoutLM

This project has been derived from microsoft's LayoutLM project with dependency for transformers removed. It also includes support for 140 languages.This is a completed project for training and prediction of multilingual documents as there are limitations on labelled dataset kindly prepare data for your respective languages.I have currently tested it for hindi, malayalam, english combinations. I have released the training flow and model accordingly for these languages that have been trained on adhaar dataset

Model Link

path and config file for multilingual bert model for producing embeddings
https://drive.google.com/drive/folders/1t5Ktz94YTSrE_JHdrfiPc4Moi-K4GxHz?usp=sharing

Training flow

Training flow is in the train directory
Do alter and go through the parameters in the config.yml inside train directory to suit your requirements.

Steps

clone the repository
run pip install -r requirements.txt
To train go to train folder and run python train.py after making changes in the config file
you should also download the pretrained model from the given link and place it in the folder models
similarly you should prepare the data in the format as in folder annotated_adhaar_data \
To predict alter the config file outside the train folder and run python parser.py with the image path
after putting it in the parser.py file

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
annotated_adhaar_data		annotated_adhaar_data
layoutlm		layoutlm
sample_img		sample_img
train		train
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
objectify.py		objectify.py
parser.py		parser.py
predictLM.py		predictLM.py
requirements.txt		requirements.txt
tesser.py		tesser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multilingual_LayoutLM

Model Link

Training flow

Steps

About

Releases

Packages

Languages

bjorz/Multilingual_LayoutLM

Folders and files

Latest commit

History

Repository files navigation

Multilingual_LayoutLM

Model Link

Training flow

Steps

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages