object detection based
Codes are still under development
Hi, This is Mohamed Fawzy.
For the stubborn case of arabic handwritten words recognition which is so hard due to the hierearchy of the word itself...
Arabic words are written continously without pauses unlike English words!
But English words:
Another problem is that every single character has three forms/shapes
How I tried to solve this out:
The most popular approach to acheive characters recognition which is by finding contours using OpenCV doesn't work for arabic words!
Why?? Because contouring a single character is not possible as every character is connected with its neighbors.
While in English:
This approach is kinda limited as Arabic has 12.3 Million words.
I trained a VGG16 model with input of handwritten augmented words and their labels using AlexU dataset that contains 109 unique words.
with 85% of validation accuracy.
which I found here https://github.com/HusseinYoussef/Arabic-OCR
Thanks to Eng. Hussein Osama!
here's a sample of ب :
I also thought of splitting the image to vertical rectangles where every rectangle may contain a character.
So to implement it i used the code here: https://pyimagesearch.com/2020/06/22/turning-any-cnn-image-classifier-into-an-object-detector-with-keras-tensorflow-and-opencv/
which splits the image into ROI's (Region of interests) and search for the characters.
we can actually make an ocr that already works perfectly with arabic to giveback its results to the model for every batch!
this way we don't have to worry about tuning deep nets.
reinforcement learning is dynamically learning by adjusting actions based on continuous feedback to maximize a reward.
It will just take ages to train it.