Multilingual OCR Development Plan #12734
Replies: 76 comments 4 replies
-
Traditional Mongolian |
Beta Was this translation helpful? Give feedback.
-
I would love to work on "Bangla" |
Beta Was this translation helpful? Give feedback.
-
I very happy if you do that with Vietnamese |
Beta Was this translation helpful? Give feedback.
-
How about Arabic? That would be great. |
Beta Was this translation helpful? Give feedback.
-
I've find out that PADDLE OCR algorithm cannot recognize some special characters (such as comma, semicolon, or dot...) when the language is english. Is there any possible way that i can fix this problem |
Beta Was this translation helpful? Give feedback.
-
I would like to contribute to add the Burmese language. Is it only needed to submit two text files - dict & corpus? How further process do we need to provide? |
Beta Was this translation helpful? Give feedback.
-
Adding "Bangla" will be grate for the people in south Asia |
Beta Was this translation helpful? Give feedback.
-
Adding "Traditional Chinese (zh-TW)" would be great support. |
Beta Was this translation helpful? Give feedback.
-
Do you have preTrained Russian recognition model? |
Beta Was this translation helpful? Give feedback.
-
Hi adding " Tamil" language will be very grateful. Tamil_dict.txt Need more help plz refer this issue: |
Beta Was this translation helpful? Give feedback.
-
I can help with polish language. |
Beta Was this translation helpful? Give feedback.
-
@GmGniap Hello, Can you provide the corpus file of Burmese Language? |
Beta Was this translation helpful? Give feedback.
-
@shahidul56 Hello, Can you provide the corpus file of Bangla Languag? |
Beta Was this translation helpful? Give feedback.
-
All models updated in 2021.1.21 cannot be downloaded with following Error: |
Beta Was this translation helpful? Give feedback.
-
Sorry for the invalid links and all of them have been revised now, you can try again. |
Beta Was this translation helpful? Give feedback.
-
Hi Dear plz add the bangla and english support. I have attach both the file for bangla |
Beta Was this translation helpful? Give feedback.
-
Hi team. Great work on Paddle, it's an amazing OCR engine! Can we please have Hebrew support in multilanguage models ? Thanks ! |
Beta Was this translation helpful? Give feedback.
-
Dear Team, Tnx for your reply. I am from Bangladesh. I have already
submitted both files like dict and corpus for bangla. I would appreciate if
you could add bangla support.
Thank you.
Zahir
…On Fri, Jul 28, 2023, 1:50 AM Edward Li ***@***.***> wrote:
Hi team. Great work on Paddle, it's an amazing OCR engine! Can we please
have *Hebrew* support in multilanguage models ?
Thanks !
—
Reply to this email directly, view it on GitHub
<#1048 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD6CAOC6MTVJDVXY4W65TWDXSLBCHANCNFSM4TCPRJ6Q>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Can you provide for any ancient scripts? |
Beta Was this translation helpful? Give feedback.
-
I'm trying with my private data, but the result very poor |
Beta Was this translation helpful? Give feedback.
-
Sorry for my stupid question, I am novice at DL: What difference between Inference model and trained model? |
Beta Was this translation helpful? Give feedback.
-
I created a PR for Bangla |
Beta Was this translation helpful? Give feedback.
-
Does this list contain the latest models? If i want to fine tune for example german model do i use this link from this page to download the pretrained model? If so what yml file should i use? How do i know what is the architecture of these models? |
Beta Was this translation helpful? Give feedback.
-
Please add Tajik Language |
Beta Was this translation helpful? Give feedback.
-
I want to work on Kurdish Center language |
Beta Was this translation helpful? Give feedback.
-
I sent a PR for Bangla support #13373 |
Beta Was this translation helpful? Give feedback.
-
pls add Turkish languange.Thank you. |
Beta Was this translation helpful? Give feedback.
-
I have a copy on my GitHub at "ppocr/utils/dict" and I need to commit the dictionary text to this path and name it "vi_dict.txt" which contains a list of all characters. Vietnamese dictionary from Wikipedia. I did not find the corpus in the folder "ppocr/utils/corpus"; I could only view it in my GitHub. Link here. |
Beta Was this translation helpful? Give feedback.
-
I didnot ask for vie.I asked for Turkish Languange.
19 Ağu 2024 Pzt 10:57 tarihinde Songling Huang ***@***.***>
şunu yazdı:
… I have a copy on my GitHub at "ppocr/utils/dict" and I need to commit the
dictionary text to this path and name it "vi_dict.txt" which contains a
list of all characters. Vietnamese dictionary from Wikipedia.
I did not find the corpus in the folder "ppocr/utils/corpus"; I could only
view it in my GitHub. Link here
<https://github.com/lingskr/Vietnamese-Corpus-and-Dictionary>.
—
Reply to this email directly, view it on GitHub
<#12734 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANG56EVPGOQ5BHPXFRMWRV3ZSGQNRAVCNFSM6AAAAABKZTCY7CVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMZYGAYDINA>
.
You are receiving this because you commented.Message ID:
***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
-
How do i use these models? |
Beta Was this translation helpful? Give feedback.
-
Guideline for new language requests
If you want to request a new language support, a PR with 2 following files are needed:
In folder ppocr/utils/dict,
it is necessary to submit the dict text to this path and name it with
{language}_dict.txt
that contains a list of all characters. Please see the format example from other files in that folder.In folder ppocr/utils/corpus,
it is necessary to submit the corpus to this path and name it with
{language}_corpus.txt
that contains a list of words in your language.Maybe, 50000 words per language is necessary at least.
Of course, the more, the better.
call for contributions to add new language support for PaddleOCR.
For anyone might be insterested in traing the new language model, Guidance to train the model is provided. We are calling contributions to add new language support for PaddleOCR.
If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.
Beta Was this translation helpful? Give feedback.
All reactions