-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Terrible OCR results with Channel 5 (UK) #929
Comments
GSOC qualification: 5 points |
The problem is in By the way, for reference of my own and maybe someone who wants to look into OCR, I suggest to add |
This is pretty interesting. The quantize_map() function in itself was important (from my discussions with @anshul1912 ) in order to improve the DVB results. What the function does essentially is to kind of 'binarize' the input image into text and non text regions and ignore the gradient grayscale values at the edge of the text and non text regions. With these particular set of subtitles, it seems like the binarization process is leading to some unwanted noisy artifacts around the text regions, which is throwing off the OCR results. This could probably be solved by an additional filtering step to remove the 'salt noise' present in the current images. |
@Abhinav95 I think I can look in this issue (with your help) if you don't mind :) |
@thealphadollar Go right ahead :) |
Things we could do:
|
@cfsmp3 For now, I'll be trying to incorporate the mentioned library. Let's hope something good turns up :) |
TL;DR: I do not think it would be a wise decision to use libimagequant library. I also tried with a library called exoquant but it doesn't seem to be compatible with libpng decoding method of PNG files. I will search a little more and see if I can find some better library, otherwise resort to making quantize_map() optional. Using libimagequant makes the process highly inefficient as can be seen in the below screenshot. This is an implementation of the processes involved but they were not entirely implemented into the OCR system. Nevertheless, full implementation has two issues which are elaborated after the screenshot.
Code after full implementation: I looked on the web and it seemed like that is a problem with the latest version but even downgrading the version did not make any difference. Also, after partial implementation I could still see the "salt noise" (less than quantize_map() though) in the raw_image which might be an indication of the fact that even after full implementation we could still be left with those errors which are there in the current function. Hence I think it's better to give quantization as an option (though it increases the argument count... sadly :( ) @Abhinav95 Please see if I'm wrong somewhere, and suggest if there's a better way to go about this :) |
@cfsmp3 |
@adarshshukla19 that link is not public |
So are you guys working further on the project or not and are all the
issues resolved?
Regards.
Adarsh
…On Sat, Feb 24, 2018 at 1:03 AM, Carlos Fernandez Sanz < ***@***.***> wrote:
@adarshshukla19 <https://github.com/adarshshukla19> that link is not
public
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#929 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ARwTy6YY8uEzq5-cwSIfQfWSUZ7AUi4Sks5tXxKOgaJpZM4SDCm2>
.
|
I'll be working on adding one more quantisation option, probably from the
leptonica library since we already use it.
One small addition was done a while back which reduces the number of
colours in the colour palette and improves the output slightly.
…On 26-Feb-2018 11:08 AM, "Adarsh SHUKLA" ***@***.***> wrote:
So are you guys working further on the project or not and are all the
issues resolved?
Regards.
Adarsh
On Sat, Feb 24, 2018 at 1:03 AM, Carlos Fernandez Sanz <
***@***.***> wrote:
> @adarshshukla19 <https://github.com/adarshshukla19> that link is not
> public
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#929 (comment)-
368115913>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ARwTy6YY8uEzq5-
cwSIfQfWSUZ7AUi4Sks5tXxKOgaJpZM4SDCm2>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#929 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AfStICU4RhXP42Wza1GM1OuaAAWrURUIks5tYkM5gaJpZM4SDCm2>
.
|
@adarshshukla19 Issues are not yet solved, so yes, we're definitely going to continue working on this unless we get really reliable results. |
After the last commit results at my side are good but still have this french channel with terrible output:
here are the matterials: https://goo.gl/kncQUn |
The salt noise present in the images can be removed by the method of erosion and dilation. Original image: P.S.: I am not very familiar with the codebase or the tesseract API either, so I might take some time to implement it. Though if anyone wants to go ahead, this might help to solve it. |
@krushanbauva I thought of implementing this but there are certain issues I was facing and hence, will be taking this up when I've little time in hand.
You can surely try to implement it, go through codebase and ask doubts. I'll look back into this when I've couple of days of time in hand. I spent around a week on this, so I can support you on the codebase part a bit :) |
|
Sounds amazing :) @krushanbauva |
Good luck @krushanbauva :-) |
I've got some prior experience in Tesseract and morphological operations, do you guys mind if I join in? :) |
You're more than welcome :-)
…On Wed, Mar 7, 2018 at 7:17 PM, Dipti Kulkarni ***@***.***> wrote:
I've got some prior experience in Tesseract and morphological operations,
do you guys mind if I join in? :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#929 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFrJ2bK4pJx9n3j7XFbHh6RWG8ESHkQeks5tcKKygaJpZM4SDCm2>
.
|
@cyberdrk You can go through the articles on the official CCExtractor's page which will get you started with the codebase and also going through the recent PR's give you a lot of intuition as to where things are. 😄 P.S.: You are always welcomed to collaborate!! 😋 |
Tesseract uses Leptonica for image IO and image processing. |
i would like to work on this |
No need to ask, just go for it :-)
…On Tue, Mar 27, 2018 at 6:17 AM, Saiteja31597 ***@***.***> wrote:
i would like to work on this
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#929 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFrJ2cpc8q5BZIva3a7O8y8yYdmtbcJ7ks5tijv2gaJpZM4SDCm2>
.
|
@cyberdrk @krushanbauva Any leads you guys would like to share? I'm starting back my work on this. |
@cfsmp3 For the past few days I have tried implementing some more libraries (including leptonica) but could not be successful; the problem mostly faced is the incorporation of the libraries without changing the structure of the png file we currently use. Doing that will be, I believe, inefficient since we are already having three methods which work pretty much perfectly for most of the types of videos. If I'm not wrong in terms of the compatibility of format, I think we can close the issue since we have already solved the problem this issue raised :) |
@thealphadollar png here is an output format, but this is totally unrelated with the OCR, which just takes a bitmap. |
What happened to the suggestion of implementing dilation and erosion? |
If it's not done no one has sent a PR yet.
Go for it.
…On Fri, Feb 7, 2020, 10:16 Osama Nabih ***@***.***> wrote:
What happened to the suggestion of implementing dilation and erosion?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#929?email_source=notifications&email_token=ABNMTWL7PYGTU5AVY4PFC2DRBWQQZA5CNFSM4EQMFG3KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELEBEHI#issuecomment-583537181>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNMTWM2FKDDVUTFS26QQDLRBWQQZANCNFSM4EQMFG3A>
.
|
Closing - confirmed fixed for the sample on the description. Great job @ziexess ! |
(current master, pre 0.87)
This file (but well, all of channel 5)
https://drive.google.com/open?id=1Etq-pv5G3jGqVhhRl7cNrfuw4gaKkLoV
Produces terrible results in the OCR, even though the bitmaps seem normal. What's going on?
The text was updated successfully, but these errors were encountered: