Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to improve this code? #45

Open
panos78 opened this issue Jan 13, 2020 · 3 comments
Open

How to improve this code? #45

panos78 opened this issue Jan 13, 2020 · 3 comments
Labels

Comments

@panos78
Copy link

panos78 commented Jan 13, 2020

Hallo, I tried to implement cbl-js for the following images:
0 1
2 3
4 5
6 7
8 9
10 11
I implemented the code below:

var cbl = new CBL(
{
	preprocess: function(img)
	{
		img.binarize(190);
		img.blur();
		img.binarize(32);
		img.colorRegions(50,true,0);
	},
	character_set: "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789",
	blob_min_pixels: 50,
	blob_max_pixels: 400,
	pattern_width: 25,
	pattern_height: 25,
	perceptive_colorspace: true,
});

Any idea how to improve it?

@skotz
Copy link
Owner

skotz commented Jan 15, 2020

It looks like the colors are pretty distinct, so I'd play around with removing the binarize methods and skipping straight to the colorRegions. Making each letter a different color is the primary weakness of this CAPTCHA, so you don't want to make it black and white if you can help it.

The third parameter is the "pixel jump" which you can set higher to effectively keep letters together even when there's a line through them. Maybe try something like img.colorRegions(5, true, 1) for starters.

There's a lot going on here with fonts, character counts, and colored background blobs, so it'll be hard to get a high accuracy.

Good luck.

@panos78
Copy link
Author

panos78 commented Jan 15, 2020

I removed the binarize but without it there was nothing to identify, then I tried to play around with more than 200 different combinations of colorRegions and just with the first image to see if I can manage to solve it but no luck.
With the following code:

img.binarize(155);
img.colorRegions(16,true);

I created a new model which gives the following:
εικόνα
which is wrong but not completely wrong as it identifies all the characters + two extra A.
And now I am stuck and don't know how to continue.

@skotz
Copy link
Owner

skotz commented Jan 16, 2020

Honestly it'll be hard to get high accuracy on this CAPTCHA with the methods in this library. There's a lot of variation. To improve the results you might need to write something more custom to specifically deal with the constants: background circles and 1px foreground lines. The letters themselves seem to be from a finite set of fonts, and they're rotated but not distorted. That might help in some way.

Sorry, I won't be able to help too much more on this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants