Add new Augmentation - Text Strikethrough #63

shaheryar1 · 2021-08-26T11:52:18Z

Just noticed this in some Report oriented documents, right now we cannot achieve this with current pipeline. Maybe we can add this kind of augmentation in augraphy to support hand cut lines feature. What you guys think about it ?

proofconstruction · 2021-08-26T14:15:56Z

I think this is a pretty interesting idea, and probably has a similar approach to how we'd solve the redaction effect mentioned here in Issue #43.

The hardest part would be identifying regions of text to add the ~~strikethrough~~ effect to, but we can use Sobel edge detection etc., randomly choose a y-value in the image, then check for edges at that height (so, varying the x-value and keeping that y-value constant). If we find an edge at that height, we can keep searching for more edges at that height, and if we find a second one, we have a word there. We can then generate a line connecting the two edge-points we found, and overlay that on the ink layer.

For maximum realism, when we find two such edge points in the image, we can check y-values above and below to find the top and bottom of the characters, then average these values to find the middle of the character. We could draw the line connecting the midpoints, like someone would in person.

There are some straightforward algorithms for generating the drawn line (e.g. advance five pixels in the x-direction and randomly +1 or -1 in the y-direction), and we could continue it past the endpoints to be even more realistic.

jboarman · 2021-08-26T15:00:21Z

This technique to find contours may help identify where the text is located. From my experience with this, it can be hit or miss. But, it should work decently well for Augraphy's purpose.

Stack Overflow: Finding contours with lines of text in OpenCV
https://stackoverflow.com/a/50777937/764307

shaheryar1 · 2021-08-26T20:38:15Z

This technique to find contours may help identify where the text is located. From my experience with this, it can be hit or miss. But, it should work decently well for Augraphy's purpose.

Yes, was thinking same and I also have experience with this kind of work, So won't take long to add this.

shaheryar1 · 2021-08-29T17:52:13Z

I have started doing some experiments on this, Its not fully optimized yet.

There are some straightforward algorithms for generating the drawn line (e.g. advance five pixels in the x-direction and randomly +1 or -1 in the y-direction), and we could continue it past the endpoints to be even more realistic.

Will try to implement this one too.
https://colab.research.google.com/drive/11-_ne7ZJuqH8mkcmtM6WrwWGgD0v3VEt?usp=sharing

jboarman · 2021-08-30T19:47:56Z

This looks really nice! I'm curious if the effect works well with the same techniques used in the PencilScribbles augmentation. We might need to pull that pencil effect out into a shared lib as I sense we might use it in more than one place.

proofconstruction · 2021-08-31T12:45:42Z

This looks great, and I agree it'd look even better with PencilScribbles generating the strikethrough line. I'm definitely in favor of pulling that code out into augraphy/augmentations/lib.py. In the future, we could expand on this and PencilScribbles to create an Annotation augmentation that accepts a string and some font and writes that text onto the page using "pencil" in that style, so we could replicate the added text in the picture above too.

shaheryar1 · 2021-09-05T11:03:12Z

I have applied the chaikin's algorithm for smoothing the effect of Text Strikethrough. It now looks more realistic to me. Kindly have a look and provide feedback

https://colab.research.google.com/drive/11-_ne7ZJuqH8mkcmtM6WrwWGgD0v3VEt?usp=sharing

jboarman · 2021-09-05T21:48:41Z

This looks quite amazing!

I think we’re ready to see the addition of the pencil effect used in PencilScribbles.

Is that effect pulled out into a shared lib yet?

It might be helpful to add antialiasing beforehand, but I’m not sure (via CV_AA). See example on stackoverflow.

proofconstruction · 2021-09-10T12:26:53Z

Merged #86

jboarman · 2021-09-13T20:32:11Z

I came across this underline example, and it made me think that perhaps we could adapt code from Strikethrough to create an Underline augmentation. Eventually, even a Highlight augmentation could evolve from this code. Any thoughts on the best way to share that code between augmentations? Is there a generalization of the approach that could be extracted into the shared library functions?

shaheryar1 · 2021-09-14T08:52:44Z

Exactly, Yesterday I was also going through Memos and Report looking for sample documents containing text strikethrough and came across multiple samples where underlining was intensively used. I think I can make the code generic to cater this underline effect

kwcckw · 2021-09-14T09:41:47Z

Exactly, Yesterday I was also going through Memos and Report looking for sample documents containing text strikethrough and came across multiple samples where underlining was intensively used. I think I can make the code generic to cater this underline effect

Or maybe a flag to insert the line at the center of the text or under the text?

shaheryar1 · 2021-09-14T09:41:53Z

Here are the results with minor changes in the existing strikethrough code

proofconstruction · 2021-09-14T10:40:22Z

Does this work with different font sizes?

If we pick a standard width y for the highlighter effect, we could use the Chaikin algorithm to generate the points p for the underline, then add e.g. a yellow or pink highlighter tint (we could do this in RGB and convert back to grayscale for the most faithful effect) to all the points in the slice [p:p + y]. The underlines are already not perfectly straight, so this should look pretty natural too, like an unsteady hand highlighting a text area.

Underline/Strikethrough, Highlight, and PencilScribbles are all created in the real world by someone using a writing tool on the document. I think what we really have here are different instances of a more general augmentation; we could have a Markup class with some flags to pick one of these sub-effects, or we could (probably) pull a lot of the code out into lib.py and share it between smaller classes for each of these effects.

I can see good reasons to do either of these.

shaheryar1 · 2021-09-14T13:54:34Z

Does this work with different font sizes?

Yes it does.

Underline/Strikethrough, Highlight, and PencilScribbles are all created in the real world by someone using a writing tool on the document. I think what we really have here are different instances of a more general augmentation; we could have a Markup class with some flags to pick one of these sub-effects, or we could (probably) pull a lot of the code out into lib.py and share it between smaller classes for each of these effects

I guess for end-users we should create a markup class with multiple options, and for contributors, the
core module should be in lib.py especially the chainkins algo and the script which extract the position of each text line.

proofconstruction · 2021-09-16T02:34:03Z

@shaheryar1

Is it necessary to blur the image when making the strikethrough lines? https://github.com/shaheryar1/augraphy/blob/dev/augraphy/augmentations/strikethrough.py#L72

Does this help with contour detection?

proofconstruction · 2021-09-16T02:51:45Z

I removed the blur in testing and it doesn't seem to affect the number or placement of the resulting contours. Unless you think we should keep it in, I'm going to remove the blur from this augmentation so the Strikethrough only draws lines on text, without blurring the original image.

shaheryar1 · 2021-09-16T08:14:13Z

Is it necessary to blur the image when making the strikethrough lines? https://github.com/shaheryar1/augraphy/blob/dev/augraphy/augmentations/strikethrough.py#L72

Yes it helps when words spacing is relatively large. I noticed that it is blurring the original image rather than making a copy of it and applying blur operation on that. I'll fix this in my next PR.

proofconstruction · 2021-09-16T09:11:49Z

Can we blur a copy of the image to detect the contours, then use the contours to draw lines on a copy without blur? Otherwise this augmentation is really Blur + Strikethrough 😕

shaheryar1 · 2021-09-16T09:51:36Z

Can we blur a copy of the image to detect the contours, then use the contours to draw lines on a copy without blur?

Yes this is exactly what it is supposed to do. But I guess, In the currently deployed code I forgot to make a copy of image

proofconstruction changed the title ~~Add new Augmentation - Hand cut text lines~~ Add new Augmentation - Text Strikethrough Aug 26, 2021

proofconstruction assigned shaheryar1 Aug 27, 2021

This was referenced Sep 8, 2021

Failed Test case #73

Closed

Added Text Strikethrough #75

Closed

proofconstruction closed this as completed Sep 10, 2021

shaheryar1 mentioned this issue Oct 1, 2021

Markup class for strikethrough/underline #100

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new Augmentation - Text Strikethrough #63

Add new Augmentation - Text Strikethrough #63

shaheryar1 commented Aug 26, 2021 •

edited

Loading

proofconstruction commented Aug 26, 2021

jboarman commented Aug 26, 2021 •

edited

Loading

shaheryar1 commented Aug 26, 2021

shaheryar1 commented Aug 29, 2021

jboarman commented Aug 30, 2021

proofconstruction commented Aug 31, 2021

shaheryar1 commented Sep 5, 2021

jboarman commented Sep 5, 2021

proofconstruction commented Sep 10, 2021

jboarman commented Sep 13, 2021

shaheryar1 commented Sep 14, 2021

kwcckw commented Sep 14, 2021

shaheryar1 commented Sep 14, 2021

proofconstruction commented Sep 14, 2021

shaheryar1 commented Sep 14, 2021

proofconstruction commented Sep 16, 2021

proofconstruction commented Sep 16, 2021

shaheryar1 commented Sep 16, 2021

proofconstruction commented Sep 16, 2021

shaheryar1 commented Sep 16, 2021 •

edited

Loading

Add new Augmentation - Text Strikethrough #63

Add new Augmentation - Text Strikethrough #63

Comments

shaheryar1 commented Aug 26, 2021 • edited Loading

proofconstruction commented Aug 26, 2021

jboarman commented Aug 26, 2021 • edited Loading

shaheryar1 commented Aug 26, 2021

shaheryar1 commented Aug 29, 2021

jboarman commented Aug 30, 2021

proofconstruction commented Aug 31, 2021

shaheryar1 commented Sep 5, 2021

jboarman commented Sep 5, 2021

proofconstruction commented Sep 10, 2021

jboarman commented Sep 13, 2021

shaheryar1 commented Sep 14, 2021

kwcckw commented Sep 14, 2021

shaheryar1 commented Sep 14, 2021

proofconstruction commented Sep 14, 2021

shaheryar1 commented Sep 14, 2021

proofconstruction commented Sep 16, 2021

proofconstruction commented Sep 16, 2021

shaheryar1 commented Sep 16, 2021

proofconstruction commented Sep 16, 2021

shaheryar1 commented Sep 16, 2021 • edited Loading

shaheryar1 commented Aug 26, 2021 •

edited

Loading

jboarman commented Aug 26, 2021 •

edited

Loading

shaheryar1 commented Sep 16, 2021 •

edited

Loading