Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A better page number-crop algorithm with NumPy. (Manga) #709

Merged
merged 3 commits into from
Nov 11, 2024

Conversation

neyney10
Copy link
Contributor

Summary

  1. I wrote a new method to crop pages, as the current one fails.
  2. Attached some cool gifs.

Motivation

E-reader screen real estate is an important resource.
More available screen size means more details can be better seen, especially text.
Text is one of the most important elements that need to be clearly readable on e-readers,
which mostly are smaller devices where the need to zoom is unwanted.

By cropping the page number on the bottom of the page, 2%-5% of the page height can be regained
that allows us to upscale the image even more.

  • Most of the time the screen height is the limiting factor in upscaling, rather than its width.

The Current Algorithm Fails

While it is possible to adjust the 'power' of the cropping, I can only gain some success when the power is at least 4 (using the power user EXE), where the GUI itself only allows up to 2.

The current algorithm fails in all of the cases that I have at hand.

  • I also don't want to crop the page number In case it would crop some manga-content. Some high power levels cropped some of my pages too much.
  • But I allow some level of cropping to achieve a tighter bounding box around the main manga-content with higher power levels.

A Better Algorithm

I wrote a Python function using PIL, NumPy (openCV might perform faster, but I regressed to PIL as it exists as a dependency already and including OpenCV is heavy, package weight-wise).

OUTPUT EXAMPLES OF THE NEW ALGORITHM

(I extended the power level limit to 3, from the previous limit of 2)

Page Number Cropping

ex1h001
ex5ga002

Showcasing Its Ability To Retain Manga Content

It does not crop the page number if it detects other things that might be cropped along with the page number.

ex8n003
ex2h010
ex3b001
ex4bl001
ex6h006
ex7h009

Issues

  • My new algorithm uses the 'power' parameter differently than the other algorithm that simply crops the margins without the page number. A 'power level' in "Crop margins" is different than a 'power level' in "Crop margins & page numbers". This might lead to inconsistent expectations from the user and confuse them.

@axu2
Copy link
Collaborator

axu2 commented Jun 20, 2024

Wow this looks incredible, might take me a while to fully review it though.

For reference, my perspective is mostly using a Kindle Scribe 10" which doesn't need to zoom or crop at all. But when I use smaller devices like an android Hisense a9 eink phone cropping is essential.

Have you looked into any android manga reading software? I bet they might like a feature like this, and maybe they have something similar.

And what platform did you develop on? I use macOS (arm64/apple silicon).

And do you you have a general idea of the performance difference and how much increases the binary size? Aka

’python setup.py build_binary’

@neyney10
Copy link
Contributor Author

Wow this looks incredible, might take me a while to fully review it though.
For reference, my perspective is mostly using a Kindle Scribe 10" which doesn't need to zoom or crop at all. But when I use smaller devices like an android Hisense a9 eink phone cropping is essential.

I hope it would actually work for others, I tested it on 30+ images from 5-6 different Mangas. I want to try it on some more Mangas before I'll continue.

Have you looked into any android manga reading software? I bet they might like a feature like this, and maybe they have something similar.

Very shallowly. I tried to search on Google for some existing manga or comic crop algorithms, even some general manga-readers apps, But couldn't find something meaningful (aside of the standard cropping). I usually read on my PC or my [used to have] first gen Kindle Paperwhite 6" (as you can understand my constraint here).

Recently, my Kindle's screen shattered and I've ordered a new e-reader with a 7.8" screen. I've yet to receive it so in the meantime I thought I would at least try to optimize the Mangas that I plan to read/load when I get it. Hence, I tried to write this algorithm.

And what platform did you develop on? I use macOS (arm64/apple silicon).

I use an old Windows 10 laptop (intel x86-64)

And do you you have a general idea of the performance difference and how much increases the binary size? Aka
’python setup.py build_binary’

TBH, I forgot to measure the binary, so I ran it just now to see.
It does increase the EXE file considerably by 20 MB~ (a 30% increase) to 68 MB up from 48.3 MB. The only package that I've added is NumPy. I'll see how I can optimize it. In the worst case, I can re-implement it in basic Python math, with probably some performance hit, although I didn't use anything complicated that necessitates NumPy.

Regarding performance, to my surprise, it didn't suffer much of a hit, I did multiple types of benchmarks.

  • On my PC the algorithm alone takes roughly 0.15 sec per image (when running it outside of KCC).
  • Compared to the existing "crop margins" which is similar to the existing "crop margins & page numbers", on 10 chapters of 'Blame!' (300MB, 440~ Images) takes a similar time on my machine: 6 minutes from clicking "start" to the time I get the output file. Hence I assume that the processing here of the algorithm is negligible compared to all the other stuff that happening (Open file, extract, read image, upscale, rotate, convert, build book...).

By the way,
I found a small bug in one of my functions, I'll upload a fix soon.
I think I might want to postpone the pull request a bit until I try it a bit more on other types of Mangas.

@axu2

This comment was marked as resolved.

@neyney10
Copy link
Contributor Author

While I agree, I thought we wanted to optimize output file size (as numpy increases it by 30%).
Regardless, I only used trivial functions of NumPy and upon first inspection, the performance of the non-numpy replacement methods are on par (and sometimes faster in my quite shallow tests).

Do you want me to add the NumPy version back?

Unrelated - I've detected that the algorithm sometimes fails on thin fonts and single-digit page numbers. Also, there is an issue with using higher power levels that causes the page number to become distorted and the algorithm fails to detect it (currently I'm optimizing power=1 to work the best in most cases).

@zhaohengkang
Copy link

That's great! Excuse me, is the Cropping mode value Cropping Power: 2 in KCC's current version of the cropping mode that corresponds to the Power3 you're demonstrating?

@neyney10
Copy link
Contributor Author

neyney10 commented Aug 1, 2024

That's great! Excuse me, is the Cropping mode value Cropping Power: 2 in KCC's current version of the cropping mode that corresponds to the Power3 you're demonstrating?

Hi! No, the power levels [0...3] that I'm using are different entirely from the current power levels of [0...2] that exist currently in KCC v6.1.0.

I've noted that in the issues at the bottom of OP.

My new algorithm uses the 'power' parameter differently than the other algorithm that simply crops the margins without the page number. A 'power level' in "Crop margins" is different than a 'power level' in "Crop margins & page numbers". This might lead to inconsistent expectations from the user and confuse them.

@zhaohengkang

This comment was marked as resolved.

@axu2

This comment was marked as resolved.

@zhaohengkang
Copy link

I have used the better optimization algorithm you provided and it is really great! Thank you very much!

@zhaohengkang
Copy link

在分支上启用 GitHub Actions。然后在你的分支上创建一个发布,这会触发 GitHub 操作为所有支持的平台创建二进制文件,就像这里 https://github.com/axu2/kcc/releases 一样。

此外,请随意构建一个 numpy 版本,以便我们可以更轻松地查看跨平台的大小差异。

I did it. Thank you

@axu2
Copy link
Collaborator

axu2 commented Aug 2, 2024

@zhaohengkang feel free to upload some comparison photos.

@zhaohengkang
Copy link

随意上传一些比较照片。

The following is the mobi file generated by importing pdf files to KCC, you can see that the original algorithm can only extract pictures and not crop, and a better algorithm can also crop it.

1
2
3
4
5
6

@zhaohengkang
Copy link

随意上传一些比较照片。

Here are the mobi files generated by the import folder, and these are the comparison graphs.

7
8
9
10
11

@neyney10 neyney10 force-pushed the master branch 2 times, most recently from 320d056 to 38ff5e3 Compare August 6, 2024 13:46
@axu2

This comment was marked as resolved.

@neyney10

This comment was marked as resolved.

@axu2

This comment was marked as resolved.

@neyney10
Copy link
Contributor Author

neyney10 commented Aug 7, 2024

I think the size difference of the final distributable files is negligible, I'd rather keep numpy. I've done lots of image processing work in the past with numpy and know how useful it is, it may come in handy in the future.

I also tested startup times and didn't see any significant difference. (maybe 1 second?)

I see. Then NumPy version it is.

Regarding the algorithm - there are still issues with thin fonts of page numbers and instances of a single-digit page num that the algorithm fails to detect with higher power levels (but less of a problem with lower power levels).

As Zhao demonstrated, the algorithm works fine in most cases and I think it is ready for general use, even if more tweaks and improvements are needed for some edge cases or maybe some scenarios that I haven't tested yet.

What about the issue I have with the inconsistency with power levels of the current "crop margins" versus the new "crop margin + page num"?
Should we modify the existing "crop margins" algorithm to make it consistent?


@zhaohengkang Very nice comparison images you compiled there! seems like you worked hard sharing these results!
I was wondering if you tried running the algorithm with power=1, as it seems you only shared comparisons with higher power levels such as 2 and 3. I assume that lower power levels are too weak to crop the watermark?

@zhaohengkang
Copy link

@neyney10 Sorry, because when I turned it on, the default was power=2, and precisely this power met my requirements, so I did not use the lower power to test.

@axu2
Copy link
Collaborator

axu2 commented Aug 18, 2024

What about the issue I have with the inconsistency with power levels of the current "crop margins" versus the new "crop margin + page num"?
Should we modify the existing "crop margins" algorithm to make it consistent?

Sorry, totally forgot you asked a question here.

Is it easy to modify the existing modify "crop margins" algorithm? And is it unlikely to cause issues for users? If so, do it.

Otherwise, just put a note in the tooltip that the power is used differently.

If you really want to offer the old behavior too, you can add a LEGACY checkbox.

@keruiter
Copy link

tmp3EFE
tmp14F4
image

Hi, thank you for providing such an amazing algorithm! However, I encountered an issue while using it: when I select the Cropping mode (i.e., when it is "checked"), an error often occurs. A screenshot of the error is attached. It seems that the issue might be related to the algorithm's inability to handle blank images within comic files.

To be honest, I’m not very familiar with the technical aspects, so I’ve been testing things out bit by bit. Initially, I suspected that the problem was due to the format conversion software I was using, so I tried several different ones, but the error persisted. Then, I thought it might be related to the PDF format, so I converted it to a folder containing image files and tested again, but the issue remained. Eventually, I discovered that the problem occurs when there are completely blank images within the comic files.

I further tested and found that the error only happens when the Cropping mode is checked. When it’s set to "indeterminate" or when cropping is not used, everything works fine. Additionally, using the original cropping algorithm version of KCC does not have this issue.

@neyney10
Copy link
Contributor Author

neyney10 commented Oct 18, 2024

tmp3EFE
tmp14F4
image

Hi, thank you for providing such an amazing algorithm! However, I encountered an issue while using it: when I select the Cropping mode (i.e., when it is "checked"), an error often occurs. A screenshot of the error is attached. It seems that the issue might be related to the algorithm's inability to handle blank images within comic files.

To be honest, I’m not very familiar with the technical aspects, so I’ve been testing things out bit by bit. Initially, I suspected that the problem was due to the format conversion software I was using, so I tried several different ones, but the error persisted. Then, I thought it might be related to the PDF format, so I converted it to a folder containing image files and tested again, but the issue remained. Eventually, I discovered that the problem occurs when there are completely blank images within the comic files.

I further tested and found that the error only happens when the Cropping mode is checked. When it’s set to "indeterminate" or when cropping is not used, everything works fine. Additionally, using the original cropping algorithm version of KCC does not have this issue.

Thank you for the heads up, it seems like u did a nice investigation.

It could be that I overlooked an edge-case of blank images.

I'll confirm it and see what's going on.

@keruiter
Copy link

tmp3EFE
tmp14F4
image
Hi, thank you for providing such an amazing algorithm! However, I encountered an issue while using it: when I select the Cropping mode (i.e., when it is "checked"), an error often occurs. A screenshot of the error is attached. It seems that the issue might be related to the algorithm's inability to handle blank images within comic files.
To be honest, I’m not very familiar with the technical aspects, so I’ve been testing things out bit by bit. Initially, I suspected that the problem was due to the format conversion software I was using, so I tried several different ones, but the error persisted. Then, I thought it might be related to the PDF format, so I converted it to a folder containing image files and tested again, but the issue remained. Eventually, I discovered that the problem occurs when there are completely blank images within the comic files.
I further tested and found that the error only happens when the Cropping mode is checked. When it’s set to "indeterminate" or when cropping is not used, everything works fine. Additionally, using the original cropping algorithm version of KCC does not have this issue.

Thank you for the heads up, it seems like u did a nice investigation.

It could be that I overlooked an edge-case of blank images.

I'll confirm it and see what's going on.

Thanks for the quick response! Looking forward to your solution~

1. Replaced both crop margins and crop margins & page num with newer algorithm.
2. Crop max power level increased to 3.0
3. Adds NumPy as a new dependency.
@neyney10

This comment was marked as resolved.

@neyney10 neyney10 reopened this Oct 18, 2024
@neyney10
Copy link
Contributor Author

For the update:

  1. Replaced the crop margins method with a dumber version of the crop page number algorithm. Frankly, it is quite similar to what existed until now in terms of the idea behind it.
  2. Fixed the issue of blank images - as mentioned by @keruiter.
  3. Removed the "getBoundingBox" method - Please notify me if this is problematic, I've only seen it used in the crop margins and page number methods, it seems to be limiting the minimum and maximum crop. If such limits are needed let me know.

@keruiter

This comment has been minimized.

@neyney10

This comment was marked as resolved.

@keruiter

This comment has been minimized.

@neyney10

This comment has been minimized.

@axu2 axu2 merged commit 6ba6906 into ciromattia:master Nov 11, 2024
@axu2 axu2 changed the title A better page number-crop algorithm. (Manga) A better page number-crop algorithm with NumPy. (Manga) Nov 11, 2024
@axu2
Copy link
Collaborator

axu2 commented Nov 12, 2024

Alright, I'm ready to merge as a tentative 7.0.0 release, because the numpy dependency is a big dependency change. Get ready for more user feedback since this is now in the actual releases tab instead of being linked in the release notes.

Is this alright as the first line in the release notes?

A new page number crop algorithm has been implemented to optimize vertical screen real estate, please test and provide feedback. It is optimized for power = 1.0

code quality is great, I only have some minor nitpicks about defining functions before you use them and the fact that the margin crop algorithm should be defined first to make reviewing easier. Maybe the page# crop algo should call the margin crop algo to reduce duplicated code/comments.

People in the discord have reported good results with the new algo.

I have merged this for now, but I'll keep looking into it, I wonder if any python or numpy built ins can do what the helper functions are doing? Since kcc has no unit tests, always unsure if we miss a corner case.

And armv7 now takes forever to build due to numpy source build...

image

@neyney10
Copy link
Contributor Author

neyney10 commented Nov 18, 2024

@axu2

Alright, I'm ready to merge as a tentative 7.0.0 release, because the numpy dependency is a big dependency change. Get ready for more user feedback since this is now in the actual releases tab instead of being linked in the release notes.

Is this alright as the first line in the release notes?

A new page number crop algorithm has been implemented to optimize vertical screen real estate, please test and provide feedback. It is optimized for power = 1.0

Ye, it's okay.

code quality is great, I only have some minor nitpicks about defining functions before you use them and the fact that the margin crop algorithm should be defined first to make reviewing easier. Maybe the page# crop algo should call the margin crop algo to reduce duplicated code/comments.

I guess we can somehow reduce duplicated code by reusing a more open function of the margin cropping algo.

People in the discord have reported good results with the new algo.

I have merged this for now, but I'll keep looking into it, I wonder if any python or numpy built ins can do what the helper functions are doing? Since kcc has no unit tests, always unsure if we miss a corner case.

During the development of the algorithm, I've created a small test system. It applies the algorithm on dozens of images and checks if it cropped them okay and matches the expected boxes.

I've used numpy in the helper functions... I don't know if there is are better fit functions that encompass more of the logic contained in the helper functions... But I agree, I'm worried of corner cases as well.

And armv7 now takes forever to build due to numpy source build...

image

Oh. Why is that? Is it because NumPy package isnt optimized for arm? Maybe NumPy v2+ is more optimized?

@axu2
Copy link
Collaborator

axu2 commented Nov 18, 2024

Lots of packages including numpy don't have precompiled armv7 wheels. don't worry about it, just a curious observation. armv7 is typically just raspberry pi.

@axu2 axu2 mentioned this pull request Dec 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants