-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#549 Fix soft hyphen characters are visible #550
#549 Fix soft hyphen characters are visible #550
Conversation
looks like the PR has a failing test: https://travis-ci.org/github/danfickle/openhtmltopdf/builds/725230397 .
(Btw, it seems the travis-ci <-> github integration is not configured completely ? Normally I would have expected the feedback directly in the comments ) |
It's fixed now. Forgot to remove a font used during testing, which wasn't needed at the end. Thanks for looking into! |
Hi @StephanSchrader, Firstly, a big thanks for contributing to this project! I think you want The bigger question is the performance impact of running every character through the unicode character database (probably multiple times). I'll have to do some profiling before I accept this PR. If it does prove to be a bottleneck, maybe we could just test for a few common problematic characters? Can you think of any others? Thanks again, |
Hey Daniel aka @danfickle
The expression:
Good point. I can provide a JMH benchmark for the text rendering.
That's a nice idea. Actually the soft-hyphen is the only case I have. Other chars which I could think of, are: Unit Separator, No Break Here. Source: https://en.wikipedia.org/wiki/C0_and_C1_control_codes |
- fix PdfBoxTextRenderer string width calculation
Hi @danfickle I've pushed the changes for I also added a benchmark rendering a simple text and another one with hyphens. Please review if I've missed something. To run the benchmarks, build the project and run:
Use The branch: benchmark in my fork also contains the benchmarks for comparison. My results are:
Thanks for reviewing and all the work! Cheers, Stephan |
Thanks Stephan, When manually iterating unicode 32 bit code points (as opposed to Java's 16bit chars), one has to be careful to increment the loop counter correctly to handle surrogate pairs (two 16 bit chars that make one code point). for (int i = 0; i < str.length(); /* No-op */) {
int codePoint = str.codePointAt(i);
i += Character.charCount(codePoint); /* Returns 1 or 2 */
} However, I'd recommend the use of Other than that:
Thanks again, |
Hi @danfickle
You're absolutely right and I wasn't aware of the method:
My intention was to have this aspect simply visible and because the examples module already contains aspects with different responsibilities: more or less integration tests, documentation generation, performance things ;) I moved the JMH and dependency to
The package
Changes to
My first thought was
Done. Cheers Stephan |
Mostly complete except for index page and testing.
optimize formatting of diagnostic messages
…ccount for negative margins.
will create a new PR |
Fix for issue #549, all invisible characters are filtered and not printed in any way.