-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Substitute Non-Breaking-Space with Normal-Space for PDF font character lookup #21
Comments
…h normal space if not present in font. Thanks @rototor
I think (hope) using |
@danfickle I think using |
@danfickle is there a timeframe for having this fixed (in a non-snapshot version)? We have an application using your library that is supposed to go into production, but the customer ran into this problem in user acceptance testing and is not likely to approve this moving to production the way it is. Thanks! |
Can you give me the weekend to clean up some svg code before deploying a release or do you need it immediately? It's nice to hear that people are using this. |
Yeah that's no problem. Thanks for the quick response! |
Just FYI, I came across another character that causes a "#" to show up. which is classified as a zero-width space: https://en.wikipedia.org/wiki/Zero-width_space I've put in some character replacement in our code to deal with this for the time being, but thought you'd like to know. Thanks again for the fast turnaround. |
At least we're not the only ones having trouble with this. |
Sorry that was supposed to be |
@scoldwell - If you are pre-filtering as a temporary fix, you may wish to use this function: /**
* Checks if a code point is printable. If false, it can be safely discarded at the
* rendering stage, else it should be replaced with the replacement character,
* if a suitable glyph can not be found.
* @param codePoint
* @return whether codePoint is printable
*/
public static boolean isCodePointPrintable(int codePoint) {
if (Character.isISOControl(codePoint))
return false;
int category = Character.getType(codePoint);
return !(category == Character.CONTROL ||
category == Character.FORMAT ||
category == Character.UNASSIGNED ||
category == Character.PRIVATE_USE ||
category == Character.SURROGATE);
} As an implementation note, behavior will differ between Java 6 and later versions as the unicode version was changed and I'll close this issue now, as I think it is finally solved. Feel free to re-open if you find any other issues. |
* Change groupid to reflect the transition into organization * Doing builds and especially releases both on push and PR leads into duplicate builds. We should choose on of them, and I think PR should suffice * Release process (danfickle#21) * The first commit in the repo is from 2004, so I find it correct to state that as the inception year * Updated the Maven compiler plugin as well * Updated the Maven source and javadocs plugins * Minor tweaks * First take on a release pipeline * Getting there * Switching to using semver instead * Updated groupid to adhere with what Maven central expects and accepts
I´ve investigated the "#" problem I described in #19 a bit future. The problem is, that is renderd as '#'. The # comes from the default xhtmlrenderer.conf:
The character is used as replacement even if xr.renderer.replace-missing-characters=false. It seem no font has a character. This makes somehow sense, as its visual the same character as a normal space.
Just replacing (character 160) with ' ' would fix the problem - but it does not feel like a correct fix to me:
Especially because their are more spaces then just space and non-breaking-space. For examples see here https://www.cs.tut.fi/~jkorpela/chars/spaces.html
The text was updated successfully, but these errors were encountered: