Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surrogate characters are decoded wrongly in makeJustificationArray #605

Closed
EmanuelCozariz opened this issue Nov 20, 2020 · 4 comments
Closed

Comments

@EmanuelCozariz
Copy link

EmanuelCozariz commented Nov 20, 2020

Given the following string 𧙗, this will be encoded as '\uD85D\uDE57'

The above string will be accepted by the font CODE2002.ttf

PDFont font = PDType0Font.load(doc, new File("CODE2002.ttf"));
cs.showText("\uD85D\uDE57");

But it is incorrectly decoded.

Method makeJustificationArray of PdfBoxFastOutputDevice
uses Character.toString(c) to add to the data array

uD85D => Character.toString(c) will decode as �
uDE57 => Character.toString(c) will decode as �

@danfickle
Copy link
Owner

Hi @EmanuelCozariz,

You're right, the justification code was not surrogate pair aware. I have made the fix but have not added a test as fonts with surrogate pair coverage tend to be too large to add to the repository.

Hopefully, time permitting, you could download the repository and test with your use case before the next release, which should be soon?

Anyway, thanks again for reporting and debugging this issue.

@danfickle
Copy link
Owner

I think this is fixed. Please feel free to re-open if required. Release soon.

@EmanuelCozariz
Copy link
Author

Do you have a release timeline? We are blocked by this issue at the moment. Thank you.

@danfickle
Copy link
Owner

Hopefully on the weekend (Sunday 29 Nov) if no blocking issues come up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants