Surrogate characters are decoded wrongly in makeJustificationArray #605

EmanuelCozariz · 2020-11-20T09:20:41Z

Given the following string 𧙗, this will be encoded as '\uD85D\uDE57'

The above string will be accepted by the font CODE2002.ttf

PDFont font = PDType0Font.load(doc, new File("CODE2002.ttf"));
cs.showText("\uD85D\uDE57");

But it is incorrectly decoded.

Method makeJustificationArray of PdfBoxFastOutputDevice
uses Character.toString(c) to add to the data array

uD85D => Character.toString(c) will decode as �
uDE57 => Character.toString(c) will decode as �

The text was updated successfully, but these errors were encountered:

danfickle · 2020-11-20T12:28:13Z

Hi @EmanuelCozariz,

You're right, the justification code was not surrogate pair aware. I have made the fix but have not added a test as fonts with surrogate pair coverage tend to be too large to add to the repository.

Hopefully, time permitting, you could download the repository and test with your use case before the next release, which should be soon?

Anyway, thanks again for reporting and debugging this issue.

danfickle · 2020-11-27T11:50:03Z

I think this is fixed. Please feel free to re-open if required. Release soon.

EmanuelCozariz · 2020-11-27T11:51:39Z

Do you have a release timeline? We are blocked by this issue at the moment. Thank you.

danfickle · 2020-11-27T11:54:14Z

Hopefully on the weekend (Sunday 29 Nov) if no blocking issues come up.

danfickle added a commit that referenced this issue Nov 20, 2020

#605 Fix for justification code not being code point aware.

9f41b80

danfickle closed this as completed Nov 27, 2020

danfickle mentioned this issue Nov 30, 2020

Upload to maven central via bintray. #7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Surrogate characters are decoded wrongly in makeJustificationArray #605

Surrogate characters are decoded wrongly in makeJustificationArray #605

EmanuelCozariz commented Nov 20, 2020 •

edited

Loading

danfickle commented Nov 20, 2020

danfickle commented Nov 27, 2020

EmanuelCozariz commented Nov 27, 2020

danfickle commented Nov 27, 2020

Surrogate characters are decoded wrongly in makeJustificationArray #605

Surrogate characters are decoded wrongly in makeJustificationArray #605

Comments

EmanuelCozariz commented Nov 20, 2020 • edited Loading

danfickle commented Nov 20, 2020

danfickle commented Nov 27, 2020

EmanuelCozariz commented Nov 27, 2020

danfickle commented Nov 27, 2020

EmanuelCozariz commented Nov 20, 2020 •

edited

Loading