Cannot display U+xxxxx utf-32 symbols #2965

shopping0421 · 2024-11-19T04:01:58Z

Describe the bug
I want to display all Chinese text as well.
I found out that normal Chinese text display well.
But some hard text that encoded with 4 bytes can not get a good display and overlayed by other word.
e.g.
https://www.compart.com/en/unicode/U+24256
U+24256 "𤉖" display 'PV' in PDF.

To Reproduce
Steps to reproduce the behavior including code snippet (if applies):

register font with:
https://fonts.google.com/specimen/Cactus+Classical+Serif

which support the text '𤉖'
https://fonts.google.com/specimen/Chocolate+Classical+Sans?preview.text=%F0%A4%89%96

show pdf with text '𤉖'

You can make use of react-pdf REPL to share the snippet

Expected behavior
I should see '𤉖' display correct in the PDF.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: Mac OS
Browser Chrome
React-pdf version 4.1.4

shopping0421 · 2024-11-20T09:58:08Z

Hi there,
I finally fix this issue by the attached patches. And i hope someone can review my patch and release the fix.
@react-pdf+pdfkit+4.0.0.patch
@react-pdf+layout+4.1.2.patch

And I think the main reason for the issue maybe:
(suppose c is a utf32(U+010000 - U+10FFFF) char)
c is composed by 2 bytes.
c.length() => 2
c.codePointAt() => only get the c[0].codePointAt() which has no mapped codepoint from fonts.

So, what i do is to:

correct the codepoint calculation for utf32 char.
fix the layout library to fix the font suggestion(before it's always restore to default font since not a valid codepoint).
fix the pdfkit to compute with correct glyphs and encoded text for utf32 char.

And finally the utf32 chars display normal in my pdf:

reference:
https://en.wikipedia.org/wiki/UTF-16#Code_points_from_U+010000_to_U+10FFFF

shopping0421 changed the title ~~Cannot display U+xxxxx unicode-32 symbols~~ Cannot display U+xxxxx utf-32 symbols Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot display U+xxxxx utf-32 symbols #2965

Cannot display U+xxxxx utf-32 symbols #2965

shopping0421 commented Nov 19, 2024 •

edited

Loading

shopping0421 commented Nov 20, 2024

Cannot display U+xxxxx utf-32 symbols #2965

Cannot display U+xxxxx utf-32 symbols #2965

Comments

shopping0421 commented Nov 19, 2024 • edited Loading

shopping0421 commented Nov 20, 2024

shopping0421 commented Nov 19, 2024 •

edited

Loading