Skip to content
This repository has been archived by the owner on Jan 25, 2023. It is now read-only.

Missing Korean Chancters in PDF #1757

Closed
bhansali-mukesh opened this issue May 27, 2020 · 7 comments
Closed

Missing Korean Chancters in PDF #1757

bhansali-mukesh opened this issue May 27, 2020 · 7 comments

Comments

@bhansali-mukesh
Copy link

Defect Report

Use this template for filing a defect report. For feature requests and other matters, you can use part of the template and delete what you don't need.

Title

Missing Korean Chancters in PDF

Font

NotoSansCJKjp-Regular.otf.
You can upload the problem font here unless it is a Chinese, Japanese or Korean font (these are large).

Where the font came from, and when

For example:
Site: https://www.google.com/get/noto/
Date: 2020-03-03

Font Version

  • Version 1.004;PS 1.004;hotconv 1.0.82;makeotf.lib2.5.63406

OS name and version

CentOS Linux release 7.7.1908 (Core)
Derived from Red Hat Enterprise Linux 7.7 (Source)

Application name and version

Jasper Reports 6.8.1

Issue

When Report is produced using this font for Korean Language and Format is PDF, Some Characters are missing in Report. In HTML it looks fine.
Tried with Diffrent Font and found that it is working perfectly fine with othe font.

  1. Any Jasper Report in PDF with This Font, Locale must be Korean
  2. Works fine in HTML but Missing few characters in Korean PDF Report
  3. It should match HTML Report as it is working with other font.

Character data

통제청 추
Noto Sans CJK JP Regular.pdf

Type Dodam M.pdf
Type Dodam M

noto.zip

Noto Sans CJK JP Regular

lease include real character data to illustrate your issue-- Unicode codepoints are helpful. This makes it possible for developers who don't know the language or script to copy/paste the text to reproduce the issue.

Screenshot

If possible, include a screenshot or an image illustrating the issue.
Annotations are also helpful.

Tools for reporting bugs

Useful tools for reporting bugs are available at: https://github.com/googlei18n/

Harfbuzz hb-view and hb-shape

These are part of the HarfBuzz distribution and can help isolate if an issue is in the app/OS, shaping engine, or font.

  • hb-view renders the text with the exact font (for example, to see how ligatured characters shape) using your installed version of HarfBuzz.

For example:

  hb-view --font-file {path to font} --text-file {path to text file} --output-file '{sample}.png'
  • hb-shape shows glyph selection and positioning

Fontview

  • Fontview displays the text.

Fontdiff

  • Fontdiff displays the text using two versions of the font side by side.
@bhansali-mukesh
Copy link
Author

Diff-Noto Sans CJK JP Regular

@punchcutter
Copy link

The PDF creation looks broken. When I try to open the PDFs you attached I get an error

Cannot extract the embedded font 'XTCOAU+NotoSansCJKjp-Bold-Identity-H'. Some characters may not display or print correctly.

Also, you should get the latest version of the fonts from https://github.com/googlefonts/noto-cjk/releases if you don't have them. If it works with another font then which font?

@udayjadhav
Copy link

@punchcutter We are using the latest font only
I have attached pdf file and HTML snapshot as well, we are using Noto Sans CJK JP Regular font, i tried with your mentioned font as well.
you might be getting Cannot extract the embedded font 'XTCOAU+NotoSansCJKjp-Bold-Identity-H'. Some characters may not display or print correctly, because font may not be available in your classpath, you can try opening it in the
Noto Sans CJK JP Regular.pdf
browser.
Noto Sans CJK JP Regular
Type Dodam M.pdf
Type Dodam M

@patchew
Copy link

patchew commented Jun 11, 2020 via email

@Harshad-Panmand
Copy link

Harshad-Panmand commented Aug 7, 2020

I would also think that using Noto Sans CJK KR would be more appropriate for Korean than Noto Sans CJK JP. [Apologies for any typos – typed with my thumbs and subject to the most amusing auto-correct (or not), i.e. typed on my iPhone]

On Jun 10, 2020, at 23:41, udayjadhav @.***> wrote:  @punchcutter We are using the latest font only I have attached pdf file and HTML snapshot as well, we are using Noto Sans CJK JP Regular font, i tried with your mentioned font as well. you might be getting Cannot extract the embedded font 'XTCOAU+NotoSansCJKjp-Bold-Identity-H'. Some characters may not display or print correctly, because font may not be available in your classpath, you can try opening it in the Noto Sans CJK JP Regular.pdf browser. Type Dodam M.pdf — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

@patchew / @punchcutter I have used 'Noto Sans CJK KR *' still, it's not working. In below added pdf korean characters are missing (As previous).

CAPA 기록 보고서 V2 (1).pdf

CAPA_PDF_Snapshot

CAPA_HTML

@patchew
Copy link

patchew commented Nov 12, 2020

@patchew / @punchcutter I have used 'Noto Sans CJK KR *' still, it's not working. In below added pdf korean characters are missing (As previous).

@Harshad-Panmand , were you able to resolve this?
When trying to open your attached PDF, it seems that there's more to it than just using the NotoSans CJK-KR fonts:

Acrobat_Pro_DC_and_CAPA_V2_1_pdf

@simoncozens
Copy link
Collaborator

This seems very similar to #1678 (same error message). Closing in favour of that. I suspect it's broken PDF generation software, but anything we can do to debug would be good.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants