Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine string width before detecting the language of the string and applying this setting to the font shape engine leads to wrong get_string_width() #1231

Closed
kreier opened this issue Jul 25, 2024 · 2 comments · Fixed by #1233

Comments

@kreier
Copy link

kreier commented Jul 25, 2024

The determination of a string width with pdf.get_string_width(string) depends on the language set for the shape engine (when used). But even after explicit setting the shape engine to a specific script and language with something like pdf.set_text_shaping(use_shaping_engine=True, script="arab", language="ara") this setting can change. For example, when a string with latin characters is printed. The shape engine examines the first character and realizes the mismatch, and changes to latin text shaping. But when the next string is rendered, the string width is determined first with the old (now latin) setting and after that the shape engine determines the language (arabic in this case) and switches to this script and language. But the return value is based on the calculation with the wrong latin setting.

I discovered this bug in a document where both latin and non-latin strings are mixed, and sometimes the non-latin strings where misplaced. To visualize this behavior I have this example below

Minimal code

from fpdf import FPDF
fontname = ["NotoArabic.ttf"]
teststrings = ["الملوك", "الملوك", "test", "الملوك", "الملوك", "الملوك", "test", "الملوك", "test"]

def render_strings(teststrings):
    pdf.set_font('noto', size=24)
    pdf.set_draw_color(160)
    pdf.set_line_width(0.3)
    for string in teststrings:
        # pdf.set_text_shaping(use_shaping_engine=True, script="arab", language="ara")
        pdf.set_x(110 - pdf.get_string_width(string))
        pdf.rect(pdf.get_x(), pdf.get_y()+2, pdf.get_string_width(string), 13, style="D")
        pdf.cell(h=17, text=string)
        pdf.ln()
    pdf.ln()

for typeface in fontname:
    pdf = FPDF(orientation="P", unit="mm", format="A4")
    pdf.add_page()
    pdf.c_margin = 0
    pdf.add_font("noto", style="", fname="../../fonts/" + typeface)
    pdf.set_text_shaping(use_shaping_engine=True, script="arab", language="ara")
    render_strings(teststrings)
    pdf.output("fpdf2_switch_language" + typeface + ".pdf")

The output looks like this:

Screenshot 2024-07-25 at 13 12 56

Environment

@kreier kreier added the bug label Jul 25, 2024
@kreier
Copy link
Author

kreier commented Jul 25, 2024

Updated test: setting the font shape engine to the desired language and script every time before determining the string width solves this problem. At least for the moment. I added this line in the example code above and commented it out.

@andersonhc
Copy link
Collaborator

Thanks for reporting this issue @kreier, I will take a look as soon as possible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants