Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pygettext does not work with new f-strings (same quotes) #113604

Closed
Obstbeeren opened this issue Dec 31, 2023 · 4 comments
Closed

pygettext does not work with new f-strings (same quotes) #113604

Obstbeeren opened this issue Dec 31, 2023 · 4 comments
Assignees
Labels
type-bug An unexpected behavior, bug, or error

Comments

@Obstbeeren
Copy link

Obstbeeren commented Dec 31, 2023

Bug report

Bug description:

When using the new f-strings in 3.12 with the same quotes, pygettext failes.

from gettext import gettext as _

print(_(f"Hello, {_("Test")}!"))

Running pygettext returns the following error:

*** test.py:3: Seen unexpected token "f""

CPython versions tested on:

3.12

Operating systems tested on:

Linux, Windows

@Obstbeeren Obstbeeren added the type-bug An unexpected behavior, bug, or error label Dec 31, 2023
@terryjreedy terryjreedy added type-bug An unexpected behavior, bug, or error and removed type-bug An unexpected behavior, bug, or error labels Jan 1, 2024
@terryjreedy
Copy link
Member

The print works, at least on Windows 3.13.0a2, but there does not seem to be a pygettext module.

@serhiy-storchaka
Copy link
Member

pygettext is not a module. It is a script in Tools/i18n.

It fails because it currently does not support new tokens for f-strings: FSTRING_START, FSTRING_MIDDLE and FSTRING_END. Previously the tokenizer produced a single STRING token for f-string. It have to be parsed to the AST and processed recursively.

@serhiy-storchaka
Copy link
Member

It seems to me, that it works correct in 3.12 with this example.

  1. It is not related to using same quotes. You get the same result with different quotes.
  2. It complains, because it expects a literal string as argument of _(). You get a similar complain about _(x) and _('a'+'b'). It is not error. Perhaps it should be more clear about this.
  3. It correctly finds "Test" in the nested _("Test"). But it does not find it in 3.11, this is a bug.
  4. It does not find nested calls in some other examples, e.g. _(_('a')), class A(f(_('a'))):, def f(x=_('a')):, etc.
  5. It can find false docstrings, e.g. in def f() -> lambda: '''not docstring''':.

So, there are two issues:

  1. pygettext should be more clean about what is error and what is not error. _() can be used with a non-literal string, and it is in general not error, but it can be programming error if f-string is used unintentionally, so it should emit a warning.
  2. In some corner cases pygettext cannot find nested _() calls or found false docstrings. It is desirable to make it more reliable, although processing class and def declarations can be too complicated. Using AST (pygettext: use an AST parser instead of a tokenizer #104400) can make it simpler, but it can have other drawbacks.

Also, the code for handling f-strings in 3.11 is no longer used and can be deleted.

@tomasr8
Copy link
Member

tomasr8 commented Feb 16, 2025

Fixed in #104402

@tomasr8 tomasr8 closed this as completed Feb 16, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in Gettext issues Feb 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
Status: Done
Development

No branches or pull requests

4 participants