-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extract: Wrong lineno when arguments are spread over multiple lines and contain a function call #1123
Comments
I also wrote a quick-and-dirty fix: diff --git a/babel/messages/extract.py b/babel/messages/extract.py
index 48573ee..182a74b 100644
--- a/babel/messages/extract.py
+++ b/babel/messages/extract.py
@@ -502,6 +502,7 @@ def extract_python(
:param options: a dictionary of additional options (optional)
:rtype: ``iterator``
"""
+ last_name_token = ""
funcname = lineno = message_lineno = None
stack_depth = -1
buf = []
@@ -521,6 +522,8 @@ def extract_python(
current_fstring_start = None
for tok, value, (lineno, _), _, _ in tokens:
+ if tok == NAME:
+ last_name_token = value
if stack_depth == -1 and tok == NAME and value in ('def', 'class'):
in_def = True
elif tok == OP and value == '(':
@@ -530,7 +533,8 @@ def extract_python(
in_def = False
continue
if funcname:
- message_lineno = lineno
+ if last_name_token in keywords:
+ message_lineno = lineno
stack_depth += 1
elif in_def and tok == OP and value == ':':
# End of a class definition without parens
diff --git a/tests/messages/test_extract.py b/tests/messages/test_extract.py
index 7d3a05a..852cdea 100644
--- a/tests/messages/test_extract.py
+++ b/tests/messages/test_extract.py
@@ -97,10 +97,10 @@ add_notice(req, ngettext("Bar deleted.",
messages = list(extract.extract_python(buf, ('ngettext', '_'), ['NOTE:'],
{'strip_comment_tags': False}))
- assert messages[0] == (3, 'ngettext', ('Catalog deleted.', 'Catalogs deleted.', None), ['NOTE: This Comment SHOULD Be Extracted'])
+ assert messages[0] == (2, 'ngettext', ('Catalog deleted.', 'Catalogs deleted.', None), ['NOTE: This Comment SHOULD Be Extracted'])
assert messages[1] == (6, '_', 'Locale deleted.', ['NOTE: This Comment SHOULD Be Extracted'])
assert messages[2] == (10, 'ngettext', ('Foo deleted.', 'Foos deleted.', None), ['NOTE: This Comment SHOULD Be Extracted'])
- assert messages[3] == (15, 'ngettext', ('Bar deleted.', 'Bars deleted.', None), ['NOTE: This Comment SHOULD Be Extracted', 'NOTE: And This One Too'])
+ assert messages[3] == (14, 'ngettext', ('Bar deleted.', 'Bars deleted.', None), ['NOTE: This Comment SHOULD Be Extracted', 'NOTE: And This One Too'])
def test_declarations(self):
buf = BytesIO(b"""\ |
Thanks for the investigation! Just for reference, GNU's gettext sets the line number to 2: > xgettext hello.py
> cat messages.po
...
#: line.py:2
msgid "hello"
msgstr "" I think it makes sense to try to be consistent with xgettext here, i.e. use lineno=2. Would you like to open a PR? |
Agreed, and in addition to being consistent with xgettext, it makes more sense to use the number of the line with the actual message.
I don't mind giving it a try, but I'm not sure I'll be able to free up enough time to look into it at the moment |
How would this break custom extractors? |
When a nested function call was inside a gettext call, but on a separate line, the message got extracted with the line number of that function call. We updated the line number whenever we encountered an opening parenthesis after a gettext function name. This commit fixes that by only updating the line number on the first argument inside a gettext call. Compared with `xgettext`, some test cases were thus wrong and corrected in this commit. Fixes python-babel#1123
When a gettext call had a nested function call on a new line, the extract function would use that nested call's line number when extracting the terms for the gettext call. The reason is that we set the line number on any encounter of an opening parenthesis after a gettext keyword. This does not work if either we have a nested call, or our first term starts on a new line. This commit fixes that by only setting the line number when we encounter the first argument inside a gettext call. Existing tests were adapted to work according to `xgettext` with regards to the line numbers. Fixes python-babel#1123
Currently the Python extractor does not support deeply nested gettext calls (deeper than as a direct argument to the top-level gettext call). e.g. ```py _("Hello %s", _("Person")) _("Hello %s", random_function(", ".join([_("Person 1"), _("Person 2")]))) ``` The extraction code was refactored quite a bit to simplify the flow and support this use-case. Fixes python-babel#1125 (meanwhile also fixes python-babel#1123)
Currently the Python extractor does not support deeply nested gettext calls (deeper than as a direct argument to the top-level gettext call). e.g. ```py _("Hello %s", _("Person")) _("Hello %s", random_function(", ".join([_("Person 1"), _("Person 2")]))) ``` The extraction code was refactored quite a bit to simplify the flow and support this use-case. Fixes python-babel#1125 (meanwhile also fixes python-babel#1123)
Currently the Python extractor does not support deeply nested gettext calls (deeper than as a direct argument to the top-level gettext call). e.g. ```py _("Hello %s", _("Person")) _("Hello %s", random_function(", ".join([_("Person 1"), _("Person 2")]))) ``` The extraction code was refactored quite a bit to simplify the flow and support this use-case. Fixes python-babel#1125 (meanwhile also fixes python-babel#1123)
When a gettext call had a nested function call on a new line, the extract function would use that nested call's line number when extracting the terms for the gettext call. The reason is that we set the line number on any encounter of an opening parenthesis after a gettext keyword. This does not work if either we have a nested call, or our first term starts on a new line. This commit fixes that by only setting the line number when we encounter the first argument inside a gettext call. Existing tests were adapted to work according to `xgettext` with regards to the line numbers. Fixes python-babel#1123
Overview
Consider this piece of code:
Extracting it yields the following result:
As you may have noticed, the lineno (the first component of the tuple) is wrong. It should be either 1 or 2 (depending on whether you expect the lineno of gettext or the lineno of the value), but not 3.
This is because the code setting message_lineno is executed every time an opening parenthesis is encountered and funcname is set:
babel/babel/messages/extract.py
Lines 526 to 533 in f91754b
while funcname is only changed if the name is in keywords (i.e., if the name is one of the gettext names):
babel/babel/messages/extract.py
Lines 615 to 616 in f91754b
So, in the case where you have a gettext call -setting funcname- with another function call inside its arguments -resulting in an opening parenthesis-, you meet the two conditions to change the value of lineno for something that might be wrong, for example, if you spread the arguments to gettext over multiple lines
Steps to Reproduce
Copy-paste this to your python REPL:
Actual Results
Expected Results
or maybe
depending on whether you expect the lineno of gettext or the lineno of the value
Additional Information
The first and last test cases of
test_comments_with_calls_that_spawn_multiple_lines
should fail, assuming you decide to consider the lineno to be the lineno of the gettext call. It currently fails if you replace the call tolen
in the test cases with something that isn't a function callThe text was updated successfully, but these errors were encountered: